Next Article in Journal
Recurrence Networks in Natural Languages
Next Article in Special Issue
Information Theory and an Entropic Approach to an Analysis of Fiscal Inequality
Previous Article in Journal
Artificial Noise Injection and Its Power Loading Methods for Secure Space-Time Line Coded Systems
Previous Article in Special Issue
A Novel Uncertainty Management Approach for Air Combat Situation Assessment Based on Improved Belief Entropy
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

The Arbitrarily Varying Relay Channel †

Department of Electrical Engineering, Technion—Israel Institute of Technology, Haifa 32000, Israel
*
Author to whom correspondence should be addressed.
Parts of this work have been presented at the 2018 International Symposium on Information Theory, Vail, Colorado, 17–22 June 2018, and at the 56th Annual Allerton Conference on Communication, Control, and Computing, Monticelo, Illinoi, 3–5 October 2018. This work was supported by the Israel Science Foundation (grant No. 1285/16).
Entropy 2019, 21(5), 516; https://doi.org/10.3390/e21050516
Submission received: 26 March 2019 / Revised: 11 May 2019 / Accepted: 20 May 2019 / Published: 22 May 2019

Abstract

:
We study the arbitrarily varying relay channel, which models communication with relaying in the presence of an active adversary. We establish the cutset bound and partial decode-forward bound on the random code capacity. We further determine the random code capacity for special cases. Then, we consider conditions under which the deterministic code capacity is determined as well. In addition, we consider the arbitrarily varying Gaussian relay channel with sender frequency division under input and state constraints. We determine the random code capacity, and establish lower and upper bounds on the deterministic code capacity. Furthermore, we show that as opposed to previous relay models, the primitive relay channel has a different behavior compared to the non-primitive relay channel in the arbitrarily varying scenario.

1. Introduction

The relay channel was first introduced by van der Meulen [1] to describe point-to-point communication with the help of a relay, which receives a noisy version of the transmitter signal and transmits a signal of its own to the destination receiver. The relay channel is generally perceived as a fundamental building block for multihop networks (see e.g., [2,3], Chapter 16), where some nodes receive and transmit in order to assist the information flow between other nodes. The capacity of the relay channel is not known in general, however, Cover and El Gamal established the cutset upper bound, the decode-forward lower bound, and the partial decode-forward lower bound [4]. It was also shown in [4] that for the reversely degraded relay channel, direct transmission is capacity achieving. For the degraded relay channel, the decode-forward lower bound and the cutset upper bound coincide, thus characterizing the capacity for this model [4].
In general, the partial decode-forward lower bound is tighter than both direct transmission and decode-forward lower bounds. El Gamal and Zahedi [5] determined the capacity of the relay channel with orthogonal sender components, by showing that the partial decode-forward lower bound and cutset upper bound coincide. A variation of the relay channel, referred to as the primitive relay channel, was introduced by Kim [2], and attracted a lot of attention (see e.g., [6,7,8,9,10,11,12] and references therein). Recently, there has also been a growing interest in the Gaussian relay channel, as e.g., in [5,7,9,13,14,15,16] and references therein. In particular, El Gamal and Zahedi [5] introduced the Gaussian relay channel with sender frequency division (SFD), as a special case of a relay channel with orthogonal sender components. There are many other relaying scenarios, including secrecy [17,18], networking [15,19,20,21,22], parallel relaying [23,24,25], diamond channels [26,27,28], side information [29,30,31,32,33], etc.
In practice, the channel statistics are not necessarily known in exact, and they may even change over time. The arbitrarily varying channel (AVC) is an appropriate model to describe such a situation [34]. In real systems, such variations are caused by fading in wireless communication [35,36,37,38,39,40,41,42], memory faults in storage [43,44,45,46,47], malicious attacks on identification and authorization systems [48,49], etc. It is especially relevant to communication in the presence of an adversary, or a jammer, attempting to disrupt communication. Jamming attacks are not limited to point-to-point communication, and cause a major security concern for cognitive radio networks [50] and wireless sensor networks [42,51,52,53,54], for instance.
Considering the AVC without a relay, Blackwell et al. determined the random code capacity [34], i.e., the capacity achieved by stochastic-encoder stochastic-decoder coding schemes with common randomness. It was also demonstrated in [34] that the random code capacity is not necessarily achievable using deterministic codes. A well-known result by Ahlswede [55] is the dichotomy property of the AVC. Specifically, the deterministic code capacity either equals the random code capacity or else, it is zero. Subsequently, Ericson [56] and Csiszár and Narayan [57] established a simple single-letter condition, namely non-symmetrizability, which is both necessary and sufficient for the capacity to be positive. Ahlswede’s Robustification Technique (RT) is a useful technique for the AVC analysis, developed and applied to classical AVC settings [58,59]. Essentially, the RT uses a reliable code for the compound channel to construct a random code for the AVC applying random permutations to the codeword symbols. A continuing line of works on arbitrarily varying networks includes among others the arbitrarily varying broadcast channel [60,61,62,63,64,65], multiple-access channel [60,66,67,68,69,70,71,72,73,74,75], and wiretap channel [76,77,78,79,80,81,82,83,84]. The reference lists here are far from being exhaustive.
In this work, we introduce a new model, namely, the arbitrarily varying relay channel (AVRC). The AVRC combines the previous models, i.e., the relay channel and the AVC, and we believe that it is a natural problem to consider, in light of the jamming attacks on current and future networks, as mentioned above. In the analysis, we incorporate the block Markov coding schemes of [4] in Ahlswede’s Robustification and Elimination Techniques [55,59]. A straightforward application of Ahlswede’s RT fails to comply with the strictly causal relay transmission. In a recent work [85,86], by the authors of this paper, a modified RT technique was presented and applied to the point-to-point AVC with causal side information under input and state constraints, without a relay. This was the first time where the application of the RT exploited the structure of the original compound channel code to construct a random code for the AVC, as opposed to earlier work where the original code is treated as a “black box”. Here, we present another modification of the RT, which also exploits the structure of the original compound channel code, but in a different manner. The analysis also requires to redefine the compound channel, and we refer to the newly defined channel as the block-compound relay channel.
We establish the cutset upper bound and the full/partial decode-forward lower bound on the random code capacity of the AVRC. The random code capacity is determined in special cases of the degraded AVRC, the reversely degraded AVRC, and the AVRC with orthogonal sender components. Then, we give extended non-symmetrizability conditions under which the deterministic code capacity coincides with the random code capacity. We show by example that the deterministic code capacity can be strictly lower than the random code capacity of the AVRC. Then, we consider the Gaussian AVRC with SFD, under input and state constraints. The random code capacity is determined using the previous results, whereas the deterministic code capacity is lower and upper bounded using an independent approach. Specifically, we extend the techniques from [87], where Csiszár and Narayan determine the capacity of the Gaussian AVC under input and state constraint. It is shown that for low values on the input constraint, the deterministic code capacity can be strictly lower than the random code capacity, but yet non-zero.
Furthermore, we give similar bounds for the primitive AVRC, where there is a noiseless link between the relay and the receiver of limited capacity [2]. We find the capacity of the primitive counterpart of the Gaussian AVRC with SFD, in which case the deterministic and random code capacities coincide, regardless of the value of the input constraint. We deduce that Kim’s assertion—that “the primitive relay channel captures most essential features and challenges of relaying, and thus serves as a good testbed for new relay coding techniques” [2]—is not true in the arbitrarily varying scenario.
This work is organized as follows. In Section 2, the basic definitions and notation are provided. In Section 3, we give the main results on the general AVRC. The Gaussian AVRC with SFD is introduced in Section 4, and the main results are given in Section 5. The definition and results on the primitive AVRC are in Section 6.

2. Definitions

2.1. Notation

We use the following notation conventions throughout. Calligraphic letters X , S , Y , are used for finite sets. Lowercase letters x , s , y , stand for constants and values of random variables, and uppercase letters X , S , Y , stand for random variables. The distribution of a random variable X is specified by a probability mass function (pmf) P X ( x ) = p ( x ) over a finite set X . The set of all pmfs over X is denoted by P ( X ) . We use x j = ( x 1 , x 2 , , x j ) to denote a sequence of letters from X . A random sequence X n and its distribution P X n ( x n ) = p ( x n ) are defined accordingly. For a pair of integers i and j, 1 i j , we define the discrete interval [ i : j ] = { i , i + 1 , , j } . The notation x = ( x 1 , x 2 , , x n ) is used when it is understood from the context that the length of the sequence is n, and the 2 -norm of x is denoted by x .

2.2. Channel Description

A state-dependent discrete memoryless relay channel ( X , X 1 , S , W Y , Y 1 | X , X 1 , S , Y , Y 1 ) consists of five sets, X , X 1 , S , Y and Y 1 , and a collection of conditional pmfs W Y , Y 1 | X , X 1 , S . The sets stand for the input alphabet, the relay transmission alphabet, the state alphabet, the output alphabet, and the relay input alphabet, respectively. The alphabets are assumed to be finite, unless explicitly said otherwise. The channel is memoryless without feedback, and therefore
W Y n , Y 1 n | X n , X 1 n , S n ( y n , y 1 n | x n , x 1 n , s n ) = i = 1 n W Y , Y 1 | X , X 1 , S ( y i , y 1 , i | x i , x 1 , i , s i ) .
Communication over a relay channel is depicted in Figure 1. Following [29], a relay channel W Y , Y 1 | X , X 1 , S is called degraded if the channel can be expressed as
W Y , Y 1 | X , X 1 , S ( y , y 1 | x , x 1 , s ) = W Y 1 | X , X 1 , S ( y 1 | x , x 1 , s ) W Y | Y 1 , X 1 , S ( y | y 1 , x 1 , s ) ,
and it is called reversely degraded if
W Y , Y 1 | X , X 1 , S ( y , y 1 | x , x 1 , s ) = W Y | X , X 1 , S ( y | x , x 1 , s ) W Y 1 | Y , X 1 , S ( y 1 | y , x 1 , s ) .
We say that the relay channel is strongly degraded or reversely degraded, if the respective definition holds such that the sender-relay marginal is independent of the state. That is, W Y , Y 1 | X , X 1 , S is strongly degraded if W Y , Y 1 | X , X 1 , S = W Y 1 | X , X 1 W Y | Y 1 , X 1 , S , and similarly, W Y , Y 1 | X , X 1 , S is strongly reversely degraded if W Y , Y 1 | X , X 1 , S = W Y | X , X 1 , S W Y 1 | Y , X 1 . For example, if Y 1 = X + Z and Y = Y 1 + X 1 + S , where Z is an independent additive noise, then W Y , Y 1 | X , X 1 , S is strongly degraded. Whereas, if Y = X + X 1 + S and Y 1 = Y + Z , then W Y , Y 1 | X , X 1 , S is strongly reversely degraded.
The arbitrarily varying relay channel (AVRC) is a discrete memoryless relay channel ( X , X 1 , S , W Y , Y 1 | X , X 1 , S , Y , Y 1 ) with a state sequence of unknown distribution, not necessarily independent nor stationary. That is, S n q ( s n ) with an unknown joint pmf q ( s n ) over S n . In particular, q ( s n ) can give mass 1 to some state sequence s n . We use the shorthand notation L = { W Y , Y 1 | X , X 1 , S } for the AVRC, where the alphabets are understood from the context.
To analyze the AVRC, we consider the compound relay channel. Different models of compound relay channels have been considered in the literature [30,88]. Here, we define the compound relay channel as a discrete memoryless relay channel ( X , X 1 , S , W Y , Y 1 | X , X 1 , S , Y , Y 1 ) with a discrete memoryless state, where the state distribution q ( s ) is not known in exact, but rather belongs to a family of distributions Q , with Q P ( S ) . That is, S n i = 1 n q ( s i ) , with an unknown pmf q Q over S . We use the shorthand notation L Q for the compound relay channel, where the transition probability W Y , Y 1 | X , X 1 , S and the alphabets are understood from the context.
In the analysis, we also use the following model. Suppose that the user transmits B > 0 blocks of length n, and the jammer is entitled to use a different state distribution q b ( s ) Q for every block b [ 1 : B ] , while the encoder, relay and receiver are aware of this jamming scheme. In other words, every block is governed by a different memoryless state. We refer to this channel as the block-compound relay channel, denoted by L Q × B . Although this is a toy model, it is a useful tool for the analysis of the AVRC.

2.3. Coding

We introduce some preliminary definitions, starting with the definitions of a deterministic code and a random code for the AVRC L . Note that in general, the term ‘code’, unless mentioned otherwise, refers to a deterministic code.
Definition 1 
(A code, an achievable rate and capacity). A ( 2 n R , n ) code for the AVRC L consists of the following; a message set [ 1 : 2 n R ] , where it is assumed throughout that 2 n R is an integer, an encoder f : [ 1 : 2 n R ] X n , a sequence of n relaying functions f 1 , i : Y 1 i 1 X 1 , i , i [ 1 : n ] , and a decoding function g : Y n [ 1 : 2 n R ] .
Given a message m [ 1 : 2 n R ] , the encoder transmits x n = f ( m ) . At time i [ 1 : n ] , the relay transmits x 1 , i = f 1 , i ( y 1 i 1 ) and then receives y 1 , i . The relay codeword is given by x 1 n = f 1 n ( y 1 n ) f 1 , i ( y 1 i 1 ) i = 1 n . The decoder receives the output sequence y n , and finds an estimate of the message m ^ = g ( y n ) (see Figure 1). We denote the code by C = f ( · ) , f 1 n ( · ) , g ( · ) . Define the conditional probability of error of the code C given a state sequence s n S n by
P e | s n ( n ) ( C ) = 1 2 n R m = 1 2 n R ( y n , y 1 n ) : g ( y n ) m i = 1 n W Y , Y 1 | X , X 1 , S ( y i , y 1 , i | f i ( m ) , f 1 , i ( y 1 i 1 ) , s i ) .
Now, define the average probability of error of C for some distribution q ( s n ) P ( S n ) ,
P e ( n ) ( q , C ) = s n S n q ( s n ) · P e | s n ( n ) ( C ) .
Observe that P e ( n ) ( q , C ) is linear in q, and thus continuous. We say that C is a ( 2 n R , n , ε ) code for the AVRC L if it further satisfies
P e ( n ) ( q , C ) ε , for all q ( s n ) P ( S n ) .
A rate R is called achievable if for every ε > 0 and sufficiently large n, there exists a ( 2 n R , n , ε ) code. The operational capacity is defined as the supremum of the achievable rates and it is denoted by C ( L ) . We use the term ‘capacity’ referring to this operational meaning, and in some places we call it the deterministic code capacity in order to emphasize that achievability is measured with respect to deterministic codes.
We proceed now to define the parallel quantities when using stochastic-encoders stochastic-decoder triplets with common randomness. The codes formed by these triplets are referred to as random codes.
Definition 2 
(Random code). A ( 2 n R , n ) random code for the AVRC L consists of a collection of ( 2 n R , n ) codes { C γ = ( f γ , f 1 , γ n , g γ ) } γ Γ , along with a probability distribution μ ( γ ) over the code collection Γ. We denote such a code by C Γ = ( μ , Γ , { C γ } γ Γ ) . Analogously to the deterministic case, a ( 2 n R , n , ε ) random code has the additional requirement
P e ( n ) ( q , C Γ ) = γ Γ μ ( γ ) P e ( n ) ( q , C γ ) ε , for all q ( s n ) P ( S n ) .
The capacity achieved by random codes is denoted by C ( L ) , and it is referred to as the random code capacity.

3. Main Results—General AVRC

We present our results on the compound relay channel and the AVRC.

3.1. The Compound Relay Channel

We establish the cutset upper bound and the partial decode-forward lower bound for the compound relay channel. Consider a given compound relay channel L Q . Let
R C S ( L Q ) inf q Q max p ( x , x 1 ) min I q ( X , X 1 ; Y ) , I q ( X ; Y , Y 1 | X 1 ) ,
and
R P D F ( L Q ) max p ( u , x , x 1 ) min { inf q Q I q ( U , X 1 ; Y ) + inf q Q I q ( X ; Y | X 1 , U ) , inf q Q I q ( U ; Y 1 | X 1 ) + inf q Q I q ( X ; Y | X 1 , U ) } ,
where the subscripts ‘ C S ’ and ‘ D F ’ stand for ‘cutset’ and ‘partial decode-forward’, respectively.
Lemma 1.
The capacity of the compound relay channel L Q is bounded by
C ( L Q ) R P D F ( L Q ) ,
C ( L Q ) R C S ( L Q ) .
Specifically, if R < R P D F ( L Q ) , then there exists a ( 2 n R , n , e a n ) block Markov code over L Q for sufficiently large n and some a > 0 .
The proof of Lemma 1 is given in Appendix A. The achievability proof is based on block Markov coding interlaced with the partial decode-forward scheme. That is, the encoder sends a sequence of messages over multiple blocks. The message in each block consists of two components, a decode-forward component, and a direct transmission component, where only the former is decoded by the relay. The name ‘decode-forward component’ stands for the fact that the relay decodes this message component and sends its estimation forwards, to the destination receiver. Once the decoder has received all blocks, the decode-forward components are decoded backwards, i.e., starting with the message in the last block going backwards. Using the estimation of the decode-forward components, the direct transmission components are decoded forwards, i.e., starting with the message in the first block going forwards. The ambiguity of the state distribution needs to be dealt with throughout all of those estimations. In both decoding stages, the receiver performs joint typicality decoding using a set of types that “quantizes” the set Q of state distributions.
Remark 1.
If the set of state distributions Q is convex, then the upper bound expression in the RHS of Equation (8) has a min max form. On the other hand, in the lower bound expression in the RHS of Equation (9), the maximum comes first, and then we have multiple min terms, which makes this expression a lot more complicated than the classical partial decode-forward bound [4] (see also [3], Theorem 16.3), where Markov properties lead to a simpler expression. We note that this phenomenon (or one might say, disturbance) where the lower bound has multiple min terms is not exclusive to the AVRC. A noteworthy example is the arbitrarily varying wiretap channel [76,89], where the lower bound has the form of max [ min I q ( U ; Y ) max I q ( U ; Z ) ] . While the capacity of the classical wiretap channel is known, the arbitrarily varying counterpart has remained an open problem for several years.
Observe that taking U = in (9) gives the direct transmission lower bound,
C ( L Q ) R P D F ( L Q ) max p ( x , x 1 ) inf q Q I q ( X ; Y | X 1 ) .
Taking U = X in (9) results in a full decode-forward lower bound,
C ( L Q ) R P D F ( L Q ) max p ( x , x 1 ) inf q Q min I q ( X , X 1 ; Y ) , I q ( X ; Y 1 | X 1 ) .
This yields the following corollary. The corollary uses the terms of a strongly degraded relay channel, for which W Y , Y 1 | X , X 1 , S = W Y 1 | X , X 1 W Y | Y 1 , X 1 , S , and a strongly reversely degraded relay channel, for which W Y , Y 1 | X , X 1 , S = W Y | X , X 1 , S W Y 1 | Y , X 1 , as defined in Section 2.2.
Corollary 1.
Let L Q be a compound relay channel, where Q is a compact convex set.
1. 
If W Y , Y 1 | X , X 1 , S is strongly reversely degraded, then
C ( L Q ) = R P D F ( L Q ) = R C S ( L Q ) = min q Q max p ( x , x 1 ) I q ( X ; Y | X 1 ) .
2. 
If W Y , Y 1 | X , X 1 , S is strongly degraded, then
C ( L Q ) = R P D F ( L Q ) = R C S ( L Q ) = max p ( x , x 1 ) min min q Q I q ( X , X 1 ; Y ) , I ( X ; Y 1 | X 1 ) .
The proof of Corollary 1 is given in Appendix B. Part 1 follows from the direct transmission and cutset bounds, (12) and (8), respectively, while part 2 is based on the full decode-forward and cutset bounds, (13) and (8), respectively, along with the convexity considerations in the remark below.
Remark 2.
On a technical level, there are two purposes for considering the strongly degraded relay channel, for which the marginal channel to the relay is independent of the state, i.e., W Y 1 | X , X 1 , S = W Y 1 | X , X 1 (see Section 2.2). First, this ensures that X ( X 1 , Y 1 ) Y form a Markov chain, without conditioning on S. Secondly, as pointed out in Remark 1, there is a difference between the order of the min and max in the lower and upper bounds (cf. (8) and (9)). Thereby, proving the capacity results of Corollary 1 above, we apply the minimax theorem. In general, a pointwise minimum of two convex functions may not necessarily yield a convex function. Nevertheless, having assumed that the relay channel is strongly degraded, the functional G ( p , q ) = min { I q ( X , X 1 ; Y ) , I ( X ; Y 1 | X 1 ) } is quasi-convex in the state distribution, i.e.,
G ( p , ( 1 α ) q 1 + α q 2 ) ) max G ( p , q 1 ) , G ( p , q 2 ) ,
for every p P ( X × X 1 ) , q 1 , q 2 Q , and 0 α 1 . The quasi-convex shape is illustrated in Figure 2, which depicts G ( p , q ) for an example given in the sequel. By [90] (Theorem 3.4), the minimax theorem applies to quasi-convex functions as well, which alleviates the proof of Corollary 1.
The following corollary is a direct consequence of Lemma 1 and it is significant for the random code analysis of the AVRC.
Corollary 2.
The capacity of the block-compound relay channel L Q × B is bounded by
C ( L Q × B ) R P D F ( L Q ) ,
C ( L Q × B ) R C S ( L Q ) .
Specifically, if R < R P D F ( L Q ) , then there exists a ( 2 n R , n , e a n ) block Markov code over L Q × B for sufficiently large n and some a > 0 .
The proof of Corollary 2 is given in Appendix C.

3.2. The AVRC

We give lower and upper bounds, on the random code capacity and the deterministic code capacity, for the AVRC L .

3.2.1. Random Code Lower and Upper Bounds

The random code bounds below are obtained through a modified version of Ahlswede’s RT, using our results on the block-compound relay channel in Corollary 2. Define
R P D F ( L ) R P D F ( L Q ) | Q = P ( S ) , R C S ( L ) R C S ( L Q ) | Q = P ( S ) .
Theorem 1.
The random code capacity of an AVRC L is bounded by
R P D F ( L ) C ( L ) R C S ( L ) .
The proof of Theorem 1 is given in Appendix D. To prove Theorem 1 we modify Ahlswede’s RT. A straightforward application of Ahlswede’s RT fails to comply with the strictly causal relay transmission. Essentially, the RT uses a reliable code for the compound channel code to construct a random code for the AVC, applying random permutations to the transmitted codeword. However, the relay cannot apply permutations to its transmission, since at time i [ 1 : n ] , the relay cannot compute f 1 , j ( y 1 j 1 ) , for j > i , as the relay encoder only knows the past received symbols y 1 , 1 , , y 1 , i 1 , and does not have access to the symbols y 1 , i , , y 1 , j 1 which will be received in the future. To resolve this difficulty, we use a block Markov code for the block compound channel. In a block Markov coding scheme, the relay sends x 1 , b n in block b, using the sequence of symbols y 1 , b 1 n received in the previous block. Since the entire sequence y 1 , b 1 n is known to the relay encoder, permutations can be applied to the transmission in each block separately. Hence, our proof exploits the structure of the original block-compound channel code to construct a random code for the AVRC, as opposed to classical works where the RT is used such that the original code is treated as a “black box” [59].
Remark 3.
Block Markov coding with partial decode-forward is not a simple scheme by itself, and thus, using the RT requires careful attention. In particular, by close inspection of the proof of Theorem 1, one may recognize that the necessity of using the block-compound relay channel, rather than the standard compound channel, stems from the fact that for the AVRC, the state sequences may have completely different types in each block. For each block, we use the RT twice. First, the RT is applied to the probability of the backward decoding error, for the message component which is decoded by the relay. Then, it is applied to the probability of forward decoding error, for the message component which is transmitted directly.
Together with Corollary 1, the theorem above yields another corollary.
Corollary 3.
Let L be an AVRC.
1. 
If W Y , Y 1 | X , X 1 , S is strongly reversely degraded,
C ( L ) = R P D F ( L ) = R C S ( L ) = min q ( s ) max p ( x , x 1 ) I q ( X ; Y | X 1 ) .
2. 
If W Y , Y 1 | X , X 1 , S is strongly degraded,
C ( L ) = R P D F ( L ) = R C S ( L ) = max p ( x , x 1 ) min min q ( s ) I q ( X , X 1 ; Y ) , I ( X ; Y 1 | X 1 ) .
Before we proceed to the deterministic code capacity, we note that Ahlswede’s Elimination Technique [55] can be applied to the AVRC as well. Hence, the size of the code collection of any reliable random code can be reduced to polynomial size.

3.2.2. Deterministic Code Lower and Upper Bounds

In the next statements, we characterize the deterministic code capacity of the AVRC L . We consider conditions under which the deterministic code capacity is positive, and it coincides with the random code capacity, and conditions under which it is lower. For every x 1 X 1 , let W 1 ( x 1 ) and W ( x 1 ) denote the marginal AVCs from the sender to the relay and from the sender to the destination receiver, respectively,
W 1 ( x 1 ) = { W Y 1 | X , X 1 , S ( · | · , x 1 , · ) } , W ( x 1 ) = { W Y | X , X 1 , S ( · | · , x 1 , · ) } .
See Figure 3.
Lemma 2 gives a condition under which the deterministic code capacity is the same as the random code capacity. The condition is given in terms of the marginal AVCs W 1 ( x 1 ) and W ( x 1 ) .
Lemma 2.
If the marginal sender-relay and sender-reciever AVCs have positive capacities, i.e., C ( W 1 ( x 1 , 1 ) ) > 0 and C ( W ( x 1 , 2 ) ) > 0 , for some x 1 , 1 , x 1 , 2 X 1 , then the capacity of the AVRC L is positive, and it coincides with the random code capacity, i.e., C ( L ) = C ( L ) > 0 .
The proof of Lemma 2 is given in Appendix E, extending Ahlswede’s Elimination Technique [55].
Next, we give a computable sufficient condition, under which the deterministic code capacity coincides with the random code capacity. For the point to point AVC, this occurs if and only if the channel is non-symmetrizable [56,57] (Definition 2). Our condition here is given in terms of an extended definition of symmetrizability, akin to [67] (Definition 3.2).
Definition 3.
A state-dependent relay channel W Y , Y 1 | X , X 1 , S is said to be symmetrizable- X | X 1 if for some conditional distribution J ( s | x ) ,
s S W Y , Y 1 | X , X 1 , S ( y , y 1 | x , x 1 , s ) J ( s | x ˜ ) = s S W Y , Y 1 | X , X 1 , S ( y , y 1 | x ˜ , x 1 , s ) J ( s | x ) , x , x ˜ X , x 1 X 1 , y Y , y 1 Y 1 .
Equivalently, for every given x 1 X 1 , the channel W Y ¯ | X , X 1 , S ( · | · , x 1 , · ) is symmetrizable, where Y ¯ = ( Y , Y 1 ) .
A similar definition applies to the marginals W Y | X , X 1 , S and W Y 1 | X , X 1 , S . Note that symmetrizability of each of these marginals can be checked, without reference to whether the channel is degraded or strongly degraded.
Corollary 4.
Let L be an AVRC.
1. 
If W Y | X , X 1 , S and W Y 1 | X , X 1 , S are non-symmetrizable- X | X 1 , then C ( L ) = C ( L ) > 0 . In this case,
R P D F ( L ) C ( L ) R C S ( L ) .
2. 
If W Y , Y 1 | X , X 1 , S is strongly reversely degraded, where W Y 1 | X , X 1 , S is non-symmetrizable- X | X 1 , then
C ( L ) = C ( L ) = R P D F ( L ) = R C S ( L ) = min q ( s ) max p ( x , x 1 ) I q ( X ; Y | X 1 ) .
3. 
If W Y , Y 1 | X , X 1 , S is strongly degraded, where W Y | X , X 1 , S is non-symmetrizable- X | X 1 and W Y 1 | X , X 1 ( y 1 | x , x 1 ) W Y 1 | X , X 1 ( y 1 | x ˜ , x 1 ) for some x , x ˜ X , x 1 X 1 and y 1 Y 1 , then
C ( L ) = C ( L ) = R P D F ( L ) = R C S ( L ) = max p ( x , x 1 ) min min q ( s ) I q ( X , X 1 ; Y ) , I ( X ; Y 1 | X 1 ) .
The proof of Corollary 4 is given in Appendix F.
Remark 4.
By Corollary 4, we have that non-symmetrizability of the marginal AVCs, W 1 ( x 1 , 1 ) and W ( x 1 , 2 ) , for some x 1 , 1 , x 1 , 2 X 1 , is a sufficient condition for positive capacity (see Figure 3). This raises the question whether it is a necessary condition as well. In other words: If W 1 ( x 1 ) and W ( x 1 ) are symmetrizable for all x 1 X 1 , does that necessarily imply that the capacity is zero? The answer is no. We show this using a very simple example. Suppose that Y 1 = S and Y = ( X 1 , X + S ) , where all variables are binary. It is readily seen that for both Y 1 and Y, the input and the state are symmetric, for every given X 1 = x 1 . Hence, W 1 ( x 1 ) and W ( x 1 ) are symmetrizable for all x 1 X 1 . Nevertheless, we note that since the relay can send X 1 = Y 1 = S , this is equivalent to an AVC with state information at the decoder. As the decoder can use X 1 to eliminate the state, the capacity of this AVRC is C ( L ) = 1 . In Lemma 3 below, we give a stronger condition which is a necessary condition for positive capacity.
Remark 5.
Note that there are 4 symmetrizability cases in terms of the sender-relay channel W Y 1 | X , X 1 , S and the sender-receiver channel W Y | X , X 1 , S . For the case where W Y 1 | X , X 1 , S and W Y | X , X 1 , S are both non-symmetrizable- X | X 1 , the lemma above asserts that the capacity coincides with the random code capacity. In other cases, one may expect the capacity to be lower than the random code capacity. For instance, if W Y | X , X 1 , S is non-symmetrizable- X | X 1 , while W Y 1 | X , X 1 , S is symmetrizable- X | X 1 , then the capacity is positive by direct transmission. Furthermore, in this case, if the channel is reversely degraded, then the capacity coincides with the random code capacity. However, it remains in question whether this is true in general, when the channel is not reversely degraded.
Next, we consider conditions under which the capacity is zero. Observe that if W Y , Y 1 | X , X 1 , S is symmetrizable- X | X 1 then so are W Y | X , X 1 , S and W Y 1 | X , X 1 , S . Intuitively, if the AVRC is symmetrizable- X | X 1 , then it is a poor channel. For example, say Y 1 = X + X 1 + S and Y = X · X 1 · S , with S = X . Then, the jammer can confuse the decoder by taking the state sequence S n to be some codeword. The following lemma validates this intuition.
Lemma 3.
If the AVRC L is symmetrizable- X | X 1 , then it has zero capacity, i.e., C ( L ) = 0 . Equivalently, non-symmetrizability- X | X 1 of the AVRC L is a necessary condition for positive capacity.
Lemma 3 is proved in Appendix G, using an extended version of Ericson’s technique [56]. For a strongly degraded AVRC, we have a simpler symmetrizability condition under which the capacity is zero.
Definition 4.
Let W Y , Y 1 | X , X 1 , S = W Y 1 | X , X 1 W Y | Y 1 , X 1 , S be a strongly degraded relay channel. We say that W Y , Y 1 | X , X 1 , S is symmetrizable- X 1 × Y 1 if for some conditional distribution J ( s | x 1 , y 1 ) ,
s S W Y | Y 1 , X 1 , S ( y | y 1 , x 1 , s ) J ( s | x ˜ 1 , y ˜ 1 ) = s S W Y | Y 1 , X 1 , S ( y | y ˜ 1 , x ˜ 1 , s ) J ( s | x 1 , y 1 ) , x ˜ 1 , x 1 X 1 , y Y , y 1 , y ˜ 1 Y 1 .
Equivalently, the channel W Y | Y ¯ 1 , S is symmetrizable, where Y ¯ 1 = ( Y 1 , X 1 ) .
Lemma 4.
If the AVRC L is strongly degraded and symmetrizable- X 1 × Y 1 , then it has zero capacity, i.e., C ( L ) = 0 .
Lemma 4 is proved in Appendix H. An example is given below.
Example 1.
Consider a state-dependent relay channel W Y , Y 1 | X , X 1 , S , specified by
Y 1 = X + Z mod 2 , Y = X 1 + S ,
where X = X 1 = Z = S = Y 1 = { 0 , 1 } and Y = { 0 , 1 , 2 } , and the additive noise is distributed according to Z Bernoulli ( θ ) , 0 θ 1 . It is readily seen that W Y , Y 1 | X , X 1 , S is strongly degraded and symmetrizable- X 1 × Y 1 , by (2) and (28). In particular, (28) is satisfied with J ( s | x 1 , y 1 ) = 1 for s = x 1 , and J ( s | x 1 , y 1 ) = 0 otherwise. Hence, by Lemma 4, the capacity is C ( L ) = 0 . On the other hand, we show that the random code capacity is given by C ( L ) = min 1 2 , 1 h ( θ ) , using Corollary 3. The derivation of the random code capacity is given in Appendix I.

3.3. AVRC with Orthogonal Sender Components

Consider the special case of a relay channel W Y , Y 1 | X , X 1 , S with orthogonal sender components [5]; [3] (Section 16.6.2), where X = ( X , X ) and
W Y , Y 1 | X , X , X 1 , S ( y , y 1 | x , x , x 1 , s ) = W Y | X , X 1 , S ( y | x , x 1 , s ) · W Y 1 | X , X 1 , S ( y 1 | x , x 1 , s ) .
Here, we address the case where the channel output depends on the state only through the relay, i.e., W Y | X , X 1 , S ( y | x , x 1 , s ) = W Y | X , X 1 ( y | x , x 1 ) .
Lemma 5.
Let L = { W Y | X , X 1 W Y 1 | X , X 1 , S } be an AVRC with orthogonal sender components. The random code capacity of L is given by
C ( L ) = R P D F ( L ) = R C S ( L ) = max p ( x 1 ) p ( x | x 1 ) p ( x | x 1 ) min I ( X , X 1 ; Y ) , min q ( s ) I q ( X ; Y 1 | X 1 ) + I ( X ; Y | X 1 ) .
If W Y 1 | X , X 1 , S is non-symmetrizable- X | X 1 , and W Y | X , X 1 ( y | x , x 1 ) W Y | X , X 1 ( y | x ˜ , x 1 ) for some x 1 X 1 , x , x ˜ X , y Y , then the deterministic code capacity is given by C ( L ) = R P D F ( L ) = R C S ( L ) .
The proof of Lemma 5 is given in Appendix J. To prove Lemma 5, we apply the methods of [5] to our results. Specifically, we use the partial decode-forward lower bound in Theorem 1, taking U = X (see (9) and (19)).

4. Gaussian AVRC with Sender Frequency Division

We give extended results for the Gaussian AVRC with sender frequency division (SFD), which is a special case of the AVRC with orthogonal sender components [5]. We determine the random code capacity of the Gaussian AVRC with SFD, and give lower and upper bounds on the deterministic code capacity. The derivation of the deterministic code bounds is mostly independent of our previous results, and it is based on the technique by [87]. The Gaussian relay channel W Y , Y 1 | X , X 1 , S with SFD is a special case of a relay channel with orthogonal sender components [5], specified by
Y 1 = X + Z , Y = X + X 1 + S ,
where the Gaussian additive noise Z N ( 0 , σ 2 ) is independent of the channel state. As opposed to Lemma 5, the main channel here depends on the state, while the channel to the relay does not. In the case of a Gaussian channel, power limitations need to be accounted for, and thus, we consider the Gaussian relay channel under input and state constraints. Specifically, the user and the relay’s transmission are subject to input constraints Ω > 0 and Ω 1 > 0 , respectively, and the jammer is under a state constraint Λ , i.e.,
1 n i = 1 n ( X i 2 + X i 2 ) Ω , 1 n i = 1 n X 1 , i 2 Ω 1 w . p . 1 , 1 n i = 1 n S i 2 Λ w . p . 1 .
We note that Ahlswede’s Elimination Technique cannot be used under a state constraint (see [57]). Indeed, if the jammer concentrates a lot of power on the shared randomness transmission, then this transmission needs to be robust against a state constraint that is higher than Λ . Thereby, the results given in Section 3.2.2 do not apply to the Gaussian AVRC under input and state constraints.
For the compound relay channel, the state constraint is in the average sense. That is, we say that the Gaussian compound relay channel L Q with SFD is under input constraints Ω and Ω 1 and state constraint Λ if
1 n i = 1 n ( X i 2 + X i 2 ) Ω , 1 n i = 1 n X 1 , i 2 Ω 1 , w . p . 1 , Q = { q ( s ) : E S 2 Λ } .
Coding definitions and notation are as follows. The definition of a code is similar to that of Section 2.3. The encoding function is denoted by f = ( f , f ) , with f : [ 1 : 2 n R ] R n and f : [ 1 : 2 n R ] R n , and the relay encoding function is denoted by f 1 : R n R n , where f 1 , i : R i 1 R , for i [ 1 : n ] . The boldface notation indicates that the encoding functions produce sequences. Here, the encoder and the relay satisfy the input constraints f ( m ) 2 + f ( m ) 2 n Ω and f 1 ( y 1 ) 2 n Ω 1 for all m [ 1 : 2 n R ] and y 1 R n . At time i [ 1 : n ] , given a message m [ 1 : 2 n R ] , the encoder transmits ( x i , x i ) = ( f i ( m ) , f i ( m ) ) , and the relay transmits x 1 , i = f 1 , i ( y 1 , 1 , , y 1 , i 1 ) . The decoder receives the output sequence y , and finds an estimate m ^ = g ( y ) . A ( 2 n R , n , ε ) code C for the Gaussian AVRC satisfies P e | s ( n ) ( C ) ε , for all s R n with s 2 n Λ , where
P e | s ( n ) ( C ) = 1 2 n R m = 1 2 n R D ( m , s ) c 1 ( 2 π σ 2 ) n / 2 e z 2 / 2 σ 2 d z ,
with
D ( m , s ) = z R n : g f ( m ) + f 1 f ( m ) + z + s = m .
Achivable rates, deterministic code capacity and random code capacity are defined as before. Next, we give our results on the Gaussian compound relay channel and the Gaussian AVRC with SFD.

5. Main Results—Gaussian AVRC with SFD

We give our results on the Gaussian compound and AVRC with SFD. The results on this compound relay channel and on the random code capacity of this AVRC are obtained through a straightforward extension of our previous results and derivations. However, the derivation of the deterministic code bounds is mostly independent of our previous results, and it is based on modifying the technique by Csiszär and Narayan in their paper on the Gaussian AVC [87].

5.1. Gaussian Compound Relay Channel

We determine the capacity of the Gaussian compound relay channel with SFD under input and state constraints. Let
F G ( α , ρ ) min { 1 2 log 1 + Ω 1 + α Ω + 2 ρ α Ω Ω 1 Λ , 1 2 log 1 + ( 1 α ) Ω σ 2 + 1 2 log 1 + ( 1 ρ 2 ) α Ω Λ } .
Lemma 6.
The capacity of the Gaussian compound relay channel with SFD, under input constraints Ω and Ω 1 and state constraint Λ, is given by
C ( L Q ) = max 0 α , ρ 1 F G ( α , ρ ) ,
and it is identical to the random code capacity, i.e., C ( L Q ) = C ( L Q ) .
The proof of Lemma 6 is given in Appendix K, based on our results in the previous sections. The parameter 0 α 1 represents the fraction of input power invested in the transmission of the message component which is decoded by the relay, in the partial decode-forward coding scheme. Specifically, in the achievability proof in [5], α Ω and ( 1 α ) Ω are the variances of X and X , respectively. The parameter ρ stands for the correlation coefficient between the decode-forward transmission X and the relay transmission X 1 .

5.2. Gaussian AVRC

We determine the random code capacity of the Gaussian AVRC with SFD under constraints.
Theorem 2.
The random code capacity of the Gaussian AVRC with SFD, under input constraints Ω and Ω 1 and state constraint Λ, is given by
C ( L ) = C ( L Q ) = max 0 α , ρ 1 F G ( α , ρ ) .
The proof of Theorem 2 is given in Appendix L. The proof follows the same considerations as in our previous results.
Next, we give lower and upper bounds on the deterministic code capacity of the Gaussian AVRC with SFD under constraints, obtained by generalizing the non-standard techniques by Csiszár and Narayan in their 1991 paper on the Gaussian AVC [87]. Define
R G , l o w ( L ) max F G ( α , ρ ) subject to 0 α , ρ 1 , ( 1 ρ 2 ) α Ω > Λ , Ω 1 Ω ( Ω 1 + ρ α Ω ) 2 > Λ + ( 1 ρ 2 ) α Ω .
and
R G , u p ( L ) max F G ( α , ρ ) subject to 0 α , ρ 1 , Ω 1 + α Ω + 2 ρ α Ω · Ω 1 Λ .
It can be seen that R G , l o w R G , u p , since
Ω 1 + α Ω + 2 ρ α Ω · Ω 1 = ( 1 ρ 2 ) α Ω + ( Ω 1 + ρ α Ω ) 2 ( 1 ρ 2 ) α Ω .
The analysis is based on the following lemma by [87].
Lemma 7
(see [87] (Lemma 1)). For every ε > 0 , 8 ε < η < 1 , K > 2 ε , and M = 2 n R , with 2 ε R K , and n n 0 ( ε , η , K ) , there exist M unit vectors a ( m ) R n , m [ 1 : M ] , such that for every unit vector c R n and 0 θ , ζ 1 ,
| m ˜ [ 1 : M ] : a ( m ˜ ) , c θ | 2 n [ R + 1 2 log ( 1 θ 2 ) ] + + ε ,
and if θ η and θ 2 + ζ 2 > 1 + η 2 2 R , then
1 M | m [ 1 : M ] : | a ( m ˜ ) , a ( m ) | θ , | a ( m ˜ ) , c | ζ , for some m ˜ m | 2 n ε ,
where [ t ] + = max { 0 , t } and · , · denotes inner product.
Intuitively, the lemma states that under certain conditions, a codebook can be constructed with an exponentially small fraction of “bad” messages, for which the codewords are non-orthogonal to each other and the state sequence.
Theorem 3.
The deterministic code capacity of the Gaussian AVRC with SFD, under input constraints Ω and Ω 1 and state constraint Λ, is bounded by
R G , l o w ( L ) C ( L ) R G , u p ( L ) .
The proof of Theorem 3 is given in Appendix M.
Remark 6.
Csiszár and Narayan [87] have shown that for the classical Gaussian AVC, reliable decoding is guaranteed when the input constraint Ω is larger than the state constraint Λ. Here, we use a partial decode-forward coding scheme, where the message has two components, one which is decoded by the relay, and the other is transmitted directly. The respective optimization constraints Ω 1 Ω ( Ω 1 + ρ α Ω ) 2 > Λ + ( 1 ρ 2 ) α Ω and ( 1 ρ 2 ) α Ω > Λ in the RHS of (39), guarantee reliability for each decoding step.
Remark 7.
Csiszár and Narayan [87] have further shown that for the classical Gaussian AVC, if Ω Λ , the capacity is zero. The converse proof in [87] follows by considering a jammer who chooses the state sequence to be a codeword. Due to the symmetry between X and S , the decoder cannot distinguish between the transmitted codeword and the impostor sent by the jammer. Here, we consider a jammer who simulates X + X 1 . Specifically, The jammer draws a codeword X = f ( m ˜ ) uniformly at random, and then, generates a sequence Y ˜ 1 distributed according to the conditional distribution P Y 1 | M = m ˜ . If the sequence S ˜ = f ( m ˜ ) + f 1 ( Y ˜ 1 ) satisfies the state constraint Λ, then the jammer chooses S ˜ as the state sequence. Defining α Ω , Ω 1 , and ρ as the empirical decode-forward transmission power, relay transmission power, and their correlation coefficient, respectively, we have that the state constraint S ˜ 2 n Λ holds with high probability, if Ω 1 + α Ω + 2 ρ α Ω · Ω 1 < Λ . The details are in Appendix M.
Figure 4 depicts the bounds on the capacity of the Gaussian AVRC with SFD under input and state constraints, as a function of the input constraint Ω = Ω 1 , under state constraint Λ = 1 and σ 2 = 0.5 . The top dashed line depicts the random code capacity of the Gaussian AVRC. The solid lines depict the deterministic code lower and upper bounds R G , l o w ( L ) and R G , u p ( L ) . For low values, Ω < Λ 4 = 0.25 , we have that R G , u p ( L ) = 0 , hence the deterministic code capacity is zero, and it is strictly lower than the random code capacity. The dotted lower line depicts the direct transmission lower bound, which is F G ( 1 , 0 ) for Ω > Λ , and zero otherwise [57]. For intermediate values of Ω , direct transmission is better than the lower bound in Theorem 3. Whereas, for high values of Ω , the optimization constraints in (39) and (40) are inactive, hence, our bounds are tight, and the capacity coincides with the random code capacity, i.e., C ( L ) = C ( L ) = R G , l o w ( L ) = R G , u p ( L ) .

6. The Primitive AVRC

In this section, we give our results on the primitive AVRC [2], and then consider the Gaussian case. Part of the motivation given in [2] to consider the primitive relay channel was that the overall behavior and properties are the same as the non primitive (“regular”) relay channel. We show that this is not true in the arbitrarily varying scenario. In particular, the behavior of the primitive Gaussian AVRC with SFD is different compared to the non-primitive counterpart considered above.

6.1. Definitions and Notation

Consider a setup where the sender transmits information over state-dependent memoryless relay channel W Y , Y 1 | X , S , while there is a noiseless link of capacity C 1 > 0 between the relay and the receiver. Communication over a primitive relay channel is depicted in Figure 5. Given a message M [ 1 : 2 n R ] , the encoder transmits X n = f ( M ) over the channel W Y , Y 1 | X , S , which is referred to as the primitive relay channel. The relay receives Y 1 n and sends an index L = f 1 ( Y 1 n ) to the receiver, where f 1 : Y 1 n [ 1 : 2 n C 1 ] . The decoder receives both the channel output sequence Y n and the relay output L, and finds an estimate of the message M ^ = g ( Y n , L ) . In accordance with the previous definitions, the primitive AVRC L prim = { W Y , Y 1 | X , S } has a state sequence of unknown distribution, not necessarily independent nor stationary. The deterministic code capacity and the random code capacity are defined as before, and denoted by C ( L prim ) and C ( L prim ) , respectively.

6.2. Main Results—Primitive AVRC

We give our results on the primitive AVRC below. However, since the proofs are based on the same arguments as given for the non primitive AVRC, we omit the proofs of the results in this section. The details are given in [91].
Using similar arguments to those given for the non primitive relay channel, we obtain the following bounds on the random code capacity,
R C S min q ( s ) max p ( x ) min I q ( X ; Y ) + C 1 , I q ( X ; Y , Y 1 ) ,
and
R P D F max p ( u , x ) min min q ( s ) I q ( U ; Y ) + min q ( s ) I q ( X ; Y | U ) + C 1 , min q ( s ) I q ( U ; Y 1 ) + min q ( s ) I q ( X ; Y | U ) .
Theorem 4.
The random code capacity of a primitive AVRC L prim is bounded by
R P D F C ( L prim ) R C S .
Those bounds have the same form as the cutset upper bound and the partical decode-forward lower bound in Section 3 (cf. (8), (9) and (45), (46)). As in Section 3, we can use the bounds above to determine the capacity in the strongly degraded and reversely degraded cases, based on the direct transmission lower bound (for U = ), and the full decode-forward lower bound (for U = X ).
Corollary 5.
Let L prim be a primitive AVRC.
1. 
If W Y , Y 1 | X , S is strongly reversely degraded, i.e., W Y , Y 1 | X , S = W Y | X , S W Y 1 | Y , then
C ( L prim ) = min q ( s ) max p ( x ) I q ( X ; Y ) .
2. 
If W Y , Y 1 | X , X 1 , S is strongly degraded, i.e., W Y , Y 1 | X , X 1 , S = W Y 1 | X W Y | Y 1 , S , then
C ( L prim ) = max p ( x ) min min q ( s ) I q ( X ; Y ) + C 1 , I ( X ; Y 1 ) .
As for the deterministic code capacity, we give the following theorem.
Theorem 5.
Let L prim be a primitive AVRC.
1. 
If W Y 1 | X , S is non-symmetrizable, then C ( L prim ) = C ( L prim ) . In this case,
R P D F C ( L prim ) R C S .
2. 
If W Y , Y 1 | X , S is strongly reversely degraded, where W Y 1 | X , S is non-symmetrizable, then
C ( L prim ) = min q ( s ) max p ( x ) I q ( X ; Y ) .
3. 
If W Y , Y 1 | X , S is strongly degraded, such that W Y 1 | X ( y 1 | x ) W Y 1 | X ( y 1 | x ˜ ) for some x , x ˜ X , y 1 Y 1 , then
C ( L prim ) = max p ( x ) min min q ( s ) I q ( X ; Y ) + C 1 , I ( X ; Y 1 ) .
4. 
If W Y ˜ | X , S is symmetrizable, where Y ˜ = ( Y , Y 1 ) , then C ( L prim ) = 0 .
The proof of Theorem 5 is available in [91]. To illustrate our results, we give the following example of a primitive AVRC.
Example 2.
Consider a state-dependent primitive relay channel W Y , Y 1 | X , S , specified by
Y 1 = X ( 1 S ) , Y = X + S ,
where X = S = Y 1 = { 0 , 1 } , Y = { 0 , 1 , 2 } , and C 1 = 1 , i.e., the link between the relay and the receiver is a noiseless bit pipe. It can be seen that both the sender-relay and the sender-receiver marginals are symmetrizable. Indeed, W Y | X , S satisfies
s S W Y | X , S ( y 1 | x , s ) J ( s | x ˜ ) = s S W Y | X , S ( y 1 | x ˜ , s ) J ( s | x ) , x , x ˜ X , y Y ,
with J ( s | x ) = 1 for s = x , and J ( s | x ) = 0 otherwise, while W Y 1 | X , S satisfies (53) with J ( s | x ) = 1 for s = 1 x , and J ( s | x ) = 0 otherwise. Nevertheless, the capacity of the primitive AVRC L prim = { W Y , Y 1 | X , S } is C ( L prim ) = 1 , which can be achieved using a code of length n = 1 , with f ( m ) = m , f 1 ( y 1 ) = y 1 ,
g ( y , ) = g ( y , y 1 ) = 0 y = 0 1 y = 2 y 1 y = 1
for m , y 1 { 0 , 1 } and y { 0 , 1 , 2 } . This example shows that even if the sender-relay and sender-receiver marginals are symmetrizable, the capacity may still be positive. We further note that the condition in part 4 of Theorem 5 implies that W Y | X , S and W Y 1 | X , S are both symmetrizable, but not vice versa, as shown by this example. That is, as the capacity is positive, we have that W Y ˜ | X , S is non-symmetrizable, where Y ˜ = ( Y , Y 1 ) , despite the fact that the marginals W Y | X , S and W Y 1 | X , S are both symmetrizable.

6.3. Primitive Gaussian AVRC

Consider the primitive Gaussian relay channel with SFD,
Y 1 = X + Z , Y = X + S ,
Suppose that input and state constraints are imposed as before, i.e., 1 n i = 1 n ( X i 2 + X i 2 ) Ω and 1 n i = 1 n S i 2 Λ with probability 1. The capacity of the primitive Gaussian AVRC with SFD, under input constraint Ω and state constraint Λ is given by
C ( L prim ) = C ( L prim ) = max 0 α 1 1 2 log 1 + α Ω Λ + min C 1 , 1 2 log 1 + ( 1 α ) Ω Λ .
This result is due to the following. Observe that one could treat this primitive AVRC as two independent channels, one from X to Y and the other from X to Y 1 , dividing the input power to α Ω and ( 1 α ) Ω , respectively. Based on this observation, the random code direct part follows from [92]. Next, the deterministic code direct part follows from part 1 of Theorem 5, and the converse part follows straightforwardly from the cutset upper bound in Theorem 4.

7. Discussion

We have presented the model of the arbitrarily varying relay channel (AVRC), as a state dependent relay channel, where jamming attacks result in either a random or a deterministic state sequence, S n q ( s n ) , where the joint distribution q ( s n ) is unknown and it is not necessarily of a product form. We have established the cutset upper bound and the partial decode-forward lower bound on the random code capacity of the AVRC. We have determined the random code capacity in special cases of the degraded AVRC, the reversely degraded AVRC, and the AVRC with orthogonal sender components. To do so, we used the direct transmission lower bound and the full decode-forward lower bound, along with quasi-convexity properties which are required in order to use the minimax theorem.
We have provided generalized symmetrizability conditions under which the deterministic code capacity coincides with the random code capacity. Specifically, we have shown that if the sender-relay and sender-receiver marginals are non-symmetrizable for a given relay transmission, then the capacity is positive. We further noted that this is a sufficient condition for positive capacity, which raises the question whether it is also a necessary condition. In other words, if those marginals are symmetrizable for every given relay transmission, does that necessarily imply that the capacity is zero? The answer is no, and we have refuted this assertion using a simple example, where the relay acts as a source of state information to the receiver. Then, we provided a stronger symmetrizability condition, which is necessary for the capacity to be positive. We have shown by example that the deterministic code capacity can be strictly lower than the random code capacity of the AVRC.
The Gaussian AVRC with sender frequency division (SFD) under input and state constraints is also addressed in this paper. The random code capacity is determined using the above results, whereas the deterministic code capacity is lower and upper bounded using an independent approach. Specifically, we extended the technique by Csiszár and Narayan in their 1991 paper on the Gaussian AVC [87]. We have shown that the deterministic code capacity can be strictly lower than the random code capacity, for low values on the input constraint.
Furthermore, we have considered the primitive AVRC, where there is a noiseless link between the relay and the receiver of limited capacity [2]. We tested Kim’s assertion that “the primitive relay channel captures most essential features and challenges of relaying, and thus serves as a good testbed for new relay coding techniques” [2]. We have shown that this assertion is not true in the arbitrarily varying scenario. Specifically, for the primitive Gaussian AVRC with SFD, the deterministic code capacity and the random code capacity are always the same, regardless of the value of the input constraint (see (56)), in contrast to our findings for the non primitive case, as demonstrated in Figure 4.

Author Contributions

Formal analysis, U.P.; Investigation, U.P.; Methodology, U.P.; Supervision, Y.S.; Writing—original draft, U.P.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
AVCArbitrarily varying channel
AVRCArbitrarily varying relay channel
DMCDiscrete memoryless channel
pmfprobability mass function
RTRobustification technique
SFDSender frequency division
Eq.Equation
RHSRight hand side
LHSLeft hand side

Appendix A. Proof of Lemma 1

Appendix A.1. Partial Decode-Forward Lower Bound

We construct a block Markov code using the partial decode-forward scheme. That is, the encoder sends a sequence of messages over multiple blocks. The message in each block consists of two components, a decode-forward component, and a direct transmission component, where only the former is decoded by the relay. Once the decoder has received all blocks, the decode-forward components are decoded backwards, i.e., starting with the message in the last block going backwards. Using the estimation of the decode-forward components, the direct transmission components are decoded forwards, i.e., starting with the message in the first block going forwards. The ambiguity of the state distribution needs to be treated throughout all of those estimations. Hence, we use joint typicality with respect to a state type, which is “close” to some q Q . Let δ > 0 be arbitrarily small. Define a set of state types Q ^ n by
Q ^ n = { P ^ s n : s n A δ 1 ( q ) for some q Q } ,
where
δ 1 δ 2 · | S | .
Namely, Q ^ n is the set of types that are δ 1 -close to some state distribution q ( s ) in Q . A code C for the compound relay channel is constructed as follows.
The encoders use B blocks, each consists of n channel uses to convey ( B 1 ) independent messages to the receiver. Furthermore, each message M b , for b [ 1 : B 1 ] , is divided into two independent messages. That is, M b = ( M b , M b ) , where M b and M b are uniformly distributed, i.e.,
M b Unif [ 1 : 2 n R ] , M b Unif [ 1 : 2 n R ] , with R + R = R ,
for b [ 1 : B 1 ] . For convenience of notation, set M 0 = M B 1 and M 0 = M B 1 . The average rate B 1 B · R is arbitrarily close to R.
Codebook Generation: Fix the distribution P U , X , X 1 ( u , x , x 1 ) , and let
P X , Y , Y 1 | U , X 1 q ( x , y , y 1 | u , x 1 ) = P X | U , X 1 ( x | u , x 1 ) s S q ( s ) W Y , Y 1 | X , X 1 , S ( y , y 1 | x , x 1 , s ) .
We construct B independent codebooks. For b [ 2 : B 1 ] , generate 2 n R independent sequences x 1 , b n ( m b 1 ) , m b 1 [ 1 : 2 n R ] , at random, each according to i = 1 n P X 1 ( x 1 , i ) . Then, generate 2 n R sequences,
u b n ( m b | m b 1 ) i = 1 n P U | X 1 ( u i | x 1 , b , i ( m b 1 ) ) , m b [ 1 : 2 n R ] ,
conditionally independent given x 1 , b n ( m b 1 ) . Then, for every m b [ 1 : 2 n R ] , generate 2 n R sequences,
x b n ( m b , m b | m b 1 ) i = 1 n P X | U , X 1 ( x i | u b , i ( m b | m b 1 ) , x 1 , b , i ( m b 1 ) ) , m b [ 1 : 2 n R ] ,
conditionally independent given ( u b n ( m b | m b 1 ) , x 1 , b n ( m b 1 ) ) . We have thus generated B 2 independent codebooks,
F b = x 1 , b n ( m b 1 ) , u b n ( m b | m b 1 ) , x b n ( m b , m b | m b 1 ) : m b 1 , m b [ 1 : 2 n R ] , m b [ 1 : 2 n R ] ,
for b [ 2 : B 1 ] . The codebooks F 1 and F B are generated in the same manner, with fixed m 0 = m B 1 and m 0 = m B 1 . Encoding and decoding is illustrated in Figure A1.
Encoding: To send the message sequence ( m 1 , m 1 , , m B 1 , m B 1 ) , transmit x b n ( m b , m b | m b 1 ) at block b, for b [ 1 : B ] .
Relay Encoding: In block 1, the relay transmits x 1 , 1 n ( 1 ) . Set m ˜ 0 1 . At the end of block b [ 1 : B 1 ] , the relay receives y 1 , b n , and finds some m ˜ b [ 1 : 2 n R ] such that
( u b n ( m ˜ b | m ˜ b 1 ) , x 1 , b n ( m ˜ b 1 ) , y 1 , b n ) A δ ( P U , X 1 P Y 1 | U , X 1 q ) , for some q Q ^ n .
If there is none or there is more than one such, set m ˜ b = 1 . In block b + 1 , the relay transmits x 1 , b + 1 n ( m ˜ b ) .
Backward Decoding: Once all blocks ( y b n ) b = 1 B are received, decoding is performed backwards. Set m ^ B = m ^ B 1 . For b = B 1 , B 2 , , 1 , find a unique m ^ b [ 1 : 2 n R ] such that
( u b + 1 n ( m ^ b + 1 | m ^ b ) , x 1 , b + 1 n ( m ^ b ) , y b + 1 n ) A δ ( P U , X 1 P Y | U , X 1 q ) , for some q Q ^ n .
If there is none, or more than one such m ^ b [ 1 : 2 n R ] , declare an error.
Then, the decoder uses m ^ 1 , , m ^ B 1 as follows. For b = B 1 , B 2 , , 1 , find a unique m ^ b [ 1 : 2 n R ] such that
( u b n ( m ^ b | m ^ b 1 ) , x b n ( m ^ b , m ^ b | m ^ b 1 ) , x 1 , b ( m ^ b 1 ) , y b n ) A δ ( P U , X , X 1 P Y | X , X 1 q ) , for some q Q ^ n .
If there is none, or more than one such m ^ b [ 1 : 2 n R ] , declare an error. We note that using the set of types Q ^ n instead of the original set of state distributions Q alleviates the analysis, since Q is not necessarily finite nor countable.
Figure A1. The partial decode-forward coding scheme. The block index b [ 1 : B ] is indicated at the top. In the following rows, we have the corresponding elements: (1) sequences transmitted by the encoder; (2) estimated messages at the relay; (3) sequences transmitted by the relay; (4) estimated messages at the destination decoder. The arrows in the second row indicate that the relay encodes forwards with respect to the block index, while the arrows in the fourth row indicate that the receiver decodes backwards.
Figure A1. The partial decode-forward coding scheme. The block index b [ 1 : B ] is indicated at the top. In the following rows, we have the corresponding elements: (1) sequences transmitted by the encoder; (2) estimated messages at the relay; (3) sequences transmitted by the relay; (4) estimated messages at the destination decoder. The arrows in the second row indicate that the relay encodes forwards with respect to the block index, while the arrows in the fourth row indicate that the receiver decodes backwards.
Entropy 21 00516 g0a1
Analysis of Probability of Error: Assume without loss of generality that the user sent ( M b , M b ) = ( 1 , 1 ) , and let q * ( s ) Q denote the actual state distribution chosen by the jammer. The error event is bounded by the union of the events
E 1 ( b ) = { M ˜ b 1 } , E 2 ( b ) = { M ^ b 1 } , E 3 ( b ) = { M ^ b 1 } , for b [ 1 : B 1 ] .
Then, the probability of error is bounded by
P e ( n ) ( q , C ) b = 1 B 1 Pr E 1 ( b ) + b = 1 B 1 Pr E 2 ( b ) E 1 c ( b ) + b = 1 B 1 Pr E 3 ( b ) E 1 c ( b ) E 2 c ( b ) E 2 c ( b 1 ) ,
with E 2 ( 0 ) = , where the conditioning on ( M b , M b ) = ( 1 , 1 ) is omitted for convenience of notation.
We begin with the probability of erroneous relaying, Pr E 1 ( b ) . Define
E 1 , 1 ( b ) = { ( U b n ( 1 | M ˜ b 1 ) , X 1 , b n ( M ˜ b 1 ) , Y 1 , b n ) A δ ( P U , X 1 P Y 1 | U , X 1 q ) for all q Q ^ n } E 1 , 2 ( b ) = { ( U b n ( m b | M ˜ b 1 ) , X 1 , b n ( M ˜ b 1 ) , Y 1 , b n ) A δ ( P U , X 1 P Y 1 | U , X 1 q ) , for some m b 1 , q Q ^ n } .
For b [ 1 : B 1 ] , the relay error event is bounded as
E 1 ( b ) E 1 ( b 1 ) E 1 , 1 ( b ) E 1 , 2 ( b ) = E 1 ( b 1 ) E 1 ( b 1 ) c E 1 , 1 ( b ) E 1 ( b 1 ) c E 1 , 2 ( b ) ,
with E 1 ( 0 ) = . Thus, by the union of events bound,
Pr E 1 ( b ) Pr E 1 ( b 1 ) + Pr E 1 , 1 ( b ) E 1 ( b 1 ) c + Pr E 1 , 2 ( b ) E 1 ( b 1 ) c .
Consider the second term on the RHS of (A15). We now claim that given that E 1 ( b 1 ) c occurred, i.e., M ˜ b 1 = 1 , the event E 1 , 1 ( b ) implies that ( U b n ( 1 | 1 ) , X 1 , b n ( 1 ) , Y 1 , b n ) A δ / 2 ( P U , X 1 P Y 1 | U , X 1 q ) for all q Q . This claim is due to the following. Assume to the contrary that E 1 , 1 ( b ) holds, but ( U b n ( 1 | 1 ) , X 1 , b n ( 1 ) , Y 1 , b n ) A δ / 2 ( P U , X 1 P Y 1 | U , X 1 q ) for some q Q . Then, for a sufficiently large n, there exists a type q ( s ) such that
| q ( s ) q ( s ) | δ 1 ,
for all s S , and by the definition in (A1), q Q ^ n . Then, (A16) implies that
| P Y 1 | U , X 1 q ( y 1 | u , x 1 ) P Y 1 | U , X 1 q ( y 1 | u , x 1 ) | | S | · δ 1 = δ 2 ,
for all u U , x 1 X 1 and y 1 Y 1 (see (A4) and (A2)). Hence, ( U b n ( 1 | 1 ) , X 1 , b n ( 1 ) , Y 1 , b n ) A δ ( P U , X 1 P Y 1 | U , X 1 q ) , which contradicts the first assumption. It follows that
Pr E 1 , 1 ( b ) E 1 ( b 1 ) c Pr ( U b n ( 1 | 1 ) , X 1 , b n ( 1 ) , Y 1 , b n ) A δ / 2 ( P U , X 1 P Y 1 | U , X 1 q ) for all q Q E 1 ( b 1 ) c Pr ( U b n ( 1 | 1 ) , X 1 , b n ( 1 ) , Y 1 , b n ) A δ / 2 ( P U , X 1 P Y 1 | U , X 1 q * ) E 1 ( b 1 ) c .
Since the codebooks F 1 , , F B are independent, the sequence ( U b n ( 1 | 1 ) , X 1 , b n ( 1 ) ) from the codebook F b is independent of the relay estimate M ˜ b 1 , which is a function of Y 1 , b 1 n and the codebook F b 1 . Thus, the RHS of (A18) tends to zero exponentially as n by the law of large numbers and Chernoff’s bound.
We move to the third term in the RHS of (A15). By the union of events bound, the fact that the number of type classes in S n is bounded by ( n + 1 ) | S | , and the independence of the codebooks, we have that
Pr E 1 , 2 ( b ) E 1 ( b 1 ) c ( n + 1 ) | S | · sup q Q ^ n Pr ( U b n ( m b | 1 ) , X 1 , b n ( 1 ) , Y 1 , b n ) A δ ( P U , X 1 P Y 1 | U , X 1 q ) for some m b 1 ( n + 1 ) | S | · 2 n R · sup q Q ^ n u n , x 1 n P U n , X 1 n ( u n , x 1 n ) · y 1 n : ( u n , x 1 n , y 1 n ) A δ ( P U , X 1 P Y 1 | U , X 1 q ) P Y 1 n | X 1 n q * ( y 1 n | x 1 n ) ,
where the last line follows since U b n ( m b | 1 ) is conditionally independent of Y 1 , b n given X 1 , b n ( 1 ) , for every m b 1 . Let y 1 n satisfy ( u n , x 1 n , y 1 n ) A δ ( P U , X 1 P Y 1 | U , X 1 q ) . Then, ( x 1 n , y 1 n ) A δ 2 ( P X 1 , Y 1 q ) with δ 2 | U | · δ . By Lemmas 2.6 and 2.7 in [93],
P X 1 n , Y 1 n q * ( x 1 n , y 1 n ) = 2 n H ( P ^ x 1 n , y 1 n ) + D ( P ^ x 1 n , y 1 n | | P X 1 , Y 1 q * ) 2 n H ( P ^ x 1 n , y 1 n ) 2 n H q ( X 1 , Y 1 ) ε 1 ( δ ) ,
hence,
P Y 1 n | X 1 n q * ( y 1 n | x 1 n ) 2 n H q ( Y 1 | X 1 ) ε 2 ( δ ) ,
where ε 1 ( δ ) , ε 2 ( δ ) 0 as δ 0 . Therefore, by Equation (A19)−(A20), along with [93] (Lemma 2.13),
Pr E 1 , 2 ( b ) E 1 ( b 1 ) c ( n + 1 ) | S | · sup q Q 2 n [ I q ( U ; Y 1 | X 1 ) R ε 3 ( δ ) ] ,
with ε 3 ( δ ) 0 as δ 0 . Using induction, we have by (A15) that Pr E 1 ( b ) tends to zero exponentially as n , for b [ 1 : B 1 ] , provided that R < inf q Q I q ( U ; Y 1 | X 1 ) ε 3 ( δ ) .
As for the erroneous decoding of M b at the receiver, observe that given E 1 ( b ) c , the relay sends X 1 , b n ( 1 ) in block b + 1 , hence
( U b + 1 n ( 1 | 1 ) , X b + 1 n ( 1 , 1 | 1 ) , X 1 , b + 1 n ( 1 ) ) P U , X , X 1 ( u , x , x 1 ) .
At the destination receiver, decoding is performed backwards, hence the error events have a different form compared to those of the relay (cf. (A13) and the events below). Define the events,
E 2 , 1 ( b ) = { ( U b + 1 n ( M ^ b + 1 | 1 ) , X 1 , b + 1 n ( 1 ) , Y b + 1 n ) A δ ( P U , X 1 P Y | U , X 1 q ) for all q Q ^ n } E 2 , 2 ( b ) = { ( U b + 1 n ( M ^ b + 1 | m b ) , X 1 , b + 1 n ( m b ) , Y b + 1 n ) A δ ( P U , X 1 P Y 1 | U , X 1 q ) , for some m b 1 , q Q ^ n }
For b [ 1 : B 1 ] , the error event E 2 ( b ) is bounded by
E 2 ( b ) E 2 ( b + 1 ) E 2 , 1 ( b ) E 2 , 2 ( b ) = E 2 ( b + 1 ) E 2 ( b + 1 ) c E 2 , 1 ( b ) E 2 ( b + 1 ) c E 2 , 2 ( b ) ,
with E 2 ( B ) = . Thus,
Pr E 2 ( b ) E 1 ( b ) c Pr E 2 ( b + 1 ) E 1 ( b ) c + Pr E 2 , 1 ( b ) E 1 ( b ) c , E 2 ( b + 1 ) c + Pr E 2 , 2 ( b ) E 1 ( b ) c , E 2 ( b + 1 ) c .
By similar arguments to those used above, we have that
Pr E 2 , 1 ( b ) E 1 ( b ) c , E 2 ( b + 1 ) c Pr ( U b + 1 n ( 1 | 1 ) , X 1 , b + 1 n ( 1 ) , Y b + 1 n ) A δ / 2 ( P U , X 1 P Y | U , X 1 q * ) E 1 ( b ) c ,
which tends to zero exponentially as n , due to (A22), and by the law of large numbers and Chernoff’s bound. Then, by similar arguments to those used for the bound on Pr E 1 , 2 ( b ) E 1 ( b 1 ) c , the third term on the RHS of (A25) tends to zero as n , provided that R < inf q Q I q ( U , X 1 ; Y ) ε 4 ( δ ) , where ε 4 ( δ ) 0 as δ 0 . Using induction, we have by (A25) that the second term on the RHS of (A12) tends to zero exponentially as n , for b [ 1 : B 1 ] .
Moving to the error event for M b , define
E 3 , 1 ( b ) = { ( U b n ( M ^ b | M ^ b 1 ) , X b n ( M ^ b , 1 | M ^ b 1 ) , X 1 , b ( M ^ b 1 ) , Y b n ) A δ ( P U , X , X 1 P Y | X , X 1 q ) , for all q Q ^ n } E 3 , 2 ( b ) = { ( U b n ( M ^ b | M ^ b 1 ) , X b n ( M ^ b , m b | M ^ b 1 ) , X 1 , b ( M ^ b 1 ) , Y b n ) A δ ( P U , X , X 1 P Y | X , X 1 q ) , for some m b 1 , q Q ^ n } .
Given E 2 ( b ) c E 2 ( b 1 ) c , we have that M ^ b = 1 and M ^ b 1 = 1 . Then, by similar arguments to those used above,
Pr E 3 ( b ) E 1 ( b ) c E 2 ( b ) c E 2 ( b 1 ) c Pr E 3 , 1 ( b ) E 1 ( b ) c E 2 ( b ) c E 2 ( b 1 ) c + Pr E 3 , 2 ( b ) E 1 ( b ) c E 2 ( b ) c E 2 ( b 1 ) c e a 0 n + ( n + 1 ) | S | · sup q Q m b 1 Pr ( U b n ( 1 | 1 ) , X b n ( 1 , m b | 1 ) , X 1 , b ( 1 ) , Y b n ) A δ ( P U , X , X 1 P Y | X , X 1 q ) E 1 ( b ) c e a 0 n + ( n + 1 ) | S | · sup q Q 2 n [ I q ( X ; Y | U , X 1 ) R ε 5 ( δ ) ]
where a 0 > 0 and ε 5 ( δ ) 0 as δ 0 . The second inequality holds by (A22) along with the law of large numbers and Chernoff’s bound, and the last inequality holds as X b n ( 1 , m b | 1 ) is conditionally independent of Y b n given ( U b n ( 1 | 1 ) , X 1 , b n ( 1 ) ) for every m b 1 . Thus, the third term on the RHS of (A12) tends to zero exponentially as n , provided that R < inf q Q I q ( X ; Y | U , X 1 ) ε 5 ( δ ) . Eliminating R and R , we conclude that the probability of error, averaged over the class of the codebooks, exponentially decays to zero as n , provided that R < R P D F ( L Q ) . Therefore, there must exist a ( 2 n R , n , ε ) deterministic code, for a sufficiently large n. □

Appendix A.2. Cutset Upper Bound

This is a straightforward consequence of the cutset bound in [4]. Assume to the contrary that there exists an achievable rate R > R C S ( L Q ) . Then, for some q * ( s ) in the closure of Q ,
R > max p ( x , x 1 ) min I q * ( X , X 1 ; Y ) , I q * ( X ; Y , Y 1 | X 1 ) .
By the achievability assumption, we have that for every ε > 0 and sufficiently large n, there exists a ( 2 n R , n ) random code C Γ such that P e ( n ) ( q , C ) ε for every i.i.d. state distribution q Q , and in particular for q * . This holds even if q * is in the closure of Q but not in Q itself, since P e ( n ) ( q , C ) is continuous in q. Consider using this code over a standard relay channel W Y , Y 1 | X , X 1 without a state, where W Y , Y 1 | X , X 1 ( y , y 1 | x , x 1 ) = s S q * ( s ) W Y , Y 1 | X , X 1 , S ( y , y 1 | x , x 1 , s ) . It follows that the rate R as in (A29) can be achieved over the relay channel W Y , Y 1 | X , X 1 , in contradiction to [4]. We deduce that the assumption is false, and R > R C S ( L Q ) cannot be achieved. □

Appendix B. Proof of Corollary 1

This is a straightforward consequence of Lemma 1, which states that the capacity of the compound relay channel is bounded by R P D F ( L Q ) C ( L Q ) R C S ( L Q ) . Thus, if W Y , Y 1 | X , X 1 , S is reversely degraded such that W Y , Y 1 | X , X 1 , S = W Y | X , X 1 W Y 1 | Y , X 1 , S , then I q ( X ; Y , Y 1 | X 1 ) = I q ( X ; Y | X 1 ) , and the bounds coincide by the minimax theorem [90], cf. (8) and (12). Similarly, if W Y , Y 1 | X , X 1 , S is strongly degraded, i.e., W Y , Y 1 | X , X 1 , S = W Y 1 | X , X 1 W Y | Y 1 , X 1 , S , then I q ( X ; Y , Y 1 | X 1 ) = I ( X ; Y 1 | X 1 ) , and by (8) and (13),
R C S ( L Q ) = min q ( s ) Q max p ( x , x 1 ) min I q ( X , X 1 ; Y ) , I ( X ; Y 1 | X 1 ) ,
R P D F ( L Q ) = max p ( x , x 1 ) min q ( s ) Q min I q ( X , X 1 ; Y ) , I ( X ; Y 1 | X 1 ) .
Observe that min I q ( X , X 1 ; Y ) , I ( X ; Y 1 | X 1 ) is concave in p ( x , x 1 ) and quasi-convex in q ( s ) (see e.g., [94] (Section 3.4])), hence the bounds (A30) and (A31) coincide by the minimax theorem [90]. □

Appendix C. Proof of Corollary 2

Consider the block-compound relay channel L Q × B , where the state distribution q b Q varies from block to block. Since the encoder, relay and receiver are aware of this jamming scheme, they can use a block coding scheme that is synchronized with the jammer block strategy. Thus, the capacity is the same as that of the ordinary compound channel, i.e., C ( L Q × B ) = C ( L Q ) and C ( L Q × B ) = C ( L Q ) . Hence, (17) and (18) follow from Lemma 1. As for the second part of Corollary 2, observe that the block Markov coding scheme used in the proof of the partial decode-forward lower bound can be applied as is to the block-compound relay channel, since the relay and the destination receiver do not estimate the state distribution while decoding the messages (see Appendix A). Furthermore, the analysis also holds, where the actual state distribution q * , in (A18)–(A20) and (A26), is now replaced by the state distribution q b * which corresponds to block b [ 1 : B ] . □

Appendix D. Proof of Theorem 1

First, we explain the general idea. We modify Ahlswede’s Robustification Technique (RT) [59] to the relay channel. Namely, we use codes for the compound relay channel to construct a random code for the AVRC using randomized permutations. However, in our case, the strictly causal nature of the relay imposes a difficulty, and the application of the RT is not straightforward.
In [59], there is noncausal state information and a random code is defined via permutations of the codeword symbols and the received sequence. Here, however, the relay cannot apply permutations to its transmission x 1 n , because it depends on the received sequence y 1 n in a strictly causal manner. We resolve this difficulty using block Markov codes for the block-compound relay channel to construct a random code for the AVRC, applying B in-block permutations to the relay transmission, which depends only on the sequence received in the previous block. The details are given below.

Appendix D.1. Partial Decode-Forward Lower Bound

We show that every rate R < R P D F ( L ) (see (19)) can be achieved by random codes over the AVRC L , i.e., C ( L ) R P D F ( L ) . We start with Ahlswede’s RT [59], stated below. Let h : S n [ 0 , 1 ] be a given function. If, for some fixed α n ( 0 , 1 ) , and for all q ( s n ) = i = 1 n q ( s i ) , with q P ( S ) ,
s n S n q ( s n ) h ( s n ) α n ,
then,
1 n ! π Π n h ( π s n ) β n , for all s n S n ,
where Π n is the set of all n-tuple permutations π : S n S n , and β n = ( n + 1 ) | S | · α n .
According to Corollary 2, for every R < R P D F ( L ) , there exists a ( 2 n R ( B 1 ) , n B , e 2 θ n ) block Markov code for the block-compound relay channel L P ( S ) × B for some θ > 0 and sufficiently large n, where B > 0 is arbitrarily large. Recall that the code constructed in the proof in Appendix A has the following form. The encoders use B > 0 blocks to convey B 1 messages m b , b [ 1 : B 1 ] . Each message consists of two parts, i.e., m b = ( m b , m b ) , where m b [ 1 : 2 n R ] and m b [ 1 : 2 n R ] . In block b [ 1 : B ] , the encoder sends x b n = f b ( m b , m b | m b 1 ) , with fixed m 0 and m B , and the relay transmits x 1 , b n = f 1 , b ( y 1 , b 1 n ) , using the sequence received in the previous block. After receiving the entire output sequence ( y b n ) b = 1 B , the decoder finds an estimate for the messages. Set m ^ B = 1 . The first part of each message is decoded backwards as m ^ b = g b ( y b + 1 n , m ^ b + 1 ) , for b = B 1 , B 2 , , 1 . Then, the second part of each message is decoded as m ^ b = g b ( y b n , m ^ 1 , , m ^ B 1 ) , for b [ 1 : B 1 ] . The overall blocklength is then n · B and the average rate is B 1 B ( R + R ) .
Given such a block Markov code C B M for the block-compound relay channel L P ( S ) × B , we have that
Pr C B M E b | ( E b + 1 ) c e 2 θ n , Pr C B M E b | E 1 c , , E b 1 c e 2 θ n
for b = B 1 , , 1 , where E 0 = E B = , and E b = { M ^ b M b } , E b = { M ^ b M b } , b [ 1 : B 1 ] . That is, for every sequence of state distributions q 1 , , q b + 1 , where q t ( s t n ) = i = 1 n q t ( s t , i ) for t [ 1 : b + 1 ] ,
s 1 n S n q 1 ( s 1 n ) s 2 n S n q 2 ( s 2 n ) s b + 1 n S n q b + 1 ( s b + 1 n ) · h b ( s 1 n , s 2 n , , s b + 1 n ) e 2 θ n ,
and
s 1 n S n q 1 ( s 1 n ) s 2 n S n q 2 ( s 2 n ) s b n S n q b ( s b n ) · h b ( s 1 n , s 2 n , , s b n ) e 2 θ n ,
where
h b ( s 1 n , s 2 n , , s b + 1 n ) = 1 2 n ( b + 1 ) ( R + R ) ( m 1 , m 1 ) , , ( m b + 1 , m b + 1 ) y 1 , b n Y 1 n Pr Y 1 , b n = y 1 , b n ( M 1 , M 1 ) = ( m 1 , m 1 ) , , ( M b , M b ) = ( m b , m b ) , S 1 n = s 1 n , , S b n = s b n × y b + 1 n : g b ( y b + 1 n , m b + 1 ) m b W Y n | X n , X 1 n , S n ( y b + 1 n | f b + 1 ( m b + 1 , m b + 1 | m b ) , f 1 , b + 1 ( y 1 , b n ) , s b + 1 n )
and
h b ( s 1 n , s 2 n , , s b n ) = 1 2 n R m b = 1 2 n R 1 2 n R ( B 1 ) m 1 , , m B 1 y 1 , b 1 n Y 1 n Pr ( Y 1 , b 1 n = y 1 , b 1 n | ( M 1 , M 1 ) = ( m 1 , m 1 ) , , ( M b 1 , M b 1 ) = ( m b 1 , m b 1 ) , S 1 n = s 1 n , , S b 1 n = s b 1 n ) × y b n , y 1 , b n : g b ( y b n , m 1 , , m B 1 ) m b W Y n | X n , X 1 n , S n ( y b n | f b ( m b , m b | m b 1 ) , f 1 , b ( y 1 , b 1 n ) , s b n ) .
The conditioning in the equations above can be explained as follows. In (A37), due to the code construction, the sequence Y 1 , b n received at the relay in block b [ 1 : B ] depends only on the messages ( M t , M t ) with t b . The decoded message M ^ b , at the destination receiver, depends on messages M t with t > b , since the receiver decodes this part of the message backwards. In (A38), since the second part of the message M b is decoded after backward decoding is complete, the estimation of M b at the decoder depends on the entire sequence M ^ 1 , , M ^ B 1 . By (A35)–(A36), for every t [ 1 : b ] , h b and h b as functions of s t + 1 n and s t n , respectively, satisfy (A32) with α n = e 2 θ n , given that the state sequences in the other blocks are fixed. Hence, applying Ahlswede’s RT recursively, we obtain
1 ( n ! ) b + 1 π 1 , π 2 , , π b + 1 Π n h b ( π 1 s 1 n , π 2 s 2 n , , π b + 1 s b + 1 n ) ( n + 1 ) B | S | e 2 θ n e θ n , 1 ( n ! ) b π 1 , π 2 , , π b Π n h b ( π 1 s 1 n , π 2 s 2 n , , π b s b n ) ( n + 1 ) B | S | e 2 θ n e θ n ,
for all ( s 1 n , s 2 n , , s b + 1 n ) S ( b + 1 ) n and sufficiently large n, such that ( n + 1 ) B | S | e θ n .
On the other hand, for every π 1 , π 2 , , π b + 1 Π n , we have that
h b ( π 1 s 1 n , π 2 s 2 n , , π b + 1 s b + 1 n ) = E h b ( π 1 s 1 n , π 2 s 2 n , , π b + 1 s b + 1 n | M t , M t , t = 1 , , b + 1 ) ,
with
h b ( π 1 s 1 n , π 2 s 2 n , , π b + 1 s b + 1 n | m t , m t , t = 1 , , b + 1 ) = y 1 , 1 , , y 1 , b t = 0 b 1 W Y 1 n | X n , X 1 n , S n ( y 1 , t + 1 n | f t + 1 ( m t + 1 , m t + 1 | m t ) , f 1 , t + 1 ( y 1 , t n ) , π t + 1 s t + 1 n ) × y b + 1 n : g b ( y b + 1 n , m b + 1 ) m b W Y n | X n , X 1 n , S n ( y b + 1 n | f b + 1 ( m b + 1 , m b + 1 | m b ) , f 1 , b + 1 ( y 1 , b n ) , π b + 1 s b + 1 n ) = ( a ) y 1 , 1 , , y 1 , b t = 0 b 1 W Y 1 n | X n , X 1 n , S n ( π t + 1 y 1 , t + 1 n | f t + 1 ( m t + 1 , m t + 1 | m t ) , f 1 , b + 1 ( π t y 1 , t n ) , π t + 1 s t + 1 n )
× y b + 1 n : g b ( π b + 1 y b + 1 n , m b + 1 ) m b W Y n | X n , X 1 n , S n ( π b + 1 y b + 1 n | f b + 1 ( m b + 1 , m b + 1 | m b ) , f 1 , b + 1 ( π b y 1 , b n ) , π b + 1 s b + 1 n ) = ( b ) y 1 , 1 , , y 1 , b t = 0 b 1 W Y 1 n | X n , X 1 n , S n ( y 1 , t + 1 n | π t + 1 1 f t + 1 ( m t + 1 , m t + 1 | m t ) , π t + 1 1 f 1 , b + 1 ( π t y 1 , t n ) , s t + 1 n ) × y b + 1 n : g b ( π b + 1 y b + 1 n , m b + 1 ) m b W Y n | X n , X 1 n , S n ( y b + 1 n | π b + 1 1 f b + 1 ( m b + 1 , m b + 1 | m b ) , π b + 1 1 f 1 , b + 1 ( π b y 1 , b n ) , s b + 1 n ) ,
where ( a ) is obtained by changing the order of summation over y 1 , 1 n , , y 1 , b n and y b + 1 n ; and ( b ) holds because the relay channel is memoryless. Similarly,
h b ( π 1 s 1 n , π 2 s 2 n , , π b s b n ) = E h b ( π 1 s 1 n , π 2 s 2 n , , π b s b n | M 1 , , M B 1 , M t , t = 1 , , b ) ,
with
h b ( π 1 s 1 n , π 2 s 2 n , , π b s b n | m 1 , , m B 1 , m t , t = 1 , , b ) = y 1 , 1 , , y 1 , b 1 t = 1 b 1 W Y 1 n | X n , X 1 n , S n ( y 1 , t n | f t ( m t , m t | m t 1 ) , f 1 , t ( y 1 , t n ) , π t s t n ) × y b n : g b ( y b n , m 1 , , m B 1 ) m b W Y n | X n , X 1 n , S n ( y b n | f b ( m b , m b | m b 1 ) , f 1 , b ( y 1 , b 1 n ) , π b s b n ) = ( a ) y 1 , 1 , , y 1 , b 1 t = 1 b 1 W Y 1 n | X n , X 1 n , S n ( π t y 1 , t n | f t ( m t , m t | m t 1 ) , f 1 , t ( π t 1 y 1 , t 1 n ) , π t s t n ) × y b n : g b ( π b y b n , m 1 , , m B 1 ) m b W Y n | X n , X 1 n , S n ( π b y b n | f b ( m b , m b | m b 1 ) , f 1 , b ( π b 1 y 1 , b 1 n ) , π b s b n ) = ( b ) y 1 , 1 , , y 1 , b 1 t = 1 b 1 W Y 1 n | X n , X 1 n , S n ( y 1 , t n | π t 1 f t ( m t , m t | m t 1 ) , π t 1 f 1 , t ( π t 1 y 1 , t 1 n ) , s t n ) × y b n : g b ( π b y b n , m 1 , , m B 1 ) m b W Y n | X n , X 1 n , S n ( y b n | π b 1 f b ( m b , m b | m b 1 ) , π b 1 f 1 , b ( π b 1 y 1 , b 1 n ) , s b n ) .
Then, consider the ( 2 n R ( B 1 ) , n B ) random Markov block code C B M Π , specified by
f b , π ( m b , m b | m b 1 ) = π b 1 f b ( m b , m b | m b 1 ) , f 1 , b , π ( y 1 , b 1 n ) = π b 1 f 1 , b ( π b 1 y 1 , b 1 n ) ,
and
g b , π ( y b + 1 n , m ^ b + 1 ) = g b ( π b + 1 y b + 1 n , m ^ b + 1 ) , g b , π ( y b n , m ^ 1 , , m ^ B 1 ) = g b ( π y b n , m ^ 1 , , m ^ B 1 ) ,
for π 1 , , π B Π n , with a uniform distribution μ ( π 1 , , π B ) = 1 | Π n | B = 1 ( n ! ) B . That is, a set of B independent permutations is chosen at random and applied to all blocks simultaneously, while the order of the blocks remains intact. As we restricted ourselves to a block Markov code, the relaying function in a given block depends only on symbols received in the previous block, hence, the relay can implement those in-block permutations, and the coding scheme does not violate the causality requirement.
From (A41) and (A43), we see that using the random code C B M Π , the error probabilities for the messages M b and M b are given by
Pr C B M Π E b | ( E b + 1 ) c , S 1 n = s 1 n , , S b + 1 n = s b + 1 n = π 1 , , π B Π n μ ( π 1 , , π B ) h b ( π 1 s 1 n , π 2 s 2 n , , π b + 1 s b + 1 n ) , Pr C B M Π E b | E 1 c , , E B 1 c , S 1 n = s 1 n , , S b n = s b n = π 1 , , π B Π n μ ( π 1 , , π B ) h b ( π 1 s 1 n , π 2 s 2 n , , π b s b n ) ,
for all s 1 n , , s b + 1 n S n , b [ 1 : B 1 ] , and therefore, together with (A39), we have that the probability of error of the random code C B M Π is bounded by P e ( n ) ( q , C B M Π ) e θ n , for every q ( s n B ) P ( S n B ) . That is, C B M Π is a ( 2 n R ( B 1 ) , n B , e θ n ) random code for the AVRC L , where the overall blocklength is n B , and the average rate B 1 B · R tends to R as B . This completes the proof of the partial decode-forward lower bound.

Appendix D.2. Cutset Upper Bound

The proof immediately follows from Lemma 1, since the random code capacity of the AVRC is bounded by the random code capacity of the compound relay channel, i.e., C ( L ) C ( L P ( S ) ) . □

Appendix E. Proof of Lemma 2

We use the approach of [55], with the required adjustments. We use the random code constructed in the proof of Theorem 1. Let R < C ( L ) , and consider the case where the marginal sender-relay and sender-receiver AVCs have positive capacity, i.e.,
C ( W 1 ( x 1 , 1 ) ) > 0 , and C ( W ( x 1 , 2 ) ) > 0 ,
for some x 1 , 1 , x 1 , 2 X 1 (see (23)). By Theorem 1, for every ε > 0 and sufficiently large n, there exists a ( 2 n R , n , ε ) random code C Γ = μ ( γ ) = 1 k , Γ = [ 1 : k ] , { C γ } γ Γ , where C γ = ( f γ n , f 1 , γ , g γ ) , for γ Γ . Following Ahlswede’s Elimination Technique [55], it can be assumed that the size of the code collection is bounded by k = | Γ | n 2 . By (A46), we have that for every ε > 0 and sufficiently large ν , the code index γ [ 1 : k ] can be sent through the relay channel W Y 1 | X , X 1 , S using a ( 2 ν R ˜ , ν , ε ) deterministic code C i = ( f ˜ ν , g ˜ ) , where R ˜ > 0 , while the relay repeatedly transmits the symbol x 1 , 1 . Since k is at most polynomial, the encoder can reliably convey γ to the relay with a negligible blocklength, i.e., ν = o ( n ) . Similarly, there exists ( 2 ν R ˜ , ν , ε ) code C i = ( f ˜ ν , g ˜ ) for the transmission of γ [ 1 : k ] through the channel W Y | X , X 1 , S to the receiver, where ν = o ( n ) and R ˜ > 0 , while the relay repeatedly transmits the symbol x 1 , 2 .
Now, consider a code formed by the concatenation of C i and C i as consecutive prefixes to a corresponding code in the code collection { C γ } γ Γ . That is, the encoder first sends the index γ to the relay and the receiver, and then it sends the message m [ 1 : 2 n R ] to the receiver. Specifically, the encoder first transmits the ( ν + ν ) -sequence ( f ˜ ν ( γ ) , f ˜ ν ( γ ) ) to convey the index γ , while the relay transmits the ( ν + ν ) -sequence ( x ˜ 1 ν , x ˜ 1 ν ) , where x ˜ 1 ν = ( x 1 , 1 , x 1 , 1 , , x 1 , 1 ) and x ˜ 1 ν = ( x 1 , 2 , x 1 , 2 , , x 1 , 2 ) . At the end of this transmission, the relay uses the first ν symbols it received to estimate the code index as γ ^ = g ˜ ( y ˜ 1 ν ) .
Then, the message m is transmitted by the codeword x n = f γ ( m ) , while the relay transmits x 1 n = f 1 , γ ^ n ( y 1 n ) . Subsequently, decoding is performed in two stages as well; the decoder estimates the index at first, with γ ^ = g ˜ ( y ˜ ν ) , and the message is then estimated by m ^ = g γ ^ ( y n ) . By the union of events bound, the probability of error is then bounded by ε c = ε + ε + ε , for every joint distribution in P ( S ν + ν + n ) . That is, the concatenated code is a ( 2 ( ν + ν + n ) R ˜ n , ν + ν + n , ε c ) code over the AVRC L , where the blocklength is n + o ( n ) , and the rate R ˜ n = n ν + ν + n · R approaches R as n . □

Appendix F. Proof of Corollary 4

Consider part 1. By Definition 3, if W Y 1 | X , X 1 , S and W Y | X , X 1 , S are not symmetrizable- X | X 1 then there exist x 1 , 1 , x 1 , 2 X 1 such that the DMCs W Y 1 | X , X 1 , S ( · | · , x 1 , 1 , · ) and W Y | X , X 1 , S ( · | · , x 1 , 2 , · ) are non-symmetrizable in the sense of [57] (Definition 2). This, in turn, implies that C ( W 1 ( x 1 , 1 ) ) > 0 and C ( W ( x 1 , 2 ) ) > 0 , due to [57] (Theorem 1). Hence, by Lemma 2, C ( L ) = C ( L ) , and by Theorem 1, R P D F ( L ) C ( L ) R C S ( L ) .
Part 3 immediately follows from part 1 and Corollary 3. As for part 2, consider a strongly reversely degraded relay channel. We claim that if W Y | X , X 1 , S is symmetrizable- X | X 1 , then W Y 1 | X , X 1 , S is also symmetrizable- X | X 1 . Indeed, suppose that W Y | X , X 1 , S is symmetrized by some J ( s | x , x 1 ) (see Definition 24). Then, for every x , x ˜ X , x 1 X 1 , and y 1 Y 1 ,
s S J ( s | x ˜ , x 1 ) W Y 1 | X , X 1 , S ( y 1 | x , x 1 , s ) = s S J ( s | x ˜ , x 1 ) y Y W Y , Y 1 | X , X 1 , S ( y , y 1 | x , x 1 , s ) = ( a ) y Y W Y 1 | Y , X 1 ( y 1 | y , x 1 ) s S J ( s | x ˜ , x 1 ) W Y | X , X 1 , X ( y | x , x 1 , s ) = ( b ) y Y W Y 1 | Y , X 1 ( y 1 | y , x 1 ) s S J ( s | x , x 1 ) W Y | X , X 1 , X ( y | x ˜ , x 1 , s ) = ( c ) s S J ( s | x , x 1 ) y Y W Y , Y 1 | X , X 1 , S ( y , y 1 | x ˜ , x 1 , s ) = s S J ( s | x , x 1 ) W Y 1 | X , X 1 , S ( y 1 | x ˜ , x 1 , s ) ,
where ( a ) and ( c ) hold since W Y , Y 1 | X , X 1 , S is strongly reversely degraded, and ( b ) holds since W Y | X , X 1 , S is symmetrized by J ( s | x , x 1 ) . This means that W Y 1 | X , X 1 , S is also symmetrizable- X | X 1 . It can be deduced that given the conditions of part 2, both W Y | X , X 1 , S and W Y 1 | X , X 1 , S are non-symmetrizable- X | X 1 . Hence, the proof follows from part 1 and Corollary 3. □

Appendix G. Proof of Lemma 3

The proof is based on generalizing the technique by [56]. Let L be a symmetrizable- X | X 1 . Assume to the contrary that a positive rate R > 0 can be achieved. That is, for every ε > 0 and sufficiently large n, there exists a ( 2 n R , n , ε ) code C = ( f , f 1 , g ) . Hence, the size of the message set is at least 2, i.e.,
M 2 n R 2 .
We now show that there exists a distribution q ( s n ) such that the probability of error P e ( n ) ( q , C ) is bounded from below by a positive constant, in contradiction to the assumption above.
By Definition 3, there exists a conditional distribution J ( s | x ) that satisfies (24). Then, consider the state sequence distribution q ( s n ) = 1 M m = 1 M J n ( s n | x n ( m ) ) , where J n ( s n | x n ) = i = 1 n J ( s i | x i ) and x n ( m ) = f ( m ) . For this distribution, the probability of error is given by
P e ( n ) ( q , C ) = s n S n 1 M m ˜ = 1 M J n ( s n | x n ( m ˜ ) ) · 1 M m = 1 M ( y n , y 1 n ) : g ( y n ) m W n ( y n , y 1 n | x n ( m ) , f 1 n ( y 1 n ) , s n ) = 1 2 M 2 m = 1 M m ˜ = 1 M ( y n , y 1 n ) : g ( y n ) m s n S n W n ( y n , y 1 n | x n ( m ) , f 1 n ( y 1 n ) , s n ) J n ( s n | x n ( m ˜ ) ) + 1 2 M 2 m = 1 M m ˜ = 1 M ( y n , y 1 n ) : g ( y n ) m ˜ s n S n W n ( y n , y 1 n | x n ( m ˜ ) , f 1 n ( y 1 n ) , s n ) J n ( s n | x n ( m ) )
with W n W Y n , Y 1 n | X n , X 1 n , S n for short notation, where in the last sum we interchanged the summation indices m and m ˜ . Then, consider the last sum, and observe that by (24), we have that
s n S n W n ( y n , y 1 n | x n ( m ˜ ) , f 1 n ( y 1 n ) , s n ) J n ( s n | x n ( m ) ) = i = 1 n s i S W ( y i , y 1 , i | x i ( m ˜ ) , f 1 , i ( y 1 i 1 ) , s i ) J ( s i | x i ( m ) ) = i = 1 n s i S W ( y i , y 1 , i | x i ( m ) , f 1 , i ( y 1 i 1 ) , s i ) J ( s i | x i ( m ˜ ) ) = s n S n W n ( y n , y 1 n | x n ( m ) , f 1 n ( y 1 n ) , s n ) J n ( s n | x n ( m ˜ ) ) .
Substituting (A50) in (A49), we have
P e ( n ) ( q , C ) = 1 2 M 2 m = 1 M m ˜ = 1 M s n S n [ ( y n , y 1 n ) : g ( y n ) m W n ( y n , y 1 n | x n ( m ) , f 1 n ( y 1 n ) , s n ) J n ( s n | x n ( m ˜ ) ) + ( y n , y 1 n ) : g ( y n ) m ˜ W n ( y n , y 1 n | x n ( m ) , f 1 n ( y 1 n ) , s n ) J n ( s n | x n ( m ˜ ) ) ]
1 2 M 2 m = 1 M m ˜ m s n S n y n , y 1 n W n ( y n , y 1 n | x n ( m ) , f 1 n ( y 1 n ) , s n ) J n ( s n | x n ( m ˜ ) ) = M ( M 1 ) 2 M 2 1 4 ,
where the last inequality follows from (A48), hence a positive rate cannot be achieved. □

Appendix H. Proof of Lemma 4