The Arbitrarily Varying Relay Channel †

We study the arbitrarily varying relay channel, which models communication with relaying in the presence of an active adversary. We establish the cutset bound and partial decode-forward bound on the random code capacity. We further determine the random code capacity for special cases. Then, we consider conditions under which the deterministic code capacity is determined as well. In addition, we consider the arbitrarily varying Gaussian relay channel with sender frequency division under input and state constraints. We determine the random code capacity, and establish lower and upper bounds on the deterministic code capacity. Furthermore, we show that as opposed to previous relay models, the primitive relay channel has a different behavior compared to the non-primitive relay channel in the arbitrarily varying scenario.


I. INTRODUCTION
The relay channel was first introduced by van der Meulen [20] to describe point to point communication with the help of a relay, which receives a noisy version of the transmitter signal and transmits a signal of its own to the destination receiver, in a strictly causal manner.The capacity of the relay channel is not known in general, however, Cover and El Gamal established the cutset upper bound, the decode-forward lower bound, and the partial decode-forward lower bound [9].It was also shown in [9] that for the reversely degraded relay channel, direct transmission is capacity achieving.For the degraded relay channel, the decode-forward bound and the cutset bound coincide, thus characterizing the capacity for this model [9].In general, the partial decode-forward lower bound is tighter than both direct transmission and decode-forward lower bounds.El Gamal and Zahedi [16] determined the capacity of the relay channel with orthogonal sender components, by showing that the partial decode-forward bound and cutset bound coincide.Recently, there has been a growing interest in the Gaussian relay channel under input constraints, considered e.g. in [16,29,7,19,28,26,27].In particular, El Gamal and Zahedi [16] determined the capacity of the Gaussian relay channel with sender frequency division, as a special case of a relay channel with orthogonal sender components.
In practice, the channel statistics are not necessarily known in exact, and they may even change over time.The arbitrarily varying channel (AVC) is an appropriate model to describe such a situation [5].Considering the AVC without a relay, Blackwell et al. determined the random code capacity [5], i.e. the capacity achieved by stochastic-encoder stochastic-decoder coding schemes with common randomness.It was also demonstrated in [5] that the random code capacity is not necessarily achievable using deterministic codes.A well-known result by Ahlswede [1] is the dichotomy property presented by the AVC.Specifically, the deterministic code capacity either equals the random code capacity or else, it is zero.Subsequently, Ericson [15] and Csiszár and Narayan [13] established a simple single-letter condition, namely non-symmetrizability, which is both necessary and sufficient for the capacity to be positive.
In this work, we study the arbitrarily varying relay channel (AVRC), which combines the previous models, i.e. the relay channel and the AVC.In the analysis, we incorporate the block Markov coding schemes of [9] in Ahlswede's Robustification and Elimination Techniques [1,2].We establish the cutset upper bound and the full/partial decode-forward lower bound on the random code capacity of the AVRC.We determine the random code capacity for special cases of the degraded AVRC, the reversely degraded AVRC, and the AVRC with orthogonal sender components.Then, we give extended symmetrizability conditions under which the deterministic code capacity coincides with the random code capacity.We show by example that the deterministic code capacity can be strictly lower than the random code capacity of the AVRC.We also give generalized symmetrizability conditions under which the deterministic code capacity coincides with the random code capacity, and conditions under which it is zero.Furthermore, we consider the Gaussian AVRC with sender frequency division (SFD), under input and state constraints.The random code capacity is determined using the previous results, whereas the deterministic code capacity is lower and upper bounded using an independent approach.Specifically, we extend the techniques of [11], where Csiszár and Narayan determine the capacity of the Gaussian AVC under input and state constraint.
This work is organized as follows.In Section II, the basic definitions and notation are provided.In Section III, we give the main results on the general AVRC.The Gaussian AVRC with SFD is introduced in Section IV, and the main results are given in Section V.The proofs are given in the appendix.
This work was supported by the Israel Science Foundation (grant No. 1285/16).

A. Notation
We use the following notation conventions throughout.Calligraphic letters X , S, Y, ... are used for finite sets.Lowercase letters x, s, y, . . .stand for constants and values of random variables, and uppercase letters X, S, Y, . . .stand for random variables.The distribution of a random variable X is specified by a probability mass function (pmf) P X (x) = p(x) over a finite set X .The set of all pmfs over X is denoted by P(X ).We use x j = (x 1 , x 2 , . . ., x j ) to denote a sequence of letters from X .A random sequence X n and its distribution P X n (x n ) = p(x n ) are defined accordingly.For a pair of integers i and j, 1 ≤ i ≤ j, we define the discrete interval [i : j] = {i, i + 1, . . ., j}.The notation x = (x 1 , x 2 , . . ., x n ) is used when it is understood from the context that the length of the sequence is n, and the ℓ 2 -norm of x is denoted by x .

B. Channel Description
A state-dependent discrete memoryless relay channel (X , X 1 , S, W Y,Y1|X,X1,S , Y, Y 1 ) consists of five sets, X , X 1 , S, Y and Y 1 , and a collection of conditional pmfs W Y,Y1|X,X1,S .The sets stand for the input alphabet, the relay transmission alphabet, the state alphabet, the output alphabet, and the relay input alphabet, respectively.The alphabets are assumed to be finite, unless explicitly said otherwise.The channel is memoryless without feedback, and therefore W Y,Y1|X,X1,S (y i , y 1,i |x i , x 1,i , s i ) .
The arbitrarily varying relay channel (AVRC) is a discrete memoryless relay channel (X , X 1 , S, W Y,Y1|X,X1,S , Y, Y 1 ) with a state sequence of unknown distribution, not necessarily independent nor stationary.That is, S n ∼ q(s n ) with an unknown joint pmf q(s n ) over S n .In particular, q(s n ) can give mass 1 to some state sequence s n .We use the shorthand notation L = {W Y,Y1|X,X1,S } for the AVRC, where the alphabets are understood from the context.
To analyze the AVRC, we consider the compound relay channel.Different models of compound relay channels have been considered in the literature [24,3].Here, we define the compound relay channel as a discrete memoryless relay channel (X , X 1 , S, W Y,Y1|X,X1,S , Y, Y 1 ) with a discrete memoryless state, where the state distribution q(s) is not known in exact, but rather belongs to a family of distributions Q, with Q ⊆ P(S).That is, S n ∼ n i=1 q(s i ), with an unknown pmf q ∈ Q over S. We use the shorthand notation L Q for the compound relay channel, where the transition probability W Y,Y1|X,X1,S and the alphabets are understood from the context.
In the analysis, we also use the following model.Suppose that the user transmits B > 0 blocks of length n, and the jammer is entitled to use a different state distribution q b (s) ∈ Q for every block b ∈ [1 : B], while the encoder, relay and receiver are aware of this jamming scheme.In other words, every block is governed by a different memoryless state.We refer to this channel as the block-compound relay channel, denoted by L Q×B .Although this is a toy model, it is a useful tool for the analysis of the AVRC.

C. Coding
We introduce some preliminary definitions, starting with the definitions of a deterministic code and a random code for the AVRC L. Note that in general, the term 'code', unless mentioned otherwise, refers to a deterministic code.Definition 1 (A code, an achievable rate and capacity).A (2 nR , n) code for the AVRC L consists of the following; a message set [1 : 2 nR ], where it is assumed throughout that 2 nR is an integer, an encoder f : [1 : Given a message m ∈ [1 : 2 nR ], the encoder transmits x n = f (m).At time i ∈ [1 : n], the relay transmits x 1,i = f 1,i (y i−1 1 ) and then receives y 1,i .The relay codeword is given by . The decoder receives the output sequence y n and finds an estimate of the message m = g(y n ) (see Figure 1).We denote the code by ). Define the conditional probability of error of the code C given a state sequence s n ∈ S n by Now, define the average probability of error of C for some distribution q(s n ) ∈ P(S n ), Observe that P (n) e (q, C ) is linear in q, and thus continuous.We say that C is a (2 nR , n, ε) code for the AVRC L if it further satisfies A rate R is called achievable if for every ε > 0 and sufficiently large n, there exists a (2 nR , n, ε) code.The operational capacity is defined as the supremum of the achievable rates and it is denoted by C(L).We use the term 'capacity' referring to this operational meaning, and in some places we call it the deterministic code capacity in order to emphasize that achievability is measured with respect to deterministic codes.
We proceed now to define the parallel quantities when using stochastic-encoders stochastic-decoder triplets with common randomness.The codes formed by these triplets are referred to as random codes.
Definition 2 (Random code).A (2 nR , n) random code for the AVRC L consists of a collection of (2 nR , n) codes {C γ = (f γ , f n 1,γ , g γ )} γ∈Γ , along with a probability distribution µ(γ) over the code collection Γ.We denote such a code by C Γ = (µ, Γ, {C γ } γ∈Γ ).Analogously to the deterministic case, a (2 nR , n, ε) random code has the additional requirement The capacity achieved by random codes is denoted by C ⋆ (L), and it is referred to as the random code capacity.

III. MAIN RESULTS -GENERAL AVRC
We present our results on the compound relay channel and the AVRC.

A. The Compound Relay Channel
We establish the cutset upper bound and the partial decode-forward lower bound for the compound relay channel.Consider a given compound relay channel and where the subscripts 'CS' and 'DF ' stand for 'cutset' and 'decode-forward', respectively.
Lemma 1.The capacity of the compound relay channel L Q is bounded by Specifically, if R < R DF (L Q ), then there exists a (2 nR , n, e −an ) block Markov code over L Q for sufficiently large n and some a > 0.
The proof of Lemma 1 is given in Appendix A. Observe that taking U = ∅ in (9) gives the direct transmission lower bound, Taking U = X in (9) results in a full decode-forward lower bound, This yields the following corollary. then 2 then The proof of Corollary 2 is given in Appendix B. Part 1 follows from the direct transmission and cutset bounds, ( 12) and ( 8), respectively, while part 2 is based on the full decode-forward and cutset bounds, ( 13) and ( 8), respectively.The following corollary is a direct consequence of Lemma 1 and it is significant for the random code analysis of the AVRC.Corollary 3. The capacity of the block-compound relay channel L Q×B is bounded by Specifically, if R < R DF (L Q ), then there exists a (2 nR , n, e −an ) block Markov code over L Q×B for sufficiently large n and some a > 0.
The proof of Corollary 3 is given in Appendix C.

B. The AVRC
We give lower and upper bounds, on the random code capacity and the deterministic code capacity, for the AVRC L.

1) Random Code Lower and Upper Bounds
Theorem 4. The random code capacity of an AVRC L is bounded by The proof of Theorem 4 is given in Appendix D. Together with Corollary 2, this yields another corollary. 2 Before we proceed to the deterministic code capacity, we note that Ahlswede's Elimination Technique [1] can be applied to the AVRC as well.Hence, the size of the code collection of any reliable random code can be reduced to polynomial size.
2) Deterministic Code Lower and Upper Bounds: In the next statements, we characterize the deterministic code capacity of the AVRC L. We consider conditions under which the deterministic code capacity coincides with the random code capacity, and conditions under which it is lower.For every x 1 ∈ X 1 , let W 1 (x 1 ) and W(x 1 ) denote the marginal AVCs from the sender to the relay and from the sender to the destination receiver, respectively, Lemma 6.If the marginal sender-relay and sender-reciever AVCs have positive capacities, i.e.C(W 1 (x 1,1 )) > 0 and C(W(x 1,2 )) > 0, for some x 1,1 , x 1,2 ∈ X 1 , then the capacity of the AVRC L coincides with the random code capacity, i.e.C(L) = C ⋆ (L).
The proof of Lemma 6 is given in Appendix E. Next, we give a computable sufficient condition, under which the deterministic code capacity coincides with the random code capacity.For the point to point AVC, this occurs if and only if the channel is non-symmetrizable [15][13, Definition 2].Our condition here is given in terms of an extended definition of symmetrizability, akin to [17,Definition 9].
Equivalently, for every given A similar definition applies to the marginals W Y |X,X1,S and W Y1|X,X1,S .
Corollary 7. Let L be an AVRC. 2 3) If W Y,Y1|X,X1,S is degraded, such that W Y,Y1|X,X1,S = W Y1|X,X1 W Y |Y1,X1,S , where W Y |X,X1,S is nonsymmetrizable-X |X 1 and W Y1|X,X1 (y 1 |x, x 1 ) = W Y1|X,X1 (y 1 |x, x 1 ) for some x, x ∈ X , x 1 ∈ X 1 and y 1 ∈ Y 1 , then The proof of Corollary 7 is given in Appendix F. Note that there are 4 symmetrizability cases in terms of the sender-relay channel W Y1|X,X1,S and the sender-receiver channel W Y |X,X1,S .For the case where W Y1|X,X1,S and W Y |X,X1,S are both nonsymmetrizable-X |X 1 , the lemma above asserts that the capacity coincides with the random code capacity.In other cases, one may expect the capacity to be lower than the random code capacity.For instance, if W Y |X,X1,S is non-symmetrizable-X |X 1 , while W Y1|X,X1,S is symmetrizable-X |X 1 , then the capacity is positive by direct transmission.Furthermore, in this case, if the channel is reversely degraded, then the capacity coincides with the random code capacity.However, it remains in question whether this is true in general, when the channel is not reveresly degraded.
Next, we consider conditions under which the capacity is zero.Observe that if W Y,Y1|X,X1,S is symmetrizable-X |X 1 then so are W Y |X,X1,S and W Y1|X,X1,S .Intuitively, this means that the AVRC is a poor channel as well.For example, say Y 1 = X + X 1 + S and Y = X • X 1 • S, then the jammer can confuse the decoder by taking the state sequence to be some codeword.The following lemma validates this intuition.Lemma 8.If the AVRC L is symmetrizable-X |X 1 , then it has zero capacity, i.e.C(L) = 0.
Lemma 8 is proved in Appendix G.If the AVRC is degraded then, we have a simpler symmetrizability condition under which the capacity is zero.
Lemma 9 is proved in Appendix H.An example is given below.
On the other hand, we show that the random code capacity is given by C ⋆ (L) = min 1 2 , 1 − h(θ) , using Corollary 5.The derivation is given in Appendix I.

C. AVRC with Orthogonal Sender Components
Consider the special case of a relay channel W Y,Y1|X,X1,S with orthogonal sender components [16] [14, Section 16.6.2],where X = (X ′ , X ′′ ) and Here, we address the case where the channel output depends on the state only through the relay, i.e.
,S } be an AVRC with orthogonal sender components.The random code capacity of L is given by The proof of Lemma 10 is given in Appendix J. To prove Lemma 10, we apply the methods of [16] to our results.Specifically, we use the partial decode-forward lower bound in Theorem 4, taking U = X ′′ (see (9) and ( 20)).

IV. GAUSSIAN AVRC WITH SENDER FREQUENCY DIVISION
We determine the random code capacity of the Gaussian AVRC with sender frequency division (SFD), and give lower and upper bounds on the deterministic code capacity.The derivation of the deterministic code bounds is mostly independent of our previous results, and it is based on the technique by [11].The Gaussian relay channel W Y,Y1|X,X1,S with SFD is a special case of a relay channel with orthogonal sender components [16], specified by where the Gaussian additive noise Z ∼ N (0, σ 2 ) is independent of the channel state.We consider the Gaussian relay channel under input and state constraints.Specifically, the user and the relay's transmission are subject to input constraints Ω > 0 and Ω 1 > 0, respectively, and the jammer is under a state constraint Λ, i.e.
For the compound relay channel, the state constraint is in the average sense.That is, we say that the Gaussian compound relay channel L Q with SFD is under input constraints Ω and Ω 1 and state constraint Λ if Coding definitions and notation are as follows.The definition of a code is similar to that of Subsection II-C.The encoding function is denoted by f and the relay function is denoted by The boldface notation indicates that the encoding functions produce sequences.Here, the encoder and the relay satisfy the input constraints f ′ (m) 2 + f ′′ (m) 2 ≤ nΩ and , and the relay transmits x 1,i = f 1,i (y 1,1 , . . ., y 1,i−1 ).The decoder receives the output sequence y and finds an estimate m = g(y).A (2 nR , n, ε) code C for the Gaussian AVRC satisfies P with Achivable rates, deterministic code capacity and random code capacity are defined as before.Next, we give our results on the Gaussian compound relay channel and the Gaussian AVRC with SFD.

A. Gaussian Compound Relay channel
We determine the capacity of the Gaussian compound relay channel with SFD under input and state constraints.Let Lemma 11.The capacity of the Gaussian compound relay channel with SFD, under input constraints Ω and Ω 1 and state constraint Λ, is given by and it is identical to the random code capacity, i.e.
The proof of Lemma 11 is given in Appendix K, based on our results in the previous sections.

B. Gaussian AVRC
We determine the random code capacity of the Gaussian AVRC with SFD under constraints.
Theorem 12.The random code capacity of the Gaussian AVRC with SFD, under input constraints Ω and Ω 1 and state constraint Λ, is given by The proof of Theorem 12 is given in Appendix L. The proof follows the same considerations as in our previous results.We give lower and upper bounds on the deterministic code capacity of the Gaussian AVRC with SFD under constraints.Define and It can be seen that R G,low ≤ R G,up , since The analysis is based on the following lemma by [11].
Lemma 13 (see [11,Lemma 1]).For every ε > 0, 8 and if θ ≥ η and where [t] + = max{0, t} and •, • denotes inner product.Intuitively, the lemma states that under certain conditions, a codebook can be constructed with an exponentially small fraction of "bad" messages, for which the codewords are non-orthogonal to each other and the state sequence.Theorem 14.The capacity of the Gaussian AVRC with SFD, under input constraints Ω and Ω 1 and state constraint Λ, is bounded by The proof of Theorem 14 is given in Appendix M. Figure 2 depicts the bounds on the capacity of the Gaussian AVRC with SFD under input and state constraints, as a function of the input constraint Ω = Ω 1 , under state constraint Λ = 1 and σ 2 = 0.5.The top dashed line depicts the random code capacity of the Gaussian AVRC.The solid lines depict the deterministic code lower and upper bounds R G,low (L) and R G,up (L).For low values, Ω < Λ 4 = 0.25, we have that R G,up (L) = 0, hence the deterministic code capacity is zero, and it is strictly lower than the random code capacity.The dotted lower line depicts the direct transmission lower bound, which is F G (1, 0) for Ω > Λ, and zero otherwise [13].For intermediate values of Ω, direct transmission is better than the lower bound in Theorem 14. Whereas, for high values of Ω, our bounds are tight, and the capacity coincides with the random code capacity, i.e.

A. Partial Decode-Forward Lower Bound
We construct a block Markov code, where the backward decoder uses joint typicality with respect to a state type, which is "close" to some q ∈ Q.Let δ > 0 be arbitrarily small.Define a set of state types Qn by where Namely, Qn is the set of types that are δ 1 -close to some state distribution q(s) in Q.A code C for the compound relay channel is constructed as follows.
The encoders use B blocks, each consists of n channel uses to convey (B − 1) independent messages to the receiver.Furthermore, each message Codebook Generation: Fix the distribution P U,X,X1 (u, x, x 1 ), and let ).We have thus generated B − 2 independent codebooks, Encoding and decoding is illustrated in Figure 3.
Encoding: To send the message sequence , the relay receives y n 1,b , and finds some If there is none or there is more than one such, set m ′ b = 1.In block b + 1, the relay transmits x n 1,b+1 ( m ′ b ).Backward Decoding: Once all blocks (y n b ) B b=1 are received, decoding is performed backwards.Set m′ If there is none, or more than one such m′ b ∈ [1 : 2 nR ′ ], declare an error.Then, the decoder uses m′ 1 , . . ., m′ If there is none, or more than one such m′′ b ∈ [1 : 2 nR ′′ ], declare an error.We note that using the set of types Qn instead of the original set of state distributions Q alleviates the analysis, since Q is not necessarily finite nor countable.

Analysis of Probability of Error:
Assume without loss of generality that the user sent (M ′ b , M ′′ b ) = (1, 1), and let q * (s) ∈ Q denote the actual state distribution chosen by the jammer.The error event is bounded by the union of the events Then, the probability of error is bounded by with E 2 (0) = ∅, where the conditioning on We begin with the probability of erroneous relaying, Pr (E 1 (b)).Define the relay error event is bounded as with E 1 (0) = ∅.Thus, by the union of events bound, Consider the second term on the RHS of (60).We now claim that given that ) for all q ′′ ∈ Q.This claim is due to the following.Assume to the contrary that ) for some q ′′ ∈ Q.Then, for a sufficiently large n, there exists a type q ′ (s) such that for all s ∈ S, and by the definition in (46), q ′ ∈ Qn .Then, (61) implies that for all u ∈ U, x 1 ∈ X 1 and y 1 ∈ Y 1 (see ( 49) and ( 47)).Hence, ), which contradicts the first assumption.It follows that Since the codebooks F 1 , . . ., F B are independent, the sequence Thus, the RHS of (63) tends to zero exponentially as n → ∞ by the law of large numbers and Chernoff's bound.
We move to the third term in the RHS of (60).By the union of events bound, the fact that the number of type classes in S n is bounded by (n + 1) |S| , and the independence of the codebooks, we have that ) where the last line follows since ).Then, (x n 1 , y n 1 ) ∈ A δ2 (P q ′ X1,Y1 ) with δ 2 |U| • δ.By Lemmas 2.6 and 2.7 in [10], ) ≤ 2 −n(H q ′ (X1,Y1)−ε1(δ)) , hence, where ε 1 (δ), ε 2 (δ) → 0 as δ → 0. Therefore, by (64)−(65), along with [10, Lemma 2.13], with ε 3 (δ) → 0 as δ → 0. Using induction, we have by (60) that Pr (E 1 (b)) tends to zero exponentially as n → ∞, for b ∈ [1 : As for the erroneous decoding of M ′ b at the receiver, observe that given E 1 (b) c , the relay sends At the destination receiver, decoding is performed backwards, hence the error events have a different form compared to those of the relay (cf.(58) and the events below).Define the events, with E 2 (B) = ∅.Thus, By similar arguments to those used above, we have that which tends to zero exponentially as n → ∞, due to (67), and by the law of large numbers and Chernoff's bound.Then, by similar arguments to those used for the bound on ), the third term on the RHS of (70) tends to zero as n → ∞, provided that R ′ < inf q ′ ∈Q I q ′ (U, X 1 ; Y ) − ε 4 (δ), where ε 4 (δ) → 0 as δ → 0. Using induction, we have by (70) that the second term on the RHS of (57) tends to zero exponentially as n → ∞, Moving to the error event for M ′′ b , define Then, by similar arguments to those used above, where a 0 > 0 and ε 5 (δ) → 0 as δ → 0. The second inequality holds by (67) along with the law of large numbers and Chernoff's bound, and the last inequality holds as ) for every m ′′ b = 1.Thus, the third term on the RHS of (57) tends to zero exponentially as n → ∞, provided that R ′′ < inf q ′ ∈Q I q ′ (X; Y |U, X 1 ) − ε 5 (δ).Eliminating R ′ and R ′′ , we conclude that the probability of error, averaged over the class of the codebooks, exponentially decays to zero as n → ∞, provided that R < R DF (L Q ).Therefore, there must exist a (2 nR , n, ε) deterministic code, for a sufficiently large n.

B. Cutset Upper Bound
This is a straightforward consequence of the cutset bound in [9].Assume to the contrary that there exists an achievable rate R > R CS (L Q ).Then, for some q * (s) in the closure of Q, By the achievability assumption, we have that for every ε > 0 and sufficiently large n, there exists a (2 nR , n) random code C Γ such that P (n) e (q, C ) ≤ ε for every i.i.d.state distribution q ∈ Q, and in particular for q * .This holds even if q * is in the closure of Q but not in Q itself, since P (n) e (q, C ) is continuous in q.Consider using this code over a standard relay channel W Y,Y1|X,X1 without a state, where W Y,Y1|X,X1 (y, y 1 |x, x 1 ) = s∈S q * (s)W Y,Y1|X,X1,S (y, y 1 |x, x 1 , s).It follows that the rate R as in (74) can be achieved over the relay channel W Y,Y1|X,X1 , in contradiction to [9].We deduce that the assumption is false, and R > R CS (L Q ) cannot be achieved.

APPENDIX C PROOF OF COROLLARY 3
Consider the block-compound relay channel L Q×B , where the state distribution q b ∈ Q varies from block to block.Since the encoder, relay and receiver are aware of this jamming scheme, the capacity is the same as that of the ordinary compound channel, i.e.C(L Q×B ) = C(L Q ) and C ⋆ (L Q×B ) = C ⋆ (L Q ).Hence, (18) and (19) follow from Lemma 1.As for the second part of the corollary, observe that the block Markov coding scheme used in the proof of the decode forward lower bound can be applied as is to the block-compound relay channel, since the relay and the destination receiver do not estimate the state distribution while decoding the messages (see Appendix A).Furthermore, the analysis also holds, where the actual state distribution q * , in (63)-( 65) and (71), is now replaced by the state distribution q * b which corresponds to block b ∈ [1 : B].

APPENDIX D PROOF OF THEOREM 4
First, we explain the general idea.We modify Ahlswede's Robustification Technique (RT) [2] to the relay channel.Namely, we use codes for the compound relay channel to construct a random code for the AVRC using randomized permutations.However, in our case, the strictly causal nature of the relay imposes a difficulty, and the application of the RT is not straightforward.
In [2], there is noncausal state information and a random code is defined via permutations of the codeword symbols and the received sequence.Here, however, the relay cannot apply permutations to its transmission x n 1 , because it depends on the received sequence y n 1 in a strictly causal manner.We resolve this difficulty using block Markov codes for the block-compound relay channel to construct a random code for the AVRC, applying B in-block permutations to the relay transmission, which depends only on the sequence received in the previous block.The details are given below.

A. Partial Decode Forward Lower Bound
We show that every rate R < R ⋆ DF (L) (see (20)) can be achieved by random codes over the AVRC L, i.e.C(L) ≥ R ⋆ DF (L).We start with Ahlswede's RT [2], stated below.Let h : S n → [0, 1] be a given function.If, for some fixed α n ∈ (0, 1), and for all q(s n ) = n i=1 q(s i ), with q ∈ P(S), then, where Π n is the set of all n-tuple permutations π : S n → S n , and According to Corollary 3, for every R < R ⋆ DF (L), there exists a (2 nR(B−1) , nB, e −2θn ) block Markov code for the blockcompound relay channel L P(S)×B for some θ > 0 and sufficiently large n, where B > 0 is arbitrarily large.Recall that the code constructed in the proof in Appendix A has the following form.
Given such a block Markov code C BM for the block-compound relay channel L P(S)×B , we have that for b = B − 1, . . ., 1, where That is, for every sequence of state distributions q 1 , . . ., q b+1 , where q t (s n t ) = and where and The conditioning in the equations above can be explained as follows.In (82), due to the code construction, the sequence Y n for all (s n 1 , s n 2 , . . ., s n b+1 ) ∈ S (b+1)n and sufficiently large n, such that (n + 1) B|S| ≤ e θn .On the other hand, for every π 1 , π 2 , . . ., π b+1 ∈ Π n , we have that with where (a) is obtained by changing the order of summation over y n 1,1 , . . ., y n 1,b and y n b+1 ; and (b) holds because the relay channel is memoryless.Similarly, with Then, consider the (2 nR(B−1) , nB) random Markov block code C Π BM , specified by and for π 1 , . . ., π B ∈ Π n , with a uniform distribution µ(π 1 , . . ., π B ) = That is, a set of B independent permutations is chosen at random and applied to all blocks simultaneously, while the order of the blocks remains intact.As we restricted ourselves to a block Markov code, the relaying function in a given block depends only on symbols received in the previous block, hence, the relay can implement those in-block permutations, and the coding scheme does not violate the causality requirement.
From ( 86) and (88), we see that using the random code C Π BM , the error probabilities for the messages M ′ b and M ′′ b are given by for all s n 1 , . . ., s n b+1 ∈ S n , b ∈ [1 : B − 1], and therefore, together with (84), we have that the probability of error of the random code C Π BM is bounded by e (q, C Π BM ) ≤ e −θn , for every q(s nB ) ∈ P(S nB ).That is, C Π BM is a (2 nR(B−1) , nB, e −θn ) random code for the AVRC L, where the overall blocklength is nB, and the average rate B−1 B • R tends to R as B → ∞.This completes the proof of the partial decode-forward lower bound.

B. Cutset Upper Bound
The proof immediately follows from Lemma 1, since the random code capacity of the AVRC is bounded by the random code capacity of the compound relay channel, i.e.C ⋆ (L) ≤ C ⋆ (L P(S) ).

APPENDIX E PROOF OF LEMMA 6
We use the approach of [1], with the required adjustments.We use the random code constructed in the proof of Theorem 4. Let R < C ⋆ (L), and consider the case where the marginal sender-relay and sender-receiver AVCs have positive capacity, i.e.

APPENDIX J PROOF OF LEMMA 10
The proof follows the lines of [16].Consider an AVRC L = {W Y |X ′ ,X1 W Y1|X ′′ ,X1,S } with orthogonal sender components.We apply Theorem 4, which states that

A. Achievability Proof
To show achievability, we set U = X ′′ and p(x ′ , x ′′ , x 1 ) = p(x 1 )p(x ′ |x 1 )p(x ′′ |x 1 ) in the partial decode-forward lower bound . Hence, by (9), Now, by (30), we have that . Thus, (106) reduces to the expression in the RHS of (31 is achievable by deterministic codes as well, due to Corollary 7.

B. Converse Proof
By ( 8) and ( 20), the cutset upper bound takes the following form, where the last line is due to the minimax theorem [25].For the AVRC with orthogonal sender components, as specified by (30), we have the following Markov relations, Hence, by (109), I(X ′ , X ′′ , X 1 ; Y ) = I(X ′ , X 1 ; Y ).As for the second mutual information in the RHS of (107), by the mutual information chain rule, where (a) is due to (108), (b) is due to (109), and (c) holds since conditioning reduces entropy.Therefore, Without loss of generality, the maximization in (111) can be restricted to distributions of the form p(x ′ , x ′′ , x 1 ) = p(x 1 )• p(x ′ |x 1 )• p(x ′′ |x 1 ).

APPENDIX K PROOF OF LEMMA 11
Consider the Gaussian compound relay channel with SFD under input constraints Ω and Ω 1 and state constraint Λ, i.e.Q = {q(s) : ES 2 ≤ Λ}.

A. Achievability Proof
We begin with the following lemma, which follows from [18] and [12].
Proof of Lemma 15.Consider the Gaussian AVC W = {W Ȳ | X, S }, specified by Ȳ = X + S, under input constraint P and state constraint N .Then, by Csiszár and Narayan [12], the random code capacity of the AVC W, under input constraint P and state constraint N , is given by On the other hand, by Hughes and Narayan [18], As the saddle point value I q ( X, Ȳ ) = 1 2 log 1 + P N is attained with X ∼ N (0, P ) and S ∼ N (0, N ), we have that S ∼ N (0, N ) minimizes I q ( X; Ȳ ) for X ∼ N (0, P ).
Next, we use the lemma above to prove the direct part.Although we previously assumed that the input, state and output alphabets are finite, our results for the compound relay channel can be extended to the continuous case as well, using standard discretization techniques [4, 1] [14, Section 3.4.1].In particular, Lemma 1 can be extended to the compound relay channel L Q under input constraints Ω and Ω 1 and state constraint Λ, by choosing a distribution p(x ′ , x ′′ , x 1 ) such that E(X ′2 + X ′′2 ) ≤ Ω and EX 2  1 ≤ Ω 1 .Then, the capacity of L Q is bounded by min min which follows from the partial decode-forward lower bound by taking U = X ′′ .Lemma 1 further states that there exists a block Markov code that achieves this rate such that the probability of error decays exponentially as the blocklength increases.
Let 0 ≤ α, ρ ≤ 1, and let (X ′ , X ′′ , X 1 ) be jointly Gaussian with where the correlation coefficient of X ′ and X 1 is ρ, while X ′′ is independent of (X ′ , X 1 ).Hence, By Lemma 15, as Var(X ′ |X 1 = x 1 ) = (1 − ρ 2 )αΩ for all x 1 ∈ R, we have that It is left for us to evaluate the first term in the RHS of (115).Then, by standard whitening transformation, there exist two independent Gaussian random variables T 1 and T 2 such that Hence, Y = T 1 + T 2 + S, and as Var(X ′ |X 1 = x 1 ) = Var(T 1 ) for all x 1 ∈ R, we have that Let S T 1 + S.Then, by Lemma 15, Substituting ( 117), ( 118) and (122) in the RHS of (115), we have that Observe that the first sum in the RHS of (123) can be expressed as Hence, the direct part follows from (123).

B. Converse Proof
where the last equality is due to [16].

A. Achievability Proof
To show that C ⋆ (L) ≥ C(L Q ), we follow the steps in the proof of Theorem 4, where we replace Ahlswede's original RT with the modified version in [21, Lemma 4], with l n (s n ) = 1 n n i=1 s 2 i .Then, by Lemma 11, it follows that The details are omitted.

B. Converse Proof
Assume to the contrary that there exists an achievable rate R such that using random codes over the Gaussian AVRC L, under input constraints Ω and Ω 1 and state constraint Λ, where δ > 0 is arbitrarily small.That is, for every ε > 0 and sufficiently large n, there exists a (2 nR , n) random code C Γ = (µ, Γ, {C γ } γ∈Γ ) for the Gaussian AVRC L, under input constraints Ω and Ω 1 and state constraint Λ, such that for all m ∈ [1 : 2 nR ] and s ∈ R n with s 2 ≤ nΛ.
Consider using the random code C Γ over the Gaussian compound relay channel L Q under state constraint (Λ − δ), i.e. with under input constraints Ω and Ω 1 .Let q(s) ∈ Q be a given state distribution.Then, define a sequence of i.i.d.random variables S 1 , . . ., S n ∼ q(s).Letting q(s n ) n i=1 q(s i ), the probability of error is bounded by Then, the first sum is bounded by (128), and the second term vanishes as well by the law of large numbers, since q(s) is in (129).Hence, the rate R in (127) is achievable for the Gaussian compound relay channel L Q , in contradiction to Lemma 11.We deduce that the assumption is false, and (127) cannot be achieved.
The sequences x ′′ (m ′ b ), m ′ b ∈ [1 : 2 nR ′ ] are chosen as follows.Observe that the channel from the sender to the relay, Y 1 = X ′′ + Z, does not depend on the state.Thus, by Shannon's well-known result on the point to point Gaussian channel [22], the message m ′ b can be conveyed to the relay reliably, under input constraint where δ 1 is arbitrarily small (see also [8,Chapter 9]).That is, for every ε > 0 and sufficiently large n, there exists a (2 nR ′ , n, ε) code Applying Lemma 13 by [11] repeatedly yields the following.Lemma 16.For every ε > 0, 8 such that for every unit vector c ∈ R n and 0 ≤ θ, ζ ≤ 1, and if θ ≥ η and such that for every unit vector c ∈ R n and 0 ≤ θ, ζ ≤ 1, and if θ ≥ η and Then, define where 2 could be greater than nαΩ due to the possible correlation between If there is more than one such m′ b ∈ [1 : 2 nR ′ ], declare an error.Then, the decoder uses m′ 1 , . . ., m′ B−1 as follows.For b = B − 1, B − 2, . . ., 1, find a unique m′′ b ∈ [1 : 2 nR ′′ ] such that If there is more than one such m′′ b ∈ [1 : 2 nR ′′ ], declare an error.Analysis of Probability of Error: Fix s ∈ S n , and let The error event is bounded by the union of the following events.For b ∈ [1 : Then, the conditional probability of error given the state sequence s is bounded by with E 1 (0) = E 2 (0) = ∅, where the conditioning on S = s is omitted for convenience of notation.Recall that we have defined C ′′ as a (2 nR ′ , n, ε) code for the point to point gaussian channel Y 1 = X ′′ + Z.Hence, the first sum in the RHS of ( 147) is bounded by B • ε, which is arbitrarily small.As for the erroneous decoding of M ′ b at the receiver, consider the following events, where Hence, by the union of events bound, we have that By Lemma 16, given R ′ > − 1 2 log(1 − η 2 ), the first term is bounded by since log(1 + t) ≤ t for t ∈ R. As η 2 ≥ 8ε, the last expression tends to zero as n → ∞.Similarly, Pr (E 2,2 (b)) and Pr (E 2,3 (b)) tend to zero as well.Moving to the fourth term in the RHS of (150), observe that for a sufficiently small ε and η, the event Hence, the encoder transmits )), the relay transmits x 1 (M ′ b ), and we have that Observe that for sufficiently small ε and η, the event and Hence, by ( 153) and ( 154), Dividing both sides of the inequality by n(1 + ρ αγ −1 ), we obtain Next, we partition the set of values of where K is a finite constant which is independent of n, as in Lemma 16.By ( 155) and ( 157), given the event E 2 (b), we have that where the last inequality is due to (133), for sufficiently small δ > 0. To see this, observe that the inequality in (133) is strict, and it implies that Hence, for sufficiently small δ > 0, τ 1 > 0 as Furthermore, if Thus, By (155), this can be further bounded by where and By Lemma 16, the RHS of (164) tends to zero as n → ∞ provided that For sufficiently small ε and η, we have that η ≤ θ 1 = τ1 γ(Ω−δ) , hence the first condition is met.Then, observe that the second condition is equivalent to G By differentiation, we have that the minimum value of this function is given by min , where δ 1 → 0 as δ → 0. Thus, the RHS of (164) tends to zero as n → ∞, provided that and arbitrary δ ′ > 0, if η and δ are sufficiently small.Moving to the error event for M ′′ b , consider the events where Hence, by the union of events bound, we have that By Lemma 16, given R ′′ > − 1 2 log(1 − η 2 ), the first term is bounded by since log(1 + t) ≤ t for t ∈ R. As η 2 ≥ 8ε, the last expression tends to zero as n → ∞.Similarly, Pr (E 3,2 (b)) tends to zero as well.As for the third term in the RHS of (173), observe that for a sufficiently small ε and η, the event 2 ≤ nαΩ, and as the event )), the relay transmits x 1 (M ′ b ), and we have that Then, we have that Observe that for sufficiently small ε and η, the event By (177), given the event E 3 (b), we have that By part 2 of Lemma 16, the RHS of (180) tends to zero as n → ∞ provided that where for k ∈ [1 : K − 1] and ζ ′′ K = 0.For sufficiently small ε and η, we have that η ≤ τ ′′ 1 , hence the first condition is met.By differentiation, we have that the minimum value of the function G(τ ) = τ 2 + β 2 nΛ (1 − 2δ − τ ) 2 is given by β 2 (1−2δ) 2 β 2 +nΛ .Thus, the RHS of (180) tends to zero as n → ∞, provided that This is satisfied for R ′′ = R ′′ α (L) − δ ′′ , with for an arbitrary δ ′′ > 0, if η and δ are sufficiently small.We have thus shown achievability of every rate R < min R ′ α (L) + R ′′ α (L), where (see ( 142)).This completes the proof of the lower bound.

B. Upper Bound
Let R > 0 be an achievable rate.Then, there exists a sequence of (2 nR , n, ε * n ) codes C n = (f , f 1 , g) for the Gaussian AVRC L with SFD such that ε * n → 0 as n → ∞, where the encoder consists of a pair f = (f ′ , f ′′ ), with f ′ : [1 : 2 nR ] → R n and f ′′ : [1 : 2 nR ] → R n .Assume without loss of generality that the codewords have zero mean, i.e.
Then, recall that by (189), the expectation of 1 n f ′ (M ) + f 1 (Y n 1 ) 2 is strictly lower than Λ, and for a sufficiently large n, the conditional expectation of 1 n ||f ′ ( M ) + f 1 ( Y 1 )|| 2 given M = M is also strictly lower than Λ.Thus, by Chebyshev's inequality, the probability of error is bounded from below by a positive constant.Following this contradiction, we deduce that if the code is reliable, then Λ ≤ (1 + α + 2ρ √ α)Ω.It is left for us to show that for α and ρ as defined in (187), we have that R < F G (α, ρ) (see (37)).For a (2 nR , n, ε * n ) code, for all s ∈ R n with s 2 ≤ nΛ.Then, consider using the code C over the Gaussian relay channel W q Y,Y1|X,X1 , specified by where the sequence S is i.i.d.∼ q = N (0, Λ − δ).First, we show that the code C is reliable for this channel, and then we show that R < F G (α, ρ).Using the code C over the channel W q Y,Y1|X,X1 , the probability of error is bounded by where we have bounded the first term by ε * * n using the law of large numbers and the second term using (199), where ε * * n → 0 as n → ∞.Since W q Y,Y1|X,X1 is a channel without a state, we can now show that R < F G (α, ρ) by following the lines of [9] and [16].By Fano's inequality and [9, Lemma 4], we have that where q = N (0, Λ − δ), X ′ = f ′ (M ), X ′′ = f ′′ (M ), X 1 = f 1 (Y 1 ), and ε n → 0 as n → ∞.For the Gaussian relay channel with SFD, we have the following Markov relations, Hence, by (204), I q (X ′ i , X ′′ i , X 1,i ; Y i ) = I q (X ′ i , X 1,i ; Y i ).Moving to the second bound in the RHS of (202), we follow the lines of [16].Then, by the mutual information chain rule, we have where (a) is due to (203), (b) is due to (204), and (c) holds since conditioning reduces entropy.Introducing a time-sharing random variable K ∼ Unif[1 : n], which is independent of X ′ , X ′′ , X 1 , Y, Y 1 , we have that Now, by the maximum differential entropy lemma (see e.g.[8, Theorem 8.6.5]), and where α, α 1 and ρ are given by (187).Since δ > 0 is arbitrary, and α 1 ≤ 1, the proof follows from (206)-(208).

Fig. 1 . 1 1
Fig. 1.Communication over the AVRC L = {W Y,Y 1 |X,X 1 ,S }.Given a message M , the encoder transmits X n = f (M ).At time i ∈ [1 : n], the relay transmits X 1,i based on all the symbols of the past Y i−1 1 and then receives a new symbol Y 1,i .The decoder receives the output sequence Y n and finds an estimate of the message M = g(Y n ).

Fig. 2 .
Fig. 2. Bounds on the capacity of the Gaussian AVRC with SFD.The dashed upper line depicts the random code capacity of the Gaussian AVRC as a function of the input constraint Ω = Ω 1 , under state constraint Λ = 1 and σ 2 = 0.5.The solid lines depict the deterministic code lower and upper bounds R G,low (L) and R G,up (L).The dotted lower line depicts the direct transmission lower bound.

for b ∈ [ 2 :
B −1].The codebooks F 1 and F B are generated in the same manner, with fixed m

Fig. 3 .
Fig. 3. Partial decode-forward coding scheme.The block index b ∈ [1 : B] is indicated at the top.In the following rows, we have the corresponding elements: (1) sequences transmitted by the encoder; (2) estimated messages at the relay; (3) sequences transmitted by the relay; (4) estimated messages at the destination decoder.The arrows in the second row indicate that the relay encodes forwards with respect to the block index, while the arrows in the fourth row indicate that the receiver decodes backwards.