The Arbitrarily Varying Relay Channel

Uzi Pereg; Yossef Steinberg

doi:10.3390/e21050516

Abstract

We study the arbitrarily varying relay channel, which models communication with relaying in the presence of an active adversary. We establish the cutset bound and partial decode-forward bound on the random code capacity. We further determine the random code capacity for special cases. Then, we consider conditions under which the deterministic code capacity is determined as well. In addition, we consider the arbitrarily varying Gaussian relay channel with sender frequency division under input and state constraints. We determine the random code capacity, and establish lower and upper bounds on the deterministic code capacity. Furthermore, we show that as opposed to previous relay models, the primitive relay channel has a different behavior compared to the non-primitive relay channel in the arbitrarily varying scenario.

Keywords:

arbitrarily varying channel; relay channel; decode-forward; Markov block code; minimax theorem; deterministic code; random code; symmetrizability

1. Introduction

The relay channel was first introduced by van der Meulen [1] to describe point-to-point communication with the help of a relay, which receives a noisy version of the transmitter signal and transmits a signal of its own to the destination receiver. The relay channel is generally perceived as a fundamental building block for multihop networks (see e.g., [2,3], Chapter 16), where some nodes receive and transmit in order to assist the information flow between other nodes. The capacity of the relay channel is not known in general, however, Cover and El Gamal established the cutset upper bound, the decode-forward lower bound, and the partial decode-forward lower bound [4]. It was also shown in [4] that for the reversely degraded relay channel, direct transmission is capacity achieving. For the degraded relay channel, the decode-forward lower bound and the cutset upper bound coincide, thus characterizing the capacity for this model [4].

In general, the partial decode-forward lower bound is tighter than both direct transmission and decode-forward lower bounds. El Gamal and Zahedi [5] determined the capacity of the relay channel with orthogonal sender components, by showing that the partial decode-forward lower bound and cutset upper bound coincide. A variation of the relay channel, referred to as the primitive relay channel, was introduced by Kim [2], and attracted a lot of attention (see e.g., [6,7,8,9,10,11,12] and references therein). Recently, there has also been a growing interest in the Gaussian relay channel, as e.g., in [5,7,9,13,14,15,16] and references therein. In particular, El Gamal and Zahedi [5] introduced the Gaussian relay channel with sender frequency division (SFD), as a special case of a relay channel with orthogonal sender components. There are many other relaying scenarios, including secrecy [17,18], networking [15,19,20,21,22], parallel relaying [23,24,25], diamond channels [26,27,28], side information [29,30,31,32,33], etc.

In practice, the channel statistics are not necessarily known in exact, and they may even change over time. The arbitrarily varying channel (AVC) is an appropriate model to describe such a situation [34]. In real systems, such variations are caused by fading in wireless communication [35,36,37,38,39,40,41,42], memory faults in storage [43,44,45,46,47], malicious attacks on identification and authorization systems [48,49], etc. It is especially relevant to communication in the presence of an adversary, or a jammer, attempting to disrupt communication. Jamming attacks are not limited to point-to-point communication, and cause a major security concern for cognitive radio networks [50] and wireless sensor networks [42,51,52,53,54], for instance.

Considering the AVC without a relay, Blackwell et al. determined the random code capacity [34], i.e., the capacity achieved by stochastic-encoder stochastic-decoder coding schemes with common randomness. It was also demonstrated in [34] that the random code capacity is not necessarily achievable using deterministic codes. A well-known result by Ahlswede [55] is the dichotomy property of the AVC. Specifically, the deterministic code capacity either equals the random code capacity or else, it is zero. Subsequently, Ericson [56] and Csiszár and Narayan [57] established a simple single-letter condition, namely non-symmetrizability, which is both necessary and sufficient for the capacity to be positive. Ahlswede’s Robustification Technique (RT) is a useful technique for the AVC analysis, developed and applied to classical AVC settings [58,59]. Essentially, the RT uses a reliable code for the compound channel to construct a random code for the AVC applying random permutations to the codeword symbols. A continuing line of works on arbitrarily varying networks includes among others the arbitrarily varying broadcast channel [60,61,62,63,64,65], multiple-access channel [60,66,67,68,69,70,71,72,73,74,75], and wiretap channel [76,77,78,79,80,81,82,83,84]. The reference lists here are far from being exhaustive.

In this work, we introduce a new model, namely, the arbitrarily varying relay channel (AVRC). The AVRC combines the previous models, i.e., the relay channel and the AVC, and we believe that it is a natural problem to consider, in light of the jamming attacks on current and future networks, as mentioned above. In the analysis, we incorporate the block Markov coding schemes of [4] in Ahlswede’s Robustification and Elimination Techniques [55,59]. A straightforward application of Ahlswede’s RT fails to comply with the strictly causal relay transmission. In a recent work [85,86], by the authors of this paper, a modified RT technique was presented and applied to the point-to-point AVC with causal side information under input and state constraints, without a relay. This was the first time where the application of the RT exploited the structure of the original compound channel code to construct a random code for the AVC, as opposed to earlier work where the original code is treated as a “black box”. Here, we present another modification of the RT, which also exploits the structure of the original compound channel code, but in a different manner. The analysis also requires to redefine the compound channel, and we refer to the newly defined channel as the block-compound relay channel.

We establish the cutset upper bound and the full/partial decode-forward lower bound on the random code capacity of the AVRC. The random code capacity is determined in special cases of the degraded AVRC, the reversely degraded AVRC, and the AVRC with orthogonal sender components. Then, we give extended non-symmetrizability conditions under which the deterministic code capacity coincides with the random code capacity. We show by example that the deterministic code capacity can be strictly lower than the random code capacity of the AVRC. Then, we consider the Gaussian AVRC with SFD, under input and state constraints. The random code capacity is determined using the previous results, whereas the deterministic code capacity is lower and upper bounded using an independent approach. Specifically, we extend the techniques from [87], where Csiszár and Narayan determine the capacity of the Gaussian AVC under input and state constraint. It is shown that for low values on the input constraint, the deterministic code capacity can be strictly lower than the random code capacity, but yet non-zero.

Furthermore, we give similar bounds for the primitive AVRC, where there is a noiseless link between the relay and the receiver of limited capacity [2]. We find the capacity of the primitive counterpart of the Gaussian AVRC with SFD, in which case the deterministic and random code capacities coincide, regardless of the value of the input constraint. We deduce that Kim’s assertion—that “the primitive relay channel captures most essential features and challenges of relaying, and thus serves as a good testbed for new relay coding techniques” [2]—is not true in the arbitrarily varying scenario.

This work is organized as follows. In Section 2, the basic definitions and notation are provided. In Section 3, we give the main results on the general AVRC. The Gaussian AVRC with SFD is introduced in Section 4, and the main results are given in Section 5. The definition and results on the primitive AVRC are in Section 6.

2. Definitions

2.1. Notation

We use the following notation conventions throughout. Calligraphic letters

X, S, Y, \dots

are used for finite sets. Lowercase letters

x, s, y, \dots

stand for constants and values of random variables, and uppercase letters

X, S, Y, \dots

stand for random variables. The distribution of a random variable X is specified by a probability mass function (pmf)

P_{X} (x) = p (x)

over a finite set

X

. The set of all pmfs over

X

is denoted by

P (X)

. We use

x^{j} = (x_{1}, x_{2}, \dots, x_{j})

to denote a sequence of letters from

X

. A random sequence

X^{n}

and its distribution

P_{X^{n}} (x^{n}) = p (x^{n})

are defined accordingly. For a pair of integers i and j,

1 \leq i \leq j

, we define the discrete interval

[i : j] = {i, i + 1, \dots, j}

. The notation

x = (x_{1}, x_{2}, \dots, x_{n})

is used when it is understood from the context that the length of the sequence is n, and the

ℓ^{2}

-norm of

x

is denoted by

∥x∥

.

2.2. Channel Description

A state-dependent discrete memoryless relay channel

(X, X_{1}, S, W_{Y, Y_{1} | X, X_{1}, S}, Y, Y_{1})

consists of five sets,

X

,

X_{1}

,

S

,

Y

and

Y_{1}

, and a collection of conditional pmfs

W_{Y, Y_{1} | X, X_{1}, S}

. The sets stand for the input alphabet, the relay transmission alphabet, the state alphabet, the output alphabet, and the relay input alphabet, respectively. The alphabets are assumed to be finite, unless explicitly said otherwise. The channel is memoryless without feedback, and therefore

\begin{matrix} W_{Y^{n}, Y_{1}^{n} | X^{n}, X_{1}^{n}, S^{n}} (y^{n}, y_{1}^{n} | x^{n}, x_{1}^{n}, s^{n}) = \prod_{i = 1}^{n} W_{Y, Y_{1} | X, X_{1}, S} (y_{i}, y_{1, i} | x_{i}, x_{1, i}, s_{i}) . \end{matrix}

(1)

Communication over a relay channel is depicted in Figure 1. Following [29], a relay channel

W_{Y, Y_{1} | X, X_{1}, S}

is called degraded if the channel can be expressed as

\begin{matrix} W_{Y, Y_{1} | X, X_{1}, S} (y, y_{1} | x, x_{1}, s) = W_{Y_{1} | X, X_{1}, S} (y_{1} | x, x_{1}, s) W_{Y | Y_{1}, X_{1}, S} (y | y_{1}, x_{1}, s), \end{matrix}

(2)

and it is called reversely degraded if

\begin{matrix} W_{Y, Y_{1} | X, X_{1}, S} (y, y_{1} | x, x_{1}, s) = W_{Y | X, X_{1}, S} (y | x, x_{1}, s) W_{Y_{1} | Y, X_{1}, S} (y_{1} | y, x_{1}, s) . \end{matrix}

(3)

We say that the relay channel is strongly degraded or reversely degraded, if the respective definition holds such that the sender-relay marginal is independent of the state. That is,

W_{Y, Y_{1} | X, X_{1}, S}

is strongly degraded if

W_{Y, Y_{1} | X, X_{1}, S} = W_{Y_{1} | X, X_{1}} W_{Y | Y_{1}, X_{1}, S}

, and similarly,

W_{Y, Y_{1} | X, X_{1}, S}

is strongly reversely degraded if

W_{Y, Y_{1} | X, X_{1}, S} = W_{Y | X, X_{1}, S} W_{Y_{1} | Y, X_{1}}

. For example, if

Y_{1} = X + Z

and

Y = Y_{1} + X_{1} + S

, where Z is an independent additive noise, then

W_{Y, Y_{1} | X, X_{1}, S}

is strongly degraded. Whereas, if

Y = X + X_{1} + S

and

Y_{1} = Y + Z

, then

W_{Y, Y_{1} | X, X_{1}, S}

is strongly reversely degraded.

Figure 1. Communication over the arbitrarily varying relay channel

L = {W_{Y, Y_{1} | X, X_{1}, S}}

. Given a message M, the encoder transmits

X^{n} = f (M)

. At time

i \in [1 : n]

, the relay transmits

X_{1, i}

based on all the symbols of the past

Y_{1}^{i - 1}

and then receives a new symbol

Y_{1, i}

. The decoder receives the output sequence

Y^{n}

, and finds an estimate of the message

\hat{M} = g (Y^{n})

.

The arbitrarily varying relay channel (AVRC) is a discrete memoryless relay channel

(X, X_{1}, S, W_{Y, Y_{1} | X, X_{1}, S}, Y, Y_{1})

with a state sequence of unknown distribution, not necessarily independent nor stationary. That is,

S^{n} \sim q (s^{n})

with an unknown joint pmf

q (s^{n})

over

S^{n}

. In particular,

q (s^{n})

can give mass 1 to some state sequence

s^{n}

. We use the shorthand notation

L = {W_{Y, Y_{1} | X, X_{1}, S}}

for the AVRC, where the alphabets are understood from the context.

To analyze the AVRC, we consider the compound relay channel. Different models of compound relay channels have been considered in the literature [30,88]. Here, we define the compound relay channel as a discrete memoryless relay channel

(X, X_{1}, S,

W_{Y, Y_{1} | X, X_{1}, S}, Y, Y_{1})

with a discrete memoryless state, where the state distribution

q (s)

is not known in exact, but rather belongs to a family of distributions

Q

, with

Q \subseteq P (S)

. That is,

S^{n} \sim \prod_{i = 1}^{n} q (s_{i})

, with an unknown pmf

q \in Q

over

S

. We use the shorthand notation

L^{Q}

for the compound relay channel, where the transition probability

W_{Y, Y_{1} | X, X_{1}, S}

and the alphabets are understood from the context.

In the analysis, we also use the following model. Suppose that the user transmits

B > 0

blocks of length n, and the jammer is entitled to use a different state distribution

q_{b} (s) \in Q

for every block

b \in [1 : B]

, while the encoder, relay and receiver are aware of this jamming scheme. In other words, every block is governed by a different memoryless state. We refer to this channel as the block-compound relay channel, denoted by

L^{Q \times B}

. Although this is a toy model, it is a useful tool for the analysis of the AVRC.

2.3. Coding

We introduce some preliminary definitions, starting with the definitions of a deterministic code and a random code for the AVRC

L

. Note that in general, the term ‘code’, unless mentioned otherwise, refers to a deterministic code.

Definition 1

(A code, an achievable rate and capacity). A

(2^{n R}, n)

code for the AVRC

L

consists of the following; a message set

[1 : 2^{n R}]

, where it is assumed throughout that

2^{n R}

is an integer, an encoder

f : [1 : 2^{n R}] \to X^{n}

, a sequence of n relaying functions

f_{1, i} : Y_{1}^{i - 1} \to X_{1, i}

,

i \in [1 : n]

, and a decoding function

g : Y^{n} \to [1 : 2^{n R}]

.

Given a message

m \in [1 : 2^{n R}]

, the encoder transmits

x^{n} = f (m)

. At time

i \in [1 : n]

, the relay transmits

x_{1, i} = f_{1, i} (y_{1}^{i - 1})

and then receives

y_{1, i}

. The relay codeword is given by

x_{1}^{n} = f_{1}^{n} (y_{1}^{n}) ≜ {(f_{1, i} (y_{1}^{i - 1}))}_{i = 1}^{n}

. The decoder receives the output sequence

y^{n}

, and finds an estimate of the message

\hat{m} = g (y^{n})

(see Figure 1). We denote the code by

C = (f (\cdot), f_{1}^{n} (\cdot), g (\cdot))

. Define the conditional probability of error of the code

C

given a state sequence

s^{n} \in S^{n}

by

\begin{matrix} P_{e | s^{n}}^{(n)} (C) = \frac{1}{2^{n R}} \sum_{m = 1}^{2^{n R}} \sum_{\begin{matrix} (y^{n}, y_{1}^{n}) : g (y^{n}) \neq m \end{matrix}} [\prod_{i = 1}^{n} W_{Y, Y_{1} | X, X_{1}, S} (y_{i}, y_{1, i} | f_{i} (m), f_{1, i} (y_{1}^{i - 1}), s_{i})] . \end{matrix}

(4)

Now, define the average probability of error of

C

for some distribution

q (s^{n}) \in P (S^{n})

,

\begin{matrix} P_{e}^{(n)} (q, C) = \sum_{s^{n} \in S^{n}} q (s^{n}) \cdot P_{e | s^{n}}^{(n)} (C) . \end{matrix}

(5)

Observe that

P_{e}^{(n)} (q, C)

is linear in q, and thus continuous. We say that

C

is a

(2^{n R}, n, ε)

code for the AVRC

L

if it further satisfies

\begin{matrix} P_{e}^{(n)} (q, C) \leq ε, for all q (s^{n}) \in P (S^{n}) . \end{matrix}

(6)

A rate R is called achievable if for every

ε > 0

and sufficiently large n, there exists a

(2^{n R}, n, ε)

code. The operational capacity is defined as the supremum of the achievable rates and it is denoted by

C (L)

. We use the term ‘capacity’ referring to this operational meaning, and in some places we call it the deterministic code capacity in order to emphasize that achievability is measured with respect to deterministic codes.

We proceed now to define the parallel quantities when using stochastic-encoders stochastic-decoder triplets with common randomness. The codes formed by these triplets are referred to as random codes.

Definition 2

(Random code). A

(2^{n R}, n)

random code for the AVRC

L

consists of a collection of

(2^{n R}, n)

codes

{C_{γ} = (f_{γ}, f_{1, γ}^{n}, g_{γ})}_{γ \in Γ}

, along with a probability distribution

μ (γ)

over the code collection Γ. We denote such a code by

C^{Γ} = (μ, Γ, {C_{γ}}_{γ \in Γ})

. Analogously to the deterministic case, a

(2^{n R}, n, ε)

random code has the additional requirement

\begin{matrix} P_{e}^{(n)} (q, C^{Γ}) = \sum_{γ \in Γ} μ (γ) P_{e}^{(n)} (q, C_{γ}) \leq ε, for all q (s^{n}) \in P (S^{n}) & . \end{matrix}

(7)

The capacity achieved by random codes is denoted by

C^{⋆} (L)

, and it is referred to as the random code capacity.

3. Main Results—General AVRC

We present our results on the compound relay channel and the AVRC.

3.1. The Compound Relay Channel

We establish the cutset upper bound and the partial decode-forward lower bound for the compound relay channel. Consider a given compound relay channel

L^{Q}

. Let

\begin{matrix} R_{C S} (L^{Q}) ≜ & \inf_{q \in Q} \max_{p (x, x_{1})} \min \{I_{q} (X, X_{1}; Y), I_{q} (X; Y, Y_{1} | X_{1})\}, \end{matrix}

(8)

and

\begin{matrix} R_{P D F} (L^{Q}) ≜ \max_{p (u, x, x_{1})} \min {\inf_{q \in Q} I_{q} (U, X_{1}; Y) + \inf_{q \in Q} I_{q} (X; Y | X_{1}, U), \\ \inf_{q \in Q} I_{q} (U; Y_{1} | X_{1}) + \inf_{q \in Q} I_{q} (X; Y | X_{1}, U)}, \end{matrix}

(9)

where the subscripts ‘

C S

’ and ‘

D F

’ stand for ‘cutset’ and ‘partial decode-forward’, respectively.

Lemma 1.

The capacity of the compound relay channel

L^{Q}

is bounded by

\begin{matrix} C (L^{Q}) \geq R_{P D F} (L^{Q}), \end{matrix}

(10)

\begin{matrix} C^{⋆} (L^{Q}) \leq R_{C S} (L^{Q}) . \end{matrix}

(11)

Specifically, if

R < R_{P D F} (L^{Q})

, then there exists a

(2^{n R}, n, e^{- a n})

block Markov code over

L^{Q}

for sufficiently large n and some

a > 0

.

The proof of Lemma 1 is given in Appendix A. The achievability proof is based on block Markov coding interlaced with the partial decode-forward scheme. That is, the encoder sends a sequence of messages over multiple blocks. The message in each block consists of two components, a decode-forward component, and a direct transmission component, where only the former is decoded by the relay. The name ‘decode-forward component’ stands for the fact that the relay decodes this message component and sends its estimation forwards, to the destination receiver. Once the decoder has received all blocks, the decode-forward components are decoded backwards, i.e., starting with the message in the last block going backwards. Using the estimation of the decode-forward components, the direct transmission components are decoded forwards, i.e., starting with the message in the first block going forwards. The ambiguity of the state distribution needs to be dealt with throughout all of those estimations. In both decoding stages, the receiver performs joint typicality decoding using a set of types that “quantizes” the set

Q

of state distributions.

Remark 1.

If the set of state distributions

Q

is convex, then the upper bound expression in the RHS of Equation (8) has a

\min \max

form. On the other hand, in the lower bound expression in the RHS of Equation (9), the maximum comes first, and then we have multiple min terms, which makes this expression a lot more complicated than the classical partial decode-forward bound [4] (see also [3], Theorem 16.3), where Markov properties lead to a simpler expression. We note that this phenomenon (or one might say, disturbance) where the lower bound has multiple min terms is not exclusive to the AVRC. A noteworthy example is the arbitrarily varying wiretap channel [76,89], where the lower bound has the form of

\max [\min I_{q} (U; Y) - \max I_{q} (U; Z)]

. While the capacity of the classical wiretap channel is known, the arbitrarily varying counterpart has remained an open problem for several years.

Observe that taking

U = \emptyset

in (9) gives the direct transmission lower bound,

\begin{matrix} C (L^{Q}) \geq & R_{P D F} (L^{Q}) \geq \max_{p (x, x_{1})} \inf_{q \in Q} I_{q} (X; Y | X_{1}) . \end{matrix}

(12)

Taking

U = X

in (9) results in a full decode-forward lower bound,

\begin{matrix} C (L^{Q}) \geq & R_{P D F} (L^{Q}) \geq \max_{p (x, x_{1})} \inf_{q \in Q} \min \{I_{q} (X, X_{1}; Y), I_{q} (X; Y_{1} | X_{1})\} . \end{matrix}

(13)

This yields the following corollary. The corollary uses the terms of a strongly degraded relay channel, for which

W_{Y, Y_{1} | X, X_{1}, S} = W_{Y_{1} | X, X_{1}} W_{Y | Y_{1}, X_{1}, S}

, and a strongly reversely degraded relay channel, for which

W_{Y, Y_{1} | X, X_{1}, S} = W_{Y | X, X_{1}, S} W_{Y_{1} | Y, X_{1}}

, as defined in Section 2.2.

Corollary 1.

Let

L^{Q}

be a compound relay channel, where

Q

is a compact convex set.

1.: If $W_{Y, Y_{1} | X, X_{1}, S}$ is strongly reversely degraded, then

$\begin{matrix} C (L^{Q}) = R_{P D F} (L^{Q}) = R_{C S} (L^{Q}) = \min_{q \in Q} \max_{p (x, x_{1})} I_{q} (X; Y | X_{1}) . \end{matrix}$

(14)
2.: If $W_{Y, Y_{1} | X, X_{1}, S}$ is strongly degraded, then

$\begin{matrix} C (L^{Q}) = R_{P D F} (L^{Q}) = R_{C S} (L^{Q}) = \max_{p (x, x_{1})} \min \{\min_{q \in Q} I_{q} (X, X_{1}; Y), I (X; Y_{1} | X_{1})\} . \end{matrix}$

(15)

The proof of Corollary 1 is given in Appendix B. Part 1 follows from the direct transmission and cutset bounds, (12) and (8), respectively, while part 2 is based on the full decode-forward and cutset bounds, (13) and (8), respectively, along with the convexity considerations in the remark below.

Remark 2.

On a technical level, there are two purposes for considering the strongly degraded relay channel, for which the marginal channel to the relay is independent of the state, i.e.,

W_{Y_{1} | X, X_{1}, S} = W_{Y_{1} | X, X_{1}}

(see Section 2.2). First, this ensures that

X - (X_{1}, Y_{1}) - Y

form a Markov chain, without conditioning on S. Secondly, as pointed out in Remark 1, there is a difference between the order of the min and max in the lower and upper bounds (cf. (8) and (9)). Thereby, proving the capacity results of Corollary 1 above, we apply the minimax theorem. In general, a pointwise minimum of two convex functions may not necessarily yield a convex function. Nevertheless, having assumed that the relay channel is strongly degraded, the functional

G (p, q) = \min {I_{q} (X, X_{1}; Y), I (X; Y_{1} | X_{1})}

is quasi-convex in the state distribution, i.e.,

\begin{matrix} G (p, (1 - α) q_{1} + α q_{2})) \leq \max (G (p, q_{1}), G (p, q_{2})), \end{matrix}

(16)

for every

p \in P (X \times X_{1})

,

q_{1}, q_{2} \in Q

, and

0 \leq α \leq 1

. The quasi-convex shape is illustrated in Figure 2, which depicts

G (p, q)

for an example given in the sequel. By [90] (Theorem 3.4), the minimax theorem applies to quasi-convex functions as well, which alleviates the proof of Corollary 1.

Figure 2. The functional

G (p, q) = \min {I_{q} (X, X_{1}; Y), I (X; Y_{1} | X_{1})}

, for

S \sim Bernoulli (q)

,

0 \leq q \leq 1

, as a function of q. The figure corresponds to Example 1, where

G (p, q) = \min {1 - \frac{1}{2} h (q), 1 - h (θ)}

, for

p (x, x_{1}) = p (x) p (x_{1})

, with

X \sim Bernoulli (\frac{1}{2})

and

X_{1} \sim Bernoulli (\frac{1}{2})

, and

θ = 0.08

. Clearly,

G (p, q)

is not convex in q, but rather quasi-convex in q.

The following corollary is a direct consequence of Lemma 1 and it is significant for the random code analysis of the AVRC.

Corollary 2.

The capacity of the block-compound relay channel

L^{Q \times B}

is bounded by

\begin{matrix} C (L^{Q \times B}) \geq R_{P D F} (L^{Q}), \end{matrix}

(17)

\begin{matrix} C^{⋆} (L^{Q \times B}) \leq R_{C S} (L^{Q}) . \end{matrix}

(18)

Specifically, if

R < R_{P D F} (L^{Q})

, then there exists a

(2^{n R}, n, e^{- a n})

block Markov code over

L^{Q \times B}

for sufficiently large n and some

a > 0

.

The proof of Corollary 2 is given in Appendix C.

3.2. The AVRC

We give lower and upper bounds, on the random code capacity and the deterministic code capacity, for the AVRC

L

.

3.2.1. Random Code Lower and Upper Bounds

The random code bounds below are obtained through a modified version of Ahlswede’s RT, using our results on the block-compound relay channel in Corollary 2. Define

\begin{matrix} R_{P D F}^{⋆} (L) ≜ R_{P D F} (L^{Q}) |_{Q = P (S)}, R_{C S}^{⋆} (L) ≜ R_{C S} (L^{Q}) |_{Q = P (S)} . \end{matrix}

(19)

Theorem 1.

The random code capacity of an AVRC

L

is bounded by

\begin{matrix} R_{P D F}^{⋆} (L) \leq C^{⋆} (L) \leq R_{C S}^{⋆} (L) . \end{matrix}

(20)

The proof of Theorem 1 is given in Appendix D. To prove Theorem 1 we modify Ahlswede’s RT. A straightforward application of Ahlswede’s RT fails to comply with the strictly causal relay transmission. Essentially, the RT uses a reliable code for the compound channel code to construct a random code for the AVC, applying random permutations to the transmitted codeword. However, the relay cannot apply permutations to its transmission, since at time

i \in [1 : n]

, the relay cannot compute

f_{1, j} (y_{1}^{j - 1})

, for

j > i

, as the relay encoder only knows the past received symbols

y_{1, 1}, \dots, y_{1, i - 1}

, and does not have access to the symbols

y_{1, i}, \dots, y_{1, j - 1}

which will be received in the future. To resolve this difficulty, we use a block Markov code for the block compound channel. In a block Markov coding scheme, the relay sends

x_{1, b}^{n}

in block b, using the sequence of symbols

y_{1, b - 1}^{n}

received in the previous block. Since the entire sequence

y_{1, b - 1}^{n}

is known to the relay encoder, permutations can be applied to the transmission in each block separately. Hence, our proof exploits the structure of the original block-compound channel code to construct a random code for the AVRC, as opposed to classical works where the RT is used such that the original code is treated as a “black box” [59].

Remark 3.

Block Markov coding with partial decode-forward is not a simple scheme by itself, and thus, using the RT requires careful attention. In particular, by close inspection of the proof of Theorem 1, one may recognize that the necessity of using the block-compound relay channel, rather than the standard compound channel, stems from the fact that for the AVRC, the state sequences may have completely different types in each block. For each block, we use the RT twice. First, the RT is applied to the probability of the backward decoding error, for the message component which is decoded by the relay. Then, it is applied to the probability of forward decoding error, for the message component which is transmitted directly.

Together with Corollary 1, the theorem above yields another corollary.

Corollary 3.

Let

L

be an AVRC.

1.: If $W_{Y, Y_{1} | X, X_{1}, S}$ is strongly reversely degraded,

$\begin{matrix} C^{⋆} (L) = R_{P D F}^{⋆} (L) = R_{C S}^{⋆} (L) = \min_{q (s)} \max_{p (x, x_{1})} I_{q} (X; Y | X_{1}) . \end{matrix}$

(21)
2.: If $W_{Y, Y_{1} | X, X_{1}, S}$ is strongly degraded,

$\begin{matrix} C^{⋆} (L) = R_{P D F}^{⋆} (L) = R_{C S}^{⋆} (L) = \max_{p (x, x_{1})} \min \{\min_{q (s)} I_{q} (X, X_{1}; Y), I (X; Y_{1} | X_{1})\} . \end{matrix}$

(22)

Before we proceed to the deterministic code capacity, we note that Ahlswede’s Elimination Technique [55] can be applied to the AVRC as well. Hence, the size of the code collection of any reliable random code can be reduced to polynomial size.

3.2.2. Deterministic Code Lower and Upper Bounds

In the next statements, we characterize the deterministic code capacity of the AVRC

L

. We consider conditions under which the deterministic code capacity is positive, and it coincides with the random code capacity, and conditions under which it is lower. For every

x_{1} \in X_{1}

, let

W_{1} (x_{1})

and

W (x_{1})

denote the marginal AVCs from the sender to the relay and from the sender to the destination receiver, respectively,

\begin{matrix} W_{1} (x_{1}) = {W_{Y_{1} | X, X_{1}, S} (\cdot | \cdot, x_{1}, \cdot)}, W (x_{1}) = {W_{Y | X, X_{1}, S} (\cdot | \cdot, x_{1}, \cdot)} . \end{matrix}

(23)

See Figure 3.

Figure 3. The marginals of the arbitrarily varying relay channel. For every relay transmission

x_{1} \in X_{1}

, the marginal sender-relay AVC is denoted by

W_{1} (x_{1}) = {W_{Y_{1} | X, X_{1}, S} (\cdot | \cdot, x_{1}, \cdot)}

, and the marginal sender-receiver AVC is denoted by

W (x_{1}) = {W_{Y | X, X_{1}, S} (\cdot | \cdot, x_{1}, \cdot)

. A sufficient condition, under which the deterministic code capacity is the same as the random code capacity of the AVRC, is given in Lemma 2. This condition is also a sufficient condition for positive capacity, but as explained in Remark 4, it is not a necessary condition.

Lemma 2 gives a condition under which the deterministic code capacity is the same as the random code capacity. The condition is given in terms of the marginal AVCs

W_{1} (x_{1})

and

W (x_{1})

.

Lemma 2.

If the marginal sender-relay and sender-reciever AVCs have positive capacities, i.e.,

C (W_{1} (x_{1, 1})) > 0

and

C (W (x_{1, 2}))

> 0

, for some

x_{1, 1}, x_{1, 2} \in X_{1}

, then the capacity of the AVRC

L

is positive, and it coincides with the random code capacity, i.e.,

C (L) = C^{⋆} (L) > 0

.

The proof of Lemma 2 is given in Appendix E, extending Ahlswede’s Elimination Technique [55].

Next, we give a computable sufficient condition, under which the deterministic code capacity coincides with the random code capacity. For the point to point AVC, this occurs if and only if the channel is non-symmetrizable [56,57] (Definition 2). Our condition here is given in terms of an extended definition of symmetrizability, akin to [67] (Definition 3.2).

Definition 3.

A state-dependent relay channel

W_{Y, Y_{1} | X, X_{1}, S}

is said to be symmetrizable-

X | X_{1}

if for some conditional distribution

J (s | x)

,

\begin{matrix} \sum_{s \in S} W_{Y, Y_{1} | X, X_{1}, S} (y, y_{1} | x, x_{1}, s) J (s | \tilde{x}) = \sum_{s \in S} W_{Y, Y_{1} | X, X_{1}, S} (y, y_{1} | \tilde{x}, x_{1}, s) J (s | x), \\ \forall x, \tilde{x} \in X, x_{1} \in X_{1}, y \in Y, y_{1} \in Y_{1} . \end{matrix}

(24)

Equivalently, for every given

x_{1} \in X_{1}

, the channel

W_{\bar{Y} | X, X_{1}, S} (\cdot | \cdot, x_{1}, \cdot)

is symmetrizable, where

\bar{Y} = (Y, Y_{1})

.

A similar definition applies to the marginals

W_{Y | X, X_{1}, S}

and

W_{Y_{1} | X, X_{1}, S}

. Note that symmetrizability of each of these marginals can be checked, without reference to whether the channel is degraded or strongly degraded.

Corollary 4.

Let

L

be an AVRC.

1.: If $W_{Y | X, X_{1}, S}$ and $W_{Y_{1} | X, X_{1}, S}$ are non-symmetrizable- $X | X_{1}$ , then $C (L) = C^{⋆} (L) > 0$ . In this case,

$\begin{matrix} R_{P D F}^{⋆} (L) \leq C (L) \leq R_{C S}^{⋆} (L) . \end{matrix}$

(25)
2.: If $W_{Y, Y_{1} | X, X_{1}, S}$ is strongly reversely degraded, where $W_{Y_{1} | X, X_{1}, S}$ is non-symmetrizable- $X | X_{1}$ , then

$\begin{matrix} C (L) = C^{⋆} (L) = R_{P D F}^{⋆} (L) = R_{C S}^{⋆} (L) = \min_{q (s)} \max_{p (x, x_{1})} I_{q} (X; Y | X_{1}) . \end{matrix}$

(26)
3.: If $W_{Y, Y_{1} | X, X_{1}, S}$ is strongly degraded, where $W_{Y | X, X_{1}, S}$ is non-symmetrizable- $X | X_{1}$ and $W_{Y_{1} | X, X_{1}} (y_{1} | x, x_{1}) \neq W_{Y_{1} | X, X_{1}} (y_{1} | \tilde{x}, x_{1})$ for some $x, \tilde{x} \in X$ , $x_{1} \in X_{1}$ and $y_{1} \in Y_{1}$ , then

$\begin{matrix} C (L) = C^{⋆} (L) = R_{P D F}^{⋆} (L) = R_{C S}^{⋆} (L) = \max_{p (x, x_{1})} \min \{\min_{q (s)} I_{q} (X, X_{1}; Y), I (X; Y_{1} | X_{1})\} . \end{matrix}$

(27)

The proof of Corollary 4 is given in Appendix F.

Remark 4.

By Corollary 4, we have that non-symmetrizability of the marginal AVCs,

W_{1} (x_{1, 1})

and

W (x_{1, 2})

, for some

x_{1, 1}, x_{1, 2} \in X_{1}

, is a sufficient condition for positive capacity (see Figure 3). This raises the question whether it is a necessary condition as well. In other words: If

W_{1} (x_{1})

and

W (x_{1})

are symmetrizable for all

x_{1} \in X_{1}

, does that necessarily imply that the capacity is zero? The answer is no. We show this using a very simple example. Suppose that

Y_{1} = S

and

Y = (X_{1}, X + S)

, where all variables are binary. It is readily seen that for both

Y_{1}

and Y, the input and the state are symmetric, for every given

X_{1} = x_{1}

. Hence,

W_{1} (x_{1})

and

W (x_{1})

are symmetrizable for all

x_{1} \in X_{1}

. Nevertheless, we note that since the relay can send

X_{1} = Y_{1} = S

, this is equivalent to an AVC with state information at the decoder. As the decoder can use

X_{1}

to eliminate the state, the capacity of this AVRC is

C (L) = 1

. In Lemma 3 below, we give a stronger condition which is a necessary condition for positive capacity.

Remark 5.

Note that there are 4 symmetrizability cases in terms of the sender-relay channel

W_{Y_{1} | X, X_{1}, S}

and the sender-receiver channel

W_{Y | X, X_{1}, S}

. For the case where

W_{Y_{1} | X, X_{1}, S}

and

W_{Y | X, X_{1}, S}

are both non-symmetrizable-

X | X_{1}

, the lemma above asserts that the capacity coincides with the random code capacity. In other cases, one may expect the capacity to be lower than the random code capacity. For instance, if

W_{Y | X, X_{1}, S}

is non-symmetrizable-

X | X_{1}

, while

W_{Y_{1} | X, X_{1}, S}

is symmetrizable-

X | X_{1}

, then the capacity is positive by direct transmission. Furthermore, in this case, if the channel is reversely degraded, then the capacity coincides with the random code capacity. However, it remains in question whether this is true in general, when the channel is not reversely degraded.

Next, we consider conditions under which the capacity is zero. Observe that if

W_{Y, Y_{1} | X, X_{1}, S}

is symmetrizable-

X | X_{1}

then so are

W_{Y | X, X_{1}, S}

and

W_{Y_{1} | X, X_{1}, S}

. Intuitively, if the AVRC is symmetrizable-

X | X_{1}

, then it is a poor channel. For example, say

Y_{1} = X + X_{1} + S

and

Y = X \cdot X_{1} \cdot S

, with

S = X

. Then, the jammer can confuse the decoder by taking the state sequence

S^{n}

to be some codeword. The following lemma validates this intuition.

Lemma 3.

If the AVRC

L

is symmetrizable-

X | X_{1}

, then it has zero capacity, i.e.,

C (L) = 0

. Equivalently, non-symmetrizability-

X | X_{1}

of the AVRC

L

is a necessary condition for positive capacity.

Lemma 3 is proved in Appendix G, using an extended version of Ericson’s technique [56]. For a strongly degraded AVRC, we have a simpler symmetrizability condition under which the capacity is zero.

Definition 4.

Let

W_{Y, Y_{1} | X, X_{1}, S} = W_{Y_{1} | X, X_{1}} W_{Y | Y_{1}, X_{1}, S}

be a strongly degraded relay channel. We say that

W_{Y, Y_{1} | X, X_{1}, S}

is symmetrizable-

X_{1} \times Y_{1}

if for some conditional distribution

J (s | x_{1}, y_{1})

,

\begin{matrix} \sum_{s \in S} W_{Y | Y_{1}, X_{1}, S} (y | y_{1}, x_{1}, s) J (s | {\tilde{x}}_{1}, {\tilde{y}}_{1}) = \sum_{s \in S} W_{Y | Y_{1}, X_{1}, S} (y | {\tilde{y}}_{1}, {\tilde{x}}_{1}, s) J (s | x_{1}, y_{1}), \\ \forall {\tilde{x}}_{1}, x_{1} \in X_{1}, y \in Y, y_{1}, {\tilde{y}}_{1} \in Y_{1} . \end{matrix}

(28)

Equivalently, the channel

W_{Y | {\bar{Y}}_{1}, S}

is symmetrizable, where

{\bar{Y}}_{1} = (Y_{1}, X_{1})

.

Lemma 4.

If the AVRC

L

is strongly degraded and symmetrizable-

X_{1} \times Y_{1}

, then it has zero capacity, i.e.,

C (L) = 0

.

Lemma 4 is proved in Appendix H. An example is given below.

Example 1.

Consider a state-dependent relay channel

W_{Y, Y_{1} | X, X_{1}, S}

, specified by

\begin{matrix} Y_{1} = & X + Z \mod 2, \\ Y = & X_{1} + S, \end{matrix}

where

X = X_{1} = Z = S = Y_{1} = {0, 1}

and

Y = {0, 1, 2}

, and the additive noise is distributed according to

Z \sim Bernoulli (θ)

,

0 \leq θ \leq 1

. It is readily seen that

W_{Y, Y_{1} | X, X_{1}, S}

is strongly degraded and symmetrizable-

X_{1} \times Y_{1}

, by (2) and (28). In particular, (28) is satisfied with

J (s | x_{1}, y_{1}) = 1

for

s = x_{1}

, and

J (s | x_{1}, y_{1}) = 0

otherwise. Hence, by Lemma 4, the capacity is

C (L) = 0

. On the other hand, we show that the random code capacity is given by

C^{⋆} (L) = \min \{\frac{1}{2}, 1 - h (θ)\}

, using Corollary 3. The derivation of the random code capacity is given in Appendix I.

3.3. AVRC with Orthogonal Sender Components

Consider the special case of a relay channel

W_{Y, Y_{1} | X, X_{1}, S}

with orthogonal sender components [5]; [3] (Section 16.6.2), where

X = (X^{'}, X^{″})

and

\begin{matrix} W_{Y, Y_{1} | X^{'}, X^{″}, X_{1}, S} (y, y_{1} | x^{'}, x^{″}, x_{1}, s) = W_{Y | X^{'}, X_{1}, S} (y | x^{'}, x_{1}, s) \cdot W_{Y_{1} | X^{″}, X_{1}, S} (y_{1} | x^{″}, x_{1}, s) . \end{matrix}

(29)

Here, we address the case where the channel output depends on the state only through the relay, i.e.,

W_{Y | X^{'}, X_{1}, S} (y | x^{'}, x_{1}, s) = W_{Y | X^{'}, X_{1}} (y | x^{'}, x_{1})

.

Lemma 5.

Let

L

=

{W_{Y | X^{'}, X_{1}}

W_{Y_{1} | X^{″}, X_{1}, S}}

be an AVRC with orthogonal sender components. The random code capacity of

L

is given by

\begin{matrix} C^{⋆} (L) = R_{P D F}^{⋆} (L) = R_{C S}^{⋆} (L) = \max_{p (x_{1}) p (x^{'} | x_{1}) p (x^{″} | x_{1})} \min \{I (X^{'}, X_{1}; Y), \min_{q (s)} I_{q} (X^{″}; Y_{1} | X_{1}) + I (X^{'}; Y | X_{1})\} . \end{matrix}

(30)

If

W_{Y_{1} | X^{″}, X_{1}, S}

is non-symmetrizable-

X^{″} | X_{1}

, and

W_{Y | X^{'}, X_{1}} (y | x^{'}, x_{1}) \neq W_{Y | X^{'}, X_{1}} (y | {\tilde{x}}^{'}, x_{1})

for some

x_{1} \in X_{1}

,

x^{'}, {\tilde{x}}^{'} \in X^{'}

,

y \in Y

, then the deterministic code capacity is given by

C (L) = R_{P D F}^{⋆} (L) = R_{C S}^{⋆} (L)

.

The proof of Lemma 5 is given in Appendix J. To prove Lemma 5, we apply the methods of [5] to our results. Specifically, we use the partial decode-forward lower bound in Theorem 1, taking

U = X^{″}

(see (9) and (19)).

4. Gaussian AVRC with Sender Frequency Division

We give extended results for the Gaussian AVRC with sender frequency division (SFD), which is a special case of the AVRC with orthogonal sender components [5]. We determine the random code capacity of the Gaussian AVRC with SFD, and give lower and upper bounds on the deterministic code capacity. The derivation of the deterministic code bounds is mostly independent of our previous results, and it is based on the technique by [87]. The Gaussian relay channel

W_{Y, Y_{1} | X, X_{1}, S}

with SFD is a special case of a relay channel with orthogonal sender components [5], specified by

\begin{matrix} Y_{1} = & X^{″} + Z, \\ Y = & X^{'} + X_{1} + S, \end{matrix}

(31)

where the Gaussian additive noise

Z \sim N (0, σ^{2})

is independent of the channel state. As opposed to Lemma 5, the main channel here depends on the state, while the channel to the relay does not. In the case of a Gaussian channel, power limitations need to be accounted for, and thus, we consider the Gaussian relay channel under input and state constraints. Specifically, the user and the relay’s transmission are subject to input constraints

Ω > 0

and

Ω_{1} > 0

, respectively, and the jammer is under a state constraint

Λ

, i.e.,

\begin{matrix} \frac{1}{n} \sum_{i = 1}^{n} (X_{i}^{' 2} + X_{i}^{″ 2}) \leq Ω, \frac{1}{n} \sum_{i = 1}^{n} X_{1, i}^{2} \leq Ω_{1} w . p . 1, \\ \frac{1}{n} \sum_{i = 1}^{n} S_{i}^{2} \leq Λ w . p . 1 . \end{matrix}

(32)

We note that Ahlswede’s Elimination Technique cannot be used under a state constraint (see [57]). Indeed, if the jammer concentrates a lot of power on the shared randomness transmission, then this transmission needs to be robust against a state constraint that is higher than

Λ

. Thereby, the results given in Section 3.2.2 do not apply to the Gaussian AVRC under input and state constraints.

For the compound relay channel, the state constraint is in the average sense. That is, we say that the Gaussian compound relay channel

L^{Q}

with SFD is under input constraints

Ω

and

Ω_{1}

and state constraint

Λ

if

\begin{matrix} \frac{1}{n} \sum_{i = 1}^{n} (X_{i}^{' 2} + X_{i}^{″ 2}) \leq Ω, \frac{1}{n} \sum_{i = 1}^{n} X_{1, i}^{2} \leq Ω_{1}, w . p . 1, \\ Q = {q (s) : E S^{2} \leq Λ} . \end{matrix}

(33)

Coding definitions and notation are as follows. The definition of a code is similar to that of Section 2.3. The encoding function is denoted by

f = (f^{'}, f^{″})

, with

f^{'} : [1 : 2^{n R}] \to R^{n}

and

f^{″} : [1 : 2^{n R}] \to R^{n}

, and the relay encoding function is denoted by

f_{1} : R^{n} \to R^{n}

, where

f_{1, i} : R^{i - 1} \to R

, for

i \in [1 : n]

. The boldface notation indicates that the encoding functions produce sequences. Here, the encoder and the relay satisfy the input constraints

{∥f^{'} (m)∥}^{2} + {∥f^{″} (m)∥}^{2} \leq n Ω

and

{∥f_{1} (y_{1})∥}^{2} \leq n Ω_{1}

for all

m \in [1 : 2^{n R}]

and

y_{1} \in R^{n}

. At time

i \in [1 : n]

, given a message

m \in [1 : 2^{n R}]

, the encoder transmits

(x_{i}^{'}, x_{i}^{″}) = (f_{i}^{'} (m), f_{i}^{″} (m))

, and the relay transmits

x_{1, i} = f_{1, i} (y_{1, 1}, \dots, y_{1, i - 1})

. The decoder receives the output sequence

y

, and finds an estimate

\hat{m} = g (y)

. A

(2^{n R}, n, ε)

code

C

for the Gaussian AVRC satisfies

P_{e | s}^{(n)} (C) \leq ε

, for all

s \in R^{n}

with

{∥s∥}^{2} \leq n Λ

, where

\begin{matrix} P_{e | s}^{(n)} (C) = \frac{1}{2^{n R}} \sum_{m = 1}^{2^{n R}} \int_{D {(m, s)}^{c}} \frac{1}{{(2 π σ^{2})}^{n / 2}} e^{- {∥z∥}^{2} / 2 σ^{2}} d z, \end{matrix}

(34)

with

\begin{matrix} D (m, s) = \{z \in R^{n} : g (f^{'} (m) + f_{1} (f^{″} (m) + z) + s) = m\} . \end{matrix}

(35)

Achivable rates, deterministic code capacity and random code capacity are defined as before. Next, we give our results on the Gaussian compound relay channel and the Gaussian AVRC with SFD.

5. Main Results—Gaussian AVRC with SFD

We give our results on the Gaussian compound and AVRC with SFD. The results on this compound relay channel and on the random code capacity of this AVRC are obtained through a straightforward extension of our previous results and derivations. However, the derivation of the deterministic code bounds is mostly independent of our previous results, and it is based on modifying the technique by Csiszär and Narayan in their paper on the Gaussian AVC [87].

5.1. Gaussian Compound Relay Channel

We determine the capacity of the Gaussian compound relay channel with SFD under input and state constraints. Let

\begin{matrix} F_{G} (α, ρ) ≜ \min {\frac{1}{2} \log (1 + \frac{Ω_{1} + α Ω + 2 ρ \sqrt{α Ω} \sqrt{Ω_{1}}}{Λ}), \\ \frac{1}{2} \log (1 + \frac{(1 - α) Ω}{σ^{2}}) + \frac{1}{2} \log (1 + \frac{(1 - ρ^{2}) α Ω}{Λ})} . \end{matrix}

(36)

Lemma 6.

The capacity of the Gaussian compound relay channel with SFD, under input constraints Ω and

Ω_{1}

and state constraint Λ, is given by

\begin{matrix} C (L^{Q}) = \max_{0 \leq α, ρ \leq 1} F_{G} (α, ρ), \end{matrix}

(37)

and it is identical to the random code capacity, i.e.,

C (L^{Q}) = C^{⋆} (L^{Q})

.

The proof of Lemma 6 is given in Appendix K, based on our results in the previous sections. The parameter

0 \leq α \leq 1

represents the fraction of input power invested in the transmission of the message component which is decoded by the relay, in the partial decode-forward coding scheme. Specifically, in the achievability proof in [5],

α Ω

and

(1 - α) Ω

are the variances of

X^{'}

and

X^{″}

, respectively. The parameter

ρ

stands for the correlation coefficient between the decode-forward transmission

X^{'}

and the relay transmission

X_{1}

.

5.2. Gaussian AVRC

We determine the random code capacity of the Gaussian AVRC with SFD under constraints.

Theorem 2.

The random code capacity of the Gaussian AVRC with SFD, under input constraints Ω and

Ω_{1}

and state constraint Λ, is given by

\begin{matrix} C^{⋆} (L) = C (L^{Q}) = \max_{0 \leq α, ρ \leq 1} F_{G} (α, ρ) . \end{matrix}

(38)

The proof of Theorem 2 is given in Appendix L. The proof follows the same considerations as in our previous results.

Next, we give lower and upper bounds on the deterministic code capacity of the Gaussian AVRC with SFD under constraints, obtained by generalizing the non-standard techniques by Csiszár and Narayan in their 1991 paper on the Gaussian AVC [87]. Define

\begin{matrix} \begin{matrix} R_{G, l o w} (L) ≜ & \max & F_{G} (α, ρ) \\ subject to & 0 \leq α, ρ \leq 1, \\ (1 - ρ^{2}) α Ω > Λ, \\ \frac{Ω_{1}}{Ω} {(\sqrt{Ω_{1}} + ρ \sqrt{α Ω})}^{2} > Λ + (1 - ρ^{2}) α Ω . \end{matrix} \end{matrix}

(39)

and

\begin{matrix} \begin{matrix} R_{G, u p} (L) ≜ & \max & F_{G} (α, ρ) \\ subject to & 0 \leq α, ρ \leq 1, \\ Ω_{1} + α Ω + 2 ρ \sqrt{α Ω \cdot Ω_{1}} \geq Λ . \end{matrix} \end{matrix}

(40)

It can be seen that

R_{G, l o w} \leq R_{G, u p}

, since

\begin{matrix} Ω_{1} + α Ω + 2 ρ \sqrt{α Ω \cdot Ω_{1}} = (1 - ρ^{2}) α Ω + {(\sqrt{Ω_{1}} + ρ \sqrt{α Ω})}^{2} \geq (1 - ρ^{2}) α Ω . \end{matrix}

(41)

The analysis is based on the following lemma by [87].

Lemma 7

(see [87] (Lemma 1)). For every

ε > 0

,

8 \sqrt{ε} < η < 1

,

K > 2 ε

, and

M = 2^{n R}

, with

2 ε \leq R \leq K

, and

n \geq n_{0} (ε, η, K)

, there exist

M

unit vectors

a (m) \in R^{n}

,

m \in [1 : M]

, such that for every unit vector

c \in R^{n}

and

0 \leq θ, ζ \leq 1

,

\begin{matrix} | \{\tilde{m} \in [1 : M] : ⟨ a (\tilde{m}), c ⟩ \geq θ\} | \leq 2^{n ({[R + \frac{1}{2} \log (1 - θ^{2})]}_{+} + ε)}, \end{matrix}

(42)

and if

θ \geq η

and

θ^{2} + ζ^{2} > 1 + η - 2^{- 2 R}

, then

\begin{matrix} \frac{1}{M} | \{m \in [1 : M] : | ⟨ a (\tilde{m}), a (m) ⟩ | \geq θ, | ⟨ a (\tilde{m}), c ⟩ | \geq ζ, for some \tilde{m} \neq m\} | \leq 2^{- n ε}, \end{matrix}

(43)

where

{[t]}_{+} = \max {0, t}

and

⟨ \cdot, \cdot ⟩

denotes inner product.

Intuitively, the lemma states that under certain conditions, a codebook can be constructed with an exponentially small fraction of “bad” messages, for which the codewords are non-orthogonal to each other and the state sequence.

Theorem 3.

The deterministic code capacity of the Gaussian AVRC with SFD, under input constraints Ω and

Ω_{1}

and state constraint Λ, is bounded by

\begin{matrix} R_{G, l o w} (L) \leq C (L) \leq R_{G, u p} (L) . \end{matrix}

(44)

The proof of Theorem 3 is given in Appendix M.

Remark 6.

Csiszár and Narayan [87] have shown that for the classical Gaussian AVC, reliable decoding is guaranteed when the input constraint Ω is larger than the state constraint Λ. Here, we use a partial decode-forward coding scheme, where the message has two components, one which is decoded by the relay, and the other is transmitted directly. The respective optimization constraints

\frac{Ω_{1}}{Ω} {(\sqrt{Ω_{1}} + ρ \sqrt{α Ω})}^{2} > Λ + (1 - ρ^{2}) α Ω

and

(1 - ρ^{2}) α Ω > Λ

in the RHS of (39), guarantee reliability for each decoding step.

Remark 7.

Csiszár and Narayan [87] have further shown that for the classical Gaussian AVC, if

Ω \leq Λ

, the capacity is zero. The converse proof in [87] follows by considering a jammer who chooses the state sequence to be a codeword. Due to the symmetry between

X

and

S

, the decoder cannot distinguish between the transmitted codeword and the impostor sent by the jammer. Here, we consider a jammer who simulates

X^{'} + X_{1}

. Specifically, The jammer draws a codeword

X^{'} = f^{'} (\tilde{m})

uniformly at random, and then, generates a sequence

{\tilde{Y}}_{1}

distributed according to the conditional distribution

P_{Y_{1} | M = \tilde{m}}

. If the sequence

\tilde{S} = f^{'} (\tilde{m}) + f_{1} ({\tilde{Y}}_{1})

satisfies the state constraint Λ, then the jammer chooses

\tilde{S}

as the state sequence. Defining

α Ω

,

Ω_{1}

, and ρ as the empirical decode-forward transmission power, relay transmission power, and their correlation coefficient, respectively, we have that the state constraint

{∥\tilde{S}∥}^{2} \leq n Λ

holds with high probability, if

Ω_{1} + α Ω + 2 ρ \sqrt{α Ω \cdot Ω_{1}} < Λ

. The details are in Appendix M.

Figure 4 depicts the bounds on the capacity of the Gaussian AVRC with SFD under input and state constraints, as a function of the input constraint

Ω = Ω_{1}

, under state constraint

Λ = 1

and

σ^{2} = 0.5

. The top dashed line depicts the random code capacity of the Gaussian AVRC. The solid lines depict the deterministic code lower and upper bounds

R_{G, l o w} (L)

and

R_{G, u p} (L)

. For low values,

Ω < \frac{Λ}{4} = 0.25

, we have that

R_{G, u p} (L) = 0

, hence the deterministic code capacity is zero, and it is strictly lower than the random code capacity. The dotted lower line depicts the direct transmission lower bound, which is

F_{G} (1, 0)

for

Ω > Λ

, and zero otherwise [57]. For intermediate values of

Ω

, direct transmission is better than the lower bound in Theorem 3. Whereas, for high values of

Ω

, the optimization constraints in (39) and (40) are inactive, hence, our bounds are tight, and the capacity coincides with the random code capacity, i.e.,

C (L) = C^{⋆} (L) = R_{G, l o w} (L) = R_{G, u p} (L)

.

Figure 4. Bounds on the capacity of the Gaussian AVRC with sender frequency division. The dashed upper line depicts the random code capacity of the Gaussian AVRC as a function of the input constraint

Ω = Ω_{1}

, under state constraint

Λ = 1

and

σ^{2} = 0.5

. The solid lines depict the deterministic code lower and upper bounds

R_{G, l o w} (L)

and

R_{G, u p} (L)

. The dotted lower line depicts the direct transmission lower bound.

6. The Primitive AVRC

In this section, we give our results on the primitive AVRC [2], and then consider the Gaussian case. Part of the motivation given in [2] to consider the primitive relay channel was that the overall behavior and properties are the same as the non primitive (“regular”) relay channel. We show that this is not true in the arbitrarily varying scenario. In particular, the behavior of the primitive Gaussian AVRC with SFD is different compared to the non-primitive counterpart considered above.

6.1. Definitions and Notation

Consider a setup where the sender transmits information over state-dependent memoryless relay channel

W_{Y, Y_{1} | X, S}

, while there is a noiseless link of capacity

C_{1} > 0

between the relay and the receiver. Communication over a primitive relay channel is depicted in Figure 5. Given a message

M \in [1 : 2^{n R}]

, the encoder transmits

X^{n} = f (M)

over the channel

W_{Y, Y_{1} | X, S}

, which is referred to as the primitive relay channel. The relay receives

Y_{1}^{n}

and sends an index

L = f_{1} (Y_{1}^{n})

to the receiver, where

f_{1} : Y_{1}^{n} \to [1 : 2^{n C_{1}}]

. The decoder receives both the channel output sequence

Y^{n}

and the relay output L, and finds an estimate of the message

\hat{M} = g (Y^{n}, L)

. In accordance with the previous definitions, the primitive AVRC

L_{prim} = {W_{Y, Y_{1} | X, S}}

has a state sequence of unknown distribution, not necessarily independent nor stationary. The deterministic code capacity and the random code capacity are defined as before, and denoted by

C (L_{prim})

and

C^{⋆} (L_{prim})

, respectively.

Figure 5. Communication over the primitive AVRC

L

. Given a message M, the encoder transmits

X^{n} = f (M)

. The relay receives

Y_{1}^{n}

and sends

L = f_{1} (Y_{1}^{n})

, where

f_{1} : Y_{1}^{n} \to [1 : 2^{n C_{1}}]

. The decoder receives both the channel output sequence

Y^{n}

and the relay output L, and finds an estimate of the message

\hat{M} = g (Y^{n}, L)

.

6.2. Main Results—Primitive AVRC

We give our results on the primitive AVRC below. However, since the proofs are based on the same arguments as given for the non primitive AVRC, we omit the proofs of the results in this section. The details are given in [91].

Using similar arguments to those given for the non primitive relay channel, we obtain the following bounds on the random code capacity,

\begin{matrix} R_{C S}^{⋆} ≜ & \min_{q (s)} \max_{p (x)} \min \{I_{q} (X; Y) + C_{1}, I_{q} (X; Y, Y_{1})\}, \end{matrix}

(45)

and

\begin{matrix} R_{P D F}^{⋆} ≜ & \max_{p (u, x)} \min \{\min_{q (s)} I_{q} (U; Y) + \min_{q (s)} I_{q} (X; Y | U) + C_{1}, \min_{q (s)} I_{q} (U; Y_{1}) + \min_{q (s)} I_{q} (X; Y | U)\} . \end{matrix}

(46)

Theorem 4.

The random code capacity of a primitive AVRC

L_{prim}

is bounded by

\begin{matrix} R_{P D F}^{⋆} \leq C^{⋆} (L_{prim}) \leq R_{C S}^{⋆} . \end{matrix}

(47)

Those bounds have the same form as the cutset upper bound and the partical decode-forward lower bound in Section 3 (cf. (8), (9) and (45), (46)). As in Section 3, we can use the bounds above to determine the capacity in the strongly degraded and reversely degraded cases, based on the direct transmission lower bound (for

U = \emptyset

), and the full decode-forward lower bound (for

U = X

).

Corollary 5.

Let

L_{prim}

be a primitive AVRC.

1.: If $W_{Y, Y_{1} | X, S}$ is strongly reversely degraded, i.e., $W_{Y, Y_{1} | X, S} = W_{Y | X, S} W_{Y_{1} | Y}$ , then

$\begin{matrix} C^{⋆} (L_{prim}) = \min_{q (s)} \max_{p (x)} I_{q} (X; Y) . \end{matrix}$

(48)
2.: If $W_{Y, Y_{1} | X, X_{1}, S}$ is strongly degraded, i.e., $W_{Y, Y_{1} | X, X_{1}, S} = W_{Y_{1} | X} W_{Y | Y_{1}, S}$ , then

$\begin{matrix} C^{⋆} (L_{prim}) = \max_{p (x)} \min \{\min_{q (s)} I_{q} (X; Y) + C_{1}, I (X; Y_{1})\} . \end{matrix}$

(49)

As for the deterministic code capacity, we give the following theorem.

Theorem 5.

Let

L_{prim}

be a primitive AVRC.

1.: If $W_{Y_{1} | X, S}$ is non-symmetrizable, then $C (L_{prim}) = C^{⋆} (L_{prim})$ . In this case,

$\begin{matrix} R_{P D F}^{⋆} \leq C (L_{prim}) \leq R_{C S}^{⋆} . \end{matrix}$

(50)
2.: If $W_{Y, Y_{1} | X, S}$ is strongly reversely degraded, where $W_{Y_{1} | X, S}$ is non-symmetrizable, then

$\begin{matrix} C (L_{prim}) = \min_{q (s)} \max_{p (x)} I_{q} (X; Y) . \end{matrix}$

(51)
3.: If $W_{Y, Y_{1} | X, S}$ is strongly degraded, such that $W_{Y_{1} | X} (y_{1} | x) \neq W_{Y_{1} | X} (y_{1} | \tilde{x})$ for some $x, \tilde{x} \in X$ , $y_{1} \in Y_{1}$ , then

$\begin{matrix} C (L_{prim}) = \max_{p (x)} \min \{\min_{q (s)} I_{q} (X; Y) + C_{1}, I (X; Y_{1})\} . \end{matrix}$

(52)
4.: If $W_{\tilde{Y} | X, S}$ is symmetrizable, where $\tilde{Y} = (Y, Y_{1})$ , then $C (L_{prim}) = 0$ .

The proof of Theorem 5 is available in [91]. To illustrate our results, we give the following example of a primitive AVRC.

Example 2.

Consider a state-dependent primitive relay channel

W_{Y, Y_{1} | X, S}

, specified by

\begin{matrix} Y_{1} = & X (1 - S), \\ Y = & X + S, \end{matrix}

where

X = S = Y_{1} = {0, 1}

,

Y = {0, 1, 2}

, and

C_{1} = 1

, i.e., the link between the relay and the receiver is a noiseless bit pipe. It can be seen that both the sender-relay and the sender-receiver marginals are symmetrizable. Indeed,

W_{Y | X, S}

satisfies

\begin{matrix} \sum_{s \in S} W_{Y | X, S} (y_{1} | x, s) J (s | \tilde{x}) = \sum_{s \in S} W_{Y | X, S} (y_{1} | \tilde{x}, s) J (s | x), x, \tilde{x} \in X, y \in Y, \end{matrix}

(53)

with

J (s | x) = 1

for

s = x

, and

J (s | x) = 0

otherwise, while

W_{Y_{1} | X, S}

satisfies (53) with

J (s | x) = 1

for

s = 1 - x

, and

J (s | x) = 0

otherwise. Nevertheless, the capacity of the primitive AVRC

L_{prim} = {W_{Y, Y_{1} | X, S}}

is

C (L_{prim}) = 1

, which can be achieved using a code of length

n = 1

, with

f (m) = m

,

f_{1} (y_{1}) = y_{1}

,

\begin{matrix} g (y, ℓ) = g (y, y_{1}) = \{\begin{matrix} 0 & y = 0 \\ 1 & y = 2 \\ y_{1} & y = 1 \end{matrix} \end{matrix}

(54)

for

m, y_{1} \in {0, 1}

and

y \in {0, 1, 2}

. This example shows that even if the sender-relay and sender-receiver marginals are symmetrizable, the capacity may still be positive. We further note that the condition in part 4 of Theorem 5 implies that

W_{Y | X, S}

and

W_{Y_{1} | X, S}

are both symmetrizable, but not vice versa, as shown by this example. That is, as the capacity is positive, we have that

W_{\tilde{Y} | X, S}

is non-symmetrizable, where

\tilde{Y} = (Y, Y_{1})

, despite the fact that the marginals

W_{Y | X, S}

and

W_{Y_{1} | X, S}

are both symmetrizable.

6.3. Primitive Gaussian AVRC

Consider the primitive Gaussian relay channel with SFD,

\begin{matrix} Y_{1} = & X^{″} + Z, \\ Y = & X^{'} + S, \end{matrix}

(55)

Suppose that input and state constraints are imposed as before, i.e.,

\frac{1}{n} \sum_{i = 1}^{n} (X_{i}^{' 2} + X_{i}^{″ 2}) \leq Ω

and

\frac{1}{n} \sum_{i = 1}^{n} S_{i}^{2} \leq Λ

with probability 1. The capacity of the primitive Gaussian AVRC with SFD, under input constraint

Ω

and state constraint

Λ

is given by

\begin{matrix} C (L_{prim}) = C^{⋆} (L_{prim}) = \max_{0 \leq α \leq 1} [\frac{1}{2} \log (1 + \frac{α Ω}{Λ}) + \min \{C_{1}, \frac{1}{2} \log (1 + \frac{(1 - α) Ω}{Λ})\}] . \end{matrix}

(56)

This result is due to the following. Observe that one could treat this primitive AVRC as two independent channels, one from

X^{'}

to Y and the other from

X^{″}

to

Y_{1}

, dividing the input power to

α Ω

and

(1 - α) Ω

, respectively. Based on this observation, the random code direct part follows from [92]. Next, the deterministic code direct part follows from part 1 of Theorem 5, and the converse part follows straightforwardly from the cutset upper bound in Theorem 4.

7. Discussion

We have presented the model of the arbitrarily varying relay channel (AVRC), as a state dependent relay channel, where jamming attacks result in either a random or a deterministic state sequence,

S^{n} \sim q (s^{n})

, where the joint distribution

q (s^{n})

is unknown and it is not necessarily of a product form. We have established the cutset upper bound and the partial decode-forward lower bound on the random code capacity of the AVRC. We have determined the random code capacity in special cases of the degraded AVRC, the reversely degraded AVRC, and the AVRC with orthogonal sender components. To do so, we used the direct transmission lower bound and the full decode-forward lower bound, along with quasi-convexity properties which are required in order to use the minimax theorem.

We have provided generalized symmetrizability conditions under which the deterministic code capacity coincides with the random code capacity. Specifically, we have shown that if the sender-relay and sender-receiver marginals are non-symmetrizable for a given relay transmission, then the capacity is positive. We further noted that this is a sufficient condition for positive capacity, which raises the question whether it is also a necessary condition. In other words, if those marginals are symmetrizable for every given relay transmission, does that necessarily imply that the capacity is zero? The answer is no, and we have refuted this assertion using a simple example, where the relay acts as a source of state information to the receiver. Then, we provided a stronger symmetrizability condition, which is necessary for the capacity to be positive. We have shown by example that the deterministic code capacity can be strictly lower than the random code capacity of the AVRC.

The Gaussian AVRC with sender frequency division (SFD) under input and state constraints is also addressed in this paper. The random code capacity is determined using the above results, whereas the deterministic code capacity is lower and upper bounded using an independent approach. Specifically, we extended the technique by Csiszár and Narayan in their 1991 paper on the Gaussian AVC [87]. We have shown that the deterministic code capacity can be strictly lower than the random code capacity, for low values on the input constraint.

Furthermore, we have considered the primitive AVRC, where there is a noiseless link between the relay and the receiver of limited capacity [2]. We tested Kim’s assertion that “the primitive relay channel captures most essential features and challenges of relaying, and thus serves as a good testbed for new relay coding techniques” [2]. We have shown that this assertion is not true in the arbitrarily varying scenario. Specifically, for the primitive Gaussian AVRC with SFD, the deterministic code capacity and the random code capacity are always the same, regardless of the value of the input constraint (see (56)), in contrast to our findings for the non primitive case, as demonstrated in Figure 4.

Author Contributions

Formal analysis, U.P.; Investigation, U.P.; Methodology, U.P.; Supervision, Y.S.; Writing—original draft, U.P.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

AVC	Arbitrarily varying channel
AVRC	Arbitrarily varying relay channel
DMC	Discrete memoryless channel
pmf	probability mass function
RT	Robustification technique
SFD	Sender frequency division
Eq.	Equation
RHS	Right hand side
LHS	Left hand side

Appendix A. Proof of Lemma 1

Appendix A.1. Partial Decode-Forward Lower Bound

We construct a block Markov code using the partial decode-forward scheme. That is, the encoder sends a sequence of messages over multiple blocks. The message in each block consists of two components, a decode-forward component, and a direct transmission component, where only the former is decoded by the relay. Once the decoder has received all blocks, the decode-forward components are decoded backwards, i.e., starting with the message in the last block going backwards. Using the estimation of the decode-forward components, the direct transmission components are decoded forwards, i.e., starting with the message in the first block going forwards. The ambiguity of the state distribution needs to be treated throughout all of those estimations. Hence, we use joint typicality with respect to a state type, which is “close” to some

q \in Q

. Let

δ > 0

be arbitrarily small. Define a set of state types

{\hat{Q}}_{n}

by

\begin{matrix} {\hat{Q}}_{n} = {{\hat{P}}_{s^{n}} : s^{n} \in A^{δ_{1}} (q) for some q \in Q}, \end{matrix}

(A1)

where

\begin{matrix} δ_{1} ≜ \frac{δ}{2 \cdot | S |} . \end{matrix}

(A2)

Namely,

{\hat{Q}}_{n}

is the set of types that are

δ_{1}

-close to some state distribution

q (s)

in

Q

. A code

C

for the compound relay channel is constructed as follows.

The encoders use B blocks, each consists of n channel uses to convey

(B - 1)

independent messages to the receiver. Furthermore, each message

M_{b}

, for

b \in [1 : B - 1]

, is divided into two independent messages. That is,

M_{b} = (M_{b}^{'}, M_{b}^{″})

, where

M_{b}^{'}

and

M_{b}^{″}

are uniformly distributed, i.e.,

\begin{matrix} M_{b}^{'} \sim Unif [1 : 2^{n R^{'}}], M_{b}^{″} \sim Unif [1 : 2^{n R^{″}}], with R^{'} + R^{″} = R, \end{matrix}

(A3)

for

b \in [1 : B - 1]

. For convenience of notation, set

M_{0}^{'} = M_{B}^{'} \equiv 1

and

M_{0}^{″} = M_{B}^{″} \equiv 1

. The average rate

\frac{B - 1}{B} \cdot R

is arbitrarily close to R.

Codebook Generation: Fix the distribution

P_{U, X, X_{1}} (u, x, x_{1})

, and let

\begin{matrix} P_{X, Y, Y_{1} | U, X_{1}}^{q} (x, y, y_{1} | u, x_{1}) = P_{X | U, X_{1}} (x | u, x_{1}) \sum_{s \in S} q (s) W_{Y, Y_{1} | X, X_{1}, S} (y, y_{1} | x, x_{1}, s) . \end{matrix}

(A4)

We construct B independent codebooks. For

b \in [2 : B - 1]

, generate

2^{n R^{'}}

independent sequences

x_{1, b}^{n} (m_{b - 1}^{'})

,

m_{b - 1}^{'} \in [1 : 2^{n R^{'}}]

, at random, each according to

\prod_{i = 1}^{n} P_{X_{1}} (x_{1, i})

. Then, generate

2^{n R^{'}}

sequences,

\begin{matrix} u_{b}^{n} (m_{b}^{'} | m_{b - 1}^{'}) \sim & \prod_{i = 1}^{n} P_{U | X_{1}} (u_{i} | x_{1, b, i} (m_{b - 1}^{'})), m_{b}^{'} \in [1 : 2^{n R^{'}}], \end{matrix}

(A5)

conditionally independent given

x_{1, b}^{n} (m_{b - 1}^{'})

. Then, for every

m_{b}^{'} \in [1 : 2^{n R^{'}}]

, generate

2^{n R^{″}}

sequences,

\begin{matrix} x_{b}^{n} (m_{b}^{'}, m_{b}^{″} | m_{b - 1}^{'}) \sim & \prod_{i = 1}^{n} P_{X | U, X_{1}} (x_{i} | u_{b, i} (m_{b}^{'} | m_{b - 1}^{'}), x_{1, b, i} (m_{b - 1}^{'})), m_{b}^{″} \in [1 : 2^{n R^{″}}], \end{matrix}

(A6)

conditionally independent given

(u_{b}^{n} (m_{b}^{'} | m_{b - 1}^{'}), x_{1, b}^{n} (m_{b - 1}^{'}))

. We have thus generated

B - 2

independent codebooks,

\begin{matrix} F_{b} = \{(x_{1, b}^{n} (m_{b - 1}^{'}), u_{b}^{n} (m_{b}^{'} | m_{b - 1}^{'}), x_{b}^{n} (m_{b}^{'}, m_{b}^{″} | m_{b - 1}^{'})) : m_{b - 1}^{'}, m_{b}^{'} \in [1 : 2^{n R^{'}}], m_{b}^{″} \in [1 : 2^{n R^{″}}]\}, \end{matrix}

(A7)

for

b \in [2 : B - 1]

. The codebooks

F_{1}

and

F_{B}

are generated in the same manner, with fixed

m_{0}^{'} = m_{B}^{'} \equiv 1

and

m_{0}^{″} = m_{B}^{″} \equiv 1

. Encoding and decoding is illustrated in Figure A1.

Encoding: To send the message sequence

(m_{1}^{'}, m_{1}^{″}, \dots, m_{B - 1}^{'}, m_{B - 1}^{″})

, transmit

x_{b}^{n} (m_{b}^{'}, m_{b}^{″} | m_{b - 1}^{'})

at block b, for

b \in [1 : B]

.

Relay Encoding: In block 1, the relay transmits

x_{1, 1}^{n} (1)

. Set

{\tilde{m}}_{0}^{'} \equiv 1

. At the end of block

b \in [1 : B - 1]

, the relay receives

y_{1, b}^{n}

, and finds some

{\tilde{m}}_{b}^{'} \in [1 : 2^{n R^{'}}]

such that

\begin{matrix} (u_{b}^{n} ({\tilde{m}}_{b}^{'} | {\tilde{m}}_{b - 1}^{'}), x_{1, b}^{n} ({\tilde{m}}_{b - 1}^{'}), y_{1, b}^{n}) \in A^{δ} (P_{U, X_{1}} P_{Y_{1} | U, X_{1}}^{q}), for some q \in {\hat{Q}}_{n} . \end{matrix}

(A8)

If there is none or there is more than one such, set

{\tilde{m}}_{b}^{'} = 1

. In block

b + 1

, the relay transmits

x_{1, b + 1}^{n} ({\tilde{m}}_{b}^{'})

.

Backward Decoding: Once all blocks

{(y_{b}^{n})}_{b = 1}^{B}

are received, decoding is performed backwards. Set

{\hat{m}}_{B}^{'} = {\hat{m}}_{B}^{″} \equiv 1

. For

b = B - 1, B - 2, \dots, 1

, find a unique

{\hat{m}}_{b}^{'} \in [1 : 2^{n R^{'}}]

such that

\begin{matrix} (u_{b + 1}^{n} ({\hat{m}}_{b + 1}^{'} | {\hat{m}}_{b}^{'}), x_{1, b + 1}^{n} ({\hat{m}}_{b}^{'}), y_{b + 1}^{n}) \in A^{δ} (P_{U, X_{1}} P_{Y | U, X_{1}}^{q}), for some q \in {\hat{Q}}_{n} . \end{matrix}

(A9)

If there is none, or more than one such

{\hat{m}}_{b}^{'} \in [1 : 2^{n R^{'}}]

, declare an error.

Then, the decoder uses

{\hat{m}}_{1}^{'}, \dots, {\hat{m}}_{B - 1}^{'}

as follows. For

b = B - 1, B - 2, \dots, 1

, find a unique

{\hat{m}}_{b}^{″} \in [1 : 2^{n R^{″}}]

such that

\begin{matrix} (u_{b}^{n} ({\hat{m}}_{b}^{'} | {\hat{m}}_{b - 1}^{'}), x_{b}^{n} ({\hat{m}}_{b}^{'}, {\hat{m}}_{b}^{″} | {\hat{m}}_{b - 1}^{'}), x_{1, b} ({\hat{m}}_{b - 1}^{'}), y_{b}^{n}) \in A^{δ} (P_{U, X, X_{1}} P_{Y | X, X_{1}}^{q}), for some q \in {\hat{Q}}_{n} . \end{matrix}

(A10)

If there is none, or more than one such

{\hat{m}}_{b}^{″} \in [1 : 2^{n R^{″}}]

, declare an error. We note that using the set of types

{\hat{Q}}_{n}

instead of the original set of state distributions

Q

alleviates the analysis, since

Q

is not necessarily finite nor countable.

Figure A1. The partial decode-forward coding scheme. The block index

b \in [1 : B]

is indicated at the top. In the following rows, we have the corresponding elements: (1) sequences transmitted by the encoder; (2) estimated messages at the relay; (3) sequences transmitted by the relay; (4) estimated messages at the destination decoder. The arrows in the second row indicate that the relay encodes forwards with respect to the block index, while the arrows in the fourth row indicate that the receiver decodes backwards.

Figure A1. The partial decode-forward coding scheme. The block index

b \in [1 : B]

is indicated at the top. In the following rows, we have the corresponding elements: (1) sequences transmitted by the encoder; (2) estimated messages at the relay; (3) sequences transmitted by the relay; (4) estimated messages at the destination decoder. The arrows in the second row indicate that the relay encodes forwards with respect to the block index, while the arrows in the fourth row indicate that the receiver decodes backwards.

Analysis of Probability of Error: Assume without loss of generality that the user sent

(M_{b}^{'}, M_{b}^{″}) = (1, 1)

, and let

q^{*} (s) \in Q

denote the actual state distribution chosen by the jammer. The error event is bounded by the union of the events

\begin{matrix} E_{1} (b) = & {{\tilde{M}}_{b}^{'} \neq 1}, E_{2} (b) = {{\hat{M}}_{b}^{'} \neq 1}, E_{3} (b) = {{\hat{M}}_{b}^{″} \neq 1}, for b \in [1 : B - 1] . \end{matrix}

(A11)

Then, the probability of error is bounded by

\begin{matrix} P_{e}^{(n)} (q, C) \leq & \sum_{b = 1}^{B - 1} \Pr (E_{1} (b)) + \sum_{b = 1}^{B - 1} \Pr (E_{2} (b) ∣ E_{1}^{c} (b)) + \sum_{b = 1}^{B - 1} \Pr (E_{3} (b) ∣ E_{1}^{c} (b) \cap E_{2}^{c} (b) \cap E_{2}^{c} (b - 1)), \end{matrix}

(A12)

with

E_{2} (0) = \emptyset

, where the conditioning on

(M_{b}^{'}, M_{b}^{″}) = (1, 1)

is omitted for convenience of notation.

We begin with the probability of erroneous relaying,

\Pr (E_{1} (b))

. Define

\begin{matrix} E_{1, 1} (b) = & {(U_{b}^{n} (1 | {\tilde{M}}_{b - 1}^{'}), X_{1, b}^{n} ({\tilde{M}}_{b - 1}^{'}), Y_{1, b}^{n}) \notin A^{δ} (P_{U, X_{1}} P_{Y_{1} | U, X_{1}}^{q^{'}}) for all q^{'} \in {\hat{Q}}_{n}} \\ E_{1, 2} (b) = & {(U_{b}^{n} (m_{b}^{'} | {\tilde{M}}_{b - 1}^{'}), X_{1, b}^{n} ({\tilde{M}}_{b - 1}^{'}), Y_{1, b}^{n}) \in A^{δ} (P_{U, X_{1}} P_{Y_{1} | U, X_{1}}^{q^{'}}), for some m_{b}^{'} \neq 1, q^{'} \in {\hat{Q}}_{n}} . \end{matrix}

(A13)

For

b \in [1 : B - 1]

, the relay error event is bounded as

\begin{matrix} E_{1} (b) \subseteq & E_{1} (b - 1) \cup E_{1, 1} (b) \cup E_{1, 2} (b) \\ = & E_{1} (b - 1) \cup (E_{1} {(b - 1)}^{c} \cap E_{1, 1} (b)) \cup (E_{1} {(b - 1)}^{c} \cap E_{1, 2} (b)), \end{matrix}

(A14)

with

E_{1} (0) = \emptyset

. Thus, by the union of events bound,

\begin{matrix} \Pr (E_{1} (b)) \leq \Pr (E_{1} (b - 1)) + \Pr (E_{1, 1} (b) ∣ E_{1} {(b - 1)}^{c}) + \Pr (E_{1, 2} (b) ∣ E_{1} {(b - 1)}^{c}) . \end{matrix}

(A15)

Consider the second term on the RHS of (A15). We now claim that given that

E_{1} {(b - 1)}^{c}

occurred, i.e.,

{\tilde{M}}_{b - 1}^{'} = 1

, the event

E_{1, 1} (b)

implies that

(U_{b}^{n} (1 | 1), X_{1, b}^{n} (1), Y_{1, b}^{n}) \notin A^{δ / 2} (P_{U, X_{1}} P_{Y_{1} | U, X_{1}}^{q^{″}})

for all

q^{″} \in Q

. This claim is due to the following. Assume to the contrary that

E_{1, 1} (b)

holds, but

(U_{b}^{n} (1 | 1), X_{1, b}^{n} (1), Y_{1, b}^{n}) \in A^{δ / 2} (P_{U, X_{1}} P_{Y_{1} | U, X_{1}}^{q^{″}})

for some

q^{″} \in Q

. Then, for a sufficiently large n, there exists a type

q^{'} (s)

such that

\begin{matrix} | q^{'} (s) - q^{″} (s) | \leq δ_{1}, \end{matrix}

(A16)

for all

s \in S

, and by the definition in (A1),

q^{'} \in {\hat{Q}}_{n}

. Then, (A16) implies that

\begin{matrix} | P_{Y_{1} | U, X_{1}}^{q^{'}} (y_{1} | u, x_{1}) - P_{Y_{1} | U, X_{1}}^{q^{″}} (y_{1} | u, x_{1}) | \leq | S | \cdot δ_{1} = \frac{δ}{2}, \end{matrix}

(A17)

for all

u \in U

,

x_{1} \in X_{1}

and

y_{1} \in Y_{1}

(see (A4) and (A2)). Hence,

(U_{b}^{n} (1 | 1), X_{1, b}^{n} (1), Y_{1, b}^{n}) \in A^{δ} (P_{U, X_{1}} P_{Y_{1} | U, X_{1}}^{q^{'}})

, which contradicts the first assumption. It follows that

\begin{matrix} \Pr (E_{1, 1} (b) ∣ E_{1} {(b - 1)}^{c}) \\ \leq & \Pr ((U_{b}^{n} (1 | 1), X_{1, b}^{n} (1), Y_{1, b}^{n}) \notin A^{δ / 2} (P_{U, X_{1}} P_{Y_{1} | U, X_{1}}^{q^{″}}) for all q^{″} \in Q ∣ E_{1} {(b - 1)}^{c}) \\ \leq & \Pr ((U_{b}^{n} (1 | 1), X_{1, b}^{n} (1), Y_{1, b}^{n}) \notin A^{δ / 2} (P_{U, X_{1}} P_{Y_{1} | U, X_{1}}^{q^{*}}) ∣ E_{1} {(b - 1)}^{c}) . \end{matrix}

(A18)

Since the codebooks

F_{1}, \dots, F_{B}

are independent, the sequence

(U_{b}^{n} (1 | 1), X_{1, b}^{n} (1))

from the codebook

F_{b}

is independent of the relay estimate

{\tilde{M}}_{b - 1}

, which is a function of

Y_{1, b - 1}^{n}

and the codebook

F_{b - 1}

. Thus, the RHS of (A18) tends to zero exponentially as

n \to \infty

by the law of large numbers and Chernoff’s bound.

We move to the third term in the RHS of (A15). By the union of events bound, the fact that the number of type classes in

S^{n}

is bounded by

{(n + 1)}^{| S |}

, and the independence of the codebooks, we have that

\begin{matrix} \Pr (E_{1, 2} (b) ∣ E_{1} {(b - 1)}^{c}) \\ \leq & {(n + 1)}^{| S |} \cdot \sup_{q^{'} \in {\hat{Q}}_{n}} \Pr ((U_{b}^{n} (m_{b}^{'} | 1), X_{1, b}^{n} (1), Y_{1, b}^{n}) \in A^{δ} (P_{U, X_{1}} P_{Y_{1} | U, X_{1}}^{q^{'}}) for some m_{b}^{'} \neq 1) \\ \leq & {(n + 1)}^{| S |} \cdot 2^{n R^{'}} \cdot \sup_{q^{'} \in {\hat{Q}}_{n}} [\sum_{u^{n}, x_{1}^{n}} P_{U^{n}, X_{1}^{n}} (u^{n}, x_{1}^{n}) \cdot \sum_{y_{1}^{n} : (u^{n}, x_{1}^{n}, y_{1}^{n}) \in A^{δ} (P_{U, X_{1}} P_{Y_{1} | U, X_{1}}^{q^{'}})} P_{Y_{1}^{n} | X_{1}^{n}}^{q^{*}} (y_{1}^{n} | x_{1}^{n})], \end{matrix}

(A19)

where the last line follows since

U_{b}^{n} (m_{b}^{'} | 1)

is conditionally independent of

Y_{1, b}^{n}

given

X_{1, b}^{n} (1)

, for every

m_{b}^{'} \neq 1

. Let

y_{1}^{n}

satisfy

(u^{n}, x_{1}^{n}, y_{1}^{n}) \in A^{δ} (P_{U, X_{1}} P_{Y_{1} | U, X_{1}}^{q^{'}})

. Then,

(x_{1}^{n}, y_{1}^{n}) \in A^{δ_{2}} (P_{X_{1}, Y_{1}}^{q^{'}})

with

δ_{2} ≜ | U | \cdot δ

. By Lemmas 2.6 and 2.7 in [93],

\begin{matrix} P_{X_{1}^{n}, Y_{1}^{n}}^{q^{*}} (x_{1}^{n}, y_{1}^{n}) = 2^{- n (H ({\hat{P}}_{x_{1}^{n}, y_{1}^{n}}) + D ({\hat{P}}_{x_{1}^{n}, y_{1}^{n}} | | P_{X_{1}, Y_{1}}^{q^{*}}))} \leq 2^{- n H ({\hat{P}}_{x_{1}^{n}, y_{1}^{n}})} \leq 2^{- n (H_{q^{'}} (X_{1}, Y_{1}) - ε_{1} (δ))}, \end{matrix}

hence,

\begin{matrix} P_{Y_{1}^{n} | X_{1}^{n}}^{q^{*}} (y_{1}^{n} | x_{1}^{n}) \leq 2^{- n (H_{q^{'}} (Y_{1} | X_{1}) - ε_{2} (δ))}, \end{matrix}

(A20)

where

ε_{1} (δ), ε_{2} (δ) \to 0

as

δ \to 0

. Therefore, by Equation (A19)−(A20), along with [93] (Lemma 2.13),

\begin{matrix} \Pr (E_{1, 2} (b) ∣ E_{1} {(b - 1)}^{c}) \leq {(n + 1)}^{| S |} \cdot \sup_{q^{'} \in Q} 2^{- n [I_{q^{'}} (U; Y_{1} | X_{1}) - R^{'} - ε_{3} (δ)]}, \end{matrix}

(A21)

with

ε_{3} (δ) \to 0

as

δ \to 0

. Using induction, we have by (A15) that

\Pr (E_{1} (b))

tends to zero exponentially as

n \to \infty

, for

b \in [1 : B - 1]

, provided that

R^{'} < \inf_{q^{'} \in Q} I_{q^{'}} (U; Y_{1} | X_{1}) - ε_{3} (δ)

.

As for the erroneous decoding of

M_{b}^{'}

at the receiver, observe that given

E_{1} {(b)}^{c}

, the relay sends

X_{1, b}^{n} (1)

in block

b + 1

, hence

\begin{matrix} (U_{b + 1}^{n} (1 | 1), X_{b + 1}^{n} (1, 1 | 1), X_{1, b + 1}^{n} (1)) \sim P_{U, X, X_{1}} (u, x, x_{1}) . \end{matrix}

(A22)

At the destination receiver, decoding is performed backwards, hence the error events have a different form compared to those of the relay (cf. (A13) and the events below). Define the events,

\begin{matrix} E_{2, 1} (b) = & {(U_{b + 1}^{n} ({\hat{M}}_{b + 1}^{'} | 1), X_{1, b + 1}^{n} (1), Y_{b + 1}^{n}) \notin A^{δ} (P_{U, X_{1}} P_{Y | U, X_{1}}^{q^{'}}) for all q^{'} \in {\hat{Q}}_{n}} \\ E_{2, 2} (b) = & {(U_{b + 1}^{n} ({\hat{M}}_{b + 1}^{'} | m_{b}^{'}), X_{1, b + 1}^{n} (m_{b}^{'}), Y_{b + 1}^{n}) \in A^{δ} (P_{U, X_{1}} P_{Y_{1} | U, X_{1}}^{q^{'}}), for some m_{b}^{'} \neq 1, q^{'} \in {\hat{Q}}_{n}} \end{matrix}

(A23)

For

b \in [1 : B - 1]

, the error event

E_{2} (b)

is bounded by

\begin{matrix} E_{2} (b) \subseteq & E_{2} (b + 1) \cup E_{2, 1} (b) \cup E_{2, 2} (b) \\ = & E_{2} (b + 1) \cup (E_{2} {(b + 1)}^{c} \cap E_{2, 1} (b)) \cup (E_{2} {(b + 1)}^{c} \cap E_{2, 2} (b)), \end{matrix}

(A24)

with

E_{2} (B) = \emptyset

. Thus,

\begin{matrix} \Pr (E_{2} (b) ∣ E_{1} {(b)}^{c}) \leq & \Pr (E_{2} (b + 1) ∣ E_{1} {(b)}^{c}) + \Pr (E_{2, 1} (b) ∣ E_{1} {(b)}^{c}, E_{2} {(b + 1)}^{c}) \\ + \Pr (E_{2, 2} (b) ∣ E_{1} {(b)}^{c}, E_{2} {(b + 1)}^{c}) . \end{matrix}

(A25)

By similar arguments to those used above, we have that

\begin{matrix} \Pr (E_{2, 1} (b) ∣ E_{1} {(b)}^{c}, E_{2} {(b + 1)}^{c}) \leq \Pr ((U_{b + 1}^{n} (1 | 1), X_{1, b + 1}^{n} (1), Y_{b + 1}^{n}) \notin A^{δ / 2} (P_{U, X_{1}} P_{Y | U, X_{1}}^{q^{*}}) ∣ E_{1} {(b)}^{c}), \end{matrix}

(A26)

which tends to zero exponentially as

n \to \infty

, due to (A22), and by the law of large numbers and Chernoff’s bound. Then, by similar arguments to those used for the bound on

\Pr (E_{1, 2} (b) ∣ E_{1} {(b - 1)}^{c})

, the third term on the RHS of (A25) tends to zero as

n \to \infty

, provided that

R^{'} < \inf_{q^{'} \in Q} I_{q^{'}} (U, X_{1}; Y) - ε_{4} (δ)

, where

ε_{4} (δ) \to 0

as

δ \to 0

. Using induction, we have by (A25) that the second term on the RHS of (A12) tends to zero exponentially as

n \to \infty

, for

b \in [1 : B - 1]

.

Moving to the error event for

M_{b}^{″}

, define

\begin{matrix} E_{3, 1} (b) = & {(U_{b}^{n} ({\hat{M}}_{b}^{'} | {\hat{M}}_{b - 1}^{'}), X_{b}^{n} ({\hat{M}}_{b}^{'}, 1 | {\hat{M}}_{b - 1}^{'}), X_{1, b} ({\hat{M}}_{b - 1}^{'}), Y_{b}^{n}) \notin A^{δ} (P_{U, X, X_{1}} P_{Y | X, X_{1}}^{q^{'}}), for all q^{'} \in {\hat{Q}}_{n}} \\ E_{3, 2} (b) = & {(U_{b}^{n} ({\hat{M}}_{b}^{'} | {\hat{M}}_{b - 1}^{'}), X_{b}^{n} ({\hat{M}}_{b}^{'}, m_{b}^{″} | {\hat{M}}_{b - 1}^{'}), X_{1, b} ({\hat{M}}_{b - 1}^{'}), Y_{b}^{n}) \in A^{δ} (P_{U, X, X_{1}} P_{Y | X, X_{1}}^{q^{'}}), \\ for some m_{b}^{″} \neq 1, q^{'} \in {\hat{Q}}_{n}} . \end{matrix}

(A27)

Given

E_{2} {(b)}^{c} \cap E_{2} {(b - 1)}^{c}

, we have that

{\hat{M}}_{b}^{'} = 1

and

{\hat{M}}_{b - 1}^{'} = 1

. Then, by similar arguments to those used above,

\begin{matrix} \Pr (E_{3} (b) ∣ E_{1} {(b)}^{c} \cap E_{2} {(b)}^{c} \cap E_{2} {(b - 1)}^{c}) \\ \leq & \Pr (E_{3, 1} (b) ∣ E_{1} {(b)}^{c} \cap E_{2} {(b)}^{c} \cap E_{2} {(b - 1)}^{c}) + \Pr (E_{3, 2} (b) ∣ E_{1} {(b)}^{c} \cap E_{2} {(b)}^{c} \cap E_{2} {(b - 1)}^{c}) \\ \leq & e^{- a_{0} n} + {(n + 1)}^{| S |} \cdot \sup_{q^{'} \in Q} \sum_{m_{b}^{″} \neq 1} \Pr ((U_{b}^{n} (1 | 1), X_{b}^{n} (1, m_{b}^{″} | 1), X_{1, b} (1), Y_{b}^{n}) \in A^{δ} (P_{U, X, X_{1}} P_{Y | X, X_{1}}^{q^{'}}) ∣ E_{1} {(b)}^{c}) \\ \leq & e^{- a_{0} n} + {(n + 1)}^{| S |} \cdot \sup_{q^{'} \in Q} 2^{- n [I_{q^{'}} (X; Y | U, X_{1}) - R^{″} - ε_{5} (δ)]} \end{matrix}

(A28)

where

a_{0} > 0

and

ε_{5} (δ) \to 0

as

δ \to 0

. The second inequality holds by (A22) along with the law of large numbers and Chernoff’s bound, and the last inequality holds as

X_{b}^{n} (1, m_{b}^{″} | 1)

is conditionally independent of

Y_{b}^{n}

given

(U_{b}^{n} (1 | 1), X_{1, b}^{n} (1))

for every

m_{b}^{″} \neq 1

. Thus, the third term on the RHS of (A12) tends to zero exponentially as

n \to \infty

, provided that

R^{″} < \inf_{q^{'} \in Q} I_{q^{'}} (X; Y | U, X_{1}) - ε_{5} (δ)

. Eliminating

R^{'}

and

R^{″}

, we conclude that the probability of error, averaged over the class of the codebooks, exponentially decays to zero as

n \to \infty

, provided that

R < R_{P D F} (L^{Q})

. Therefore, there must exist a

(2^{n R}, n, ε)

deterministic code, for a sufficiently large n. □

Appendix A.2. Cutset Upper Bound

This is a straightforward consequence of the cutset bound in [4]. Assume to the contrary that there exists an achievable rate

R > R_{C S} (L^{Q})

. Then, for some

q^{*} (s)

in the closure of

Q

,

\begin{matrix} R > \max_{p (x, x_{1})} \min \{I_{q^{*}} (X, X_{1}; Y), I_{q^{*}} (X; Y, Y_{1} | X_{1})\} . \end{matrix}

(A29)

By the achievability assumption, we have that for every

ε > 0

and sufficiently large n, there exists a

(2^{n R}, n)

random code

C^{Γ}

such that

P_{e}^{(n)} (q, C) \leq ε

for every i.i.d. state distribution

q \in Q

, and in particular for

q^{*}

. This holds even if

q^{*}

is in the closure of

Q

but not in

Q

itself, since

P_{e}^{(n)} (q, C)

is continuous in q. Consider using this code over a standard relay channel

W_{Y, Y_{1} | X, X_{1}}

without a state, where

W_{Y, Y_{1} | X, X_{1}} (y, y_{1} | x, x_{1}) = \sum_{s \in S} q^{*} (s) W_{Y, Y_{1} | X, X_{1}, S} (y, y_{1} | x, x_{1}, s)

. It follows that the rate R as in (A29) can be achieved over the relay channel

W_{Y, Y_{1} | X, X_{1}}

, in contradiction to [4]. We deduce that the assumption is false, and

R > R_{C S} (L^{Q})

cannot be achieved. □

Appendix B. Proof of Corollary 1

This is a straightforward consequence of Lemma 1, which states that the capacity of the compound relay channel is bounded by

R_{P D F} (L^{Q}) \leq C (L^{Q}) \leq R_{C S} (L^{Q})

. Thus, if

W_{Y, Y_{1} | X, X_{1}, S}

is reversely degraded such that

W_{Y, Y_{1} | X, X_{1}, S} = W_{Y | X, X_{1}} W_{Y_{1} | Y, X_{1}, S}

, then

I_{q} (X; Y, Y_{1} | X_{1}) = I_{q} (X; Y | X_{1})

, and the bounds coincide by the minimax theorem [90], cf. (8) and (12). Similarly, if

W_{Y, Y_{1} | X, X_{1}, S}

is strongly degraded, i.e.,

W_{Y, Y_{1} | X, X_{1}, S} = W_{Y_{1} | X, X_{1}} W_{Y | Y_{1}, X_{1}, S}

, then

I_{q} (X; Y, Y_{1} | X_{1}) = I (X; Y_{1} | X_{1})

, and by (8) and (13),

\begin{matrix} R_{C S} (L^{Q}) = & \min_{q (s) \in Q} \max_{p (x, x_{1})} \min \{I_{q} (X, X_{1}; Y), I (X; Y_{1} | X_{1})\}, \end{matrix}

(A30)

\begin{matrix} R_{P D F} (L^{Q}) = & \max_{p (x, x_{1})} \min_{q (s) \in Q} \min \{I_{q} (X, X_{1}; Y), I (X; Y_{1} | X_{1})\} . \end{matrix}

(A31)

Observe that

\min \{I_{q} (X, X_{1}; Y), I (X; Y_{1} | X_{1})\}

is concave in

p (x, x_{1})

and quasi-convex in

q (s)

(see e.g., [94] (Section 3.4])), hence the bounds (A30) and (A31) coincide by the minimax theorem [90]. □

Appendix C. Proof of Corollary 2

Consider the block-compound relay channel

L^{Q \times B}

, where the state distribution

q_{b} \in Q

varies from block to block. Since the encoder, relay and receiver are aware of this jamming scheme, they can use a block coding scheme that is synchronized with the jammer block strategy. Thus, the capacity is the same as that of the ordinary compound channel, i.e.,

C (L^{Q \times B}) = C (L^{Q})

and

C^{⋆} (L^{Q \times B}) = C^{⋆} (L^{Q})

. Hence, (17) and (18) follow from Lemma 1. As for the second part of Corollary 2, observe that the block Markov coding scheme used in the proof of the partial decode-forward lower bound can be applied as is to the block-compound relay channel, since the relay and the destination receiver do not estimate the state distribution while decoding the messages (see Appendix A). Furthermore, the analysis also holds, where the actual state distribution

q^{*}

, in (A18)–(A20) and (A26), is now replaced by the state distribution

q_{b}^{*}

which corresponds to block

b \in [1 : B]

. □

Appendix D. Proof of Theorem 1

First, we explain the general idea. We modify Ahlswede’s Robustification Technique (RT) [59] to the relay channel. Namely, we use codes for the compound relay channel to construct a random code for the AVRC using randomized permutations. However, in our case, the strictly causal nature of the relay imposes a difficulty, and the application of the RT is not straightforward.

In [59], there is noncausal state information and a random code is defined via permutations of the codeword symbols and the received sequence. Here, however, the relay cannot apply permutations to its transmission

x_{1}^{n}

, because it depends on the received sequence

y_{1}^{n}

in a strictly causal manner. We resolve this difficulty using block Markov codes for the block-compound relay channel to construct a random code for the AVRC, applying B in-block permutations to the relay transmission, which depends only on the sequence received in the previous block. The details are given below.

Appendix D.1. Partial Decode-Forward Lower Bound

We show that every rate

R < R_{P D F}^{⋆} (L)

(see (19)) can be achieved by random codes over the AVRC

L

, i.e.,

C (L) \geq R_{P D F}^{⋆} (L)

. We start with Ahlswede’s RT [59], stated below. Let

h : S^{n} \to [0, 1]

be a given function. If, for some fixed

α_{n} \in (0, 1)

, and for all

q (s^{n}) = \prod_{i = 1}^{n} q (s_{i})

, with

q \in P (S)

,

\begin{matrix} \sum_{s^{n} \in S^{n}} q (s^{n}) h (s^{n}) \leq α_{n}, \end{matrix}

(A32)

then,

\begin{matrix} \frac{1}{n!} \sum_{π \in Π_{n}} h (π s^{n}) \leq β_{n}, for all s^{n} \in S^{n}, \end{matrix}

(A33)

where

Π_{n}

is the set of all n-tuple permutations

π : S^{n} \to S^{n}

, and

β_{n} = {(n + 1)}^{| S |} \cdot α_{n}

.

According to Corollary 2, for every

R < R_{P D F}^{⋆} (L)

, there exists a

(2^{n R (B - 1)},

n B,

e^{- 2 θ n})

block Markov code for the block-compound relay channel

L^{P (S) \times B}

for some

θ > 0

and sufficiently large n, where

B > 0

is arbitrarily large. Recall that the code constructed in the proof in Appendix A has the following form. The encoders use

B > 0

blocks to convey

B - 1

messages

m_{b}

,

b \in [1 : B - 1]

. Each message consists of two parts, i.e.,

m_{b} = (m_{b}^{'}, m_{b}^{″})

, where

m_{b}^{'} \in [1 : 2^{n R^{'}}]

and

m_{b}^{″} \in [1 : 2^{n R^{″}}]

. In block

b \in [1 : B]

, the encoder sends

x_{b}^{n} = f_{b} (m_{b}^{'}, m_{b}^{″} | m_{b - 1}^{'})

, with fixed

m_{0}

and

m_{B}

, and the relay transmits

x_{1, b}^{n} = f_{1, b} (y_{1, b - 1}^{n})

, using the sequence received in the previous block. After receiving the entire output sequence

{(y_{b}^{n})}_{b = 1}^{B}

, the decoder finds an estimate for the messages. Set

{\hat{m}}_{B}^{'} = 1

. The first part of each message is decoded backwards as

{\hat{m}}_{b}^{'} = g_{b}^{'} (y_{b + 1}^{n}, {\hat{m}}_{b + 1}^{'})

, for

b = B - 1, B - 2, \dots, 1

. Then, the second part of each message is decoded as

{\hat{m}}_{b}^{″} = g_{b}^{″} (y_{b}^{n}, {\hat{m}}_{1}^{'}, \dots, {\hat{m}}_{B - 1}^{'})

, for

b \in [1 : B - 1]

. The overall blocklength is then

n \cdot B

and the average rate is

\frac{B - 1}{B} (R^{'} + R^{″})

.

Given such a block Markov code

C_{B M}

for the block-compound relay channel

L^{P (S) \times B}

, we have that

\begin{matrix} \Pr_{C_{B M}} (E_{b}^{'} | {(E_{b + 1}^{'})}^{c}) \leq e^{- 2 θ n}, \Pr_{C_{B M}} (E_{b}^{″} | E_{1}^{' c}, \dots, E_{b - 1}^{' c}) \leq e^{- 2 θ n} \end{matrix}

(A34)

for

b = B - 1, \dots, 1

, where

E_{0}^{'} = E_{B}^{'} = \emptyset

, and

E_{b}^{'} = {{\hat{M}}_{b}^{'} \neq M_{b}^{'}}

,

E_{b}^{″} = {{\hat{M}}_{b}^{″} \neq M_{b}^{″}}

,

b \in [1 : B - 1]

. That is, for every sequence of state distributions

q_{1}, \dots, q_{b + 1}

, where

q_{t} (s_{t}^{n}) = \prod_{i = 1}^{n} q_{t} (s_{t, i})

for

t \in [1 : b + 1]

,

\begin{matrix} \sum_{s_{1}^{n} \in S^{n}} q_{1} (s_{1}^{n}) \sum_{s_{2}^{n} \in S^{n}} q_{2} (s_{2}^{n}) \dots \sum_{s_{b + 1}^{n} \in S^{n}} q_{b + 1} (s_{b + 1}^{n}) \cdot h_{b}^{'} (s_{1}^{n}, s_{2}^{n}, \dots, s_{b + 1}^{n}) \leq e^{- 2 θ n}, \end{matrix}

(A35)

and

\begin{matrix} \sum_{s_{1}^{n} \in S^{n}} q_{1} (s_{1}^{n}) \sum_{s_{2}^{n} \in S^{n}} q_{2} (s_{2}^{n}) \dots \sum_{s_{b}^{n} \in S^{n}} q_{b} (s_{b}^{n}) \cdot h_{b}^{″} (s_{1}^{n}, s_{2}^{n}, \dots, s_{b}^{n}) \leq e^{- 2 θ n}, \end{matrix}

(A36)

where

\begin{matrix} h_{b}^{'} (s_{1}^{n}, s_{2}^{n}, \dots, s_{b + 1}^{n}) = \frac{1}{2^{n (b + 1) (R^{'} + R^{″})}} \sum_{(m_{1}^{'}, m_{1}^{″}), \dots, (m_{b + 1}^{'}, m_{b + 1}^{″})} \\ \sum_{y_{1, b}^{n} \in Y_{1}^{n}} \Pr (Y_{1, b}^{n} = y_{1, b}^{n} ∣ (M_{1}^{'}, M_{1}^{″}) = (m_{1}^{'}, m_{1}^{″}), \dots, (M_{b}^{'}, M_{b}^{″}) = (m_{b}^{'}, m_{b}^{″}), S_{1}^{n} = s_{1}^{n}, \dots, S_{b}^{n} = s_{b}^{n}) \\ \times \sum_{y_{b + 1}^{n} : g_{b}^{'} (y_{b + 1}^{n}, m_{b + 1}^{'}) \neq m_{b}^{'}} W_{Y^{n} | X^{n}, X_{1}^{n}, S^{n}} (y_{b + 1}^{n} | f_{b + 1} (m_{b + 1}^{'}, m_{b + 1}^{″} | m_{b}^{'}), f_{1, b + 1} (y_{1, b}^{n}), s_{b + 1}^{n}) \end{matrix}

(A37)

and

\begin{matrix} h_{b}^{″} (s_{1}^{n}, s_{2}^{n}, \dots, s_{b}^{n}) = \frac{1}{2^{n R^{″}}} \sum_{m_{b}^{″} = 1}^{2^{n R^{″}}} \frac{1}{2^{n R^{'} (B - 1)}} \sum_{m_{1}^{'}, \dots, m_{B - 1}^{'}} \\ \sum_{y_{1, b - 1}^{n} \in Y_{1}^{n}} \Pr (Y_{1, b - 1}^{n} = y_{1, b - 1}^{n} | (M_{1}^{'}, M_{1}^{″}) = (m_{1}^{'}, m_{1}^{″}), \dots, (M_{b - 1}^{'}, M_{b - 1}^{″}) = (m_{b - 1}^{'}, m_{b - 1}^{″}), \\ S_{1}^{n} = s_{1}^{n}, \dots, S_{b - 1}^{n} = s_{b - 1}^{n}) \\ \times \sum_{y_{b}^{n}, y_{1, b}^{n} : g_{b}^{″} (y_{b}^{n}, m_{1}^{'}, \dots, m_{B - 1}^{'}) \neq m_{b}^{″}} W_{Y^{n} | X^{n}, X_{1}^{n}, S^{n}} (y_{b}^{n} | f_{b} (m_{b}^{'}, m_{b}^{″} | m_{b - 1}^{'}), f_{1, b} (y_{1, b - 1}^{n}), s_{b}^{n}) . \end{matrix}

(A38)

The conditioning in the equations above can be explained as follows. In (A37), due to the code construction, the sequence

Y_{1, b}^{n}

received at the relay in block

b \in [1 : B]

depends only on the messages

(M_{t}^{'}, M_{t}^{″})

with

t \leq b

. The decoded message

{\hat{M}}_{b}^{'}

, at the destination receiver, depends on messages

M_{t}^{'}

with

t > b

, since the receiver decodes this part of the message backwards. In (A38), since the second part of the message

M_{b}^{″}

is decoded after backward decoding is complete, the estimation of

M_{b}^{″}

at the decoder depends on the entire sequence

{\hat{M}}_{1}^{'}, \dots, {\hat{M}}_{B - 1}^{'}

. By (A35)–(A36), for every

t \in [1 : b]

,

h_{b}^{'}

and

h_{b}^{″}

as functions of

s_{t + 1}^{n}

and

s_{t}^{n}

, respectively, satisfy (A32) with

α_{n} = e^{- 2 θ n}

, given that the state sequences in the other blocks are fixed. Hence, applying Ahlswede’s RT recursively, we obtain

\begin{matrix} \frac{1}{{(n!)}^{b + 1}} \sum_{π_{1}, π_{2}, \dots, π_{b + 1} \in Π_{n}} h_{b}^{'} (π_{1} s_{1}^{n}, π_{2} s_{2}^{n}, \dots, π_{b + 1} s_{b + 1}^{n}) \leq {(n + 1)}^{B | S |} e^{- 2 θ n} \leq e^{- θ n}, \\ \frac{1}{{(n!)}^{b}} \sum_{π_{1}, π_{2}, \dots, π_{b} \in Π_{n}} h_{b}^{″} (π_{1} s_{1}^{n}, π_{2} s_{2}^{n}, \dots, π_{b} s_{b}^{n}) \leq {(n + 1)}^{B | S |} e^{- 2 θ n} \leq e^{- θ n}, \end{matrix}

(A39)

for all

(s_{1}^{n}, s_{2}^{n}, \dots, s_{b + 1}^{n}) \in S^{(b + 1) n}

and sufficiently large n, such that

{(n + 1)}^{B | S |} \leq e^{θ n}

.

On the other hand, for every

π_{1}, π_{2}, \dots, π_{b + 1} \in Π_{n}

, we have that

\begin{matrix} h_{b}^{'} (π_{1} s_{1}^{n}, π_{2} s_{2}^{n}, \dots, π_{b + 1} s_{b + 1}^{n}) = E h_{b}^{'} (π_{1} s_{1}^{n}, π_{2} s_{2}^{n}, \dots, π_{b + 1} s_{b + 1}^{n} | M_{t}^{'}, M_{t}^{″}, t = 1, \dots, b + 1), \end{matrix}

(A40)

with

\begin{matrix} h_{b}^{'} (π_{1} s_{1}^{n}, π_{2} s_{2}^{n}, \dots, π_{b + 1} s_{b + 1}^{n} | m_{t}^{'}, m_{t}^{″}, t = 1, \dots, b + 1) \\ = & \sum_{y_{1, 1}, \dots, y_{1, b}} \prod_{t = 0}^{b - 1} W_{Y_{1}^{n} | X^{n}, X_{1}^{n}, S^{n}} (y_{1, t + 1}^{n} | f_{t + 1} (m_{t + 1}^{'}, m_{t + 1}^{″} | m_{t}^{'}), f_{1, t + 1} (y_{1, t}^{n}), π_{t + 1} s_{t + 1}^{n}) \\ \times \sum_{y_{b + 1}^{n} : g_{b}^{'} (y_{b + 1}^{n}, m_{b + 1}^{'}) \neq m_{b}^{'}} W_{Y^{n} | X^{n}, X_{1}^{n}, S^{n}} (y_{b + 1}^{n} | f_{b + 1} (m_{b + 1}^{'}, m_{b + 1}^{″} | m_{b}^{'}), f_{1, b + 1} (y_{1, b}^{n}), π_{b + 1} s_{b + 1}^{n}) \\ \overset{(a)}{=} & \sum_{y_{1, 1}, \dots, y_{1, b}} \prod_{t = 0}^{b - 1} W_{Y_{1}^{n} | X^{n}, X_{1}^{n}, S^{n}} (π_{t + 1} y_{1, t + 1}^{n} | f_{t + 1} (m_{t + 1}^{'}, m_{t + 1}^{″} | m_{t}^{'}), f_{1, b + 1} (π_{t} y_{1, t}^{n}), π_{t + 1} s_{t + 1}^{n}) \end{matrix}

\begin{matrix} \times \sum_{y_{b + 1}^{n} : g_{b}^{'} (π_{b + 1} y_{b + 1}^{n}, m_{b + 1}^{'}) \neq m_{b}^{'}} W_{Y^{n} | X^{n}, X_{1}^{n}, S^{n}} (π_{b + 1} y_{b + 1}^{n} | f_{b + 1} (m_{b + 1}^{'}, m_{b + 1}^{″} | m_{b}^{'}), f_{1, b + 1} (π_{b} y_{1, b}^{n}), π_{b + 1} s_{b + 1}^{n}) \\ \overset{(b)}{=} & \sum_{y_{1, 1}, \dots, y_{1, b}} \prod_{t = 0}^{b - 1} W_{Y_{1}^{n} | X^{n}, X_{1}^{n}, S^{n}} (y_{1, t + 1}^{n} | π_{t + 1}^{- 1} f_{t + 1} (m_{t + 1}^{'}, m_{t + 1}^{″} | m_{t}^{'}), π_{t + 1}^{- 1} f_{1, b + 1} (π_{t} y_{1, t}^{n}), s_{t + 1}^{n}) \\ \times \sum_{y_{b + 1}^{n} : g_{b}^{'} (π_{b + 1} y_{b + 1}^{n}, m_{b + 1}^{'}) \neq m_{b}^{'}} W_{Y^{n} | X^{n}, X_{1}^{n}, S^{n}} (y_{b + 1}^{n} | π_{b + 1}^{- 1} f_{b + 1} (m_{b + 1}^{'}, m_{b + 1}^{″} | m_{b}^{'}), π_{b + 1}^{- 1} f_{1, b + 1} (π_{b} y_{1, b}^{n}), s_{b + 1}^{n}), \end{matrix}

(A41)

where

(a)

is obtained by changing the order of summation over

y_{1, 1}^{n}, \dots, y_{1, b}^{n}

and

y_{b + 1}^{n}

; and

(b)

holds because the relay channel is memoryless. Similarly,

\begin{matrix} h_{b}^{″} (π_{1} s_{1}^{n}, π_{2} s_{2}^{n}, \dots, π_{b} s_{b}^{n}) = E h_{b}^{″} (π_{1} s_{1}^{n}, π_{2} s_{2}^{n}, \dots, π_{b} s_{b}^{n} | M_{1}^{'}, \dots, M_{B - 1}^{'}, M_{t}^{″}, t = 1, \dots, b), \end{matrix}

(A42)

with

\begin{matrix} h_{b}^{″} (π_{1} s_{1}^{n}, π_{2} s_{2}^{n}, \dots, π_{b} s_{b}^{n} | m_{1}^{'}, \dots, m_{B - 1}^{'}, m_{t}^{″}, t = 1, \dots, b) \\ = & \sum_{y_{1, 1}, \dots, y_{1, b - 1}} \prod_{t = 1}^{b - 1} W_{Y_{1}^{n} | X^{n}, X_{1}^{n}, S^{n}} (y_{1, t}^{n} | f_{t} (m_{t}^{'}, m_{t}^{″} | m_{t - 1}^{'}), f_{1, t} (y_{1, t}^{n}), π_{t} s_{t}^{n}) \\ \times \sum_{y_{b}^{n} : g_{b}^{″} (y_{b}^{n}, m_{1}^{'}, \dots, m_{B - 1}^{'}) \neq m_{b}^{″}} W_{Y^{n} | X^{n}, X_{1}^{n}, S^{n}} (y_{b}^{n} | f_{b} (m_{b}^{'}, m_{b}^{″} | m_{b - 1}^{'}), f_{1, b} (y_{1, b - 1}^{n}), π_{b} s_{b}^{n}) \\ \overset{(a)}{=} & \sum_{y_{1, 1}, \dots, y_{1, b - 1}} \prod_{t = 1}^{b - 1} W_{Y_{1}^{n} | X^{n}, X_{1}^{n}, S^{n}} (π_{t} y_{1, t}^{n} | f_{t} (m_{t}^{'}, m_{t}^{″} | m_{t - 1}^{'}), f_{1, t} (π_{t - 1} y_{1, t - 1}^{n}), π_{t} s_{t}^{n}) \\ \times \sum_{y_{b}^{n} : g_{b}^{″} (π_{b} y_{b}^{n}, m_{1}^{'}, \dots, m_{B - 1}^{'}) \neq m_{b}^{″}} W_{Y^{n} | X^{n}, X_{1}^{n}, S^{n}} (π_{b} y_{b}^{n} | f_{b} (m_{b}^{'}, m_{b}^{″} | m_{b - 1}^{'}), f_{1, b} (π_{b - 1} y_{1, b - 1}^{n}), π_{b} s_{b}^{n}) \\ \overset{(b)}{=} & \sum_{y_{1, 1}, \dots, y_{1, b - 1}} \prod_{t = 1}^{b - 1} W_{Y_{1}^{n} | X^{n}, X_{1}^{n}, S^{n}} (y_{1, t}^{n} | π_{t}^{- 1} f_{t} (m_{t}^{'}, m_{t}^{″} | m_{t - 1}^{'}), π_{t}^{- 1} f_{1, t} (π_{t - 1} y_{1, t - 1}^{n}), s_{t}^{n}) \\ \times \sum_{y_{b}^{n} : g_{b}^{″} (π_{b} y_{b}^{n}, m_{1}^{'}, \dots, m_{B - 1}^{'}) \neq m_{b}^{″}} W_{Y^{n} | X^{n}, X_{1}^{n}, S^{n}} (y_{b}^{n} | π_{b}^{- 1} f_{b} (m_{b}^{'}, m_{b}^{″} | m_{b - 1}^{'}), π_{b}^{- 1} f_{1, b} (π_{b - 1} y_{1, b - 1}^{n}), s_{b}^{n}) . \end{matrix}

(A43)

Then, consider the

(2^{n R (B - 1)}, n B)

random Markov block code

C_{B M}^{Π}

, specified by

\begin{matrix} f_{b, π} (m_{b}^{'}, m_{b}^{″} | m_{b - 1}^{'}) = π_{b}^{- 1} f_{b} (m_{b}^{'}, m_{b}^{″} | m_{b - 1}^{'}), f_{1, b, π} (y_{1, b - 1}^{n}) = π_{b}^{- 1} f_{1, b} (π_{b - 1} y_{1, b - 1}^{n}), \end{matrix}

(A44a)

and

\begin{matrix} g_{b, π}^{'} (y_{b + 1}^{n}, {\hat{m}}_{b + 1}^{'}) = g_{b}^{'} (π_{b + 1} y_{b + 1}^{n}, {\hat{m}}_{b + 1}^{'}), g_{b, π}^{″} (y_{b}^{n}, {\hat{m}}_{1}^{'}, \dots, {\hat{m}}_{B - 1}^{'}) = g_{b}^{″} (π y_{b}^{n}, {\hat{m}}_{1}^{'}, \dots, {\hat{m}}_{B - 1}^{'}), \end{matrix}

(A44b)

for

π_{1}, \dots, π_{B} \in Π_{n}

, with a uniform distribution

μ (π_{1}, \dots, π_{B}) = \frac{1}{| Π_{n} |^{B}} = \frac{1}{{(n!)}^{B}}

. That is, a set of B independent permutations is chosen at random and applied to all blocks simultaneously, while the order of the blocks remains intact. As we restricted ourselves to a block Markov code, the relaying function in a given block depends only on symbols received in the previous block, hence, the relay can implement those in-block permutations, and the coding scheme does not violate the causality requirement.

From (A41) and (A43), we see that using the random code

C_{B M}^{Π}

, the error probabilities for the messages

M_{b}^{'}

and

M_{b}^{″}

are given by

\begin{matrix} \Pr_{C_{B M}^{Π}} (E_{b}^{'} | {(E_{b + 1}^{'})}^{c}, S_{1}^{n} = s_{1}^{n}, \dots, S_{b + 1}^{n} = s_{b + 1}^{n}) = \sum_{π_{1}, \dots, π_{B} \in Π_{n}} μ (π_{1}, \dots, π_{B}) h_{b}^{'} (π_{1} s_{1}^{n}, π_{2} s_{2}^{n}, \dots, π_{b + 1} s_{b + 1}^{n}), \\ \Pr_{C_{B M}^{Π}} (E_{b}^{″} | E_{1}^{' c}, \dots, E_{B - 1}^{' c}, S_{1}^{n} = s_{1}^{n}, \dots, S_{b}^{n} = s_{b}^{n}) = \sum_{π_{1}, \dots, π_{B} \in Π_{n}} μ (π_{1}, \dots, π_{B}) h_{b}^{″} (π_{1} s_{1}^{n}, π_{2} s_{2}^{n}, \dots, π_{b} s_{b}^{n}), \end{matrix}

(A45)

for all

s_{1}^{n}, \dots, s_{b + 1}^{n} \in S^{n}

,

b \in [1 : B - 1]

, and therefore, together with (A39), we have that the probability of error of the random code

C_{B M}^{Π}

is bounded by

P_{e}^{(n)} (q, C_{B M}^{Π}) \leq e^{- θ n}

, for every

q (s^{n B}) \in P (S^{n B})

. That is,

C_{B M}^{Π}

is a

(2^{n R (B - 1)}, n B, e^{- θ n})

random code for the AVRC

L

, where the overall blocklength is

n B

, and the average rate

\frac{B - 1}{B} \cdot R

tends to R as

B \to \infty

. This completes the proof of the partial decode-forward lower bound.

Appendix D.2. Cutset Upper Bound

The proof immediately follows from Lemma 1, since the random code capacity of the AVRC is bounded by the random code capacity of the compound relay channel, i.e.,

C^{⋆} (L) \leq C^{⋆} (L^{P (S)})

. □

Appendix E. Proof of Lemma 2

We use the approach of [55], with the required adjustments. We use the random code constructed in the proof of Theorem 1. Let

R < C^{⋆} (L)

, and consider the case where the marginal sender-relay and sender-receiver AVCs have positive capacity, i.e.,

\begin{matrix} C (W_{1} (x_{1, 1})) > 0, and C (W (x_{1, 2})) > 0, \end{matrix}

(A46)

for some

x_{1, 1}, x_{1, 2} \in X_{1}

(see (23)). By Theorem 1, for every

ε > 0

and sufficiently large n, there exists a

(2^{n R}, n, ε)

random code

C^{Γ} = (μ (γ) = \frac{1}{k}, Γ = [1 : k], {C_{γ}}_{γ \in Γ})

, where

C_{γ} = (f_{γ}^{n}, f_{1, γ}, g_{γ})

, for

γ \in Γ

. Following Ahlswede’s Elimination Technique [55], it can be assumed that the size of the code collection is bounded by

k = | Γ | \leq n^{2}

. By (A46), we have that for every

ε^{'} > 0

and sufficiently large

ν^{'}

, the code index

γ \in [1 : k]

can be sent through the relay channel

W_{Y_{1} | X, X_{1}, S}

using a

(2^{ν^{'} {\tilde{R}}^{'}}, ν^{'}, ε^{'})

deterministic code

C_{i}^{'} = ({\tilde{f}}^{ν^{'}}, {\tilde{g}}^{'})

, where

{\tilde{R}}^{'} > 0

, while the relay repeatedly transmits the symbol

x_{1, 1}

. Since k is at most polynomial, the encoder can reliably convey

γ

to the relay with a negligible blocklength, i.e.,

ν^{'} = o (n)

. Similarly, there exists

(2^{ν^{″} {\tilde{R}}^{″}}, ν^{″}, ε^{″})

code

C_{i}^{″} = ({\tilde{f}}^{ν^{″}}, {\tilde{g}}^{″})

for the transmission of

γ \in [1 : k]

through the channel

W_{Y | X, X_{1}, S}

to the receiver, where

ν^{″} = o (n)

and

{\tilde{R}}^{″} > 0

, while the relay repeatedly transmits the symbol

x_{1, 2}

.

Now, consider a code formed by the concatenation of

C_{i}^{'}

and

C_{i}^{″}

as consecutive prefixes to a corresponding code in the code collection

{C_{γ}}_{γ \in Γ}

. That is, the encoder first sends the index

γ

to the relay and the receiver, and then it sends the message

m \in [1 : 2^{n R}]

to the receiver. Specifically, the encoder first transmits the

(ν^{'} + ν^{″})

-sequence

({\tilde{f}}^{ν^{'}} (γ), {\tilde{f}}^{ν^{″}} (γ))

to convey the index

γ

, while the relay transmits the

(ν^{'} + ν^{″})

-sequence

({\tilde{x}}_{1}^{ν^{'}}, {\tilde{x}}_{1}^{ν^{″}})

, where

{\tilde{x}}_{1}^{ν^{'}} = (x_{1, 1}, x_{1, 1}, \dots, x_{1, 1})

and

{\tilde{x}}_{1}^{ν^{″}} = (x_{1, 2}, x_{1, 2}, \dots, x_{1, 2})

. At the end of this transmission, the relay uses the first

ν^{'}

symbols it received to estimate the code index as

{\hat{γ}}^{'} = {\tilde{g}}^{'} ({\tilde{y}}_{1}^{ν^{'}})

.

Then, the message m is transmitted by the codeword

x^{n} = f_{γ} (m)

, while the relay transmits

x_{1}^{n} = f_{1, {\hat{γ}}^{'}}^{n} (y_{1}^{n})

. Subsequently, decoding is performed in two stages as well; the decoder estimates the index at first, with

{\hat{γ}}^{″} =

{\tilde{g}}^{″} ({\tilde{y}}^{ν^{″}})

, and the message is then estimated by

\hat{m} =

g_{{\hat{γ}}^{″}} (y^{n})

. By the union of events bound, the probability of error is then bounded by

ε_{c} = ε + ε^{'} + ε^{″}

, for every joint distribution in

P (S^{ν^{'} + ν^{″} + n})

. That is, the concatenated code is a

(2^{(ν^{'} + ν^{″} + n) {\tilde{R}}_{n}}, ν^{'} + ν^{″} + n, ε_{c})

code over the AVRC

L

, where the blocklength is

n + o (n)

, and the rate

{\tilde{R}}_{n} = \frac{n}{ν^{'} + ν^{″} + n} \cdot R

approaches R as

n \to \infty

. □

Appendix F. Proof of Corollary 4

Consider part 1. By Definition 3, if

W_{Y_{1} | X, X_{1}, S}

and

W_{Y | X, X_{1}, S}

are not symmetrizable-

X | X_{1}

then there exist

x_{1, 1}, x_{1, 2} \in X_{1}

such that the DMCs

W_{Y_{1} | X, X_{1}, S} (\cdot | \cdot,

x_{1, 1},

\cdot)

and

W_{Y | X, X_{1}, S} (\cdot | \cdot, x_{1, 2}, \cdot)

are non-symmetrizable in the sense of [57] (Definition 2). This, in turn, implies that

C (W_{1} (x_{1, 1})) > 0

and

C (W (x_{1, 2})) > 0

, due to [57] (Theorem 1). Hence, by Lemma 2,

C (L) = C^{⋆} (L)

, and by Theorem 1,

R_{P D F}^{⋆} (L) \leq C (L) \leq R_{C S}^{⋆} (L)

.

Part 3 immediately follows from part 1 and Corollary 3. As for part 2, consider a strongly reversely degraded relay channel. We claim that if

W_{Y | X, X_{1}, S}

is symmetrizable-

X | X_{1}

, then

W_{Y_{1} | X, X_{1}, S}

is also symmetrizable-

X | X_{1}

. Indeed, suppose that

W_{Y | X, X_{1}, S}

is symmetrized by some

J (s | x, x_{1})

(see Definition 24). Then, for every

x, \tilde{x} \in X

,

x_{1} \in X_{1}

, and

y_{1} \in Y_{1}

,

\begin{matrix} \sum_{s \in S} J (s | \tilde{x}, x_{1}) W_{Y_{1} | X, X_{1}, S} (y_{1} | x, x_{1}, s) = & \sum_{s \in S} J (s | \tilde{x}, x_{1}) \sum_{y \in Y} W_{Y, Y_{1} | X, X_{1}, S} (y, y_{1} | x, x_{1}, s) \\ \overset{(a)}{=} & \sum_{y \in Y} W_{Y_{1} | Y, X_{1}} (y_{1} | y, x_{1}) \sum_{s \in S} J (s | \tilde{x}, x_{1}) W_{Y | X, X_{1}, X} (y | x, x_{1}, s) \\ \overset{(b)}{=} & \sum_{y \in Y} W_{Y_{1} | Y, X_{1}} (y_{1} | y, x_{1}) \sum_{s \in S} J (s | x, x_{1}) W_{Y | X, X_{1}, X} (y | \tilde{x}, x_{1}, s) \\ \overset{(c)}{=} & \sum_{s \in S} J (s | x, x_{1}) \sum_{y \in Y} W_{Y, Y_{1} | X, X_{1}, S} (y, y_{1} | \tilde{x}, x_{1}, s) \\ = & \sum_{s \in S} J (s | x, x_{1}) W_{Y_{1} | X, X_{1}, S} (y_{1} | \tilde{x}, x_{1}, s), \end{matrix}

(A47)

where

(a)

and

(c)

hold since

W_{Y, Y_{1} | X, X_{1}, S}

is strongly reversely degraded, and

(b)

holds since

W_{Y | X, X_{1}, S}

is symmetrized by

J (s | x, x_{1})

. This means that

W_{Y_{1} | X, X_{1}, S}

is also symmetrizable-

X | X_{1}

. It can be deduced that given the conditions of part 2, both

W_{Y | X, X_{1}, S}

and

W_{Y_{1} | X, X_{1}, S}

are non-symmetrizable-

X | X_{1}

. Hence, the proof follows from part 1 and Corollary 3. □

Appendix G. Proof of Lemma 3

The proof is based on generalizing the technique by [56]. Let

L

be a symmetrizable-

X | X_{1}

. Assume to the contrary that a positive rate

R > 0

can be achieved. That is, for every

ε > 0

and sufficiently large n, there exists a

(2^{n R}, n, ε)

code

C = (f, f_{1}, g)

. Hence, the size of the message set is at least 2, i.e.,

\begin{matrix} M ≜ 2^{n R} \geq 2 . \end{matrix}

(A48)

We now show that there exists a distribution

q (s^{n})

such that the probability of error

P_{e}^{(n)} (q, C)

is bounded from below by a positive constant, in contradiction to the assumption above.

By Definition 3, there exists a conditional distribution

J (s | x)

that satisfies (24). Then, consider the state sequence distribution

q (s^{n}) = \frac{1}{M} \sum_{m = 1}^{M} J^{n} (s^{n} | x^{n} (m))

, where

J^{n} (s^{n} | x^{n}) = \prod_{i = 1}^{n} J (s_{i} | x_{i})

and

x^{n} (m) = f (m)

. For this distribution, the probability of error is given by

\begin{matrix} P_{e}^{(n)} (q, C) = & \sum_{s^{n} \in S^{n}} [\frac{1}{M} \sum_{\tilde{m} = 1}^{M} J^{n} (s^{n} | x^{n} (\tilde{m}))] \cdot \frac{1}{M} \sum_{m = 1}^{M} \sum_{(y^{n}, y_{1}^{n}) : g (y^{n}) \neq m} W^{n} (y^{n}, y_{1}^{n} | x^{n} (m), f_{1}^{n} (y_{1}^{n}), s^{n}) \\ = & \frac{1}{2 M^{2}} \sum_{m = 1}^{M} \sum_{\tilde{m} = 1}^{M} \sum_{(y^{n}, y_{1}^{n}) : g (y^{n}) \neq m} \sum_{s^{n} \in S^{n}} W^{n} (y^{n}, y_{1}^{n} | x^{n} (m), f_{1}^{n} (y_{1}^{n}), s^{n}) J^{n} (s^{n} | x^{n} (\tilde{m})) \\ + \frac{1}{2 M^{2}} \sum_{m = 1}^{M} \sum_{\tilde{m} = 1}^{M} \sum_{(y^{n}, y_{1}^{n}) : g (y^{n}) \neq \tilde{m}} \sum_{s^{n} \in S^{n}} W^{n} (y^{n}, y_{1}^{n} | x^{n} (\tilde{m}), f_{1}^{n} (y_{1}^{n}), s^{n}) J^{n} (s^{n} | x^{n} (m)) \end{matrix}

(A49)

with

W^{n} \equiv W_{Y^{n}, Y_{1}^{n} | X^{n}, X_{1}^{n}, S^{n}}

for short notation, where in the last sum we interchanged the summation indices m and

\tilde{m}

. Then, consider the last sum, and observe that by (24), we have that

\begin{matrix} \sum_{s^{n} \in S^{n}} W^{n} (y^{n}, y_{1}^{n} | x^{n} (\tilde{m}), f_{1}^{n} (y_{1}^{n}), s^{n}) J^{n} (s^{n} | x^{n} (m)) = & \prod_{i = 1}^{n} [\sum_{s_{i} \in S} W (y_{i}, y_{1, i} | x_{i} (\tilde{m}), f_{1, i} (y_{1}^{i - 1}), s_{i}) J (s_{i} | x_{i} (m))] \\ = & \prod_{i = 1}^{n} [\sum_{s_{i} \in S} W (y_{i}, y_{1, i} | x_{i} (m), f_{1, i} (y_{1}^{i - 1}), s_{i}) J (s_{i} | x_{i} (\tilde{m}))] \\ = & \sum_{s^{n} \in S^{n}} W^{n} (y^{n}, y_{1}^{n} | x^{n} (m), f_{1}^{n} (y_{1}^{n}), s^{n}) J^{n} (s^{n} | x^{n} (\tilde{m})) . \end{matrix}

(A50)

Substituting (A50) in (A49), we have

\begin{matrix} P_{e}^{(n)} (q, C) = & \frac{1}{2 M^{2}} \sum_{m = 1}^{M} \sum_{\tilde{m} = 1}^{M} \sum_{s^{n} \in S^{n}} [\sum_{(y^{n}, y_{1}^{n}) : g (y^{n}) \neq m} W^{n} (y^{n}, y_{1}^{n} | x^{n} (m), f_{1}^{n} (y_{1}^{n}), s^{n}) J^{n} (s^{n} | x^{n} (\tilde{m})) \\ + \sum_{(y^{n}, y_{1}^{n}) : g (y^{n}) \neq \tilde{m}} W^{n} (y^{n}, y_{1}^{n} | x^{n} (m), f_{1}^{n} (y_{1}^{n}), s^{n}) J^{n} (s^{n} | x^{n} (\tilde{m}))] \end{matrix}

\begin{matrix} \geq & \frac{1}{2 M^{2}} \sum_{m = 1}^{M} \sum_{\tilde{m} \neq m} \sum_{s^{n} \in S^{n}} \sum_{y^{n}, y_{1}^{n}} W^{n} (y^{n}, y_{1}^{n} | x^{n} (m), f_{1}^{n} (y_{1}^{n}), s^{n}) J^{n} (s^{n} | x^{n} (\tilde{m})) \\ = & \frac{M (M - 1)}{2 M^{2}} \geq \frac{1}{4}, \end{matrix}

(A51)

where the last inequality follows from (A48), hence a positive rate cannot be achieved. □

Appendix H. Proof of Lemma 4

Let

L = {W_{Y_{1} | X, X_{1}} W_{Y | Y_{1}, X_{1}, S}}

be a symmetrizable-

X_{1} \times Y_{1}

degraded AVRC. The proof follows similar lines as in Appendix G. First, assume to the contrary that there exists a

(2^{n R}, n, ε)

code

C = (f, f_{1}, g)

, with

M ≜ 2^{n R} \geq 2

. By Definition 4, there exists

J (s | x_{1}, y_{1})

that satisfies (28). Hence, defining

\begin{matrix} q (s^{n}) = \frac{1}{M} \sum_{m = 1}^{M} \sum_{y_{1}^{n} \in Y_{1}} W_{Y_{1}^{n} | X^{n}, X_{1}^{n}} (y_{1}^{n} | f (m), f_{1}^{n} (y_{1}^{n})) J^{n} (s^{n} | f_{1}^{n} (y_{1}^{n}), y_{1}^{n}), \end{matrix}

(A52)

where

J^{n} (s^{n} | x_{1}^{n}, y_{1}^{n}) = \prod_{i = 1}^{n} J (s_{i} | x_{1, i}, y_{1, i})

, we have that

\begin{matrix} \sum_{s^{n} \in S^{n}} W^{n} (y^{n} | {\tilde{y}}_{1}^{n}, f_{1}^{n} ({\tilde{y}}_{1}^{n}), s^{n}) J^{n} (s^{n} | f_{1}^{n} (y_{1}^{n}), y_{1}^{n}) = & \sum_{s^{n} \in S^{n}} W^{n} (y^{n} | y_{1}^{n}, f_{1}^{n} (y_{1}^{n}), s^{n}) J^{n} (s^{n} | f_{1}^{n} ({\tilde{y}}_{1}^{n}), {\tilde{y}}_{1}^{n}) . \end{matrix}

(A53)

By similar manipulations as in Appendix G, we obtain

\begin{matrix} P_{e}^{(n)} (q, C) = \frac{1}{2 M^{2}} \sum_{m = 1}^{M} \sum_{\tilde{m} = 1}^{M} \sum_{y_{1}^{n}, {\tilde{y}}_{1}^{n}} W_{Y_{1}^{n} | X^{n}, X_{1}^{n}} ({\tilde{y}}_{1}^{n} | f (\tilde{m}), f_{1}^{n} ({\tilde{y}}_{1}^{n})) W_{Y_{1}^{n} | X^{n}, X_{1}^{n}} (y_{1}^{n} | f (m), f_{1}^{n} (y_{1}^{n})) \\ \times \sum_{s^{n} \in S^{n}} [\sum_{y^{n} : g (y^{n}) \neq m} W^{n} (y^{n} | y_{1}^{n}, f_{1}^{n} (y_{1}^{n}), s^{n}) J^{n} (s^{n} | f_{1}^{n} ({\tilde{y}}_{1}^{n}), {\tilde{y}}_{1}^{n}) \\ + \sum_{y^{n} : g (y^{n}) \neq \tilde{m}} W^{n} (y^{n} | y_{1}^{n}, f_{1}^{n} (y_{1}^{n}), s^{n}) J^{n} (s^{n} | f_{1}^{n} ({\tilde{y}}_{1}^{n}), {\tilde{y}}_{1}^{n})] \\ \geq \frac{M (M - 1)}{2 M^{2}} \geq \frac{1}{4}, \end{matrix}

(A54)

hence a positive rate cannot be achieved. □

Appendix I. Analysis of Example 1

We show that the random code capacity of the AVRC in Example 1 is given by

C^{⋆} (L) = \min \{\frac{1}{2}, 1 - h (θ)\}

. As the AVRC is degraded, the random code capacity is given by

\begin{matrix} C^{⋆} (L) = R_{P D F}^{⋆} (L) = R_{C S}^{⋆} (L) = \max_{p (x, x_{1})} \min \{\min_{0 \leq q \leq 1} I_{q} (X, X_{1}; Y), I (X; Y_{1} | X_{1})\}, \end{matrix}

(A55)

due to part 2 of Corollary 3, where

q \equiv q (1) = 1 - q (0)

. Now, consider the direct part. Set

p (x, x_{1}) = p (x) p (x_{1})

, where

X \sim Bernoulli (M M 1 / 2)

and

X_{1} \sim Bernoulli (M M 1 / 2)

. Then,

\begin{matrix} I (X; Y_{1} | X_{1}) = 1 - h (θ), \\ H_{q} (Y) = \frac{1}{2} [- q \log (\frac{1}{2} q) - (1 - q) \log (\frac{1}{2} (1 - q))] - \frac{1}{2} \log (\frac{1}{2}) = 1 + \frac{1}{2} h (q), \\ H_{q} (Y | X, X_{1}) = h (q) . \end{matrix}

(A56)

Hence,

\begin{matrix} C^{⋆} (L) \geq & \min \{\min_{0 \leq q \leq 1} [1 - \frac{1}{2} h (q)], 1 - h (θ)\} = \min \{\frac{1}{2}, 1 - h (θ)\} . \end{matrix}

(A57)

As for the converse part, we have the following bounds,

\begin{matrix} 1 - 1 C^{⋆} (L) \leq & \max_{p (x, x_{1})} I (X; Y_{1} | X_{1}) = 1 - h (θ), \end{matrix}

(A58)

and

\begin{matrix} 1 - 1 C^{⋆} (L) \leq & \max_{p (x, x_{1})} \min_{0 \leq q \leq 1} I_{q} (X, X_{1}; Y) \leq \max_{p (x, x_{1})} [H_{q} (Y) - H_{q} (Y | X, X_{1})] |_{q = \frac{1}{2}} \\ = & \max_{0 \leq p \leq 1} [1 + \frac{1}{2} h (p)] - 1 = \frac{1}{2}, \end{matrix}

(A59)

where

p ≜ \Pr (X_{1} = 1)

. □

Appendix J. Proof of Lemma 5

The proof follows the lines of [5]. Consider an AVRC

L

=

{W_{Y | X^{'}, X_{1}}

W_{Y_{1} | X^{″}, X_{1}, S}}

with orthogonal sender components. We apply Theorem 1, which states that

R_{P D F}^{⋆} (L) \leq C^{⋆} (L) \leq R_{C S}^{⋆} (L)

.

Appendix J.1. Achievability Proof

To show achievability, we set

U = X^{″}

and

p (x^{'}, x^{″}, x_{1}) = p (x_{1}) p (x^{'} | x_{1}) p (x^{″} | x_{1})

in the partial decode-forward lower bound

R_{P D F}^{⋆} (L) ≜ R_{P D F} (L^{Q}) |_{Q = P (S)}

. Hence, by (9),

\begin{matrix} R_{P D F}^{⋆} (L_{2}) \geq & \max_{p (x_{1}) p (x^{'} | x_{1}) p (x^{″} | x_{1})} \min \{I (X^{'}, X^{″}, X_{1}; Y), \min_{q (s)} I_{q} (X^{″}; Y_{1} | X_{1}) + I (X^{'}; Y | X_{1}, X^{″})\} . \end{matrix}

(A60)

Now, by (29), we have that

(X^{″}, Y_{1}) - (X^{'}, X_{1}) - Y

form a Markov chain. As

(X_{1}, X^{'}, X^{″}) \sim p (x_{1}) p (x^{'} | x_{1}) p (x^{″} | x_{1})

, it further follows that

(X^{″}, Y_{1}) - X_{1} - Y

form a Markov chain, hence

I (X^{'}, X^{″}, X_{1}; Y) = I (X^{'}, X_{1}; Y)

and

I (X^{'}; Y | X_{1}, X^{″}) = I (X^{'}; Y | X_{1})

. Thus, (A60) reduces to the expression in the RHS of (30). If

W_{Y_{1} | X^{″}, X_{1}, S}

is non-symmetrizable-

X^{″} | X_{1}

, then (A60) is achievable by deterministic codes as well, due to Corollary 4.

Appendix J.2. Converse Proof

By (8) and (19), the cutset upper bound takes the following form,

\begin{matrix} R_{C S}^{⋆} (L) = & \min_{q (s)} \max_{p (x^{'}, x^{″}, x_{1})} \min \{I (X^{'}, X^{″}, X_{1}; Y), I_{q} (X^{'}, X^{″}; Y, Y_{1} | X_{1})\} \\ = & \max_{p (x^{'}, x^{″}, x_{1})} \min \{I (X^{'}, X^{″}, X_{1}; Y), \min_{q (s)} I_{q} (X^{'}, X^{″}; Y, Y_{1} | X_{1})\}, \end{matrix}

(A61)

where the last line is due to the minimax theorem [90]. For the AVRC with orthogonal sender components, as specified by (29), we have the following Markov relations,>

\begin{matrix} Y_{1} - (X^{″}, X_{1}) - (X^{'}, Y), \end{matrix}

(A62)

\begin{matrix} (X^{″}, Y_{1}) - (X^{'}, X_{1}) - Y . \end{matrix}

(A63)

Hence, by (A63),

I (X^{'}, X^{″}, X_{1}; Y) = I (X^{'}, X_{1}; Y)

. As for the second mutual information in the RHS of (A61), by the mutual information chain rule,

\begin{matrix} I_{q} (X^{'}, X^{″}; Y, Y_{1} | X_{1}) = & I_{q} (X^{″}; Y_{1} | X_{1}) + I_{q} (X^{'}; Y_{1} | X^{″}, X_{1}) + I_{q} (X^{'}, X^{″}; Y | X_{1}, Y_{1}) \\ \overset{(a)}{=} & I_{q} (X^{″}; Y_{1} | X_{1}) + I_{q} (X^{'}, X^{″}; Y | X_{1}, Y_{1}) \\ \overset{(b)}{=} & I_{q} (X^{″}; Y_{1} | X_{1}) + H_{q} (Y | X_{1}, Y_{1}) - H (Y | X^{'}, X_{1}) \\ \overset{(c)}{\leq} & I_{q} (X^{″}; Y_{1} | X_{1}) + I (X^{'}; Y | X_{1}) \end{matrix}

(A64)

where

(a)

is due to (A62),

(b)

is due to (A63), and

(c)

holds since conditioning reduces entropy. Therefore,

\begin{matrix} R_{C S}^{⋆} (L) \leq \max_{p (x^{'}, x^{″}, x_{1})} \min \{I (X^{'}, X_{1}; Y), \min_{q (s)} I_{q} (X^{″}; Y_{1} | X_{1}) + I (X^{'}; Y | X_{1})\} . \end{matrix}

(A65)

Without loss of generality, the maximization in (A65) can be restricted to distributions of the form

p (x^{'}, x^{″}, x_{1}) = p (x_{1}) \cdot

p (x^{'} | x_{1}) \cdot

p (x^{″} | x_{1})

. □

Appendix K. Proof of Lemma 6

Consider the Gaussian compound relay channel with SFD under input constraints

Ω

and

Ω_{1}

and state constraint

Λ

, i.e.,

Q = {q (s) : E S^{2} \leq Λ}

.

Appendix K.1. Achievability Proof

Consider the direct part. Although we previously assumed that the input, state and output alphabets are finite, our results for the compound relay channel can be extended to the continuous case as well, using standard discretization techniques [3] (Section 3.4.1); [55,95]. In particular, Lemma 1 can be extended to the compound relay channel

L^{Q}

under input constraints

Ω

and

Ω_{1}

and state constraint

Λ

, by choosing a distribution

p (x^{'}, x^{″}, x_{1})

such that

E (X^{' 2} + X^{″ 2}) \leq Ω

and

E X_{1}^{2} \leq Ω_{1}

. Then, the capacity of

L^{Q}

is bounded by

\begin{matrix} C (L^{Q}) \geq R_{P D F} (L^{Q}) \geq \max_{\begin{matrix} p (x^{″}) p (x, x_{1}) : \\ E (X^{' 2} + X^{″ 2}) \leq Ω, \\ E X_{1}^{2} \leq Ω_{1} \end{matrix}} \min { & \min_{q (s) : E S^{2} \leq Λ} I_{q} (X_{1}; Y) + \min_{q (s) : E S^{2} \leq Λ} I_{q} (X^{'}; Y | X_{1}), \\ I (X^{″}; Y_{1}) + \min_{q (s) : E S^{2} \leq Λ} I_{q} (X^{'}; Y | X_{1})}, \end{matrix}

(A66)

which follows from the partial decode-forward lower bound by taking

U = X^{″}

. Lemma 1 further states that there exists a block Markov code that achieves this rate such that the probability of error decays exponentially as the blocklength increases.

Let

0 \leq α, ρ \leq 1

, and let

(X^{'}, X^{″}, X_{1})

be jointly Gaussian with

\begin{matrix} X^{'} \sim N (0, α Ω), X^{″} \sim N (0, (1 - α) Ω), X_{1} \sim N (0, Ω_{1}), \end{matrix}

(A67)

where the correlation coefficient of

X^{'}

and

X_{1}

is

ρ

, while

X^{″}

is independent of

(X^{'}, X_{1})

. Hence,

\begin{matrix} I (X^{″}; Y_{1}) = \frac{1}{2} \log (1 + \frac{(1 - α) Ω}{σ^{2}}) . \end{matrix}

(A68)

Since Gaussian noise is the worst additive noise under variance constraint [96] (Lemma II.2), and as

V ar (X^{'} | X_{1} = x_{1}) = (1 - ρ^{2}) α Ω

for all

x_{1} \in R

, we have that

\begin{matrix} \min_{q (s) : E S^{2} \leq Λ} I_{q} (X^{'}; Y | X_{1}) = \frac{1}{2} \log (1 + \frac{(1 - ρ^{2}) α Ω}{Λ}) . \end{matrix}

(A69)

It is left for us to evaluate the first term in the RHS of (A66). Then, by standard whitening transformation, there exist two independent Gaussian random variables

T_{1}

and

T_{2}

such that

\begin{matrix} X^{'} + X_{1} = T_{1} + T_{2}, \end{matrix}

(A70)

\begin{matrix} T_{1} \sim N (0, (1 - ρ^{2}) α Ω), T_{2} \sim N (0, Ω_{1} + ρ^{2} α Ω + 2 ρ \sqrt{α Ω \cdot Ω_{1}}) . \end{matrix}

(A71)

Hence,

Y = T_{1} + T_{2} + S

, and as

V ar (X^{'} | X_{1} = x_{1}) = V ar (T_{1})

for all

x_{1} \in R

, we have that

\begin{matrix} I_{q} (X_{1}; Y) = & H_{q} (Y) - H_{q} (X^{'} + S | X_{1}) \\ = & H_{q} (Y) - H_{q} (T_{1} + S) = I_{q} (T_{2}; Y) \end{matrix}

(A72)

Let

\bar{S} ≜ T_{1} + S

. Then, since Gaussian noise is the worst additive noise under variance constraint [96] (Lemma II.2),

\begin{matrix} \min_{q (s) : E S^{2} \leq Λ} I_{q} (X_{1}; Y) = & \min_{q (s) : E S^{2} \leq Λ} I_{q} (T_{2}; T_{2} + \bar{S}) = \frac{1}{2} \log (1 + \frac{V ar (T_{2})}{V ar (T_{1}) + Λ}) \\ = & \frac{1}{2} \log (\frac{Ω_{1} + ρ^{2} α Ω + 2 ρ \sqrt{α Ω \cdot Ω_{1}} + Λ}{(1 - ρ^{2}) α Ω + Λ}) . \end{matrix}

(A73)

Substituting (A68), (A69) and (A73) in the RHS of (A66), we have that

\begin{matrix} C (L^{Q}) \geq & \max_{0 \leq α, ρ \leq 1} \min {\frac{1}{2} \log (\frac{Ω_{1} + ρ^{2} α Ω + 2 ρ \sqrt{α Ω \cdot Ω_{1}} + Λ}{(1 - ρ^{2}) α Ω + Λ}) + \frac{1}{2} \log (1 + \frac{(1 - ρ^{2}) α Ω}{Λ}), \\ \frac{1}{2} \log (1 + \frac{(1 - α) Ω}{σ^{2}}) + \frac{1}{2} \log (1 + \frac{(1 - ρ^{2}) α Ω}{Λ})} . \end{matrix}

(A74)

Observe that the first sum in the RHS of (A74) can be expressed as

\begin{matrix} \frac{1}{2} \log (\frac{Ω_{1} + ρ^{2} α Ω + 2 ρ \sqrt{α Ω \cdot Ω_{1}} + Λ}{(1 - ρ^{2}) α Ω + Λ}) + \frac{1}{2} \log (\frac{(1 - ρ^{2}) α Ω + Λ}{Λ}) \\ = & \frac{1}{2} \log (\frac{Ω_{1} + ρ^{2} α Ω + 2 ρ \sqrt{α Ω \cdot Ω_{1}} + Λ}{Λ}) = \frac{1}{2} \log (1 + \frac{Ω_{1} + ρ^{2} α Ω + 2 ρ \sqrt{α Ω \cdot Ω_{1}}}{Λ}) . \end{matrix}

(A75)

Hence, the direct part follows from (A74). □

Appendix K.2. Converse Proof

By Lemma 1,

C^{⋆} (L^{Q}) \leq R_{C S} (L^{Q})

. Now, observe that

\begin{matrix} R_{C S} (L^{Q}) = & \min_{q (s) : E S^{2} \leq Λ} \max_{\begin{matrix} p (x^{″}) p (x, x_{1}) : \\ E (X^{' 2} + X^{″ 2}) \leq Ω, \\ E X_{1}^{2} \leq Ω \end{matrix}} \min \{I_{q} (X^{'}, X_{1}; Y), I (X^{″}; Y_{1}) + I_{q} (X^{'}; Y | X_{1})\} \\ \leq & \max_{\begin{matrix} p (x^{″}) p (x, x_{1}) : \\ E (X^{' 2} + X^{″ 2}) \leq Ω, \\ E X_{1}^{2} \leq Ω \end{matrix}} \min \{I_{q} (X^{'}, X_{1}; Y), I (X^{″}; Y_{1}) + I_{q} (X^{'}; Y | X_{1})\} |_{S \sim N (0, Λ)} \\ = & \max_{0 \leq α, ρ \leq 1} \min {\frac{1}{2} \log (1 + \frac{Ω_{1} + ρ^{2} α Ω + 2 ρ \sqrt{α Ω \cdot Ω_{1}}}{Λ}), \\ \frac{1}{2} \log (1 + \frac{(1 - α) Ω}{σ^{2}}) + \frac{1}{2} \log (1 + \frac{(1 - ρ^{2}) α Ω}{Λ})}, \end{matrix}

(A76)

where the last equality is due to [5]. □

Appendix L. Proof of Theorem 2

Appendix L.1. Achievability Proof

To show that

C^{⋆} (L) \geq C (L^{Q})

, we follow the steps in the proof of Theorem 1, where we replace Ahlswede’s original RT with the modified version in [85, 86] (Lemma 9), plugging

l^{n} (s^{n}) = \frac{1}{n} \sum_{i = 1}^{n} s_{i}^{2}

. Then, by Lemma 6, it follows that

\begin{matrix} C^{⋆} (L) \geq & \max_{0 \leq α, ρ \leq 1} \min {\frac{1}{2} \log (1 + \frac{(1 + α + 2 ρ \sqrt{α}) Ω}{Λ}), \\ \frac{1}{2} \log (1 + \frac{(1 - α) Ω}{σ^{2}}) + \frac{1}{2} \log (1 + \frac{(1 - ρ^{2}) α Ω}{Λ})} . \end{matrix}

(A77)

The details are omitted. □

Appendix L.2. Converse Proof

Assume to the contrary that there exists an achievable rate R such that

\begin{matrix} R > & \max_{0 \leq α, ρ \leq 1} \min {\frac{1}{2} \log (1 + \frac{(1 + α + 2 ρ \sqrt{α}) Ω}{Λ - δ}), \\ \frac{1}{2} \log (1 + \frac{(1 - α) Ω}{σ^{2}}) + \frac{1}{2} \log (1 + \frac{(1 - ρ^{2}) α Ω}{Λ - δ})} \end{matrix}

(A78)

using random codes over the Gaussian AVRC

L

, under input constraints

Ω

and

Ω_{1}

and state constraint

Λ

, where

δ > 0

is arbitrarily small. That is, for every

ε > 0

and sufficiently large n, there exists a

(2^{n R}, n)

random code

C^{Γ} = (μ, Γ, {C_{γ}}_{γ \in Γ})

for the Gaussian AVRC

L

, under input constraints

Ω

and

Ω_{1}

and state constraint

Λ

, such that

\begin{matrix} P_{e | s} (C^{Γ}) \leq ε, \end{matrix}

(A79)

for all

m \in [1 : 2^{n R}]

and

s \in R^{n}

with

{∥s∥}^{2} \leq n Λ

.

Consider using the random code

C^{Γ}

over the Gaussian compound relay channel

L^{Q}

under state constraint

(Λ - δ)

, i.e., with

\begin{matrix} Q = {q (s) : E S^{2} \leq Λ - δ}, \end{matrix}

(A80)

under input constraints

Ω

and

Ω_{1}

. Let

\bar{q} (s) \in Q

be a given state distribution. Then, define a sequence of i.i.d. random variables

{\bar{S}}_{1}, \dots, {\bar{S}}_{n} \sim \bar{q} (s)

. Letting

\bar{q} (s^{n}) ≜ \prod_{i = 1}^{n} \bar{q} (s_{i})

, the probability of error is bounded by

\begin{matrix} P_{e}^{(n)} (\bar{q}, C^{Γ}) \leq & \sum_{s^{n} : l^{n} (s^{n}) \leq Λ} {\bar{q}}^{n} (s^{n}) P_{e | s^{n}}^{(n)} (C^{Γ}) + \Pr (\frac{1}{n} \sum_{i = 1}^{n} {\bar{S}}_{i}^{2} > Λ) . \end{matrix}

(A81)

Then, the first sum is bounded by (A79), and the second term vanishes as well by the law of large numbers, since

\bar{q} (s)

is in (A80). Hence, the rate R in (A78) is achievable for the Gaussian compound relay channel

L^{Q}

, in contradiction to Lemma 6. We deduce that the assumption is false, and (A78) cannot be achieved. □

Appendix M. Proof of Theorem 3

Consider the Gaussian AVRC

L

with SFD under input constraints

Ω

and

Ω_{1}

and state constraint

Λ

. In the proof, we modify the techniques by Csiszár and Narayan [87]. In the direct part, we use their correlation binning technique within the decode-forward coding scheme, and in the converse part, we consider a jamming scheme which simulates the transmission sum by the encoder and the relay.

Appendix M.1. Lower Bound

We construct a block Markov code using backward minimum-distance decoding in two steps. The encoders use B blocks, each consists of n channel uses, to convey

(B - 1)

independent messages to the receiver, where each message

M_{b}

, for

b \in [1 : B - 1]

, is divided into two independent messages. That is,

M_{b} = (M_{b}^{'}, M_{b}^{″})

, where

M_{b}^{'}

and

M_{b}^{″}

are uniformly distributed, i.e.,

\begin{matrix} M_{b}^{'} \sim Unif [1 : 2^{n R^{'}}], M_{b}^{″} \sim Unif [1 : 2^{n R^{″}}], with R^{'} + R^{″} = R, \end{matrix}

(A82)

for

b \in [1 : B - 1]

. For convenience of notation, set

M_{0}^{'} = M_{B}^{'} \equiv 1

and

M_{0}^{″} = M_{B}^{″} \equiv 1

. The average rate

\frac{B - 1}{B} \cdot R

is arbitrarily close to R.

Codebook Construction: Fix

0 \leq α, ρ \leq 1

with

\begin{matrix} (1 - ρ^{2}) α Ω > Λ, \end{matrix}

(A83)

\begin{matrix} \frac{Ω_{1}}{Ω} {(\sqrt{Ω_{1}} + ρ \sqrt{α Ω})}^{2} > Λ + (1 - ρ^{2}) α Ω . \end{matrix}

(A84)

We construct B codebooks

F_{b}

of the following form,

\begin{matrix} F_{b} = \{(x_{1} (m_{b - 1}^{'}), x^{'} (m_{b}^{'}, m_{b}^{″} | m_{b - 1}^{'}), x^{″} (m_{b}^{'})) : m_{b - 1}^{'}, m_{b}^{'} \in [1 : 2^{n R^{'}}], m_{b}^{″} \in [1 : 2^{n R^{″}}]\}, \end{matrix}

(A85)

for

b \in [2 : B - 1]

. The codebooks

F_{1}

and

F_{B}

have the same form, with fixed

m_{0}^{'} = m_{B}^{'} \equiv 1

and

m_{0}^{″} = m_{B}^{″} \equiv 1

.

The sequences

x^{″} (m_{b}^{'})

,

m_{b}^{'} \in [1 : 2^{n R^{'}}]

are chosen as follows. Observe that the channel from the sender to the relay,

Y_{1} = X^{″} + Z

, does not depend on the state. Thus, by Shannon’s well-known result on the point-to-point Gaussian channel [97], the message

m_{b}^{'}

can be conveyed to the relay reliably, under input constraint

(1 - α) Ω

, provided that

R^{'} < \frac{1}{2} \log (1 + \frac{(1 - α) Ω}{σ^{2}}) - δ_{1}

, where

δ_{1}

is arbitrarily small (see also [98] (Chapter 9)). That is, for every

ε > 0

and sufficiently large n, there exists a

(2^{n R^{'}}, n, ε)

code

C^{″} = (x^{″} (m_{b}^{'}), g_{1} (y_{1, b}))

, such that

{∥x^{″} (m_{b}^{'})∥}^{2} \leq n (1 - α) Ω

for all

m_{b}^{'} \in [1 : 2^{n R^{'}}]

.

Next, we choose the sequences

x_{1} (m_{b - 1}^{'})

and

x^{'} (m_{b}^{'}, m_{b}^{″} |

m_{b - 1}^{'})

, for

m_{b - 1}^{'},

m_{b}^{'} \in [1 : 2^{n R^{'}}]

,

m_{b}^{″} \in [1 : 2^{n R^{″}}]

. Applying Lemma 7 by [87] repeatedly yields the following.

Lemma A1.

For every

ε > 0

,

8 \sqrt{ε} < η < 1

,

K > 2 ε

,

2 ε \leq R^{'} \leq K

,

2 ε \leq R^{″} \leq K

, and

n \geq n_{0} (ε, η, K)

,

1.: there exist $2^{n R^{'}}$ unit vectors,

$\begin{matrix} a (m_{b - 1}^{'}) \in R^{n}, m_{b - 1}^{'} \in [1 : 2^{n R^{'}}], \end{matrix}$

(A86)

such that for every unit vector $c \in R^{n}$ and $0 \leq θ, ζ \leq 1$ ,

$\begin{matrix} |\{{\tilde{m}}_{b - 1}^{'} \in [1 : 2^{n R^{'}}] : ⟨ a ({\tilde{m}}_{b - 1}^{'}), c ⟩ \geq θ\}| \leq 2^{n ({[R^{'} + \frac{1}{2} \log (1 - θ^{2})]}_{+} + ε)}, \end{matrix}$

(A87)

and if $θ \geq η$ and $θ^{2} + ζ^{2} > 1 + η - 2^{- 2 R^{'}}$ , then

$\begin{matrix} \frac{1}{2^{n R^{'}}} | {m_{b - 1}^{'} \in [1 : 2^{n R^{'}}] : | ⟨ a ({\tilde{m}}_{b - 1}^{'}), a (m_{b - 1}^{'}) ⟩ | \geq θ, | ⟨ a ({\tilde{m}}_{b - 1}^{'}), c ⟩ | \geq ζ, \\ for some {\tilde{m}}_{b - 1}^{'} \neq m_{b - 1}^{'}} | \leq 2^{- n ε} . \end{matrix}$

(A88)
2.: Furthermore, for every $m_{b}^{'} \in [1 : 2^{n R^{'}}]$ , there exist $2^{n R^{″}}$ unit vectors,

$\begin{matrix} v (m_{b}^{'}, m_{b}^{″}) \in R^{n}, m_{b}^{″} \in [1 : 2^{n R^{″}}], \end{matrix}$

(A89)

such that for every unit vector $c \in R^{n}$ and $0 \leq θ, ζ \leq 1$ ,

$\begin{matrix} |\{{\tilde{m}}_{b}^{″} \in [1 : 2^{n R^{″}}] : ⟨ v (m_{b}^{'}, {\tilde{m}}_{b}^{″}), c ⟩ \geq θ\}| \leq 2^{n ({[R^{″} + \frac{1}{2} \log (1 - θ^{2})]}_{+} + ε)}, \end{matrix}$

(A90)

and if $θ \geq η$ and $θ^{2} + ζ^{2} > 1 + η - 2^{- 2 R^{″}}$ , then

$\begin{matrix} \frac{1}{2^{n R^{″}}} | {m_{b}^{″} \in [1 : 2^{n R^{″}}] : | ⟨ v (m_{b}^{'}, {\tilde{m}}_{b}^{″}), v (m_{b}^{'}, m_{b}^{″}) ⟩ | \geq θ, | ⟨ v (m_{b}^{'}, {\tilde{m}}_{b}^{″}), c ⟩ | \geq ζ, \\ for some {\tilde{m}}_{b}^{″} \neq m_{b}^{″}} | \leq 2^{- n ε} . \end{matrix}$

(A91)

Then, define

\begin{matrix} x_{1} (m_{b - 1}^{'}) = \sqrt{n γ (Ω - δ)} \cdot a (m_{b - 1}^{'}), \\ x^{'} (m_{b}^{'}, m_{b}^{″} | m_{b - 1}^{'}) = ρ \sqrt{α γ^{- 1}} \cdot x_{1} (m_{b - 1}^{'}) + β \cdot v (m_{b}^{'}, m_{b}^{″}), \end{matrix}

(A92)

where

\begin{matrix} β ≜ & \sqrt{n (1 - ρ^{2}) α (Ω - δ)}, γ ≜ Ω_{1} / Ω . \end{matrix}

(A93)

Note that

{∥x_{1} (m_{b - 1}^{'})∥}^{2} = n γ (Ω - δ) < n Ω_{1}

, for all

m_{b - 1}^{'} \in [1 : 2^{n R^{'}}]

. On the other hand,

{∥x^{'} (m_{b}^{'}, m_{b}^{″} | m_{b - 1}^{'})∥}^{2}

could be greater than

n α Ω

due to the possible correlation between

x_{1} (m_{b - 1}^{'})

and

v (m_{b}^{'}, m_{b}^{″})

.

Encoding: Let

(m_{1}^{'}, m_{1}^{″}, \dots, m_{B - 1}^{'}, m_{B - 1}^{″})

be a sequence of messages to be sent. In block

b \in [1 : B]

, if

{∥x^{'} (m_{b}^{'}, m_{b}^{″} | m_{b - 1}^{'})∥}^{2}

\leq n α Ω

, transmit

(x^{'} (m_{b}^{'}, m_{b}^{″} | m_{b - 1}^{'}), x^{″} (m_{b}^{'}))

. Otherwise, transmit

(0, x^{″} (m_{b}^{'}))

.

Relay Encoding: In block 1, the relay transmits

x_{1} (1)

. At the end of block

b \in [1 : B - 1]

, the relay receives

y_{1, b}

, and finds an estimate

{\bar{m}}_{b}^{'} = g_{1} (y_{1, b})

. In block

b + 1

, the relay transmits

x_{1} ({\bar{m}}_{b}^{'})

.

Backward Decoding: Once all blocks

{(y_{b})}_{b = 1}^{B}

are received, decoding is performed backwards. Set

{\hat{m}}_{0}^{'} = {\hat{m}}_{0}^{″} \equiv 1

. For

b = B - 1, B - 2, \dots, 1

, find a unique

{\hat{m}}_{b}^{'} \in [1 : 2^{n R^{'}}]

such that

\begin{matrix} ∥y_{b + 1} - (1 + ρ \sqrt{α γ^{- 1}}) x_{1} ({\hat{m}}_{b}^{'})∥ \leq ∥y_{b + 1} - (1 + ρ \sqrt{α γ^{- 1}}) x_{1} (m_{b}^{'})∥, for all m_{b}^{'} \in [1 : 2^{n R}] . \end{matrix}

(A94)

If there is more than one such

{\hat{m}}_{b}^{'} \in [1 : 2^{n R^{'}}]

, declare an error.

Then, the decoder uses

{\hat{m}}_{1}^{'}, \dots, {\hat{m}}_{B - 1}^{'}

as follows. For

b = B - 1, B - 2, \dots, 1

, find a unique

{\hat{m}}_{b}^{″} \in [1 : 2^{n R^{″}}]

such that

\begin{matrix} ∥y_{b} - x_{1} ({\hat{m}}_{b - 1}^{'}) - x^{'} ({\hat{m}}_{b}^{'}, {\hat{m}}_{b}^{″} | {\hat{m}}_{b - 1}^{'})∥ \leq ∥y_{b} - x_{1} ({\hat{m}}_{b - 1}^{'}) - x^{'} ({\hat{m}}_{b}^{'}, m_{b}^{″} | {\hat{m}}_{b - 1}^{'})∥, for all m_{b}^{″} \in [1 : 2^{n R^{″}}] . \end{matrix}

(A95)

If there is more than one such

{\hat{m}}_{b}^{″} \in [1 : 2^{n R^{″}}]

, declare an error.

Analysis of Probability of Error: Fix

s \in S^{n}

, and let

\begin{matrix} c_{0} ≜ \frac{s}{∥s∥} . \end{matrix}

(A96)

The error event is bounded by the union of the following events. For

b \in [1 : B - 1]

, define

\begin{matrix} E_{1} (b) = {{\bar{M}}_{b}^{'} \neq M_{b}^{'}}, E_{2} (b) = {{\hat{M}}_{b}^{'} \neq M_{b}^{'}}, E_{3} (b) = {{\hat{M}}_{b}^{″} \neq M_{b}^{″}} . \end{matrix}

(A97)

Then, the conditional probability of error given the state sequence

s

is bounded by

\begin{matrix} P_{e | s} (C) \leq & \sum_{b = 1}^{B - 1} \Pr (E_{1} (b)) + \sum_{b = 1}^{B - 1} \Pr (E_{2} (b) \cap E_{1}^{c} (b)) + \sum_{b = 1}^{B - 1} \Pr (E_{3} (b) \cap E_{1}^{c} (b - 1) \cap E_{2}^{c} (b) \cap E_{2}^{c} (b - 1)), \end{matrix}

(A98)

with

E_{1} (0) = E_{2} (0) = \emptyset

, where the conditioning on

S = s

is omitted for convenience of notation. Recall that we have defined

C^{″}

as a

(2^{n R^{'}}, n, ε)

code for the point-to-point Gaussian channel

Y_{1} = X^{″} + Z

. Hence, the first sum in the RHS of (A98) is bounded by

B \cdot ε

, which is arbitrarily small.

To be more concise, we only give the details for erroneous decoding of

M_{b}^{'}

at the receiver. Consider the following events,

\begin{matrix} E_{2} (b) = {∥Y_{b + 1} - (1 + ρ \sqrt{α γ^{- 1}}) x_{1} ({\tilde{m}}_{b}^{'})∥ \leq ∥Y_{b + 1} - (1 + ρ \sqrt{α γ^{- 1}}) x_{1} (M_{b}^{'})∥, for some {\tilde{m}}_{b}^{'} \neq M_{b}^{'}}, \\ E_{2, 1} (b) = {| ⟨ a (M_{b}^{'}), c_{0} ⟩ | \geq η}, \\ E_{2, 2} (b) = {| ⟨ a (M_{b}^{'}), v (M_{b + 1}) ⟩ | \geq η} \\ E_{2, 3} (b) = {| ⟨ v (M_{b + 1}), c_{0} ⟩ | \geq η} \\ {\tilde{E}}_{2} (b) = E_{2} (b) \cap E_{1}^{c} (b) \cap E_{2, 1}^{c} (b) \cap E_{2, 2}^{c} (b) \cap E_{2, 3}^{c} (b), \end{matrix}

(A99)

where

M_{b + 1} = (M_{b + 1}^{'}, M_{b + 1}^{″})

. Then,

\begin{matrix} E_{2} (b) \cap E_{1}^{c} (b) \subseteq & E_{2, 1} (b) \cup E_{2, 2} (b) \cup E_{2, 3} (b) \cup (E_{2} (b) \cap E_{1}^{c} (b)) \\ = & E_{2, 1} (b) \cup E_{2, 2} (b) \cup E_{2, 3} (b) \cup {\tilde{E}}_{2} (b) . \end{matrix}

(A100)

Hence, by the union of events bound, we have that

\begin{matrix} \Pr (E_{2} (b) \cap E_{1}^{c} (b)) \leq & \Pr (E_{2, 1} (b)) + \Pr (E_{2, 2} (b)) + \Pr (E_{2, 3} (b)) + \Pr ({\tilde{E}}_{2} (b)) . \end{matrix}

(A101)

By Lemma A1, given

R^{'} > - \frac{1}{2} \log (1 - η^{2})

, the first term is bounded by

\begin{matrix} \Pr (E_{2, 1} (b)) = & \Pr (⟨ a (M_{b}^{'}), c_{0} ⟩ \geq η) + \Pr (⟨ a (M_{b}^{'}), - c_{0} ⟩ \geq η) \\ \leq & 2 \cdot \frac{1}{2^{n R^{'}}} \cdot 2^{n (R^{'} + \frac{1}{2} \log (1 - η^{2}) + ε)} \leq 2 \cdot 2^{n (- \frac{1}{2} η^{2} + ε)}, \end{matrix}

(A102)

since

\log (1 + t) \leq t

for

t \in R

. As

η^{2} \geq 8 ε

, the last expression tends to zero as

n \to \infty

. Similarly,

\Pr (E_{2, 2} (b))

and

\Pr (E_{2, 3} (b))

tend to zero as well. Moving to the fourth term in the RHS of (A101), observe that for a sufficiently small

ε

and

η

, the event

E_{2, 2}^{c} (b)

implies that

{∥x^{'} (M_{b + 1} | M_{b}^{'})∥}^{2} \leq n α Ω

, while the event

E_{1}^{c} (b)

means that

{\bar{M}}_{b}^{'} = M_{b}^{'}

. Hence, the encoder transmits

(x^{'} (M_{b + 1} | M_{b}^{'}), x^{″} (M_{b + 1}^{'}))

, the relay transmits

x_{1} (M_{b}^{'})

, and we have that

\begin{matrix} {∥Y_{b + 1} - (1 + ρ \sqrt{α γ^{- 1}}) x_{1} ({\tilde{m}}_{b}^{'})∥}^{2} - {∥Y_{b + 1} - (1 + ρ \sqrt{α γ^{- 1}}) x_{1} (M_{b}^{'})∥}^{2} \\ = & {∥(1 + ρ \sqrt{α γ^{- 1}}) x_{1} (M_{b}^{'}) + β v (M_{b + 1}) + s - (1 + ρ \sqrt{α γ^{- 1}}) x_{1} ({\tilde{m}}_{b}^{'})∥}^{2} - {∥β v (M_{b + 1}) + s∥}^{2} \\ = & 2 {(1 + ρ \sqrt{α γ^{- 1}})}^{2} (\frac{1}{2} {∥x_{1} (M_{b}^{'})∥}^{2} + \frac{1}{2} {∥x_{1} ({\tilde{m}}_{b}^{'})∥}^{2} - ⟨ x_{1} ({\tilde{m}}_{b}^{'}), x_{1} (M_{b}^{'}) ⟩) \\ + 2 (1 + ρ \sqrt{α γ^{- 1}}) (⟨ x_{1} (M_{b}^{'}), β v (M_{b + 1}) + s ⟩ - ⟨ x_{1} ({\tilde{m}}_{b}^{'}), β v (M_{b + 1}) + s ⟩) \end{matrix}

(A103)

Then, since

{∥x_{1} (m_{b}^{'})∥}^{2} = n γ (Ω - δ)

for all

m_{b}^{'} \in [1 : 2^{n R^{'}}]

, we have that

\begin{matrix} E_{2} (b) \cap E_{1}^{c} (b) \cap E_{2, 2}^{c} (b) \subseteq & {(1 + ρ \sqrt{α γ^{- 1}}) ⟨ x_{1} ({\tilde{m}}_{b}^{'}), x_{1} (M_{b}^{'}) ⟩ + ⟨ x_{1} ({\tilde{m}}_{b}^{'}), β v (M_{b + 1}) + s ⟩ \geq \\ n (1 + ρ \sqrt{α γ^{- 1}}) γ (Ω - δ) + ⟨ x_{1} (M_{b}^{'}), β v (M_{b + 1}) + s ⟩, for some {\tilde{m}}_{b}^{'} \neq M_{b}^{'}} . \end{matrix}

(A104)

Observe that for sufficiently small

ε

and

η

, the event

E_{2, 1}^{c} (b) \cap E_{2, 2}^{c} (b) \cap E_{2, 3}^{c} (b)

implies that

\begin{matrix} ⟨ x_{1} (M_{b}^{'}), β v (M_{b + 1}) + s ⟩ \geq - δ, \end{matrix}

(A105)

and

\begin{matrix} {∥β v (M_{b + 1}) + s∥}^{2} \leq n [(1 - ρ^{2}) α Ω + Λ] . \end{matrix}

(A106)

Hence, by (A104) and (A105),

\begin{matrix} {\tilde{E}}_{2} (b) = E_{2} (b) \cap E_{1}^{c} (b) \cap E_{2, 1}^{c} (b) \cap E_{2, 2}^{c} (b) \cap E_{2, 3}^{c} (b) \\ \subseteq & {(1 + ρ \sqrt{α γ^{- 1}}) ⟨ x_{1} ({\tilde{m}}_{b}^{'}), x_{1} (M_{b}^{'}) ⟩ + ⟨ x_{1} ({\tilde{m}}_{b}^{'}), β v (M_{b + 1}) + s ⟩ \geq n (1 + ρ \sqrt{α γ^{- 1}}) γ (Ω - 2 δ), \\ for some {\tilde{m}}_{b}^{'} \neq M_{b}^{'}} . \end{matrix}

(A107)

Dividing both sides of the inequality by

n (1 + ρ \sqrt{α γ^{- 1}})

, we obtain

\begin{matrix} {\tilde{E}}_{2} (b) \subseteq \{\frac{1}{n} ⟨ x_{1} ({\tilde{m}}_{b}^{'}), x_{1} (M_{b}^{'}) ⟩ + \frac{⟨ x_{1} ({\tilde{m}}_{b}^{'}), β v (M_{b + 1}) + s ⟩}{n (1 + ρ \sqrt{α γ^{- 1}})} \geq γ (Ω - 2 δ), for some {\tilde{m}}_{b}^{'} \neq M_{b}^{'}\} . \end{matrix}

(A108)

Next, we partition the set of values of

\frac{1}{n} ⟨ x_{1} ({\tilde{m}}_{b}^{'}), x_{1} (M_{b}^{'}) ⟩

to K bins. Let

τ_{1} < τ_{2} < \dots < τ_{K}

be such partition, where

\begin{matrix} τ_{1} = γ (Ω - 2 δ) - \frac{\sqrt{(Ω - δ) [(1 - ρ^{2}) α Ω + Λ]}}{1 + ρ \sqrt{α γ^{- 1}}}, τ_{K} = γ (Ω - 3 δ), \\ τ_{k + 1} - τ_{k} \leq γ \cdot δ, for k = [1 : K - 1], \end{matrix}

(A109)

where K is a finite constant which is independent of n, as in Lemma A1. By (A106) and (A108), given the event

{\tilde{E}}_{2} (b)

, we have that

\begin{matrix} \frac{1}{n} ⟨ x_{1} ({\tilde{m}}_{b}^{'}), x_{1} (M_{b}^{'}) ⟩ \geq τ_{1} > 0, \end{matrix}

(A110)

where the last inequality is due to (A84), for sufficiently small

δ > 0

. To see this, observe that the inequality in (A84) is strict, and it implies that

\begin{matrix} \sqrt{γ} \cdot (\sqrt{γ Ω} + ρ \sqrt{α Ω}) > \sqrt{(1 - ρ^{2}) α Ω + Λ} . \end{matrix}

(A111)

Hence, for sufficiently small

δ > 0

,

τ_{1} > 0

as

\begin{matrix} τ_{1} = \frac{\sqrt{Ω - 2 δ}}{1 + ρ \sqrt{α γ^{- 1}}} \cdot (\sqrt{γ} (\sqrt{γ (Ω - 2 δ)} + ρ \sqrt{α (Ω - 2 δ)}) - \sqrt{\frac{Ω - δ}{Ω - 2 δ} [(1 - ρ^{2}) α Ω + Λ]}) . \end{matrix}

(A112)

Furthermore, if

τ_{k} \leq \frac{1}{n} ⟨ x_{1} ({\tilde{m}}_{b}^{'}), x_{1} (M_{b}^{'}) ⟩ < τ_{k + 1}

, then

\begin{matrix} \frac{⟨ x_{1} ({\tilde{m}}_{b}^{'}), β v (M_{b + 1}) + s ⟩}{n (1 + ρ \sqrt{α γ^{- 1}})} \geq γ (Ω - 2 δ) - τ_{k + 1} \geq γ (Ω - 3 δ) - τ_{k} . \end{matrix}

(A113)

Thus,

\begin{matrix} \Pr ({\tilde{E}}_{2} (b)) \leq \sum_{k = 1}^{K - 1} \Pr (\frac{1}{n} | ⟨ x_{1} ({\tilde{m}}_{b}^{'}), x_{1} (M_{b}^{'}) ⟩ | \geq τ_{k}, \frac{| ⟨ x_{1} ({\tilde{m}}_{b}^{'}), β v (M_{b + 1}) + s ⟩ |}{n (1 + ρ \sqrt{α γ^{- 1}})} \geq γ (Ω - 3 δ) - τ_{k}, \\ for some {\tilde{m}}_{b}^{'} \neq M_{b}^{'}) + \Pr (\frac{1}{n} | ⟨ x_{1} ({\tilde{m}}_{b}^{'}), x_{1} (M_{b}^{'}) ⟩ | \geq τ_{K}, for some {\tilde{m}}_{b}^{'} \neq M_{b}^{'}) . \end{matrix}

(A114)

By (A106), this can be further bounded by

\begin{matrix} \Pr ({\tilde{E}}_{2} (b)) \leq & \sum_{k = 1}^{K} \Pr (| ⟨ a ({\tilde{m}}_{b}^{'}), a (M_{b}^{'}) ⟩ | \geq θ_{k}, | ⟨ a ({\tilde{m}}_{b}^{'}), c^{'} (M_{b + 1}) ⟩ | \geq μ_{k}, for some {\tilde{m}}_{b}^{'} \neq M_{b}^{'}), \end{matrix}

(A115)

where

\begin{matrix} c^{'} (m_{b + 1}) ≜ \frac{β v (m_{b + 1}) + s}{∥β v (m_{b + 1}) + s∥}, \end{matrix}

(A116)

and

\begin{matrix} θ_{k} ≜ \frac{τ_{k}}{γ (Ω - δ)}, ζ_{k} ≜ \frac{(1 + ρ \sqrt{α γ^{- 1}}) (γ (Ω - 3 δ) - τ_{k})}{\sqrt{γ (Ω - δ) ((1 - ρ^{2}) α Ω + Λ)}}, for k \in [1 : K - 1]; θ_{K} ≜ \frac{τ_{K}}{Ω - δ}, ζ_{K} = 0 . \end{matrix}

(A117)

By Lemma A1, the RHS of (A115) tends to zero as

n \to \infty

provided that

\begin{matrix} θ_{k} \geq η and θ_{k}^{2} + ζ_{k}^{2} > 1 + η - e^{- 2 R^{'}}, for k = [1 : K] . \end{matrix}

(A118)

For sufficiently small

ε

and

η

, we have that

η \leq θ_{1} = \frac{τ_{1}}{γ (Ω - δ)}

, hence the first condition is met. Then, observe that the second condition is equivalent to

G (τ_{k}) > 1 + η - e^{- 2 R^{'}}

, for

k \in [1 : K - 1]

, where

\begin{matrix} G (τ) = {(A τ)}^{2} + D^{2} {(L - τ)}^{2}, \end{matrix}

(A119)

with

\begin{matrix} A = \frac{1}{γ (Ω - δ)}, D = \frac{1 + ρ \sqrt{α γ^{- 1}}}{\sqrt{γ (Ω - δ) ((1 - ρ^{2}) α Ω + Λ)}}, L = γ (Ω - 3 δ) . \end{matrix}

(A120)

By differentiation, we have that the minimum value of this function is given by

\min_{τ_{1} \leq τ \leq τ_{K}} G (τ) = \frac{A^{2} D^{2} L^{2}}{A^{2} + D^{2}} = \frac{D^{2}}{A^{2} + D^{2}} - δ_{1}

, where

δ_{1} \to 0

as

δ \to 0

. Thus, the RHS of (A115) tends to zero as

n \to \infty

, provided that

\begin{matrix} R^{'} < & - \frac{1}{2} \log (1 + η - \frac{D^{2}}{A^{2} + D^{2}} + δ_{1}) \\ = & - \frac{1}{2} \log (η + δ_{1} + \frac{(1 - ρ^{2}) α Ω + Λ}{(γ + α + 2 ρ \sqrt{α γ}) Ω + Λ - δ γ {(1 + ρ \sqrt{α γ^{- 1}})}^{2}}) . \end{matrix}

(A121)

This is satisfied for

R^{'} = R_{α}^{'} (L) - δ^{'}

, with

\begin{matrix} R_{α}^{'} (L) = \frac{1}{2} \log (\frac{(γ + α + 2 ρ \sqrt{α γ}) Ω + Λ}{(1 - ρ^{2}) α Ω + Λ}) = - \frac{1}{2} \log (\frac{(1 - ρ^{2}) α Ω + Λ}{(γ + α + 2 ρ \sqrt{α γ}) Ω + Λ}) . \end{matrix}

(A122)

and arbitrary

δ^{'} > 0

, if

η

and

δ

are sufficiently small.

As for the error event for

M_{b}^{″}

, a similar derivation shows that the probability term in the last sum in (A98) exponentially tends to zero as

n \to \infty

, provided that

\begin{matrix} R^{″} < & - \frac{1}{2} \log (1 + η - \frac{β^{2} {(1 - 2 δ)}^{2}}{β^{2} + n Λ}) < - \frac{1}{2} \log (η + \frac{Λ}{(1 - ρ^{2}) α (Ω - δ) + Λ}) . \end{matrix}

(A123)

This is satisfied for

R^{″} = R_{α}^{″} (L) - δ^{″}

, with

\begin{matrix} R_{α}^{″} (L) = \frac{1}{2} \log (\frac{(1 - ρ^{2}) α Ω + Λ}{Λ}) = - \frac{1}{2} \log (\frac{Λ}{(1 - ρ^{2}) α Ω + Λ}) \end{matrix}

(A124)

for an arbitrary

δ^{″} > 0

, if

η

and

δ

are sufficiently small.

We have thus shown achievability of every rate

\begin{matrix} R < \min \{R_{α}^{'} (L) + R_{α}^{″} (L), \frac{1}{2} \log (1 + \frac{(1 - α) Ω}{σ^{2}}) + R_{α}^{″} (L)\}, \end{matrix}

(A125)

where

\begin{matrix} R_{α}^{'} (L) + R_{α}^{″} (L) = & \frac{1}{2} \log (\frac{(γ + α + 2 ρ \sqrt{α γ}) Ω + Λ}{(1 - ρ^{2}) α Ω + Λ}) + \frac{1}{2} \log (\frac{(1 - ρ^{2}) α Ω + Λ}{Λ}) \\ = & \frac{1}{2} \log (1 + \frac{Ω_{1} + α Ω + 2 ρ \sqrt{α Ω \cdot Ω_{1}}}{Λ}) \end{matrix}

(A126)

(see (A93)). This completes the proof of the lower bound.

Appendix M.2. Upper Bound

Let

R > 0

be an achievable rate. Then, there exists a sequence of

(2^{n R}, n, ε_{n}^{*})

codes

C_{n} = (f, f_{1}, g)

for the Gaussian AVRC

L

with SFD such that

ε_{n}^{*} \to 0

as

n \to \infty

, where the encoder consists of a pair

f = (f^{'}, f^{″})

, with

f^{'} : [1 : 2^{n R}] \to R^{n}

and

f^{″} : [1 : 2^{n R}] \to R^{n}

. Assume without loss of generality that the codewords have zero mean, i.e.,

\begin{matrix} \frac{1}{2^{n R}} \sum_{m = 1}^{2^{n R}} \frac{1}{n} \sum_{i = 1}^{n} f_{i} (m) = 0, \\ \int_{- \infty}^{\infty} d y_{1} \cdot \frac{1}{2^{n R}} \sum_{m \in [1 : 2^{n R}]} P_{Y_{1} | M} (y_{1} | m) \cdot \frac{1}{n} \sum_{i = 1}^{n} f_{1, i} (y_{1, 1}, y_{1, 2}, \dots, y_{1, i - 1}) = 0, \end{matrix}

(A127)

where

P_{y_{1} | M} (y_{1} | m) = \frac{1}{{(2 π σ^{2})}^{M M n / 2}} e^{- {∥y_{1} - f^{″} (m)∥}^{2} / 2 σ^{2}}

. If this is not the case, redefine the code such that the mean is subtracted from each codeword. Then, define

\begin{matrix} α ≜ \frac{1}{n Ω} \cdot \frac{1}{2^{n R}} \sum_{m \in [1 : 2^{n R}]} {∥f^{'} (m)∥}^{2}, \\ α_{1} ≜ \frac{1}{n Ω_{1}} \cdot \frac{1}{2^{n R}} \sum_{m \in [1 : 2^{n R}]} \int_{- \infty}^{\infty} d y_{1} \cdot P_{Y_{1} | M} (y_{1} | m) \cdot {∥f_{1} (y_{1})∥}^{2} . \\ ρ ≜ \frac{1}{n \sqrt{α Ω \cdot α_{1} Ω_{1}}} \int_{- \infty}^{\infty} d y_{1} \cdot \frac{1}{2^{n R}} \sum_{m \in [1 : 2^{n R}]} P_{Y_{1} | M} (y_{1} | m) \cdot ⟨ f^{'} (m), f_{1} (y_{1}) ⟩, \end{matrix}

(A128)

Since the code satisfies the input constraints

Ω

and

Ω_{1}

, we have that

α

,

α_{1}

and

ρ

are in the interval

[0, 1]

.

First, we show that if

\begin{matrix} Λ > Ω_{1} + α Ω + 2 ρ \sqrt{α Ω \cdot Ω_{1}} + δ, \end{matrix}

(A129)

then the capacity is zero, where

δ > 0

is arbitrarily small. Consider the following jamming strategy. The jammer draws a message

\tilde{M} \in [1 : 2^{n R}]

uniformly at random, and then, generates a sequence

{\tilde{Y}}_{1} \in R^{n}

distributed according to

P_{Y_{1} | M} ({\tilde{y}}_{1} | \tilde{m})

. Let

\tilde{S} = f^{'} (\tilde{M}) + f_{1} ({\tilde{Y}}_{1})

. If

\frac{1}{n} {∥\tilde{S}∥}^{2} \leq Λ

, the jammer chooses

\tilde{S}

to be the state sequence. Otherwise, let the state sequence consist of all zeros. Observe that

\begin{matrix} E {∥\tilde{S}∥}^{2} = & E {∥f^{'} (\tilde{M}) + f_{1} ({\tilde{Y}}_{1})∥}^{2} \\ = & E {∥f^{'} (\tilde{M})∥}^{2} + E {∥f_{1} ({\tilde{Y}}_{1})∥}^{2} + 2 E ⟨ f (\tilde{M}), f_{1} ({\tilde{Y}}_{1}) ⟩ \\ = & n (α Ω + α_{1} Ω_{1} + 2 ρ \sqrt{α Ω \cdot α_{1} Ω_{1}}) \\ \leq & n (α Ω + Ω_{1} + 2 ρ \sqrt{α Ω \cdot Ω_{1}}) < n (Λ - δ) . \end{matrix}

(A130)

where the second equality is due to (A128), and the last inequality is due to (A129). Thus, by Chebyshev’s inequality, there exists

κ > 0

such that

\begin{matrix} \Pr (\frac{1}{n} {∥\tilde{S}∥}^{2} \leq Λ) \geq κ . \end{matrix}

(A131)

The state sequence

S

is then distributed according to

\begin{matrix} P_{S | {\frac{1}{n} {∥\tilde{S}∥}^{2} \leq Λ}} (s) = \frac{1}{2^{n R}} \sum_{\tilde{m} \in [1 : 2^{n R}]} \int_{{\tilde{y}}_{1} : f^{'} (\tilde{m}) + f_{1} ({\tilde{y}}_{1}) = s} d y_{1} P_{Y_{1} | M} (y_{1} | m), \\ \Pr (S = 0 ∣ \frac{1}{n} {∥\tilde{S}∥}^{2} > Λ) = 1 . \end{matrix}

(A132)

Assume to the contrary that a positive rate can be achieved when the channel is governed by such state sequence, hence the size of the message set is at least 2, i.e.,

M ≜ 2^{n R} \geq 2

. The probability of error is then bounded by

\begin{matrix} P_{e}^{(n)} (q, C) = & \int_{- \infty}^{\infty} d s \cdot q (s) P_{e | s}^{(n)} (C) \geq \Pr (\frac{1}{n} {∥\tilde{S}∥}^{2} \leq Λ) \cdot \int_{s : \frac{1}{n} {∥s∥}^{2} \leq Λ} d s \cdot P_{S | {\frac{1}{n} {∥\tilde{S}∥}^{2} \leq Λ}} (s) \cdot P_{e | s}^{(n)} (C) \\ \geq & κ \cdot \int_{s : \frac{1}{n} {∥s∥}^{2} \leq Λ} d s \cdot P_{S | {\frac{1}{n} {∥\tilde{S}∥}^{2} \leq Λ}} (s) \cdot P_{e | s}^{(n)} (C) \end{matrix}

(A133)

where the inequality holds by (A131). Next, we have that

\begin{matrix} P_{e | s}^{(n)} (C) = & \frac{1}{M} \sum_{m = 1}^{M} \int_{- \infty}^{\infty} d y_{1} \cdot P_{Y_{1} | M} (y_{1} | m) \cdot 1 \{y_{1} : g (f^{'} (m) + f_{1} (y_{1}) + s) \neq m\}, \end{matrix}

(A134)

where we define the indicator function

G (y_{1}) = 1 {y_{1} \in A}

such that

G (y_{1}) = 1

if

y_{1} \in A

, and

G (y_{1}) = 0

otherwise. Substituting (A132) and (A134) into (A133) yields

\begin{matrix} P_{e}^{(n)} (q, C) \geq & κ \cdot \int_{s : \frac{1}{n} {∥s∥}^{2} \leq Λ} d s \cdot \frac{1}{M} \sum_{\tilde{m} = 1}^{M} \int_{{\tilde{y}}_{1} : f^{'} (\tilde{m}) + f_{1} ({\tilde{y}}_{1}) = s} d {\tilde{y}}_{1} \cdot P_{Y_{1} | M} ({\tilde{y}}_{1} | \tilde{m}) \\ \times \frac{1}{M} \sum_{m = 1}^{M} \int_{- \infty}^{\infty} d y_{1} \cdot P_{Y_{1} | M} (y_{1} | m) \cdot 1 \{y_{1} : g (f^{'} (m) + f_{1} (y_{1}) + s) \neq m\} . \end{matrix}

(A135)

Eliminating

s = f^{'} (\tilde{m}) + f_{1} ({\tilde{y}}_{1})

, and adding the constraint

{∥f^{'} (m) + f_{1} (y_{1})∥}^{2} \leq Λ

, we obtain the following,

\begin{matrix} P_{e}^{(n)} (q, C) \geq & \frac{κ}{M^{2}} \sum_{m = 1}^{M} \sum_{\tilde{m} = 1}^{M} \int_{\begin{matrix} (y_{1}, {\tilde{y}}_{1}) : \frac{1}{n} {∥f^{'} (m) + f_{1} (y_{1})∥}^{2} \leq Λ, \\ \frac{1}{n} {∥f^{'} (\tilde{m}) + f_{1} ({\tilde{y}}_{1})∥}^{2} \leq Λ \end{matrix}} d y_{1} d {\tilde{y}}_{1} \cdot P_{Y_{1} | M} (y_{1} | m) P_{Y_{1} | M} ({\tilde{y}}_{1} | \tilde{m}) \\ \times 1 \{y_{1} : g (f^{'} (m) + f_{1} (y_{1}) + f^{'} (\tilde{m}) + f_{1} ({\tilde{y}}_{1})) \neq m\} . \end{matrix}

(A136)

Now, by interchanging the summation variables

(m, y_{1})

and

(\tilde{m}, {\tilde{y}}_{1})

, we have that

\begin{matrix} P_{e}^{(n)} (q, C) \geq & \frac{κ}{2 M^{2}} \sum_{m = 1}^{M} \sum_{\tilde{m} = 1}^{M} \int_{\begin{matrix} (y_{1}, {\tilde{y}}_{1}) : \frac{1}{n} {∥f^{'} (m) + f_{1} (y_{1})∥}^{2} \leq Λ, \\ \frac{1}{n} {∥f^{'} (\tilde{m}) + f_{1} ({\tilde{y}}_{1})∥}^{2} \leq Λ \end{matrix}} d y_{1} d {\tilde{y}}_{1} \cdot P_{Y_{1} | M} (y_{1} | m) P_{Y_{1} | M} ({\tilde{y}}_{1} | \tilde{m}) \end{matrix}

\begin{matrix} \times 1 \{y_{1} : g (f^{'} (m) + f_{1} (y_{1}) + f^{'} (\tilde{m}) + f_{1} ({\tilde{y}}_{1})) \neq m\} \\ + & \frac{κ}{2 M^{2}} \sum_{m = 1}^{M} \sum_{\tilde{m} = 1}^{M} \int_{\begin{matrix} (y_{1}, {\tilde{y}}_{1}) : \frac{1}{n} {∥f^{'} (m) + f_{1} (y_{1})∥}^{2} \leq Λ, \\ \frac{1}{n} {∥f^{'} (\tilde{m}) + f_{1} ({\tilde{y}}_{1})∥}^{2} \leq Λ \end{matrix}} d y_{1} d {\tilde{y}}_{1} \cdot P_{Y_{1} | M} (y_{1} | m) P_{Y_{1} | M} ({\tilde{y}}_{1} | \tilde{m}) \\ \times 1 \{y_{1} : g (f^{'} (m) + f_{1} (y_{1}) + f^{'} (\tilde{m}) + f_{1} ({\tilde{y}}_{1})) \neq \tilde{m}\} . \end{matrix}

(A137)

Thus,

\begin{matrix} P_{e}^{(n)} (q, C) \geq & \frac{κ}{2 M^{2}} \sum_{m = 1}^{M} \sum_{\tilde{m} \neq m} \int_{\begin{matrix} (y_{1}, {\tilde{y}}_{1}) : \frac{1}{n} {∥f^{'} (m) + f_{1} (y_{1})∥}^{2} \leq Λ, \\ \frac{1}{n} {∥f^{'} (\tilde{m}) + f_{1} ({\tilde{y}}_{1})∥}^{2} \leq Λ \end{matrix}} d y_{1} d {\tilde{y}}_{1} \cdot P_{Y_{1} | M} (y_{1} | m) P_{Y_{1} | M} ({\tilde{y}}_{1} | \tilde{m}) \\ \times [1 \{y_{1} : g (f^{'} (m) + f_{1} (y_{1}) + f^{'} (\tilde{m}) + f_{1} ({\tilde{y}}_{1})) \neq m\} \\ + 1 \{y_{1} : g (f^{'} (m) + f_{1} (y_{1}) + f^{'} (\tilde{m}) + f_{1} ({\tilde{y}}_{1})) \neq \tilde{m}\}] . \end{matrix}

(A138)

As the sum in the square brackets is at least 1 for all

\tilde{m} \neq m

, it follows that

\begin{matrix} P_{e}^{(n)} (q, C) \geq & \frac{κ}{2 M^{2}} \sum_{m = 1}^{M} \sum_{\tilde{m} \neq m} \int_{\begin{matrix} (y_{1}, {\tilde{y}}_{1}) : \frac{1}{n} {∥f^{'} (m) + f_{1} (y_{1})∥}^{2} \leq Λ, \\ \frac{1}{n} {∥f^{'} (\tilde{m}) + f_{1} ({\tilde{y}}_{1})∥}^{2} \leq Λ \end{matrix}} d y_{1} d {\tilde{y}}_{1} \cdot P_{Y_{1} | M} (y_{1}^{n} | m) P_{Y_{1} | M} ({\tilde{y}}_{1} | \tilde{m}) \\ \geq & \frac{κ}{4} \cdot \Pr (\begin{matrix} \frac{1}{n} {∥f^{'} (M) + f_{1} (Y_{1})∥}^{2} \leq Λ, \\ \frac{1}{n} {∥f^{'} (\tilde{M}) + f_{1} ({\tilde{Y}}_{1})∥}^{2} \leq Λ, \tilde{M} \neq M \end{matrix}) . \end{matrix}

(A139)

Then, recall that by (A130), the expectation of

\frac{1}{n} {∥f^{'} (M) + f_{1} (Y_{1}^{n})∥}^{2}

is strictly lower than

Λ

, and for a sufficiently large n, the conditional expectation of

\frac{1}{n} | | f^{'} (\tilde{M}) + f_{1} ({\tilde{Y}}_{1}) {| |}^{2}

given

\tilde{M} \neq M

is also strictly lower than

Λ

. Thus, by Chebyshev’s inequality, the probability of error is bounded from below by a positive constant. Following this contradiction, we deduce that if the code is reliable, then

Λ \leq (1 + α + 2 ρ \sqrt{α}) Ω

.

It is left for us to show that for

α

and

ρ

as defined in (A128), we have that

R < F_{G} (α, ρ)

(see (36)). For a

(2^{n R}, n, ε_{n}^{*})

code,

\begin{matrix} P_{e | s}^{(n)} (C) \leq ε_{n}^{*}, \end{matrix}

(A140)

for all

s \in R^{n}

with

{∥s∥}^{2} \leq n Λ

. Then, consider using the code

C

over the Gaussian relay channel

W_{Y, Y_{1} | X, X_{1}}^{\bar{q}}

, specified by

\begin{matrix} Y_{1} = X^{″} + Z, \\ Y = X^{'} + X_{1} + \bar{S}, \end{matrix}

(A141)

where the sequence

\bar{S}

is i.i.d.

\sim \bar{q} = N (0, Λ - δ)

. First, we show that the code

C

is reliable for this channel, and then we show that

R < F_{G} (α, ρ)

. Using the code

C

over the channel

W_{Y, Y_{1} | X, X_{1}}^{\bar{q}}

, the probability of error is bounded by

\begin{matrix} P_{e}^{(n)} (\bar{q}, C) = \Pr (\frac{1}{n} ∥\bar{S}∥ > Λ) + \int_{s : \frac{1}{n} ∥\bar{S}∥ \leq Λ} d s \cdot P_{e | s}^{(n)} (C) \leq ε_{n}^{*} + ε_{n}^{* *}, \end{matrix}

(A142)

where we have bounded the first term by

ε_{n}^{* *}

using the law of large numbers and the second term using (A140), where

ε_{n}^{* *} \to 0

as

n \to \infty

. Since

W_{Y, Y_{1} | X, X_{1}}^{\bar{q}}

is a channel without a state, we can now show that

R < F_{G} (α, ρ)

by following the lines of [4] and [5]. By Fano’s inequality and [4] (Lemma 4), we have that

\begin{matrix} R \leq & \frac{1}{n} \sum_{i = 1}^{n} I_{\bar{q}} (X_{i}^{'}, X_{i}^{″}, X_{1, i}; Y_{i}) + ε_{n}, \\ R \leq & \frac{1}{n} \sum_{i = 1}^{n} I_{\bar{q}} (X_{i}^{'}, X_{i}^{″}; Y_{i}, Y_{1, i} | X_{1, i}) + ε_{n}, \end{matrix}

(A143)

where

\bar{q} = N (0, Λ - δ)

,

X^{'} = f^{'} (M)

,

X^{″} = f^{″} (M)

,

X_{1} = f_{1} (y_{1})

, and

ε_{n} \to 0

as

n \to \infty

. For the Gaussian relay channel with SFD, we have the following Markov relations,

\begin{matrix} Y_{1, i} - X_{i}^{″} - (X_{i}^{'}, X_{1, i}, Y_{1, i}), \end{matrix}

(A144)

\begin{matrix} (X_{i}^{″}, Y_{1, i}) - (X_{i}^{'}, X_{1, i}) - Y_{i} . \end{matrix}

(A145)

Hence, by (A145),

I_{\bar{q}} (X_{i}^{'}, X_{i}^{″}, X_{1, i}; Y_{i}) = I_{\bar{q}} (X_{i}^{'}, X_{1, i}; Y_{i})

. Moving to the second bound in the RHS of (A143), we follow the lines of [5]. Then, by the mutual information chain rule, we have

\begin{matrix} I_{\bar{q}} (X_{i}^{'}, X_{i}^{″}; Y_{i}, Y_{1, i} | X_{1, i}) = & I (X_{i}^{″}; Y_{1, i} | X_{1, i}) + I (X_{i}^{'}; Y_{1, i} | X_{i}^{″}, X_{1, i}) + I_{\bar{q}} (X_{i}^{'}, X_{i}^{″}; Y_{i} | X_{1, i}, Y_{1, i}) \\ \overset{(a)}{=} & I (X_{i}^{″}; Y_{1, i} | X_{1, i}) + I_{\bar{q}} (X_{i}^{'}, X_{i}^{″}; Y_{i} | X_{1, i}, Y_{1, i}) \\ \overset{(b)}{=} & [H (Y_{1, i} | X_{1, i}) - H (Y_{1, i} | X_{i}^{″})] + [H_{\bar{q}} (Y_{i} | X_{1, i}, Y_{1, i}) - H_{\bar{q}} (Y_{i} | X_{i}^{'}, X_{1, i})] \\ \overset{(c)}{\leq} & I_{q_{1}} (X_{i}^{″}; Y_{1, i}) + I (X_{i}^{'}; Y_{i} | X_{1, i}) \end{matrix}

(A146)

where

(a)

is due to (A144),

(b)

is due to (A145), and

(c)

holds since conditioning reduces entropy. Introducing a time-sharing random variable

K \sim Unif [1 : n]

, which is independent of

X^{'}

,

X^{″}

,

X_{1}

,

Y

,

Y_{1}

, we have that

\begin{matrix} R - ε_{n} \leq & I_{\bar{q}} (X_{K}^{'}, X_{1, K}; Y_{K} | K) \\ R - ε_{n} \leq & I (X_{K}^{″}; Y_{1, K} | K) + I_{\bar{q}} (X_{K}^{'}; Y_{K} | X_{1, K}, K) . \end{matrix}

(A147)

Now, by the maximum differential entropy lemma (see e.g., [98] (Theorem 8.6.5)),

\begin{matrix} I_{\bar{q}} (X_{K}^{'}, X_{1, K}; Y_{K} | K) \leq & \frac{1}{2} \log (\frac{E [{(X_{K}^{'} + X_{1, K})}^{2}] + (Λ - δ)}{Λ - δ}) = \frac{1}{2} \log (1 + \frac{α Ω + α_{1} Ω_{1} + 2 ρ \sqrt{α Ω \cdot α_{1} Ω_{1}}}{Λ - δ}) \end{matrix}

(A148)

and

\begin{matrix} I (X_{K}^{″}; Y_{1, K} | K) + I_{\bar{q}} (X_{K}^{'}; Y_{K} | X_{1, K}, K) \leq & \frac{1}{2} \log \frac{E X_{K}^{″ 2} + σ^{2}}{σ^{2}} + \frac{1}{2} \log \frac{[1 - \frac{{(E (X_{K}^{'} \cdot X_{1, K}))}^{2}}{E X_{K}^{' 2} \cdot E X_{1, K}^{2}}] E X_{K}^{' 2} + (Λ - δ)}{Λ - δ} \\ = & \frac{1}{2} \log (1 + \frac{(1 - α) Ω}{σ^{2}}) + \frac{1}{2} \log (1 + \frac{(1 - ρ^{2}) α Ω}{Λ - δ}), \end{matrix}

(A149)

where

α

,

α_{1}

and

ρ

are given by (A128). Since

δ > 0

is arbitrary, and

α_{1} \leq 1

, the proof follows from (A147)–(A149). □

References

Van der Meulen, E.C. Three-terminal communication channels. Adv. Appl. Probab. 1971, 3, 120–154. [Google Scholar] [CrossRef]
Kim, Y.H. Coding techniques for primitive relay channels. In Proceedings of the Allerton Conference on Communication, Control and Computing, Monticello, IL, USA, 26–28 September 2007; pp. 129–135. [Google Scholar]
El Gamal, A.; Kim, Y. Network Information Theory; Cambridge University Press: Cambridge, UK, 2011. [Google Scholar]
Cover, T.; Gamal, A.E. Capacity theorems for the relay channel. IEEE Trans. Inf. Theory 1979, 25, 572–584. [Google Scholar] [CrossRef]
Gamal, A.E.; Zahedi, S. Capacity of a class of relay channels with orthogonal components. IEEE Trans. Inf. Theory 2005, 51, 1815–1817. [Google Scholar]
Xue, F. A New Upper Bound on the Capacity of a Primitive Relay Channel Based on Channel Simulation. IEEE Trans. Inf. Theory 2014, 60, 4786–4798. [Google Scholar] [CrossRef]
Wu, X.; Özgür, A. Cut-set bound is loose for Gaussian relay networks. In Proceedings of the Allerton Conference on Communication, Control and Computing, Monticello, IL, USA, 29 September–2 October 2015; pp. 1135–1142. [Google Scholar]
Chen, Y.; Devroye, N. Zero-Error Relaying for Primitive Relay Channels. IEEE Trans. Inf. Theory 2017, 63, 7708–7715. [Google Scholar] [CrossRef]
Wu, X.; Özgür, A. Cut-set bound is loose for Gaussian relay networks. IEEE Trans. Inf. Theory 2018, 64, 1023–1037. [Google Scholar] [CrossRef]
Wu, X.; Barnes, L.P.; Özgür, A. “The Capacity of the Relay Channel”: Solution to Cover’s Problem in the Gaussian Case. IEEE Trans. Inf. Theory 2019, 65, 255–275. [Google Scholar] [CrossRef]
Mondelli, M.; Hassani, S.H.; Urbanke, R. A New Coding Paradigm for the Primitive Relay Channel. In Proceedings of the IEEE International Symposium on Information Theory (ISIT’2018), Talisa Hotel in Vail, CO, USA, 17–22 June 2018; pp. 351–355. [Google Scholar]
Ramachandran, V. Gaussian degraded relay channel with lossy state reconstruction. AEU Int. J. Electr. Commun. 2018, 93, 348–353. [Google Scholar] [CrossRef]
Chen, Z.; Fan, P.; Wu, D.; Xiong, K.; Letaief, K.B. On the achievable rates of full-duplex Gaussian relay channel. In Proceedings of the Global Communication Conference (GLOBECOM’2014), Austin, TX, USA, 8–12 December 2014; pp. 4342–4346. [Google Scholar]
Kolte, R.; Özgür, A.; Gamal, A.E. Capacity Approximations for Gaussian Relay Networks. IEEE Trans. Inf. Theory 2015, 61, 4721–4734. [Google Scholar] [CrossRef]
Jin, X.; Kim, Y. The Approximate Capacity of the MIMO Relay Channel. IEEE Trans. Inf. Theory 2017, 63, 1167–1176. [Google Scholar] [CrossRef]
Wu, X.; Barnes, L.P.; Özgür, A. The geometry of the relay channel. In Proceedings of the IEEE International Symposium on Information Theory (ISIT’2017), Aachen, Germany, 25–30 June 2017; pp. 2233–2237. [Google Scholar]
Lai, L.; El Gamal, H. The relay–eavesdropper channel: Cooperation for secrecy. IEEE Trans. Inf. Theory 2008, 54, 4005–4019. [Google Scholar] [CrossRef]
Yeoh, P.L.; Yang, N.; Kim, K.J. Secrecy outage probability of selective relaying wiretap channels with collaborative eavesdropping. In Proceedings of the Global Communication Conference (GLOBECOM’2015), San Diego, CA, USA, 6–10 December 2015; pp. 1–6. [Google Scholar]
Kramer, G.; van Wijngaarden, A.J. On the white Gaussian multiple-access relay channel. In Proceedings of the IEEE International Symposium on Information Theory (ISIT’2000), Sorrento, Italy, 25–30 June 2000; p. 40. [Google Scholar]
Schein, B.E. Distributed Coordination in Network Information Theory. Ph.D. Thesis, Massachusetts Institute of Technology, Cambridge, MA, USA, 2001. [Google Scholar]
Rankov, B.; Wittneben, A. Achievable Rate Regions for the Two-way Relay Channel. In Proceedings of the IEEE International Symposium on Information Theory (ISIT’2006), Seattle, WA, USA, 9–14 July 2006; pp. 1668–1672. [Google Scholar]
Gunduz, D.; Yener, A.; Goldsmith, A.; Poor, H.V. The Multiway Relay Channel. IEEE Trans. Inf. Theory 2013, 59, 51–63. [Google Scholar] [CrossRef]
Maric, I.; Yates, R.D. Forwarding strategies for Gaussian parallel-relay networks. In Proceedings of the IEEE International Symposium on Information Theory (ISIT’2004), Chicago, IL, USA, 27 June–2 July 2004; p. 269. [Google Scholar]
Kochman, Y.; Khina, A.; Erez, U.; Zamir, R. Rematch and forward for parallel relay networks. In Proceedings of the IEEE International Symposium on Information Theory (ISIT’2008), Toronto, ON, Canada, 6–11 July 2008; pp. 767–771. [Google Scholar]
Awan, Z.H.; Zaidi, A.; Vandendorpe, L. Secure communication over parallel relay channel. arXiv 2010, arXiv:1011.2115. [Google Scholar] [CrossRef]
Xue, F.; Sandhu, S. Cooperation in a Half-Duplex Gaussian Diamond Relay Channel. IEEE Trans. Inf. Theory 2007, 53, 3806–3814. [Google Scholar]
Kang, W.; Ulukus, S. Capacity of a class of diamond channels. In Proceedings of the Allerton Conference on Communication, Control and Computing, Urbana-Champaign, IL, USA, 23–26 September 2008; pp. 1426–1431. [Google Scholar]
Chern, B.; Özgür, A. Achieving the Capacity of the N-Relay Gaussian Diamond Network Within logN Bits. IEEE Trans. Inf. Theory 2014, 60, 7708–7718. [Google Scholar] [CrossRef]
Sigurjónsson, S.; Kim, Y.H. On multiple user channels with state information at the transmitters. In Proceedings of the IEEE International Symposium on Information Theory (ISIT’2005), Adelaide, Australia, 4–9 September 2005; pp. 72–76. [Google Scholar]
Simeone, O.; Gündüz, D.; Shamai, S. Compound relay channel with informed relay and destination. In Proceedings of the Allerton Conference on Communication, Control and Computing, Monticello, IL, USA, 30 September–2 October 2009; pp. 692–699. [Google Scholar]
Zaidi, A.; Vandendorpe, L. Lower bounds on the capacity of the relay channel with states at the source. EURASIP J. Wirel. Commun. Netw. 2009, 2009, 1–22. [Google Scholar] [CrossRef]
Zaidi, A.; Kotagiri, S.P.; Laneman, J.N.; Vandendorpe, L. Cooperative Relaying With State Available Noncausally at the Relay. IEEE Trans. Inf. Theory 2010, 56, 2272–2298. [Google Scholar] [CrossRef]
Zaidi, A.; Shamai, S.; Piantanida, P.; Vandendorpe, L. Bounds on the capacity of the relay channel with noncausal state at the source. IEEE Trans. Inf. Theory 2013, 59, 2639–2672. [Google Scholar] [CrossRef]
Blackwell, D.; Breiman, L.; Thomasian, A.J. The capacities of certain channel classes under random coding. Ann. Math. Stat. 1960, 31, 558–567. [Google Scholar] [CrossRef]
Simon, M.K.; Alouini, M.S. Digital Communication over Fading Channels; John Wiley & Sons: Hoboken, NJ, USA, 2005; Volume 95. [Google Scholar]
Shamai, S.; Steiner, A. A broadcast approach for a single-user slowly fading MIMO channel. IEEE Trans. Inf. Theory 2003, 49, 2617–2635. [Google Scholar] [CrossRef]
Abdul Salam, A.; Sheriff, R.; Al-Araji, S.; Mezher, K.; Nasir, Q. Novel Approach for Modeling Wireless Fading Channels Using a Finite State Markov Chain. ETRI J. 2017, 39, 718–728. [Google Scholar] [CrossRef][Green Version]
Ozarow, L.H.; Shamai, S.; Wyner, A.D. Information theoretic considerations for cellular mobile radio. IEEE Trans. Veh. Tech. 1994, 43, 359–378. [Google Scholar] [CrossRef]
Goldsmith, A.J.; Varaiya, P.P. Capacity of fading channels with channel side information. IEEE Trans. Inf. Theory 1997, 43, 1986–1992. [Google Scholar] [CrossRef]
Caire, G.; Shamai, S. On the capacity of some channels with channel state information. IEEE Trans. Inf. Theory 1999, 45, 2007–2019. [Google Scholar] [CrossRef]
Zhou, S.; Zhao, M.; Xu, X.; Wang, J.; Yao, Y. Distributed wireless communication system: A new architecture for future public wireless access. IEEE Commun. Mag. 2003, 41, 108–113. [Google Scholar] [CrossRef]
Xu, Y.; Lu, R.; Shi, P.; Li, H.; Xie, S. Finite-time distributed state estimation over sensor networks with round-robin protocol and fading channels. IEEE Trans. Cybern. 2018, 48, 336–345. [Google Scholar] [CrossRef]
Kuznetsov, A.V.; Tsybakov, B.S. Coding in a memory with defective cells. Probl. Peredachi Inf. 1974, 10, 52–60. [Google Scholar]
Heegard, C.; Gamal, A.E. On the capacity of computer memory with defects. IEEE Trans. Inf. Theory 1983, 29, 731–739. [Google Scholar] [CrossRef]
Kuzntsov, A.V.; Vinck, A.J.H. On the general defective channel with informed encoder and capacities of some constrained memories. IEEE Trans. Inf. Theory 1994, 40, 1866–1871. [Google Scholar] [CrossRef]
Kim, Y.; Kumar, B.V.K.V. Writing on dirty flash memory. In Proceedings of the Allerton Conference on Communication, Control and Computing, Monticello, IL, USA, 30 September–3 October 2014; pp. 513–520. [Google Scholar]
Bunin, A.; Goldfeld, Z.; Permuter, H.H.; Shamai, S.; Cuff, P.; Piantanida, P. Key and message semantic-security over state-dependent channels. IEEE Trans. Inf. Forensic Secur. 2018. [Google Scholar] [CrossRef]
Gungor, O.; Koksal, C.E.; Gamal, H.E. An information theoretic approach to RF fingerprinting. In Proceedings of the Asilomar Conference on Signals, Systems and Computers (ACSSC’2013), Pacific Grove, CA, USA, 3–6 November 2013; pp. 61–65. [Google Scholar]
Ignatenko, T.; Willems, F.M.J. Biometric security from an information-theoretical perspective. Found. Trends^® Commun. Inf. Theory 2012, 7, 135–316. [Google Scholar] [CrossRef]
Han, G.; Xiao, L.; Poor, H.V. Two-dimensional anti-jamming communication based on deep reinforcement learning. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), New Orleans, LA, USA, 5–9 March 2017. [Google Scholar]
Xu, W.; Trappe, W.; Zhang, Y.; Wood, T. The feasibility of launching and detecting jamming attacks in wireless networks. In Proceedings of the ACM International Symposium on Mobile Ad Hoc Networking and Computing, Urbana-Champaign, IL, USA, 25–27 May 2005; pp. 46–57. [Google Scholar]
Alnifie, G.; Simon, R. A multi-channel defense against jamming attacks in wireless sensor networks. In Proceedings of the ACM Workshop on QoS Security Wireless Mobile Networks, Crete Island, Greece, 22 October 2007; pp. 95–104. [Google Scholar]
Padmavathi, G.; Shanmugapriya, D. A survey of attacks, security mechanisms and challenges in wireless sensor networks. arXiv 2009, arXiv:0909.0576. [Google Scholar]
Wang, T.; Liang, T.; Wei, X.; Fan, J. Localization of Directional Jammer in Wireless Sensor Networks. In Proceedings of the 2018 International Conference on Robots & Intelligent System (ICRIS) (ICRIS’2018), Changsha, China, 26–27 May 2018; pp. 198–202. [Google Scholar]
Ahlswede, R. Elimination of correlation in random codes for arbitrarily varying channels. Z. Wahrscheinlichkeitstheorie Verw. Gebiete 1978, 44, 159–175. [Google Scholar] [CrossRef]
Ericson, T. Exponential error bounds for random codes in the arbitrarily varying channel. IEEE Trans. Inf. Theory 1985, 31, 42–48. [Google Scholar] [CrossRef]
Csiszár, I.; Narayan, P. The capacity of the arbitrarily varying channel revisited: Positivity, constraints. IEEE Trans. Inf. Theory 1988, 34, 181–193. [Google Scholar] [CrossRef]
Ahlswede, R. Coloring hypergraphs: A new approach to multi-user source coding, Part 2. J. Comb. 1980, 5, 220–268. [Google Scholar]
Ahlswede, R. Arbitrarily varying channels with states sequence known to the sender. IEEE Trans. Inf. Theory 1986, 32, 621–629. [Google Scholar] [CrossRef]
Jahn, J.H. Coding of arbitrarily varying multiuser channels. IEEE Trans. Inf. Theory 1981, 27, 212–226. [Google Scholar] [CrossRef]
Hof, E.; Bross, S.I. On the deterministic-code capacity of the two-user discrete memoryless Arbitrarily Varying General Broadcast channel with degraded message sets. IEEE Trans. Inf. Theory 2006, 52, 5023–5044. [Google Scholar] [CrossRef]
Winshtok, A.; Steinberg, Y. The arbitrarily varying degraded broadcast channel with states known at the encoder. In Proceedings of the 2006 IEEE International Symposium on Information Theory, Seattle, WA, USA, 9–14 July 2006; pp. 2156–2160. [Google Scholar]
He, X.; Khisti, A.; Yener, A. MIMO Multiple Access Channel With an Arbitrarily Varying Eavesdropper: Secrecy Degrees of Freedom. IEEE Trans. Inf. Theory 2013, 59, 4733–4745. [Google Scholar]
Pereg, U.; Steinberg, Y. The arbitrarily varying degraded broadcast channel with causal side information at the encoder. In Proceedings of the 2018 International Zurich Seminar Information Communication (IZS’2018), Aachen, Germany, 25–30 June 2018; pp. 20–24. [Google Scholar]
Keresztfalvi, T.; Lapidoth, A. Partially-robust communications over a noisy channel. In Proceedings of the IEEE International Symposium on Information Theory (ISIT’2018), Vail, CO, USA, 17–22 June 2018; pp. 2003–2006. [Google Scholar]
Gubner, J.A. Deterministic Codes for Arbitrarily Varying Multiple-Access Channels. Ph.D. Dissertation, University of Maryland, College Park, MD, USA, 1988. [Google Scholar]
Gubner, J.A. On the deterministic-code capacity of the multiple-access arbitrarily varying channel. IEEE Trans. Inf. Theory 1990, 36, 262–275. [Google Scholar] [CrossRef]
Gubner, J.A. State constraints for the multiple-access arbitrarily varying channel. IEEE Trans. Inf. Theory 1991, 37, 27–35. [Google Scholar] [CrossRef]
Gubner, J.A. On the capacity region of the discrete additive multiple-access arbitrarily varying channel. IEEE Trans. Inf. Theory 1992, 38, 1344–1347. [Google Scholar] [CrossRef]
Gubner, J.A.; Hughes, B.L. Nonconvexity of the capacity region of the multiple-access arbitrarily varying channel subject to constraints. IEEE Trans. Inf. Theory 1995, 41, 3–13. [Google Scholar] [CrossRef]
Ahlswede, R.; Cai, N. Arbitrarily Varying Multiple-Access Channels; Universität Bielefeld: Bielefeld, Germany, 1996. [Google Scholar]
Ahlswede, R.; Cai, N. Arbitrarily varying multiple-access channels. I. Ericson’s symmetrizability is adequate, Gubner’s conjecture is true. IEEE Trans. Inf. Theory 1999, 45, 742–749. [Google Scholar] [CrossRef]
He, X.; Khisti, A.; Yener, A. MIMO multiple access channel with an arbitrarily varying eavesdropper. In Proceedings of the Allerton Conference on Communication, Control and Computing (Allerton’2011), Monticello, IL, USA, 28–30 September 2011; pp. 1182–1189. [Google Scholar]
Wiese, M.; Boche, H. The arbitrarily varying multiple-access channel with conferencing encoders. IEEE Trans. Inf. Theory 2013, 59, 1405–1416. [Google Scholar] [CrossRef]
Nitinawarat, S. On the Deterministic Code Capacity Region of an Arbitrarily Varying Multiple-Access Channel Under List Decoding. IEEE Trans. Inf. Theory 2013, 59, 2683–2693. [Google Scholar] [CrossRef][Green Version]
MolavianJazi, E.; Bloch, M.; Laneman, J.N. Arbitrary jamming can preclude secure communication. In Proceedings of the Allerton Conference on Communication, Control and Computing, Monticello, IL, USA, 30 September–2 October 2009; pp. 1069–1075. [Google Scholar]
Boche, H.; Schaefer, R.F. Capacity results and super-activation for wiretap channels with active wiretappers. IEEE Trans. Inf. Theory 2013, 8, 1482–1496. [Google Scholar] [CrossRef]
Aydinian, H.; Cicalese, F.; Deppe, C. Information Theory, Combinatorics, and Search Theory; Springer: Berlin/Heidelberg, Germany, 2013; Chapter 5. [Google Scholar]
Boche, H.; Schaefer, R.F.; Poor, H.V. On arbitrarily varying wiretap channels for different classes of secrecy measures. In Proceedings of the IEEE International Symposium on Information Theory (ISIT’2014), Honolulu, HI, USA, 29 June–4 July 2014; pp. 2376–2380. [Google Scholar]
Boche, H.; Schaefer, R.F.; Poor, H.V. On the continuity of the secrecy capacity of compound and arbitrarily varying wiretap channels. IEEE Trans. Inf. Forensic Secur. 2015, 10, 2531–2546. [Google Scholar] [CrossRef]
Nötzel, J.; Wiese, M.; Boche, H. The arbitrarily varying wiretap channel—Secret randomness, stability, and super-activation. IEEE Trans. Inf. Theory 2016, 62, 3504–3531. [Google Scholar] [CrossRef]
Goldfeld, Z.; Cuff, P.; Permuter, H.H. Arbitrarily Varying Wiretap Channels with Type Constrained States. IEEE Trans. Inf. Theory 2016, 62, 7216–7244. [Google Scholar] [CrossRef]
He, D.; Luo, Y. Arbitrarily varying wiretap channel with state sequence known or unknown at the receiver. arXiv 2017, arXiv:1701.02043. [Google Scholar]
Boche, H.; Deppe, C. Secure identification for wiretap channels; Robustness, super-additivity and continuity. IEEE Trans. Inf. Forensic Secur. 2018, 13, 1641–1655. [Google Scholar] [CrossRef]
Pereg, U.; Steinberg, Y. The Arbitrarily Varying Channel Under Constraints with Side Information at the Encoder. IEEE Trans. Inf. Theory 2019, 65, 861–887. [Google Scholar] [CrossRef]
Pereg, U.; Steinberg, Y. The arbitrarily varying channel under constraints with causal side information at the encoder. In Proceedings of the IEEE International Symposium on Information Theory (ISIT’2017), Aachen, Germany, 25–30 June 2017; pp. 2805–2809. [Google Scholar]
Csiszár, I.; Narayan, P. Capacity of the Gaussian arbitrarily varying channel. IEEE Trans. Inf. Theory 1991, 37, 18–26. [Google Scholar] [CrossRef]
Behboodi, A.; Piantanida, P. On the simultaneous relay channel with informed receivers. In Proceedings of the IEEE International Symposium on Information Theory (ISIT’2009), Seoul, Korea, 28 June–3 July 2009; pp. 1179–1183. [Google Scholar]
Bjelaković, I.; Boche, H.; Sommerfeld, J. Capacity results for arbitrarily varying wiretap channels. In Information Theory, Combinatorics, and Search Theory; Springer: Berlin/Heidelberg, Germany, 2013; pp. 123–144. [Google Scholar]
Sion, M. On General Minimax Theorems. Pac. J. Math. 1958, 8, 171–176. [Google Scholar] [CrossRef]
Pereg, U.; Steinberg, Y. The arbitrarily varying gaussian relay channel with sender frequency division. arXiv 2018, arXiv:1805.12595. [Google Scholar]
Hughes, B.; Narayan, P. Gaussian arbitrarily varying channels. IEEE Trans. Inf. Theory 1987, 33, 267–284. [Google Scholar] [CrossRef]
Csiszár, I.; Körner, J. Information Theory: Coding Theorems for Discrete Memoryless Systems, 2nd ed.; Cambridge University Press: Cambridge, UK, 2011. [Google Scholar]
Boyd, S.; Vandenberghe, L. Convex Optimization; Cambridge University Press: Cambridge, UK, 2004. [Google Scholar]
Blackwell, D.; Breiman, L.; Thomasian, A.J. The capacity of a class of channels. Ann. Math. Stat. 1959, 30, 1229–1241. [Google Scholar] [CrossRef]
Diggavi, S.N.; Cover, T.M. The worst additive noise under a covariance constraint. IEEE Trans. Inf. Theory 2001, 47, 3072–3081. [Google Scholar] [CrossRef]
Shannon, C. A mathematical theory of communication. Bell Syst. Tech. J. 1948, 27, 379–423, 623–656. [Google Scholar] [CrossRef]
Cover, T.M.; Thomas, J.A. Elements of Information Theory, 2nd ed.; Wiley: Hoboken, NJ, USA, 2006. [Google Scholar]

Figure 1. Communication over the arbitrarily varying relay channel

L = {W_{Y, Y_{1} | X, X_{1}, S}}

. Given a message M, the encoder transmits

X^{n} = f (M)

. At time

i \in [1 : n]

, the relay transmits

X_{1, i}

based on all the symbols of the past

Y_{1}^{i - 1}

and then receives a new symbol

Y_{1, i}

. The decoder receives the output sequence

Y^{n}

, and finds an estimate of the message

\hat{M} = g (Y^{n})

.

Figure 1. Communication over the arbitrarily varying relay channel

L = {W_{Y, Y_{1} | X, X_{1}, S}}

. Given a message M, the encoder transmits

X^{n} = f (M)

. At time

i \in [1 : n]

, the relay transmits

X_{1, i}

based on all the symbols of the past

Y_{1}^{i - 1}

and then receives a new symbol

Y_{1, i}

. The decoder receives the output sequence

Y^{n}

, and finds an estimate of the message

\hat{M} = g (Y^{n})

.

Figure 2. The functional

G (p, q) = \min {I_{q} (X, X_{1}; Y), I (X; Y_{1} | X_{1})}

, for

S \sim Bernoulli (q)

,

0 \leq q \leq 1

, as a function of q. The figure corresponds to Example 1, where

G (p, q) = \min {1 - \frac{1}{2} h (q), 1 - h (θ)}

, for

p (x, x_{1}) = p (x) p (x_{1})

, with

X \sim Bernoulli (\frac{1}{2})

and

X_{1} \sim Bernoulli (\frac{1}{2})

, and

θ = 0.08

. Clearly,

G (p, q)

is not convex in q, but rather quasi-convex in q.

Figure 2. The functional

G (p, q) = \min {I_{q} (X, X_{1}; Y), I (X; Y_{1} | X_{1})}

, for

S \sim Bernoulli (q)

,

0 \leq q \leq 1

, as a function of q. The figure corresponds to Example 1, where

G (p, q) = \min {1 - \frac{1}{2} h (q), 1 - h (θ)}

, for

p (x, x_{1}) = p (x) p (x_{1})

, with

X \sim Bernoulli (\frac{1}{2})

and

X_{1} \sim Bernoulli (\frac{1}{2})

, and

θ = 0.08

. Clearly,

G (p, q)

is not convex in q, but rather quasi-convex in q.

Figure 3. The marginals of the arbitrarily varying relay channel. For every relay transmission

x_{1} \in X_{1}

, the marginal sender-relay AVC is denoted by

W_{1} (x_{1}) = {W_{Y_{1} | X, X_{1}, S} (\cdot | \cdot, x_{1}, \cdot)}

, and the marginal sender-receiver AVC is denoted by

W (x_{1}) = {W_{Y | X, X_{1}, S} (\cdot | \cdot, x_{1}, \cdot)

. A sufficient condition, under which the deterministic code capacity is the same as the random code capacity of the AVRC, is given in Lemma 2. This condition is also a sufficient condition for positive capacity, but as explained in Remark 4, it is not a necessary condition.

Figure 3. The marginals of the arbitrarily varying relay channel. For every relay transmission

x_{1} \in X_{1}

, the marginal sender-relay AVC is denoted by

W_{1} (x_{1}) = {W_{Y_{1} | X, X_{1}, S} (\cdot | \cdot, x_{1}, \cdot)}

, and the marginal sender-receiver AVC is denoted by

W (x_{1}) = {W_{Y | X, X_{1}, S} (\cdot | \cdot, x_{1}, \cdot)

. A sufficient condition, under which the deterministic code capacity is the same as the random code capacity of the AVRC, is given in Lemma 2. This condition is also a sufficient condition for positive capacity, but as explained in Remark 4, it is not a necessary condition.

Figure 4. Bounds on the capacity of the Gaussian AVRC with sender frequency division. The dashed upper line depicts the random code capacity of the Gaussian AVRC as a function of the input constraint

Ω = Ω_{1}

, under state constraint

Λ = 1

and

σ^{2} = 0.5

. The solid lines depict the deterministic code lower and upper bounds

R_{G, l o w} (L)

and

R_{G, u p} (L)

. The dotted lower line depicts the direct transmission lower bound.

Figure 4. Bounds on the capacity of the Gaussian AVRC with sender frequency division. The dashed upper line depicts the random code capacity of the Gaussian AVRC as a function of the input constraint

Ω = Ω_{1}

, under state constraint

Λ = 1

and

σ^{2} = 0.5

. The solid lines depict the deterministic code lower and upper bounds

R_{G, l o w} (L)

and

R_{G, u p} (L)

. The dotted lower line depicts the direct transmission lower bound.

Figure 5. Communication over the primitive AVRC

L

. Given a message M, the encoder transmits

X^{n} = f (M)

. The relay receives

Y_{1}^{n}

and sends

L = f_{1} (Y_{1}^{n})

, where

f_{1} : Y_{1}^{n} \to [1 : 2^{n C_{1}}]

. The decoder receives both the channel output sequence

Y^{n}

and the relay output L, and finds an estimate of the message

\hat{M} = g (Y^{n}, L)

.

Figure 5. Communication over the primitive AVRC

L

. Given a message M, the encoder transmits

X^{n} = f (M)

. The relay receives

Y_{1}^{n}

and sends

L = f_{1} (Y_{1}^{n})

, where

f_{1} : Y_{1}^{n} \to [1 : 2^{n C_{1}}]

. The decoder receives both the channel output sequence

Y^{n}

and the relay output L, and finds an estimate of the message

\hat{M} = g (Y^{n}, L)

.

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Article Metrics

Citations

Article Access Statistics

Journal Statistics

Multiple requests from the same IP address are counted as one view.

The Arbitrarily Varying Relay Channel †

Abstract

1. Introduction

2. Definitions

2.1. Notation

2.2. Channel Description

2.3. Coding

3. Main Results—General AVRC

3.1. The Compound Relay Channel

3.2. The AVRC

3.2.1. Random Code Lower and Upper Bounds

3.2.2. Deterministic Code Lower and Upper Bounds

3.3. AVRC with Orthogonal Sender Components

4. Gaussian AVRC with Sender Frequency Division

5. Main Results—Gaussian AVRC with SFD

5.1. Gaussian Compound Relay Channel

5.2. Gaussian AVRC

6. The Primitive AVRC

6.1. Definitions and Notation

6.2. Main Results—Primitive AVRC

6.3. Primitive Gaussian AVRC

7. Discussion

Author Contributions

Conflicts of Interest

Abbreviations

Appendix A. Proof of Lemma 1

Appendix A.1. Partial Decode-Forward Lower Bound

Appendix A.2. Cutset Upper Bound

Appendix B. Proof of Corollary 1

Appendix C. Proof of Corollary 2

Appendix D. Proof of Theorem 1

Appendix D.1. Partial Decode-Forward Lower Bound

Appendix D.2. Cutset Upper Bound

Appendix E. Proof of Lemma 2

Appendix F. Proof of Corollary 4

Appendix G. Proof of Lemma 3

Appendix H. Proof of Lemma 4

Appendix I. Analysis of Example 1

Appendix J. Proof of Lemma 5

Appendix J.1. Achievability Proof

Appendix J.2. Converse Proof

Appendix K. Proof of Lemma 6

Appendix K.1. Achievability Proof

Appendix K.2. Converse Proof

Appendix L. Proof of Theorem 2

Appendix L.1. Achievability Proof

Appendix L.2. Converse Proof

Appendix M. Proof of Theorem 3

Appendix M.1. Lower Bound

Appendix M.2. Upper Bound

References

Article Metrics

Article Access Statistics

The Arbitrarily Varying Relay Channel^†