List-Decoding Capacity of the Gaussian Arbitrarily-Varying Channel †

In this paper, we determine the capacity of the Gaussian arbitrarily-varying channel with a (possibly stochastic) encoder and a deterministic list-decoder under the average probability of error criterion. We assume that both the legitimate and the adversarial signals are restricted by their power constraints. We also assume that there is no path between the adversary and the legitimate user but the adversary knows the legitimate user’s code. We show that for any list size L, the capacity is equivalent to the capacity of a point-to-point Gaussian channel with noise variance increased by the adversary power, if the adversary has less power than L times the transmitter power; otherwise, the capacity is zero. In the converse proof, we show that if the adversary has enough power, then the decoder can be confounded by the adversarial superposition of several codewords while satisfying its power constraint with positive probability. The achievability proof benefits from a novel variant of the Csiszár-Narayan method for the arbitrarily-varying channel.


Introduction
An arbitrarily-varying channel (AVC) represents a memoryless channel including unknown parameters that are changing arbitrarily from channel use to channel use. Because these parameters (state) can change arbitrarily, we consider this to be a model for an active adversarial jammer. This adversary sends its signal to restrain the legitimate transmitter and receiver from maintaining reliable communication. In wireless channels, these unpleasant adversaries can easily enter channels, so it is of great importance to study the adversary's effect on the channel capacity. The capacity of the AVC depends on the coding method (such as randomized coding, stochastic encoder or deterministic coding), the performance criterion (such as average or maximum error probability) and the amount of adversary's knowledge about the transmitted signal (omniscient, myopic or oblivious adversary). Table 1 provides a summary of various models appearing in the literature, including the one considered here.
Blackwell et al. introduced AVC in Reference [1] and under the average error probability criterion they found the capacity of the discrete memoryless AVC to be given by a min-max expression over the mutual information of input and output. They employed randomized coding that is, common randomness between the encoder and the decoder and assumed the jammer to be oblivious that is, the jammer does not have any information about the transmitted signal except the code. In Reference [2], it is shown that the min-max expression is equivalent to the corresponding max-min one. Further, in Reference [3], the authors examined that this capacity remains the same even for the maximum error probability criterion, again provided access to common randomness. The case without common randomness between the transmitter and the receiver is referred to as the deterministic code setting. Ahlswede in a notable paper [4] characterized the deterministic capacity of a discrete AVC under the average probability of error through a dichotomy theorem. He proved that the capacity either corresponds to the AVC randomized code capacity or else it equals zero but he did not state any necessary or sufficient condition for which of the two cases prevails. Ericson, in Reference [5], found the necessary condition for the positive alternative by defining symmetrizability. A symmetrizable AVC is an AVC in which the adversary can mimic the transmitted signal in order to prevent the decoder from distinguishing between the true message and an adversarial imitation. Thus, he showed that if the deterministic code capacity of an AVC is positive then the channel should be nonsymmetrizable. Later, in Reference [6], a sufficient condition was provided by Csiszár and Narayen stating that if the AVC is nonsymmetrizable then the deterministic code capacity is not zero. Therefore, considering both conditions, the deterministic code capacity of an AVC is positive if and only if the channel is nonsymmetrizable.
The capacity of discrete AVC is also investigated under input and state (or adversarial signal) constraints. Restricted by peak or average input and state cost constraints, the random code capacity of discrete AVC is studied in Reference [7] using the average probability of error as the performance criterion. Furthermore, the second part of Csiszár and Narayan work in Reference [6] focuses on the deterministic code capacity of AVC under input and state constraints for the same performance criterion. They proved that in this case if the capacity is positive then it is less than or equal to the corresponding random code capacity. In particular, with input and state constraints, the capacity can be positive but strictly less than the random code capacity. Note that this does not occur without cost constraints. Csiszár, in Reference [8], extended this result to general input and output alphabets and state sets rather than only finite alphabets and state sets.
There is a wide variety of research on different versions of AVCs under various adversaries model, including References [9][10][11]. Sarwate, in [9], considered a myopic adversary in which there is a discrete memoryless channel (DMC) between the legitimate user and the jammer and the jammer chooses its signal based on this noisy version of the user's codeword. He found the capacity by minimizing over all DMCs that the jammer can applied by its worst strategies. In Reference [10], single letter characterizations of capacity is obtained in the presence of a delayed adversary which can observe the transmitted signal after a delay. By assuming randomization at the encoder, the capacity is corresponding to the randomized code capacity. B. K. Dey et al., in Reference [11], obtained upper bounds on the capacity of binary channel in the presence of a causal adversary for both maximal and average error probabilities.
On the other hand, AVCs are also studied in network settings throughout [12][13][14][15][16][17][18], such as multiple-access and broadcast channels. The authors in Reference [12][13][14] investigated the capacity of arbitrarily varying multiple-access channels (AVMACs). In Reference [13], it is proved that symmetrizability, which is defined for the two-user AVC, is a sufficient condition for an AVMAC to have empty interior for its deterministic code capacity region. Moreover, throughout [15][16][17][18], the capacity of arbitrarily varying wiretap channels (AVWCs) are determined.  [22] Gaussian Avg Oblivious C P Λ+σ 2 , P > Λ 0, This paper focuses on the Gaussian AVC (GAVC), wherein all alphabets are continuous rather than discrete. Initially, Ahlswede, in Reference [23], studied the capacity of a GAVC in which the adversary chooses the noise variance rather than an additive signal. Hughes and Narayan in Reference [21] determined the randomized code capacity of GAVC under the peak power input and state constraints. They further extended their result in Reference [24] for a vector GAVC. The deterministic code capacity of the GAVC, for the average probability of error, was found in Reference [22]. The authors showed that if the adversary's power is greater than the legitimate transmitter's power, then symmetrizability occurs causing the capacity to drop to zero. Note that for a discrete AVC with no cost constraint non-symmetrizability makes the deterministic capacity positive and equal to the randomized capacity [6] (Theorem 1). It is further proved in [6] (Theorem 3) that under input and state cost constraints, non-symmetrizability only results in positive deterministic capacity but it is sometimes strictly less than the randomized capacity. In the Gaussian case, even though there are input and state cost constraints, the behavior is like that of a discrete AVC with no cost constraint, in that if the channel is non-symmetrizable, then its deterministic capacity is positive and equal to the randomized capacity [22].
For the first time, in Reference [19], Hughes showed that using list-decoding, in which the decoder can decode to a small (and bounded) list rather than a unique message estimate, causes positive capacity for most symmetrizable discrete-memoryless AVCs. Intuitively, list-decoding combats the symmetrizing attack by allowing the list to contain the true message as well as the counterfeit(s) generated by the adversary; thus, the receiver can successfully decode even if it cannot specify the correct message. Furthermore, the authors in Reference [20] extended the list-decoding result to the discrete-memoryless AVCs with state constraints. They determined upper and lower bounds on the capacity by introducing two notions of symmetrizability for this channel.
The capacity of AVMAC was also studied with list-decoding in References [25,26]. Sirin Nitinawarat in Reference [25] introduced symmetrizability of an AVMAC and showed that the capacity region for deterministic codes with fixed list-size is empty if the list size is less than the symmetrizability Ω. He obtained that the capacity corresponds to the random code capacity if the list size is greater than (1 + Ω) 2 . H. Boche and R.F. Schaefer in Reference [26] obtained the list capacity region of AVMAC with conferencing encoders which is proved for large list size to be equal to the common randomness assisted capacity region. Moreover, in Reference [27], the authors found the deterministic code and random code capacity regions of arbitrarily varying broadcast channel with receiver side information. By defining a concept of symmetrizability for the channel, they characterized deterministic list codes capacity region as either identical to the random code capacity region or empty.
In this paper, we characterize the capacity of GAVC in Reference [22], using list-decoding for any list size and almost all power values of the transmitter and adversary, a similar result to that of Hughes in Reference [19] which obtained the list capacity for the discrete-memoryless AVC, for which the capacity was determined in Reference [6]. We assume that the encoder may be stochastic-that is, the encoder has access to private randomness-and a deterministic list decoder with constant list size L. Under the average probability of error criterion and without common randomness, we obtain the capacity of GAVC with list decoding to be equal to the corresponding randomized code capacity if the list size is greater than the power ratio of the jammer to the legitimate user; otherwise, the capacity is zero. Generally, our problem is a generalized version of the multiple packing of spherical caps problem in Reference [28] with Gaussian noise; although, they assumed maximal probability of error as the performance criterion. Their upper bound, which is only calculated for the noiseless case, depends on the list size L even in the asymptotic case. However, we only have list size in our symmetrizability conditions rather than the capacity itself.
In our converse proof (in Section 4), the adversary focuses on two possible strategies, one of which is simply sending Gaussian noise which causes the channel to act as a standard Gaussian channel with increased noise variance. The second strategy for the adversary is to transmit the superposition of some random (counterfeit) codewords, which is shown to be possible with positive probability if its power is large enough. In our achievability proof (in Section 5), we employ Gaussian codewords with a particular version of minimum distance list-decoding based on typicality. We extend the scheme of Reference [22] to show that with high probability a random Gaussian codebook has desirable properties to make the probability of error zero. However, our achievability proof originates from the idea of Reference Reference [6] based on typical sets, rather than the geometric approach of Reference [22]. This scheme allows us for simpler characterizations of codebook constraints. It is worth mentioning that we prove the achievability for the deterministic encoder since it suffices to achieve a rate even by a deterministic code, that is any deterministic code is a realistic value of a stochastic code. Our converse and achievability proofs apply for any list size; our previous work [29] provided proofs only for list size L = 2.
Notations: Upper case letters denote random variables while lower case letters specify a deterministic value or the realization of a random variable. Bold letters denotes n-length vectors. We indicate inner product and 2-norm by ·, · and · , respectively. We use | · | + and E[·] to denote the positive-part function and the expectation, respectively. Also, for an integer N, notation [N] represents the set {1, 2, 3, . . . , N}. Notation I n and 1 n stand for the identity matrix of size n and a vector of size n with n ones elements, respectively. For a vector v, we use superscript v T to denote its transpose. Both log(·) and exp(·) functions has base 2, so we define the Gaussian channel capacity function C(x) = 1 2 log(1 + x).

Problem Statement
A GAVC is a modified standard point-to-point Gaussian channel in the presence of an additive arbitrarily chosen adversary signal. Both transmitter and receiver do not know anything about the adversary signal and the adversary do not have any information about the transmitted signal except the codebooks. The received signal is given by where the n-length vector x is the legitimate transmitter's signal, s represents the independent adversary signal and noise V is a sequence of n-length i.i.d. zero mean Gaussian random variables with variance σ 2 , independent of x and s. We have the assumption of peak power constraints for the transmitter and adversary signals respectively as x 2 ≤ nP and s 2 ≤ nΛ. In addition, the transmitter and receiver are assumed to know the three parameters P, Λ and σ 2 . An (n, N, N r , L) stochastic list code for the GAVC is given by: where the rate of the code is R = 1 n log(N/L). The transmitter encodes the message M and its private randomness K to x(M, K) where M and K are chosen uniformly respectively from sets M and K. At the receiver, signal Y is decoded by a deterministic function φ to the set D L which is the set of all subsets of [N] with cardinality at most L. In other words, L is the size of the list decoder.
First, define the probability of error e(s, i) for a specific message i ∈ [N] in the presence of a specific adversary signal s ∈ R n as the probability that i / ∈ φ(y). Therefore, the average probability of error for s is given byē Finally, the overall probability of error P (n) e is obtained by maximizing over all possible choices of jammers' sequences s satisfying peak power constraint s 2 ≤ nΛ. Suppose r is rate of private randomness. Given L and r, rate R is achievable if there exists a sequence of n, L2 nR , 2 nr , L codes such that lim n→∞ P (n) e = 0. The list-code capacity C (L, r) is the supremum of all achievable rates given L and r.

Main Results
Theorem 1. The list-code capacity of GAVC is given by Note that the capacity for Λ = LP is unsolved.

Remark 1.
Note that this result holds for all r, including r = 0 which corresponds to a deterministic encoder. That is, the capacity does not depend on the amount of private randomness.

Remark 2.
The condition on the ratio Λ P determines whether it is possible for the adversary to launch a symmetrizing attack, wherein it transmits a superposition of codewords. Since each codeword has power P, the most codewords that the adversary can superpose while obeying its power constraint of Λ is the Λ P . Thus, if the allowable list size is greater than Λ P , then even under this attack the decoder can output a list made up of the true message and the superposed codewords selected by the adversary. Of course, the decoder does not know which is which but it can still guarantee that the true message is in the list. Thus, the worst the adversary can do is to act as an extra additive Gaussian noise with variance Λ, so the capacity is equal to the capacity of a standard Gaussian channel with increased noise variance as in C( P Λ+σ 2 ) . However, if the allowable list size is less than Λ P , then there are too many possibilities for the decoder to decode correctly, so the capacity is zero. Note that none of this depends on the channel noise, so σ 2 does not come into play in the condition on L.

Remark 3.
For the achievability proof, we make no assumptions about what the adversary does. However, for the converse proof, it is allowable to weaken the adversary by making certain assumptions about its behavior, because doing so can only increase the achievable rates. Since the converse is an upper bound on achievable rates, weakening the adversary in this manner still yields a valid upper bound.
In our proofs in Sections 4 and 5, without loss of generality we restrict ourselves to the transmitter's power P = 1 which can be done by scaling the output signal.

Converse Proof
Without loss of generality, suppose P = 1. For the first case where Λ < L, we assume that we have a code (n, L2 nR , 2 nr , L) with vanishing probability of error. Since these codes must function for arbitrary jamming signals, we may derive an outer bound by assuming the adversary transmits Gaussian noise with variance Λ − γ for any γ > 0 or 0 if Gaussian realization has power greater than Λ. By the law of large numbers, with high probability the resulting channel is equivalent to a standard Gaussian channel with noise variance σ 2 + Λ − γ. Thus, since γ can be chosen arbitrarily small, from the capacity of a non-adversarial Gaussian channel, Now, assume the symmetrizable case where Λ > L. In order to show C (L, r) = 0, first consider a sequence of stochastic codebooks and probability of error P (n) e . We claim that if R > 0 and the jammer has the following strategy, then P (n) e is bounded away from zero for sufficiently large n: The jammer randomly and uniformly chooses L messages M 1 , . . . , M L from [L2 nR ] and also L private keys K 1 , . . . , K L from [2 nr ] where M i and K i are independent. Note that the jammer knows the transmitted codebook. The jammer then constructs where µ ∈ R n is a constant vector that we will choose later. The jammer transmits S = Z if Z 2 ≤ nΛ or else transmits S = 0. In the former case, the received signal is where M 0 is the true message. If then from the decoder's perspective, any L of the L + 1 messages might have been forged by the adversary. Therefore, the list decoder with list size at most L has a probability of error at least 1 L+1 ; that is, the probability that the decoder chooses L from the L + 1 messages that does not contain the true message M 0 . That is, where X i = x(M i , K i ) and the second term in (10) shows the probability of messages M 0 , . . . , M L not being distinct which tends to zero as n → ∞. Note that X 0 , X 1 , . . . , X L are independent and each distributed as a transmitted sequence from the code. We proceed to show that there exists a choice of µ based only on the codebook such that (10) is bounded away from zero for sufficiently large n if R > 0. Let Note that α ≤ 1, since by the power constraint we always have P( X 0 2 ≤ n) = 1. Fix any δ > 0 and let Since α > α we have γ > 0. Thus for n sufficiently large, there exists µ ∈ R n such that This µ is the one to be used by the jammer in (5). Let B n be the set of all x that satisfy x − µ 2 ≤ nα. Note that P(X 0 ∈ B n ) ≥ γ − δ.
Since α − δ < α , by the definition of α , Specifically, there exists n sufficiently large so that for all µ ∈ R n , Fix any x 1 ∈ B n and consider those x ∈ B n such that Thus, we obtain the following by applying (15) Moreover, if x 1 , . . . , x L ∈ B n satisfy x i − µ, x j − µ ≤ n √ δα for all i = j ∈ {1, . . . , L }, then where (23) holds since α < α + δ ≤ 1 + δ and (24) holds for sufficiently small δ and by assumption Λ > L. Now we have where (25) follows from the analysis leading to (24), (26) follows from the union bound and the fact that X 0 , X 1 , . . . , X L are independent and (27) follows from the lower bound on the probability of B n and (20). For sufficiently small δ, (27) is bounded away from zero, so by (10), P (n) e is also bounded away from zero for sufficiently large n if R > 0.

Achievability Proof
Before proceeding to the proof, let define the typical set for Gaussian random variables X 1 , . . . , X k as: Without loss of generality, assume P = 1, r = 0 and Note that assuming r = 0 makes the code deterministic. Thus, it suffices to achieve the list-code capacity by a (n, L2 nR , L) deterministic code construction as follows: Codebook generation: Fix > > γ > 0. We generate L2 nR i.i.d zero mean Gaussian sequences Encoding: The transmitter sends X = X(m) if its power is less than 1, otherwise it sends zero. Decoding: First, given y, for 1 ≤ ≤ L let set S be the set of -tuple messages (m 1 , . . . , m ) that (x(m 1 ), . . . , x(m ), y) ∈ T (n) (X 1 , . . . , X , Y) for any set of zero-mean Gaussian variables (X 1 , . . . , X , Y) such that and 1 ≤ A ≤ 1 + σ 2 + Λ. Now we define the decoding function as where we choose between multiple minimizing L arbitrarily. Analysis of the probability of error: To prove that the constructed code is achievable meaning that the probability of error tends to zero as n → ∞, we utilize several lemmas including the following. We provide some necessary codebook properties that hold with high probability in Lemma 1, the proof for which is in the Appendix A. Lemma 1. Let X(m) for m ∈ [N], N = L2 nR , be the Gaussian codebook described above. With probability approaching 1 as n → ∞, the codebook satisfies the following, for any x, s where s 2 ≤ nΛ and any zero-mean jointly Gaussian random vector (X, X 1 , . . . , X , S) with positive definite covariance matrices with diagonals at most (1, 1, . . . , 1, Λ) for all 1 ≤ ≤ L: The following lemmas can be easily generalized for Gaussian random variables by following the corresponding lemmas in Reference [30] for discrete memoryless random variables.
Conditional Typicality Lemma: Let (X, Y) ∼ f (x, y) be jointly Gaussian random variables. Suppose that x ∈ T (n) (X) and Y ∼ f (y|x) = ∏ n i=1 f Y|X (y i |x i ). Then, for every > , Joint Typicality Lemma: Let (X, Y, Z) ∼ f (x, y, z) be jointly Gaussian random variables. If (x,ỹ) is a pair of arbitrary sequences andZ ∼ ∏ n i=1 f Z|X (z i |x i ) then there exists δ( ) > 0 that tends to zero as → 0 such that Assume that the legitimate user transmits message M. Then, the probability of error is upper bounded by the sum of the following L error events probabilities: By Lemma 1, we may assume we have a deterministic codebook that satisfies (33)-(37). Consider any state sequence s. By (33), with high probability (x(M), s) ∈ T (n) (X, S) where (X, S) are independent and EX 2 = 1, ES 2 ≤ Λ (33). Thus, by the conditional typicality lemma, for every > with high probability (x i , s, V) ∈ T (n) (X, S, V) where (X, S, V) are mutually independent and EV 2 = σ 2 . We first bound probability event P for 1 ≤ < L. Define the shorthand X = (XX 1 . . . X SV). Let V denote a finite set of Gaussian distributions of X that is -dense in the set of all Gaussian distributions of X with variances at most (1, 1, . . . , 1, Λ, σ 2 ). Note that the cardinality of V does not depend on n. We may upper bound P by We will show that 1 N ∑ N m=1 e X (m, s) → 0 for all Gaussian vectors X (whether or not they are in V ). We may restrict ourselves to X where (X, S, V) are mutually independent, where (44) holds since the legitimate transmitter, the jammer and the noise are independently generated, (45) follows from the assumptions for X , (46) corresponds to EX i (Y − X i ) = 0 following from (31) and the assumption that (m 1 , . . . , m ) ∈ S and (47) is obtained by (31) as follows: Now, suppose I(XV; SX 1 . . . X ) = 0. (52) and where (53) 2 and so we must have However, this contradicts the assumptions that X is independent from S, X 1 , . . . , X , V, since Therefore, the assumption in (52) is false and there exists η > 0 such that η ≤ I(XV; SX 1 . . . X ) = I(XV; X 1 . . . X |S).
Now, consider the following two cases.
Let Z = X + S + V − X 1 . From (46)-(47), we get Using this result in (70), we obtain e X (m, s) ≤ exp n R − C 1 meaning that e X (m, s) is exponentially vanishing if δ( ) is sufficiently small and the rate condition in (30) holds. Now, we consider error probability P L . Define X L = (XX 1 X 2 . . . X L SV). Let V L denote a finite -dense subset of Gaussian vectors X L with variances at most 1, 1, 1, . . . , 1, Λ, σ 2 . Thus, P L can be upper bounded by where (83) follows from (31). Since EX i X j = 0 for all i, j ∈ [L] and i = j, the covariance matrix of (S, X 1 , . . . , X L ) is equal to which has the determinant of ES 2 − L. This determinant should be positive since the covariance matrix Cov(S, X 1 , . . . , X L ) is positive definite. However, since ES 2 ≤ Λ, this assumption contradicts the assumption that Λ < L in (29). Thus, there exists η > 0 such that where we have used the fact that I(XV; S) = 0. Now, we may consider two cases R < min{I(X 1 ; S), . . . , I(X L ; S)} and R ≥ min{I(X 1 ; S), . . . , I(X L ; S)}. Therefore, using an identical argument as in the cases (a) and (b) for P , it follows that e X L is also exponentially vanishing.

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

Appendix A. Proof of Lemma 1
In order to prove (33), we use our proof in Reference [31] (Lemma 6) for one codebook. Moreover, to obtain (34)-(37), we apply the corresponding proof of these four equations in Reference [19] (Lemma 1) for Gaussian distributions. Note that Reference [19] focuses on discrete alphabets but the same proofs can be extended to Gaussian distributions by quantization of the set of continuous random variables in the following way.
Using the similar proof in Reference [19] (Lemma 1), we obtain for given x, s and covariance matrix Cov(X, X 1 , . . . , X , S) that the complement of each event in (34)-(37) happens with decreasingly doubly exponential probability for sufficiently large n meaning that Then, in order to complete the proof, since for any fixed ν the cardinality of finite set S n is only increasingly exponentially in n and the set V ( +2)×( +2) is finite along with the doubly decreasing exponential probabilities in (A1)-(A4), we derive that with probability approaching to 1, all inequalities in (34)-(37) hold simultaneously for sufficiently large n. Since these inequalities hold for every element in the finite sets S n and V ( +2)×( +2) , then for any vector s, x and any given covariance matrix Cov(X, X 1 , . . . , X , S) (with x 2 = n, s 2 ≤ nΛ) which is not in its corresponding ν-dense subset, there exists a point in the corresponding ν-dense subset that is close enough to it (in its ν distance neighborhood). Now, by using the continuity properties of all sets, we may conclude that (34)-(37) hold also for any point which is not in the ν-dense subset.