Abstract
We study the problem of communicating over a discrete memoryless two-way channel using non-adaptive schemes, under a zero probability of error criterion. We derive single-letter inner and outer bounds for the zero-error capacity region, based on random coding, linear programming, linear codes, and the asymptotic spectrum of graphs. Among others, we provide a single-letter outer bound based on a combination of Shannon’s vanishing-error capacity region and a two-way analogue of the linear programming bound for point-to-point channels, which, in contrast to the one-way case, is generally better than both. Moreover, we establish an outer bound for the zero-error capacity region of a two-way channel via the asymptotic spectrum of graphs, and show that this bound can be achieved in certain cases.
1. Introduction
The problem of reliable communication over a discrete memoryless two-way channel (DM-TWC) was originally introduced and investigated by Shannon [1] in a seminal paper that has marked the inception of multi-user information theory. A DM-TWC is characterized by a quadruple of finite input and output alphabets , , , , and a conditional probability distribution , where , , , . The channel is memoryless in the sense that channel uses are independent, that is, for any i,
In [1], Shannon provided inner and outer bounds for the vanishing-error capacity region of the DM-TWC, in the general setting where the users are allowed to adapt their transmissions on the fly based on past observations. We note that Shannon’s inner bound is tight for non-adaptive schemes, namely when the users map their messages to codewords in advance. The non-adaptive DM-TWC is also sometimes called the restricted DM-TWC [2]. Shannon’s inner and outer bounds have later been improved by utilizing auxiliary random variable techniques [3,4,5], and sufficient conditions under which his bounds coincide have been obtained [6,7]. However, despite much effort, the capacity region of a general DM-TWC under the vanishing-error criterion remains elusive. In fact, a strong indicator for the inherent difficulty of the problem can be observed in Blackwell’s binary multiplying channel, a simple, deterministic, common-output channel whose capacity remains unknown hitherto [4,5,8,9,10].
In yet another seminal work, Shannon proposed and studied the zero-error capacity of the point-to-point discrete memoryless channel [2], also known as the Shannon capacity of a graph. This problem has been extensively studied by others, most notably in [11,12], yet remains generally unsolved. In this paper, we consider the problem of zero-error communication over a DM-TWC. We limit our discussion to the case of non-adaptive schemes, for which the capacity region is known in the vanishing-error case [1]. Despite the obvious difficulty of the problem (the point-to-point zero-error capacity is a special case), its two-way nature adds a new combinatorial dimension that renders it interesting to study. To the best of our knowledge, this problem has not been addressed before, except in the special case of the binary multiplying channel, where upper and lower bounds on non-adaptive zero-error sum capacity have been obtained [13,14,15]. Our bounds are partially based on generalizations of these ideas and an earlier short version [16].
The problem of non-adaptive communication over a DM-TWC can be formulated as follows. Alice and Bob would like to simultaneously convey messages and respectively to each other, over n uses of the DM-TWC . To that end, Alice maps her message to an input sequence (codeword) using an encoding function , and Bob maps his message into an input sequence (codeword) using an encoding function . We call the pair of codeword collections a codebook pair. Note that the encoding functions depend only on the messages, and not on the observed outputs during the transmission, hence the name non-adaptive. When transmissions end, Alice and Bob observe the resulting (random) channel outputs and respectively, and attempt to decode the message sent by their counterpart, without error. When this is possible, that is, when there exist decoding functions and such that and , for all , with probability one, then the codebook pair (or the encoding functions) is called uniquely decodable. A rate pair is achievable for the DM-TWC if an uniquely decodable code exists for some n. The non-adaptive zero-error capacity region of a DM-TWC is the closure of the set of all achievable rate pairs, and is denoted here by . Moreover, the non-adaptive zero-error sum-capacity of a DM-TWC , denoted by , is the supremum of the sum-rate taken over all achievable rate pairs.
The main objective of this paper is to provide several single-letter outer and inner bounds on the non-adaptive zero-error capacity region of the DM-TWC. The remainder of this paper is organized as follows. In Section 2, we provide some necessary mathematical preliminaries, discussing in particular the characterization of zero-error DM-TWC capacity via confusion graphs, behavior under graph homomorphisms, and one-shot zero-error communication. Section 3 is devoted to three general outer bounds of the zero-error capacity region of DM-TWC, which are based on Shannon’s vanishing-error non-adaptive capacity region, a two-way analogue of the linear programming bound for point-to-point channels, and the Shannon capacity of a graph. In Section 4, we provide two general inner bounds using random coding and random linear codes respectively. In Section 5, we establish outer bounds for certain types of DM-TWC via the asymptotic spectra of graphs, and also explicitly construct the uniquely decodable codebook pairs achieving the outer bound. Some concluding remarks appear in Section 6.
2. Preliminaries
2.1. Shannon Capacity of a Graph
Let be a graph with vertex set V and edge set E. Two vertices are adjacent, denoted as , if there is an edge between and , that is, . An independent set in G is a subset of pairwise non-adjacent vertices. A maximum independent set is an independent set with the largest possible number of vertices. The size of a maximum independent set in G is called the independence number of G, denoted by . The complement of a graph G, denoted by , is a graph with the same vertex set, where two distinct vertices of are adjacent if and only if they are not adjacent in G. We write and for the complete graph (containing all possible edges) and the empty graph (containing no edges) over n vertices, respectively.
Let and be two graphs. The strong product (or normal product) of the graphs G and H is a graph such that
- (1)
- the vertex set of is the Cartesian product ;
- (2)
- two vertices and are adjacent if and only if one of the followings holds: (a) and ; (b) and ; (c) and .
The n-fold strong product of graph G with itself is denoted as . The Shannon capacity of graph G was defined in [2] to be:
where the limit exists by Fekete’s lemma. We note that throughout the paper all logarithms are taken to base 2.
The disjoint union of the graphs G and H is a graph such that and . A graph homomorphism from G to H, denoted by , is a mapping such that if in G, then in H. We write if there exists a graph homomorphism from the complement of G to the complement of H.
In [17], Zuiddam introduced the asymptotic spectrum of graphs notion, and provided a dual characterisation of the Shannon capacity of a graph by applying Strassen’s theory of asymptotic spectra, which includes the Lovász theta number [12], the fractional clique cover number, the complement of the fractional orthogonal rank [18], and the fractional Haemers’ bound over any field [11,19,20] as specific elements of the asymptotic spectrum (also called spectral points).
Theorem 1
([17]). Let be a collection of graphs that is closed under the disjoint union ⊔ and the strong product ⊠, and also contains the graph with a single vertex . Define the asymptotic spectrum as the set of all mappings such that for all :
- (1)
- if , then ;
- (2)
- ;
- (3)
- ;
- (4)
- .
Then, . In other words, and .
As remarked in [17], is in general not an element of . In fact, is not additive under ⊔ by a result of Alon [21], and also not multiplicative under ⊠ by a result of Haemers [11]. In Section 3.3, to derive an outer bound for zero-error capacity of a DM-TWC, we will employ the multiplicativity of for under the ⊠ operation.
2.2. Confusion Graphs of Channels
In this subsection, we characterize the zero-error capacity of a discrete memoryless point-to-point channel, as well as the zero-error capacity region of a DM-TWC, in terms of suitably defined graphs. The point-to-point characterization is well known and goes back to Shannon [2], and the DM-TWC case is a natural generalization thereof.
A discrete memoryless point-to-point channel consists of a finite input alphabet , a finite output alphabet , and a conditional probability distribution , where , . The channel is memoryless in the sense that for the ith channel use. Suppose that a transmitter would like to convey a message to a receiver over the channel. To that end, the transmitter sends an input sequence using an encoding function , and the receiver, after observing the corresponding channel outputs , guesses the message using a decoding function . This pair is called an code, and such a code is called uniquely decodable if holds for any and any correspondingly possible . A rate R is called achievable if an uniquely decodable code exists for some n. The zero-error capacity of the channel is defined as the supremum of all achievable rates.
A channel is associated with a confusion graph G, whose vertex set is the input alphabet , and two vertices are adjacent, denoted as , if and only if there exists that is possible under both of them, that is, such that and . It is easy to verify that is an uniquely decodable code if and only if is an independent set of the graph , the n-fold strong product of graph G. Consequently, the zero-error capacity of a point-to-point channel is equal to the Shannon capacity of its confusion graph G. Note that there are infinitely many distinct channels with the same confusion graph, and all of these channels have the same zero-error capacity.
We now proceed to similarly associate a DM-TWC with a collection of confusion graphs, which would then be shown to characterize its zero-error capacity region. To that end, note that when Alice sends a letter , the resulting channel from Bob back to Alice at that same instant is the point-to-point channel . This channel is associated with a confusion graph , whose vertex set is and where two vertices are adjacent, denoted in this case by , if and only if there exists some such that both
where
Symmetrically, when Bob sends a letter , the resulting channel from Alice to Bob at that same instant is associated with a confusion graph , whose vertex set is , and where two vertices are adjacent, denoted in this case by , if and only if there exists some such that both:
where
Based on the foregoing discussion, a DM-TWC can be decomposed into a collection of discrete memoryless point-to-point channels, and hence is associated with a corresponding collection of confusion graphs, denoted by , where and . The following useful observation is immediate, and in particular shows that the zero-error capacity region of a DM-TWC is a function of its confusion graphs only. Thus, from here and on, we will sometimes identify the channel with its collection of confusion graphs.
Proposition 1.
Consider a DM-TWC , associated with the collection of confusion graphs . A codebook pair is uniquely decodable for if and only if for any and , it holds that is an independent set of , and is an independent set of .
In particular, we see that the capacity region depends only on the corresponding confusion graphs . Hence, in the sequel, we will write and to represent and , respectively. We will also often identify the channel with its confusion graphs, and refer to it as , when it is clear from the context. This also leads to the following immediate observation, analogues to the point-to-point case.
Proposition 2.
If and have the same confusion graphs up to some relabeling on input symbols, then .
This further immediately implies:
Proposition 3.
depends only on the conditional marginal distributions and .
The strong product of two DM-TWCs and , denoted by , refers to a DM-TWC having input alphabets and , as well as confusion graphs
Considering the zero-error sum-capacity with respect to the strong product, we have the lemma below.
Lemma 1.
Proof.
To prove this lemma, it is sufficient to prove that, for any (resp. ) uniquely decodable codebook pair (resp. ) for channel (resp. ), there exists an uniquely decodable codebook pair for the associated product channel . To that end, let
It is easy to verify that is uniquely decodable for the product channel. Moreover, and . The lemma follows. □
2.3. Dual Graph Homomorphisms
In this subsection we study the behavior of the zero-error capacity region of a DM-TWC under graph homomorphisms, generalizing a similar analysis from the point-to-point channel case [2]. Let and be two collections of confusion graphs corresponding to two DM-TWCs such that , , and . A dual graph homomorphism from to , denoted by , is a pair of mappings , where and , such that
- (1)
- if in , then in ; and
- (2)
- if in , then in .
It is easy to see that the dual graph homomorphism is a natural generalization of the standard graph homomorphism of two graphs, in the sense that they are both adjacency preserving. We write if there exists a dual graph homomorphism from to . Then:
Lemma 2.
If , and and do not have self-loops, then
Proof.
Suppose and is a uniquely decodable codebook pair of length n for the DM-TWC . Let
We now show that is a uniquely decodable codebook pair for the DM-TWC . To that end, it suffices to show that for any distinct and , we have
Indeed, since is a uniquely decodable codebook pair, there exist coordinates such that in and in . By the definition of , we have that in and in , implying (1). It is also evident that and . The lemma now follows by taking the union over all uniquely decodable codebook pairs for . □
2.4. One-Shot Zero-Error Communication
In this subsection, we consider the problem of zero-error communication over a DM-TWC with only a single channel use by the two parties (i.e., ). We refer to the associated set of achievable rate pairs as the one-shot zero-error capacity region, and the associated sum-rate as the one-shot zero-error sum-capacity. Recall that the one-shot zero-error capacity of a point-to-point channel is simply the logarithm of the independence number of its confusion graph; this quantity yields a lower bound on the zero-error capacity of the channel, and also provides an infinite-letter expression for the capacity when evaluated over the product graph. It is therefore interesting to study the analogue of the independence number in the two-way case, which in particular would yield an inner bound on the zero-error capacity region of the DM-TWC. For simplicity of exposition, we will focus here on the one-shot zero-error sum-capacity only.
For convenience we define some notions first. Let be a DM-TWC such that and . A pair of subsets and is called a dual clique pair of the DM-TWC if and for any distinct and distinct , that is, S is a clique in each for , and T is a clique in each for . A pair of subsets and is called a dual independent pair of the DM-TWC if T is an independent set of the graph for each , and S is an independent set of the graph for each . A maximum dual independent pair is a dual independent pair with the largest possible product of sizes . This product is called the independence product of , denoted by . According to the definition, the one-shot zero-error sum-capacity of the DM-TWC is . It is also readily seen that if two channels have the same confusion graphs up to some relabeling of input symbols, then they have the same collections of dual clique pairs and dual independent pairs, and hence the same one-shot zero-error sum-capacity.
For two graphs and , let be the union of and such that and . Notice that the graph disjoint union ⊔ in Section 2.1 is a special case of the union ∪, when the vertex sets of and are disjoint. For notation convenience, in the rest of this subsection we let and . The following simple observations are now in order.
Proposition 4.
Suppose is a dual independent pair of . Then:
- (1)
- If , then . The equality holds by taking and T be a maximum independent set of , where .
- (2)
- .
- (3)
- S is an independent set of .
Proof.
The results follow directly from the definition of dual independent pairs. □
Lemma 3.
Let be a DM-TWC and G, H be graphs such that , . Then:
- (1)
- (2)
- (3)
- (4)
- .
Proof.
(1) The lower bound follows from Proposition 4 (1) and the symmetry of S and T. From Proposition 4 (2), we have
yielding the upper bound.
(2) From claim (1) above, we have . The equality holds by taking S and T as the maximum independent sets of H and G respectively.
(3) From claims (1) and (2) above, we have
On the other hand, suppose is a dual independent pair. We have the following three cases: (i) If then by Proposition 4, claim (1), we have . (ii) If , similar to case (i), we have . (iii) If and , then by Proposition 4, claim (2), we obtain . Thus .
(4) is a direct consequence of claim (1) above. The lemma follows. □
By graph homomorphisms we immediately have:
Proposition 5.
If , then
Next, we shall provide an upper bound for via a generalization of the Lovász theta number [12]. Let be an arbitrary positive semi-definite matrix (i.e., ), and be its th entry. Let J be an all-one matrix, and be an identity matrix. For any matrices A and B, denote and denote as the transpose of matrix A. Now define as
Lemma 4.
.
Proof.
Suppose with , is a maximum dual independent pair such that . For a number m and a set S, denote . Let be an matrix such that
Notice that for any vector we have
This shows that is a positive semi-definite matrix satisfying the equality constraints in (2). Accordingly, is a feasible solution for program (2) and
implying the result. This completes the proof. □
2.5. Information-Theoretic Notations
We recall some standard information-theoretic quantities that will be used in the sequel. Let be two discrete random variables taking values from sets according to a joint probability distribution . Let denote the marginal probability distribution for X, where , and be the marginal probability distribution for Y similarly. The Shannon entropy of X is denoted by where . In particular, the binary entropy function is written as , where . The conditional entropy of X given Y is written as where . The mutual information between X and Y is . The conditional mutual information of given another random variable Z is . The following basic properties will be used in the arguments afterwards.
Proposition 6.
(1) . (Non-negativity)
(2) , . (Conditioning reduces entropy)
(3) . (Entropy chain rule)
3. Outer Bounds
In this section, we provide single-letter outer bounds for the non-adaptive zero-error capacity region of the DM-TWC. First in Section 3.1, we present two simple outer bounds, one based on Shannon’s vanishing-error non-adaptive capacity region and the other on a two-way analogue of the linear programming bound for point-to-point channels. Next in Section 3.2, we combine the two bounds given in Section 3.1 and obtain an outer bound that is generally better than both. Finally, in Section 3.3 we derive another single-letter outer bound via the asymptotic spectra of graphs.
3.1. Simple Bounds
It is trivial to see that Shannon’s vanishing-error non-adaptive capacity region of the DM-TWC ([1], Theorem 3) contains its zero-error counterpart. First recall Shannon’s bound in [1].
Lemma 5
([1]). The vanishing-error non-adaptive capacity region of a DM-TWC is the convex hull of the set:
where the union is taken over all product input probability distributions .
Together with Proposition 2, this immediately yields the following outer bound.
Lemma 6.
is contained in
where
The first intersection is taken over all DM-TWCs with the same adjacency as , and the maximum is taken over all product input probability distributions .
Remark 1.
We now proceed to obtain a combinatorial outer bound. Recall that a dual clique pair of a DM-TWC is a pair of subsets and such that and for any distinct and distinct . In the sequel, we adopt the convention that .
Lemma 7.
Proof.
Let be a uniquely decodable codebook pair of length n. We will show that:
by induction on n, where is a constant independent of n.
Indeed, for the base case , one could take subsets , such that for any distinct and distinct , we have and . Clearly, and (7) follows by taking sufficiently large.
Assume that (7) holds for every length , and let us proceed to prove for length n. Suppose is a uniquely decodable codebook pair of length n. For a vector , let be its projection over all coordinates not equal to i. For each coordinate and each , , let
be the projections of each codebook obtained by fixing the ith coordinate. Define the distributions induced by these projections over and respectively to be
Furthermore, for any two subsets and , define the codebooks induced by the unions over S and T of the respective projected codebooks, to be
Note that if is a dual clique pair such that and , then the unions in (10) are disjoint, as otherwise this would contradict the assumption that is uniquely decodable. Hence
and also, for any it must hold that is a uniquely decodable codebook pair of length . Combining (8), (9) and (11) gives
By the inductive hypothesis, we obtain
where the second inequality follows from the definition of in (6). This completes the proof. □
The following is a trivial corollary of Lemmas 6 and 7.
Corollary 1.
is contained in
where
3.2. An Improved Bound
We now provide a single-letter outer bound, in which the order of the minimum and the maximum in (14) is swapped. This generally yields a tighter outer bound due to the max–min inequality. In fact, our bound can be seen as a generalization of the one obtained by Holzman and Körner for the binary multiplying channel [13], in which case the max–min is indeed strictly tighter than the min–max.
Theorem 2.
is contained in
where
The first intersection is taken over all DM-TWCs with the same adjacency as , and the maximum is taken over all product input probability distributions .
Proof.
The intersection over all follows from Proposition 2. Hence without loss of generality, we prove that for , each achievable rate pair satisfies , where .
To that end, for each uniquely decodable codebook pair of length n, we will show that:
by induction on n, where is a constant independent of n. The base case of , follows in the same way as in the proof of the base case in Lemma 7. Assume that (17) holds for all length , and let us prove it also holds for length n. Suppose that is a uniquely decodable codebook pair of length n. Following the same steps of (8)–(12) in the argument of Lemma 7, we also have:
Now, if there exists a dual clique pair and a coordinate such that
then (18) implies
where the inequality follows from the inductive hypothesis and (19). Therefore, we conclude that (17) holds under condition (19).
Assume now that condition (19) is not satisfied, that is,
Let and be codewords chosen from and respectively, uniformly at random, and let , be the corresponding channel outputs. Since is a uniquely decodable codebook pair of length n, it must be that:
On the other hand, we have:
where (23) follows from the entropy chain rule and the memorylessness of the channel, and (24) follows from the fact that conditioning reduces entropy. Similarly,
Combining (20)–(26), we obtain
where and are defined in (4) and (6), respectively, and the maximum is taken over all product input probability distributions such that , following condition (20).
We remark that Theorem 2 immediately implies, in particular, the following upper bound on the zero-error capacity of the point-to-point discrete memoryless channel.
Corollary 2.
The zero-error capacity of the discrete memoryless channel is upper bounded by
The outer minimum is taken over all the having the same confusion graph as , the outer maximum is taken over all the input distributions , and the inner maximum is taken over all the cliques C of the confusion graph of the channel.
As it turns out, the upper bound in Corollary 2 coincides with the linear programming bound on the zero-error capacity of a point-to-point discrete memoryless channel in [2]. Namely,
for any point-to-point discrete memoryless channel . This fact was originally conjectured by Shannon [2] and later proved by Ahlswede [22]. In other words, this means that in the point-to-point case, Corollary 1 yields exactly the same bound as Theorem 2. However, this is not the case in general for the DM-TWC. For example, recall that Holzman and Körner [13] derived the bound in Theorem 2 in the special case of the (deterministic) binary multiplying channel (using ) and numerically showed that it is strictly better than what can be obtained from Corollary 1. Next we give another example showing that Theorem 2 outperforms Corollary 1 for a noisy (i.e., non-deterministic) DM-TWC as well.
Example 1.
Let , and the conditional probability distribution be
where . Corollary 1 gives the upper bound
where
and . In contrast, Theorem 2 yields a tighter upper bound of
| 00 | 01 | 10 | 11 | 20 | 21 | |
| 00 | 1 | 1 | 0 | 0 | 0 | 0 |
| 01 | 0 | 0 | 0 | 0 | 1 | 0 |
| 10 | 0 | 0 | 0 | 0 | 1 | |
| 11 | 0 | 0 | 1 | 0 | 0 | |
3.3. An Outer Bound via Shannon Capacity of a Graph
Based on Lemma 3 and the Shannon capacity of a graph, we immediately have the following bound.
Lemma 8.
It is worth noting that the above bound could be optimal in the sense that when all and , it is easily verified that . However, the bound in Lemma 8 is not tight in general. Later in Section 5, we will improve the bound in Lemma 8 for certain scenarios and show that the improved bound (in Theorem 5) could outperform Theorem 2 (see Example 3), and be achieved in special cases (see Theorem 7).
4. Inner Bounds
In this section, we present two inner bounds for the non-adaptive zero-error capacity region of the DM-TWC, one based on random coding and the other on linear codes.
4.1. Random Coding
The random coding for DM-TWC is standard and generalizes a known bound by Shannon for the one-way case [2]. To obtain the random coding inner bound, we need the following lemma from [1].
Lemma 9
([1]). Let X be a random variable taking values in , and be a collection of nonnegative functions. Then there exists such that for all .
Theorem 3.
contains the region:
where the union is taken over all input distributions , .
Proof.
We randomly draw a codebook pair , such that (resp. ) consists of (resp.) statistically independent words, where each word is generated i.i.d. according to a probability distribution (resp.). A word is called bad, if there exist two words, that are either equal or adjacent in . For any particular words , and coordinate , the probability that in is upper bounded by:
Since all the coordinates are independent, the probability that in is at most:
Denote by the number of 2-subsets such that in . Then,
where the first inequality is by Markov’s inequality, and the second inequality follows from (33) and the linearity of expectation. Similarly, a word is called bad, if there exist two words that are equal or adjacent in , and we have
Let , be the number of bad words in and respectively. Then, we have:
By Lemma 9, there exists a pair such that
Remove all the bad words in and respectively, yielding a codebook pair such that:
It is readily seen that is a uniquely decodable codebook pair.
4.2. Linear Codes
In this subsection, we present a construction of uniquely decodable codes via linear codes, which generalizes a known result for the binary multiplying channel [15]. Let us introduce some notations first. Suppose D is a set of letters, and are vectors of length n, and is a collection of vectors of length n. Let:
denote the collection of indices where . For let denote the vector obtained from by projecting onto the coordinates in I, and denote
Let be a DM-TWC. We say that is a detecting symbol, if for any distinct . A detecting symbol is defined analogously. Let and denote the sets of all detecting symbols in and , respectively. A vector is called a detecting vector for if
Similarly, a vector is a detecting vector for if
The following claim is immediate.
Proposition 7.
Let , . If each is a detecting vector for and each is a detecting vector for , then is a uniquely decodable codebook pair.
Proposition 7 provides a sufficient condition for a uniquely decodable code, which is not always necessary (see Example 2). Nevertheless, this sufficient condition furnishes us with a way of constructing uniquely decodable codes by employing linear codes.
Example 2.
Suppose that , such that , , and , , . Let and . It is easy to verify that is a uniquely decodable codebook pair. However, and , implying that is not a detecting vector for .
Assume that and , where are prime powers, and let us think of the alphabets as and , respectively. The following theorem gives an inner bound on the capacity region, which is a generalization of the Tolhuizen’s construction for the Blackwell’s multiplying channel [15].
Theorem 4.
Let be a DM-TWC with input alphabet sizes , , where , are prime powers. If and contain and detecting symbols respectively, then contains the region
where is the binary entropy function.
To prove this theorem, we need the following lemma. The case that and was proved in ([15], Theorem 3). Lemma 10 follows from similar argument.
Lemma 10.
Let q, be prime powers, be positive integers such that , and with cardinality . Then there exists a pair satisfying that:
- (1)
- is a q-ary linear code;
- (2)
- such that
- (3)
- for each , we have and .
Proof.
Let A be a matrix of full rank over , then is a q-ary linear code generated by A. Recall that for every , as in (40). Denote:
Let denote the submatrix of A with columns indexed by . It is easy to see that is equivalent to . Denote:
and let us proceed by double counting the cardinality of .
On the one hand, the number of vectors such that is . For each such , there are corresponding matrices such that , where is the number of invertible matrices over , see ([15], Lemma 3). Hence, we have:
On the other hand, the number of matrices is . By (44) and the pigeonhole principle, there exist a matrix and a corresponding code such that . Letting , the lemma follows. □
Proof of Theorem 4.
For , let us identify with , and let the respective sets of all detecting symbols be with .
To prove the existence of a uniquely decodable codebook pair based on Proposition 7, we first use Lemma 10 to find two “one-sided” uniquely decodable linear codebook pairs, and then combine them to the desired codebook pair by employing their cosets in and .
First, letting , , and in Lemma 10, we have a pair satisfying that is a -ary linear code and such that
Similarly, letting , , and in Lemma 10, we have a pair satisfying that is a -ary linear code and such that
The property in Lemma 10 implies that each is a detecting vector for for . Note that if is a coset of , then each is also a detecting vector for .
Now we are going to combine the two pairs and . Since has cosets, then by the pigeonhole principle there exists coset of such that:
We note that for any DM-TWC, one could only exploit part of input symbols , to meet the requirements in Theorem 4. Hence we in fact have the following more general bound.
Corollary 3.
Let be a DM-TWC with input alphabets , . Then contains the region:
where the first union is taken over all , such that and are prime powers, and contain and detecting symbols, respectively.
Notice that the region (43) relies on the number , of symbols being used and the corresponding numbers of detecting symbols. It is thus possible that using only a smaller subset of channel inputs would yield higher achievable rates (when using our linear coding strategy) than those obtained by using larger subsets. For Example 1, Corollary 3 shows that a lower bound on the maximum sum-rate is , which is better than the random coding lower bound .
5. Certain Types of DM-TWC
In this section, we consider the DM-TWC in the scenario that the communication in one direction is stable (in particular, noiseless). First we briefly review the probabilistic refinement of the Shannon capacity of a graph in Section 5.1. Then in Section 5.2, we provide an outer bound on the zero-error capacity region via the asymptotic spectrum of graphs. In Section 5.3, we present explicit constructions that attain our outer bound in certain special cases.
5.1. Probabilistic Refinement of the Shannon Capacity of a Graph
We first recall some basic notions and results from the method-of-types. Let be a sequence and be the number of times that appears in sequence . The type of is the relative proportion of each symbol in , that is, for all . Let denote the collection of all possible types of sequences of length n. For every , the type class of P is the set of sequences of type P, that is, . The ϵ-typical set of P is
The joint type of a pair of sequences is the relative proportion of occurrences of each pair of symbols of , that is, for all and . By the Bayes’ rule, the conditional type is defined as:
Lemma 11
([23]). .
Lemma 12
([23]). , we have .
In [24], Csiszár and Körner introduced the probabilistic refinement of the Shannon capacity of a graph, imposing that the independent set consists of sequences of the (asymptotically) same type. Let denote the subgraph of induced by . The Shannon capacity of graph G relative to P is defined as
Let denote the subgraph of induced by . Then, it is readily seen that:
For each , define
If , then according to Lemma 12, we have
for any . Very recently, Vrana [25] proved the following results on .
Lemma 13
([25]). The limit in (48) exists and
- (1)
- ;
- (2)
- for .
According to Lemma 11, it is easily seen that:
Here, we would like to mention that the probabilistic refinement of the Lovász theta number was introduced and investigated by Marton in [26] via a non-asymptotic formula, and the probabilistic refinement of the fractional clique cover number was studied in relation to the graph entropy in [27].
5.2. An Outer Bound via the Asymptotic Spectrum of Graphs
In this subsection, we derive an outer bound for the case when all are the same, namely, for all .
Theorem 5.
is contained in the region
Proof.
Suppose that is a uniquely decodable codebook pair of length n. For any and , let denote the joint type of the pair and
By Lemma 11, there are at most different joint types over . Thus by the pigeonhole principle, there exists one joint type such that:
Notice that for each , we have:
Now we are going to upper bound the cardinality of . Let (resp. ) denote the collection of (resp. ) that appears in , that is, there exists (resp. ) such that . Then we immediately have
Let us now turn to upper bound the cardinalities of and . Since is uniquely decodable, by Proposition 1, for any it must hold that is an independent set of . Accordingly,
Also, for , we notice that is an independent set of with type . To be precise, we have:
Therefore we have:
where (54) follows from (50); (55) follows from the fact that , are fixed when n tends to infinity; (56) follows from (51); (57) follows from (52) and (53); (58) follows from Theorem 1 that for any graph G; (59) follows from Theorem 1 that each is multiplicative with respect to the strong product; and (60) follows from Theorem 1 and Lemma 13.
This completes the proof. □
In particular, considering the DM-TWC such that , , and is a general graph, we have the following result.
Theorem 6.
is contained in the region
Proof.
Recall that: and . According to Theorem 5, we have:
where the last equality is achieved by taking and . □
We remark that Theorem 6 (hence also Theorem 5) could outperform Theorem 2, see the following example.
Example 3.
Consider the channel where is the Pentagon graph. It is well known from [2,12] that . Then by Theorem 6 we have an upper bound on the sum-rate , while Theorem 2 only gives an upper bound .
5.3. Explicit Constructions
In this subsection, we present explicit constructions of uniquely decodable codebook pairs which could attain the outer bound of Theorem 6 in certain special cases.
Theorem 7.
Let m be a prime power, and be a disjoint union of s cliques. Then .
Proof.
First by Theorem 6, we have an upper bound on the sum-capacity given by
Next, we consider the lower bound. Notice that . We can reformulate the channel accordingly as:
where the first corresponds to a channel with input alphabets and ; and the second is with input alphabets and . Together with Lemma 1, we have:
On the one hand, it is easy to see that:
since this is a clean channel and Alice and Bob could always communicate without error. On the other hand, by Lemma 10, we obtain:
In fact, letting , and in Lemma 10, we have a pair satisfying that is an m-ary linear code and such that
Now let and , then it is easy to see that is a uniquely decodable codebook pair with respect to the channel . The corresponding sum-rate is
Taking , we obtain a lower bound on the best possible sum-rate, that is, (64).
6. Concluding Remarks
In this paper, we investigated the non-adaptive zero-error capacity region of the DM-TWC and provided several single-letter inner and outer bounds, some of which coincide in certain special cases. Determining the exact zero-error capacity region of a general DM-TWC remains an open problem, and clearly a difficult one, since it includes the notorious Shannon capacity of a graph as a special case. Despite this inherent difficulty, the problem is richer than the graph capacity setting, and we believe it deserves further study in order to obtain tighter bounds and smarter constructions.
One appealing direction is to extend the Lovász’s semi-definite relaxation approach in order to obtain tighter outer bounds, mimicking the graph capacity case. This, however, does not seem to be a simple task. In particular, one may ask whether the natural quantity defined in (2), which upper-bounds the one-shot zero-error sum-capacity, is sub-multiplicative with respect to the graph strong product, in which case it would also serve as an upper bound for the zero-error sum-capacity. This is however not evident, in part since the problem (2) is not a semi-definite program. We have also considered other variations of the program (2). In particular, we have attempted to modify the non-linear constraints to be of a linear form for some suitable symmetric matrix A. We have also looked at some variants of the orthonormal representation. For example, we considered the case where each graph vertex a is labelled by a unit vector , and if two vertices a and are nonadjacent if and only if for some set F, then the vector projections of and onto the subspace spanned by the vectors in F are orthogonal. However, proving sub-multiplicativity in any of these settings has so far resisted our best efforts.
It would be also of much interest to consider the adaptive zero-error capacity of the DM-TWC. Allowing Alice and Bob to adapt their transmissions on the fly can in general enlarge the zero-error capacity region. As a simple example, note that a point-to-point channel with noiseless feedback is a special case of the DM-TWC (where Bob has no information to send). In [2], Shannon explicitly derived the zero-error capacity with feedback for the point-to-point channel, and pointed out that for the channel corresponding to Pentagon graph this capacity is given by . This is strictly larger than the zero-error capacity without feedback , which can be thought of in this case as the non-adaptive zero-error capacity of the channel. Exploring the differences between the adaptive and non-adaptive zero-error capacity regions of a general DM-TWC remains a challenging future work.
Author Contributions
Conceptualization, Y.G. and O.S.; methodology, Y.G. and O.S.; investigation, Y.G. and O.S.; writing—original draft preparation, Y.G. and O.S.; writing—review and editing, Y.G. and O.S. All authors have read and agreed to the published version of the manuscript.
Funding
This work was supported by an ERC grant no. 639573, ISF grant no. 1495/18, and JSPS grant no. 21K13830.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
Not applicable.
Acknowledgments
We would like to thank Sihuang Hu and Lele Wang for some helpful discussions on the generalization of Lovász theta number.
Conflicts of Interest
The authors declare no conflict of interest.
References
- Shannon, C.E. Two-way communication channels. In Proceedings of the 4th Berkeley Symposium Mathematics, Statistics and Probability, Oakland, CA, USA, 20 June–30 July 1961; pp. 611–644. [Google Scholar]
- Shannon, C.E. The zero error capacity of a noisy channel. IRE Trans. Inf. Theory 1956, 2, 8–19. [Google Scholar] [CrossRef]
- Han, T. A general coding scheme for the two-way channel. IEEE Trans. Inf. Theory 1984, 30, 35–44. [Google Scholar] [CrossRef]
- Hekstra, A.P.; Willems, F.J. Dependence balance bounds for single-output two-way channels. IEEE Trans. Inf. Theory 1989, 35, 44–53. [Google Scholar] [CrossRef]
- Zhang, Z.; Berger, T.; Schalkwijk, J. New outer bounds to capacity regions of two-way channels. IEEE Trans. Inf. Theory 1986, 32, 383–386. [Google Scholar] [CrossRef]
- Weng, J.; Song, L.; Alajaji, F.; Linder, T. Capacity of two-way channels with symmetry properties. IEEE Trans. Inf. Theory 2019, 65, 6290–6313. [Google Scholar] [CrossRef]
- Weng, J.; Song, L.; Alajaji, F.; Linder, T. Sufficient conditions for the tightness of Shannon’s capacity bounds for two-way channels. In Proceedings of the 2018 IEEE International Symposium on Information Theory (ISIT), Vail, CO, USA, 17–22 June 2018; pp. 1410–1414. [Google Scholar]
- Sabag, O.; Permuter, H.H. An achievable rate region for the two-way channel with common output. In Proceedings of the 2018 56th Annual Allerton Conference on Communication, Control, and Computing (Allerton), Monticello, IL, USA, 2–5 October 2018; pp. 527–531. [Google Scholar]
- Schalkwijk, J. The binary multiplying channel—A coding scheme that operates beyond Shannon’s inner bound region. IEEE Trans. Inf. Theory 1982, 28, 107–110. [Google Scholar] [CrossRef]
- Schalkwijk, J. On an extension of an achievable rate region for the binary multiplying channel. IEEE Trans. Inf. Theory 1983, 29, 445–448. [Google Scholar] [CrossRef]
- Haemers, W. On some problems of Lovász concerning the Shannon capacity of a graph. IEEE Trans. Inf. Theory 1979, 25, 231–232. [Google Scholar] [CrossRef]
- Lovász, L. On the Shannon capacity of a graph. IEEE Trans. Inf. Theory 1979, 25, 1–7. [Google Scholar] [CrossRef]
- Holzman, R.; Körner, J. Cancellative pairs of families of sets. Eur. J. Combin. 1995, 16, 263–266. [Google Scholar] [CrossRef][Green Version]
- Janzer, B. A new upper bound for cancellative pairs. Electron. J. Combin. 2018, 25, 2–13. [Google Scholar] [CrossRef]
- Tolhuizen, L.M. New rate pairs in the zero-error capacity region of the binary multiplying channel without feedback. IEEE Trans. Inf. Theory 2000, 46, 1043–1046. [Google Scholar] [CrossRef]
- Gu, Y.; Shayevitz, O. On the non-adaptive zero-error capacity of the discrete memoryless two-way channel. In Proceedings of the 2019 IEEE International Symposium on Information Theory (ISIT), Paris, France, 7–12 July 2019; pp. 3107–3111. [Google Scholar]
- Zuiddam, J. The asymptotic spectrum of graphs and the Shannon capacity. Combinatorica 2019, 39, 1173–1184. [Google Scholar] [CrossRef]
- Cubitt, T.; Mancinska, L.; Roberson, D.E.; Severini, S.; Stahlke, D.; Winter, A. Bounds on entanglement-assisted source-channel coding via the Lovász theta number and its variants. IEEE Trans. Inform. Theory 2014, 60, 7330–7344. [Google Scholar] [CrossRef]
- Blasiak, A. A Graph-Theoretic Approach to Network Coding. Ph.D. Thesis, Cornell University, Ithaca, NY, USA, 2013. [Google Scholar]
- Bukh, B.; Cox, C. On a fractional version of Haemers’ bound. IEEE Trans. Inform. Theory 2019, 65, 3340–3348. [Google Scholar] [CrossRef]
- Alon, N. The Shannon capacity of a union. Combinatorica 1998, 18, 301–310. [Google Scholar] [CrossRef]
- Ahlswede, R. Channels with arbitrarily varying channel probability functions in the presence of noiseless feedback. Z. Wahrscheinlichkeitstheorie Verwandte Geb. 1973, 25, 239–252. [Google Scholar] [CrossRef]
- Csiszár, I.; Körner, J. Information Theory: Coding Theorems for Discrete Memoryless Systems; Cambridge University Press: Cambridge, UK, 1981. [Google Scholar]
- Csiszár, I.; Körner, J. On the capacity of the arbitrarily varying channel for maximum probability of error. Z. Wahrscheinlichkeitstheorie Verwandte Geb. 1981, 57, 87–101. [Google Scholar] [CrossRef]
- Vrana, P. Probabilistic refinement of the asymptotic spectrum of graphs. Combinatorica 2021. [Google Scholar] [CrossRef]
- Marton, K. On the Shannon capacity of probabilistic graphs. J. Comb. Theory Ser. B 1993, 57, 183–195. [Google Scholar] [CrossRef]
- Körner, J. Coding of an information source having ambiguous alphabet and the entropy of graphs. In Proceedings of the 6th Prague Conference on Information Theory, Prague, Czech Republic, 1 January 1973; pp. 411–425. [Google Scholar]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).