Network Coding for Line Networks with Broadcast Channels

: An achievable rate region for line networks with edge and node capacity constraints and broadcast channels (BCs) is derived. The region is shown to be the capacity region if the BCs are orthogonal, deterministic, physically degraded, or packet erasure with one-bit feedback. If the BCs are physically degraded with additive Gaussian noise then independent Gaussian inputs achieve capacity.


Introduction
Consider a line network with edge and node capacity constraints as shown in Figure 1."Supernode" u, u = 1, 2, 3, 4, consists of two nodes ui, uo where the "i" represents "input" and "o" represents "output".
More generally, for N supernodes V = {1, 2, . . ., N } we have 2N nodes and 2(N − 1) + N directed edges Every edge (a, b) is labeled with a capacity constraint C (a,b) and for simplicity we write C (ui,uo) as C u .
Figure 1.A line network with edge and node capacity constraints.
denote a multicast traffic session, where u, v(1), . . ., v(L) are supernodes.The meaning is that a source message is available at supernode u and is destined for supernodes in the set D(u).Since u takes on N values and D(u) can take on 2 N −1 − 1 values, there are up to N (2 N −1 − 1) multicast sessions.
We associate sources with input nodes ui and sinks with output nodes uo.Such line networks are special cases of discrete memoryless networks (DMNs) and we use the capacity definition from [1] (Section III.D).The capacity region was recently established in [2].A binary linear network code achieves capacity and progressive d-separating edge-cut (PdE) bounds [3] provide the converse.The goal of this work is to extend results from [2] to wireless line networks by using insights from two-way relaying [4], broadcasting with cooperation [5], and broadcasting with side-information [6].The model is shown in Figure 2 where the difference to Figure 1 is that node uo transmits over a two-receiver broadcast channel (BC) P (y u−1 , y u+1 |x u ) to nodes (u − 1)i and (u + 1)i (see [7]).The channel outputs at node ui are for some functions f u−1,u (•) and f u+1,u (•), and where the Z u , u = 1, 2, . . ., N , are statistically independent.We permit the noise random variables Z u to be common to Y u,u−1 and Y u,u+1 for generality.The edges (ui, uo) are the usual links with capacity C u .Such line networks are again special cases DMNs and we use the capacity definition from [1] (Section III.D).
Figure 2. A line network with broadcasting and node capacity constraints.
The paper is organized as follows.Section 2 reviews the capacity region for line networks derived in [2].Section 3 gives our main result: an achievable rate region for line networks with BCs.Section 4 shows that this region is the capacity region for orthogonal, deterministic, and physically degraded BCs, and packet erasure BCs with feedback.We further show that for physically degraded Gaussian BCs the best input distributions are Gaussian.Section 5 relates our work to recent work on relaying and concludes the paper.

Review of Wireline Capacity
We review the main result from [2].Let m(u → D(u)) and R(u → D(u)) denote the message bits and rate, respectively, of traffic session u → D(u).We collect the bits going through supernode u into the following 8 sets: The idea is that LR and m (u) RL represent traffic flowing from left-to-right and right-to-left, respectively, through supernode u without being required at supernode u; • m LRu , m RLu represent traffic flowing from left-to-right and right-to-left, respectively, through supernode u but required at supernode u also; • m u represents traffic from the left and right, respectively, and destined for supernode u but not destined for any nodes on the right and left (so this traffic "stops" at supernode u on its way from the left or right); • m u,LR , m u,R , and m u,L represent traffic originating at supernode u and destined for nodes on both the left and right, right only, and left only, respectively.The non-negative message rates are denoted R Theorem 1 (Theorem 1 in [2]) The capacity region of a line network with supernodes u = 1, 2, . . ., N is specified by the inequalities Remark 1 The converse in [2] follows by PdE arguments [3] and achievability follows by using rate-splitting, routing, copying, and "butterfly" binary linear network coding.We review both the PdE bound and the coding method after Examples 1 and 2 below.
Example 1 Consider N = 3 for which we have 9 possible multicast sessions.The network is as in Figure 1 but where the nodes 4i and 4o are removed, as well as the edges touching them.For supernode u = 1 we collect 7 of these sessions into 2 sets as follows.(We abuse notation and write u → {v} as u → v.) Sessions 2 → 3 and 3 → 2 are missing from (17) and (18) because they do not involve supernode 1.
Similarly, for supernode 2 we collect the 9 sessions into 8 sets: Finally, for supernode 3 we have the 2 sets The inequalities of Theorem 1 are LR , R We discuss the 7 inequalities (23)-(25) in more detail.Consider first the converse.We write a classic cut as (S, S c ), where S is the set of nodes on one side of the cut and S c is the set of nodes on the other side of the cut.The inequalities with the edge capacities C 1,2 , C 2,1 , C 2,3 , C 3,2 are classic cut bounds.For example, the cut (S, S c ) = ({1i, 1o}, {2i, 2o, 3i, 3o}) gives the bound R 1,R ≤ C 1,2 .
The inequalities with the "node" capacities C 1 , C 2 , C 3 in ( 23)-( 25) are not classic cut bounds.To see this, consider the bound R 1 + R 1,R ≤ C 1 .A classic cut bound would require us to choose {1o, 2o, 3o} ⊆ S c because m 1 and m 1,R generally include messages with positive rates for all supernodes.But then the only way to isolate the edge (1i, 1o) is to choose S = {1i} which gives the too-weak bound R 1,R ≤ C 1 .
We require a stronger method and use PdE bounds.We use the notation in [3]: E d is the edge cut, S d is the set of sources whose sum-rate we bound, π(•) is the permutation that defines the order in which we test the sources.Consider the edge cut E d = {(1i, 1o)}, the source set S d with the traffic sessions ( 17) and ( 18), and any permutation π(•) for which the sessions (17) appear before the sessions (18).After removing edge (1i, 1o) the PdE algorithm removes edge (1o, 2i) because node 1o has no incoming edges.We next test if the sessions ( 17) are disconnected from one of their destinations; indeed they are because one of these destinations is node 1o.The PdE algorithm now removes the remaining edges in the graph because the nodes 2i, 2o, 3i, 3o are not the sources of messages in (18).As a result, the remaining sessions ( 18) are disconnected from their destinations and the PdE bound gives The bound on C 3 follows similarly.The bound on C 2 is more subtle and we develop it in a more general context below (see the text after Example 2).
For achievability, note that all 7 inequalities are routing bounds except for the bound on C 2 in (24).To approach this bound, we use a classic method and XOR the bits in sessions 1 → 3 and 3 → 1 before sending them through edge (2i, 2o).More precisely, we combine m(1 → 3) and m(3 → 1) to form by which we mean the bits formed when the smaller-rate message bits are XORed with a corresponding number of bits of the larger-rate message.The remaining larger-rate message bits are appended so that ).The message (26) is sent to node 2o together with the remaining messages received at node 2i.We must thus satisfy the bound on C 2 in (24).For the routing bounds there are two subtleties.First, node 2o forwards to the left the uncoded bits then no bits should be removed and node 2o again communicates m(3 → 1) to node 1i at rate R(3 → 1).The bits node 2o forwards to the right are treated similarly.In summary, the rates for messages m(3 → 1) and m(1 → 3) on edges (2o, 1i) and (2o, 3i), respectively, are simply the classic routing rates.
The second routing subtlety is more straightforward: after node 1i receives the XORed bits, it can recover m(3 → 1) by subtracting the bits m(1 → 3) that it knows.Finally, node 1i transmits m(3 → 1) to node 1o.Node 3i operates similarly.
Example 2 Consider Figure 1 with N = 4 for which there are 28 possible multicast sessions.For supernode 1 we collect 19 of these sessions into 2 sets as follows.
The 9 sessions not involving supernode 1 are missing.The rate bounds for supernode 1 are given by ( 14)-( 16) with u = 1.The messages and rate bounds for supernode 4 are similar.
Similarly, for supernode 2 we collect 26 of 28 sessions into 8 sets as follows.
The converse and coding method for N > 3 are entirely similar to the case N = 3.However, we have not yet developed the PdE bound for N = 3 and edge cut E d = {(2i, 2o)}.We do this now but in the more general context of N ≥ 2 and E d = {(ui, uo)} for any u.
• Remove (ui, uo) and then remove (uo, (u−1)i) and (uo, (u+1)i) because node uo has no incoming edges.The resulting graph at supernode u is shown in Figure 3.
• Test if the sessions (8)- (10) (sessions m LRu , m RLu , m u ) are disconnected from one of their destinations, which they are because one of these destinations is node uo.
• Remove all edges to the right of supernode u because the nodes to the right are not the sources of the remaining sessions ( 6), ( 11)-( 13) (sessions m (u) LR , m u,LR , m u,R , and m u,L ).
• Test if the sessions ( 6), (11) and (12) LR , m u,LR , m u,R ) are disconnected from one of their destinations, which they are because one of these destinations is to the right of supernode u.
• Remove all edges to the left of supernode u because the nodes to the left are not the sources of the sessions (13) (sessions m u,L ).
• Test if the sessions ( 13) are disconnected from one of their destinations, which they are.
Since the algorithm completes successfully, the PdE bound (almost) gives inequality ( 14), but with R RL ).The other inequality, i.e., the one with R (u) follows by choosing S d with all the traffic sessions ( 6)-( 13) except for (6), and by modifying π(•) so that the edges to the left of supernode u are removed first, and then the edges to the right.

Achievable Rates with Broadcast
We separate channel and network coding, which sounds simple enough.However, every BC receiver has side information about some of the messages being transmitted, so we will need the methods of [6].We further use the theory in [5] to describe our achievable rate region.
We begin by having each node ui combine m LR and m by which we mean the same operation as in (26): the smaller-rate message bits are XORed with a corresponding number of bits of the larger-rate message.The remaining larger-rate message bits are appended so that m RL has rate max(R RL ).The message (37) is sent to node uo together with the remaining messages received at node ui.As a result, we must satisfy the bound (14).
The bits arriving at node uo are (37) and ( 8)- (13).Bits m u are removed at node uo since this node is their final destination.The bits (37) and ( 8)-( 9) and ( 11) must be broadcast to both nodes (u − 1)i and (u + 1)i.The remaining bits m u,R and m u,L are destined (or dedicated) for the right and left only, respectively.However, we know from information theory for broadcast channels [7] that it can help to broadcast parts of these dedicated messages to both receivers.So we split m u,R and m u,L into two parts each, namely the respective (m u,R , m u,R ) and (m u,L , m u,L ) where m u,R and m u,L are broadcast to both nodes (u − 1)i and (u + 1)i.The rates of m u,R and m u,R are the respective R u,R and R u,R , and similarly for R u,L and R u,L .We choose a joint distribution P SuTuWuXu (•) and generate a codebook of size by choosing every letter of every codeword independently using P Wu (•).
We next choose "binning" rates R Tu and R Su .For every w u , we choose 2 n(R u,R +R Tu ) length-n codewords t u by choosing the ith letter t u,i of t u via the distribution P Tu|Wu (•|w u,i ) where w u,i is the ith letter of w u .We label t u with the arguments of w u , m u,R , and a "bin" index from {1, 2, . . ., 2 nR Tu }.
Similarly, for every w u we generate 2 n(R u,L +R Su ) length-n codewords s u generated via P Su|Wu (•) and label s u with the arguments of w u , m u,L , and a "bin" index from {1, 2, . . ., 2 nR Su }.
Next, the encoder tries to find a pair of bin indices such that (w u , t u , s u ) is jointly typical according to one's favorite flavor of typicality.Using standard typicality arguments (see, e.g., [5]) a typical triple exists with high probability if n is large and Once this triple is found, we transmit a length-n signal x u that is generated via P Xu|SuTuWu (•|s u,i , t u,i , w u,i ) for i = 1, 2, . . ., n.
The receivers use joint typicality decoders to recover their messages.They further use their knowledge (or side-information) about some of the messages.The result is that decoding is reliable if n is large and if the following rate constraints are satisfied (see [5,6]): Finally, we use Fourier-Motzkin elimination (see [5]) to remove R Su , R Tu , R u,L , R u,R , R u,L , and R u,R from the above expressions and obtain the following result.
Theorem 2 An achievable rate region for a line network with broadcast channels is given by the bounds for any choice of P (s u , t u , w u , x u ) and for all u, and where S u T u W u −X u −Y u,u−1 Y u,u+1 forms a Markov chain for all u.
Example 3 Consider N = 3 for which we have the sessions (17)-( 22).The inequalities of Theorem 2 are Observe that the channels from node 1o to node 2i, and node 3o to node 2i, are memoryless channels with capacities C 1,2 and C 3,2 , respectively.In fact, from (49) and (51) it is easy to see that we may as well choose W 1 , S 1 , W 3 , and T 3 as constants.Moreover, we should choose T 1 = X 1 and S 3 = X 3 , and then choose the input distributions so that I(X 1 ; Y 1,2 ) = C 1,2 and I(X 3 ; Y 3,2 ) = C 3,2 .The inequalities (44)-(48) at node u = 2 correspond to Marton's region [10] (Section 7.8) for broadcast channels including a common rate.We will see in the next section that if we specialize to the model of [2] then only the bounds (43)-(45) remain at node 2 because the bounds (46)-(48) are redundant.[8] (p.419)).In fact, if all BCs in Figure 2 are orthogonal then the model reduces to that of Figure 1 so hopefully we recover Theorem 1 from Theorem 2.

Orthogonal Channels
Let ). Suppose C u,u−1 and C u,u+1 are the respective capacities of the memoryless channels P Y u,u−1 |X u,u−1 and P Y u,u+1 |X u,u+1 .We choose S u = X u,u−1 , T u = X u,u+1 , W u = 0, and X u,u−1 , X u,u+1 to be independent and capacity-achieving.Inequalities (44)-(48) reduce to The region of Theorem 1 is therefore achievable.The converse follows by using the same steps as in the converse of Theorem 1.

Deterministic Channels
for some functions f 1 (•) and f 2 (•).We show that Theorem 2 gives the capacity region if all BCs in Figure 2 are deterministic.

Theorem 3
The capacity region of a line network with deterministic BCs is the union over all P (w u , x u ), u = 1, 2, . . ., N , of the (non-negative) rates satisfying Proof.Achievability follows by Theorem 2 with S u = Y u,u−1 and T u = Y u,u+1 .For the converse, the constraint (54) is the PdE bound of [11] (Section III.A).The bounds (55) and ( 56) are cut bounds.For the remaining steps, let S c be the complement of S in V. We define Let M u,L be the random message corresponding to m u,L , and similarly for the other messages.The messages are independent and have entropy equal to n times their rate, where n is the number of times we use each BC.Let M (S) be the set of messages originating at supernodes in S. Let M c u,L to be the set of all network messages except for M u,L , and similarly for other messages.We use the notation For the following, let S = {u, u + 1, . . ., N } and S = {1, 2, . . ., u}.We bound Theorem 4 The capacity region of a line network with physically degraded BCs is the union over all P (w u , x u ), u = 1, 2, . . ., N , of the (non-negative) rates satisfying and where W u − X u − Y u,u−1 − Y u,u+1 forms a Markov chain.
Proof.For achievability, Theorem 2 with S u = X u and T u = 0 gives the region specified by ( 66)-(69).
For the converse, the bound (66) is based on an extension of PdE bounds to mixed wireline/wireless networks (see [11]).The bound (68) is a cut bound.The other two bounds follow by modifying the steps of [12] as follows. Consider and let S = {1, 2, . . ., u}.We then have where (a) follows by Fano's inequality and (b) follows by defining and = where (a) follows because X u,i is defined by the messages at supernode u and the past channel outputs at supernode u, and steps (b) follow because forms a (long) Markov chain.Collecting the bounds (70)-(72) proves Theorem 4.

Physically Degraded Gaussian Channels
The additive white Gaussian noise (AWGN) and physically degraded BC has (see [13]) where X u is real with power constraint n i=1 X 2 u,i ≤ nP u for all u, and Z u,u−1 and Z u,u+1 are independent Gaussian random variables with variances N u,u−1 and N u,u+1 , respectively (again, the direction of degradation can be swapped for any u without changing the results conceptually).
The capacity region is given by Theorem 4 and it remains to optimize P (w u , x u ).The variances of Y u,u−1 and Y u,u+1 are at most P u + N u,u−1 and P u + N u,u−1 + N u,u+1 , respectively, so the maximum entropy theorem (see [8] (p.234)) gives where Furthermore, a conditional version of the entropy power inequality (see [8] (p.496)) gives Collecting the bounds, and inserting (79) and (80) into (77), we have But we achieve equality in (81)-( 83) by choosing where V u and W u are independent Gaussian random variables with zero-mean and variances α u P u and (1 − α u )P u , respectively.The optimal P (w u , x u ) is therefore zero-mean Gaussian, and the capacity region is given by inserting (81)-( 83) with equality into (67)-(69), and taking the union over the rates permitted by varying α u .

Packet Erasure Channels with Feedback
A BC P Y 1 Y 2 |X is called packet erasure with feedback if X is an L-bit vector and and all supernodes receive one bit of feedback from each receiver telling them whether the receiver has seen an erasure or not.Suppose we give receiver 1 both Y 1 and Y 2 , which means that the channel is physically degraded.Let R 1 be the resulting capacity region.Similarly, let R 2 be the capacity region if we (instead) give receiver 2 both Y 1 and Y 2 .The authors of [14] (see also [15]) showed that the capacity region of the original BC is R 1 ∩ R 2 .The following theorem slightly generalizes the main result of [14] and gives the capacity of line networks with broadcast erasure channels and feedback.The input X u has L u bits and we denote the erasure probabilities for Y u,u−1 and Y u,u+1 as p u,u−1 and p u,u+1 , respectively.
Theorem 5 The capacity region of a line network with broadcast erasure channels and feedback is the union of the (non-negative) rates satisfying (14) and Proof.(Sketch) Achievability follows by using the network codes of [14] and [2].For the converse, the constraint (14) again follows from PdE bounds.For the constraints (86) and (87), we make every BC physically degraded by giving one of the receivers both channel outputs (see [14,16]).Theorem 4 gives a collection of outer bounds for each degradation choice.Finally, we optimize the coding to obtain (86) and (87).

Discussion
The capacity results in Sections 4.1-4.5 imply that decode-forward (DF) relaying suffices, i.e., amplify-forward (AF) and compress-forward (CF) do not improve rates (see also [19] ([Chapter 4])).Quantize-map-forward [17] and noisy network coding [18] also do not improve on DF.In fact, the non-DF methods are suboptimal in general because they do not use superposition coding or binning to treat broadcasting.However, we have found capacity only for BCs that are orthogonal, deterministic, physically degraded, or packet erasure with one-bit feedback.AF and CF strategies are useful for other classes of BCs, as shown in [20] and many further papers.
Finally, our model applies to wireless problems where every node has a dedicated tone and/or time slot for transmission.If nodes use the same tone at the same time, then one must consider the effects of interference.For example, scheduling transmissions with half-duplex protocols is an interesting problem for further study.

Figure 3 .
Figure 3. Network at supernode u after the PdE bound has removed the edges (ui, uo), (uo, (u − 1)i), and (uo, (u + 1)i).The session messages are tested in the order: m LRu , m RLu , m u , then m (u) LR , m u,LR , m u,R , and finally m u,L .
our choices for X u and Y u,u−1 are appropriate for the cut bound (68).