Strong Secrecy Capacity of a Class of Wiretap Networks

: This paper considers a special class of wiretap networks with a single source node and K sink nodes. The source message is encoded into a binary digital sequence of length N , divided into K subsequences, and sent to the K sink nodes respectively through noiseless channels. The legitimate receivers are able to obtain subsequences from arbitrary µ 1 = K α 1 sink nodes. Meanwhile, there exist eavesdroppers who are able to observe subsequences from arbitrary µ 2 = K α 2 sink nodes, where 0 ≤ α 2 < α 1 ≤ 1. The goal is to let the receivers be able to recover the source message with a vanishing decoding error probability, and keep the eavesdroppers ignorant about the source message. It is clear that the communication model is an extension of wiretap channel II. Secrecy capacity with respect to the strong secrecy criterion is established. In the proof of the direct part, a codebook is generated by a randomized scheme and partitioned by Csiszár’s almost independent coloring scheme. Unlike the linear network coding schemes, our coding scheme is working on the binary ﬁeld and hence independent of the scale of the network.


Introduction
Network coding is a novel technique that allows the intermediate node to make a combination of its received messages before sending out to the network, instead of the store-forward method [1], and it has been shown to offer large advantages in throughput, power consumption, and security in wireline and wireless networks.Field size and adaption to varying topologies are two of the key issues in network coding, since field size affects the complexity of encoding and decoding processes, and the code construction is related to the knowledge of network topology.Li et al. [2] proved that linear network coding was able to achieve the multicast capacity as the field size was sufficiently large.Later, random linear network coding (RLNC) [3] was proposed for the unknown or changing topology to achieve the multicast capacity asymptotically in field size and network size, in which nodes independently and randomly select coding kernels.Since then, network coding has attracted a substantial amount of research attention.
In practical communication networks, transmission is often under wiretapping attacks.A general communication model of wiretap network is specified by a quintuple (G, α, U , A, R), where • G = (V, E ) is a directed graph of the network topology, where V and E are sets of nodes and edges, respectively; • α is the unique source node in the graph; • U is the set of user nodes.Each user node is fully accessed by a legal user who is required to recover the source message without error or with a vanishing decoding error probability; • A is a collection of subsets of E .Each member in A may be fully accessed by an eavesdropper; • R specifies the capacities of edges in E .Especially, a wiretap network is called a µ 2 -wiretap network if the size of each edge set in A is µ 2 .Secure network coding was firstly introduced by Cai and Yeung to prevent information leaking to eavesdroppers with zero error probability of decoding at legitimate users [4,5].They imposed an information theoretic security requirement that the mutual information between the source symbols and the messages available to the adversary must be zero.Given a network code with message length K, and a wire-tap adversary that was capable to wiretap on at most µ 2 < K edges, Cai and Yeung [5] suggested using a linear "secret-sharing" method to provide security in the network.Instead of sending K message symbols, the source node sent µ 2 random symbols and K − µ 2 message symbols.Additionally, the code itself underwent a certain linear transformation.Cai and Yeung gave sufficient conditions for this transformation to guarantee security.They also showed that as long as the field size q > ( |E | µ 2 ), a secure transformation existed.In addition, their construction of the linear transformation took at least ( |E | µ 2 ) time steps.This complexity, as well as the required lower bound on the field size q, was quite restrictive when the scale of network was large.
Feldman et al [6] proved that the problem of making a linear network code secure was equivalent to the problem of finding a linear code with certain generalized distance properties, and they also showed that the required field size for secure network coding could be much smaller if they gave up a small amount of overall capacity.Namely, sending (1 + τ)µ 2 random symbols and K − (1 + τ)µ 2 message symbols, then a random linear transformation would be secure with high probability as long as q > O(|E | 1/τ ), which allowed a trade-off between capacity and field size.
Furthermore, a new level of information theoretic security was defined as weakly secure network coding [7], in which adversaries were unable to obtain any "meaningful" information about the source messages.The weak security requirements could also be satisfied when the number of independent messages available to the adversary was less than the multicast capacity.Ho et al. [8] considered the related problem of network coding in the presence of a Byzantine attacker that could modify data sent from a node the the network.
The idea of wiretap network came from a wiretap channel of type II, which was firstly studied by Ozarow and Wyner [9].The transmitter sent a message to the legitimate receiver via a binary noiseless channel.An eavesdropper could observe a subset of received data from the receiver with a certain size.It was assumed that the eavesdropper could always choose the best observing subset of received digital bits to minimize the equivocation over sent data.Wiretap channel II can be regarded as a special case of wiretap network with V = {s, t, 1, 2, ..., n}, E = {e s1 , e s2 , ..., e sn , e 1t , e 2t , ..., e nt }, α = s and U = {t}, A = {F ⊆ {e si } N i=1 : |F | = µ} and R = {R(e si ) = 1, R(e it ) = 1, 1 ≤ i ≤ n} (see Figure 1).Note that since both the coding schemes developed in [5] and [6] rely on a Galois field with sufficiently large size, neither of them work on the classic wiretap channel of type II, where the symbol of each transmission is binary.
In this paper, we study a special class of wiretap network with a single source node and K sink nodes (considered as distributed servers or disk blocks), which is depicted in Figure 2. In this network model, the legitimate users are able to connect to any µ 1 sink nodes.On the other hand, there exist eavesdroppers who are able observe digital sequences from arbitrary µ 2 < µ 1 sink nodes.We propose a randomized secure network coding scheme, ensuring that every legitimate user is able to recover the source message with an arbitrarily small average decoding error probability while every eavesdropper has vanishing information about the source message.The coding scheme in this paper works over the binary field (alphabet), which indicates the complexity of the scheme does not increase accordingly with the scale of the network.Moreover, the coding scheme in this paper can work on the classic wiretap channel II readily, indicating that communication model in this paper includes wiretap channel II as a special case.Differences among the coding schemes in [5,6] and this paper are summarized in Table 1.
The coding scheme in this paper comes from that of arbitrarily varying channels (AVCs).In fact, the network defined in this paper can be readily regarded as a special class arbitrarily varying wiretap channels (AVWCs) with constrained state sequences.The difference is that the receivers know the channel state sequences in our case, and hence just one single codebook is enough to assure the reliable transmission (see Remark 8 for details).The partitioning scheme is based on Csiszár's almost independent coloring scheme [10], which has been recently used to solve the security problem of wiretap channel II with the noisy main channel [11].Some results on the secrecy capacity of AVWCs can be found in [12][13][14].
Designing a coding scheme with a small field size is critical in some practical engineering problems.As an example, consider the situation where the transmitter needs to send a big file to the receiver through the Internet.To achieve this, the big file is divided into many data frames.Since the size of each data frame is less than 1500 bytes according to the TCP (Transmission Control Protocol) or UDP (User Datagram Protocol) protocols, the number of data frames will be quite large when the size of the file is huge.Packet loss is quite a common problem in network communication.When some data frames get lost, a common method is to require the transmitter to send the lost data frames again.Now, supposing that we plan to deal with the problem of frame loss via the coding scheme, each sink node in Figure 2 can be regarded as a data frame divided from the big file with K = |E | being the total number of the data frames.Since the size of each data frame is constrained, it cannot represent arbitrarily a large Galois field.Consequently, the number of K cannot be arbitrarily large when the coding schemes in [5,6] are applied, indicating that the size of the big file is constrained.However, if our coding scheme is used, the size of the big file can be arbitrary.
Another application of this model is for splitting and sharing secrete information among authorized persons.In this scenario, a group of n persons is allowed to reconstruct the secrete information correctly, while any groups with less participants can not read the split message.Please refer to [15] and references therein.
The remainder of this paper is organized as follows: the notations and problem statements are introduced in Section 2 and the main result is presented in Section 3. Furthermore, the direct and converse proofs are given in Sections 4 and 5, respectively.Section 6 explains the coding scheme in Section 4 via two simple examples.Section 7 gives the discussions on the field size, and Section 8 concludes this paper.

Notations and Problem Statements
Throughout the paper, N is the set of positive integers and [1 : N] = {1, 2, ..., N} for any N ∈ N. Random variables, sample values and alphabets (sets) are denoted by capital letters, lower case letters and calligraphic letters, respectively.A similar convention is applied to random vectors and their sample values.For example, X N represents a random N-vector (X 1 , X 2 , ..., X N ), and x N is a specific vector of X N in X N .X N is the Nth Cartesian power of X .
Let "?" be a "dummy" letter.For any index set I ⊆ [1 : N] and finite alphabet X not containing the "dummy" letter "?", denote For any given random vector X N = (X 1 , X 2 , ..., X N ) and index set I ⊆ [1 : N], • X N I = (X 1 , X 2 , ..., X N ) is a "projection" of X N onto I with X n = X n for n ∈ I, and X n =? otherwise; The random vector X N I takes value from X N I , while the random vector X I takes value from X |I| .
Proposition 1.For any N-random vector X N and index set I ⊆ [1 : N], it holds that H(X N I ) = H(X I ).
Proof.Let g be a mapping from X N I to X |I| such that g(y N ) = y I for every y N ∈ X N I .One can easily verify that g is an one-to-one mapping.Furthermore, for every y N ∈ X N I , implying that X N I and X I share the same distribution.This completes the proof of the proposition.
The communication model of wiretap network with K sink nodes, depicted in Figure 2, consists of four parts, namely encoder, network, receiver and eavesdropper.The formal definitions of those parts are introduced in Definitions 1-4, respectively.The definition of achievable transmission rate is given in Definition 5. Definition 2. (Network) Suppose that the source message W is encoded into The encoder firstly divides X N into K parts, denoted by X 1 , X 2 , ..., X K with for all 1 ≤ k ≤ K, where N K = N K is an integer without loss of generality.Then, those sequences are transmitted to the sink nodes n 1 , n 2 , ..., n K , respectively, through K noiseless channels.Therefore, the digital sequence received by the sink node n k is X k for every 1 ≤ k ≤ K. Let The sequence X k can then be rewritten as X I k .
The receiver is able to access digital sequences from arbitrary µ 1 sink nodes.Let be the collection of subsets of sink nodes possibly selected by the receiver.The whole digital sequence obtained by the receiver may be any random sequence from {X K . Denoting by the decoder is a mapping φ : X N (K 1 ) → W. If it is known that the receiver has access to the sink nodes, whose indices lie in K 1 , the estimation of the source message is then denoted by Ŵ( , and the average decoding error probability is Pr{ Ŵ(I K 1 ) = W}.However, since the sink nodes accessed by the receiver are actually unknown, the average decoding error is defined as Consequently, setting Remark 2. The communication model can also be regarded as a wiretap network with |K 1 | legitimate receivers, each of whom has access to a certain set of sink nodes in K 1 .Equation ( 2) represents the maximal value of the average decoding error probabilities of all those legitimate receivers.
Definition 4. (Eavesdropper) Let 0 < α 2 < α 1 be a constant real number with µ 2 = Kα 2 being an integer.The eavesdropper is able to access digital sequences of arbitrary µ 2 sink nodes.Let be the collection of subsets of sink nodes possibly selected by the eavesdropper.The whole digital sequence obtained by the eavesdropper may be any random sequence from {X K The quantity of source information exposed to the eavesdropper is then denoted by ). ( Remark 3. Similar to Remark 1, denoting λ 2 = Nα 2 and Remark 4. The communication model can also be regarded as a wiretap network with |K 2 | eavesdroppers, each of whom has access to a certain set of sink nodes in K 2 .Equation (3) represents the maximal quantity of exposed source information to those eavesdroppers.
Example 2. To have a clearer idea on the notations defined in this section, a special example of wiretap network is given in Figure 3 with This can be treated as a network with three receivers and three eavesdroppers.In this case, we have Definition 5. (Achievablility) A non-negative real number R is said to be achievable, if for any > 0 there exists an integer N 0 such that for any N > N 0 , one can construct a pair of encoder and decoder and The capacity of the communication model described in Figure 2 is denoted by C s .

Remark 5.
When regarding the communication model depicted in Figure 2 as a wiretap network with multiple legitimate receivers and multiple eavesdroppers, Equations ( 5) and ( 6) require that every legitimate receiver is able to decode the source message with a vanishing average decoding error probability, and the quantity of information about the source message exposed to every eavesdropper is vanishing.

The Main Result
Theorem 1.The capacity of the communication model of wiretap network described in Figure 2 is The direct half of Theorem 1 is given in Section 4, and the converse half is in Section 5.The problem of wiretap network was firstly studied by Cai and Yeung in [5].They constructed a linear network coding scheme over the wiretap network such that the legitimate receivers were able to decode the source message with exactly no decoding error while the eavesdroppers had absolutely no information on the source message.However, the coding scheme should work on a Galois field whose size was related to the scale of the network.When the number of nodes in the network increased, the size of the Galois field should have been larger accordingly, which made the encoding process much more complicated.On the other hand, the coding scheme introduced in this paper is unrelated to the scale of the network.Therefore, it turns out to be simpler than the coding scheme in [5] when the scale of the network is quite huge.
Moreover, the coding scheme in this paper is designed with a binary alphabet.Therefore, this scheme can be readily applied to the classic wiretap channel II, indicating that the communication model in this paper includes wiretap channel II as a special case.See Section 7 for details.

Direct Half of Theorem 1
This section gives the proof of the direct half of Theorem 1, i.e., it is achievable for every More precisely, we need to prove that for any 0 < τ < 1, 0 < < 1 and any sufficiently large N, there exists a pair of encoder and decoder (E, φ) satisfying and ∆ < .
On account of Remarks 1 and 3, it follows that Therefore, instead of constructing encoder and decoder pair (E, φ) satisfying Equations ( 7)-( 9), this section would prove the existence of the encoder and decoder pair satisfying Equation (7), and ∆ = max The main idea of the proof goes as follows.Let the codebook C be randomly generated, such that the size of the codebook is about 2 , each of which is related to a unique source message.Since the receiver is able to obtain a λ 1 -subsequence of the transmitted codeword, which is probably distinct from those corresponding subsequences of other codewords, the receiver is able to decode the source message with a vanishing average decoding error probability.On the other hand, receiving a λ 2 -subsequence of the transmitted codeword, the eavesdropper concludes that the transmitted codeword comes from a collection of about 2 λ 1 −λ 2 codewords.If those codewords are uniformly distributed on every subcode, the eavesdropper is unable to have any information on the source message.
The proof is organized as follows.Section 4.1 firstly gives the coding scheme achieving the capacity.Then, Section 4.2 establishes that, using the scheme of generating codebook randomly, we can obtain the codebook satisfying Equation (10) with probability → 1 when N → ∞.Finally, Section 4.3 shows that when N is sufficiently large, there exists a desired "good" partition on every random generated sample codebook such that Equation ( 14) holds.Equation ( 11) is an immediate consequence of ( 14) and Remark 6. Equation ( 7) is obtained directly from (13).Therefore, the direct half of Theorem 1 is totally established.2 N for all 1 ≤ l ≤ M and x N ∈ X N , where Codebook partition.Suppose that C = {x N (l)} M l=1 is a specific sample value of M randomly generated codewords.Let W be a random variable uniformly distributed on [1 : M ] and subsets {C m } M m=1 with equal cardinality, i.e., |C m | = M M for any 1 ≤ m ≤ M. Let W be the index of subcode containing X N (C), i.e., X N (C) ∈ C W .We need to find a partition of the codebook C satisfying max The partition satisfying Inequality ( 14) is called a "good" partition.It will be proved in Section 4.3 that there exists desired "good" partition on every given sample codebook when the block length N is sufficiently large.
Encoder.Suppose that a desired partition {C m } M m=1 on a specific codebook C is given.When the source message W is to be transmitted, the encoder uniformly randomly chooses a codeword from the subcode C W and emits it to the network.Remark 6.For a given codebook C and a desired partition applied on it, denote by X N the output of the encoder, when the source message W is transmitted.It is clear that (W, X N ) and ( W, X N (C)) share the same joint distribution.
Decoder.Suppose that a desired partition {C m } M m=1 on a deterministic codebook C = {x N (l)} M l=1 is given.Receiving digital sequence y N ∈ X N I from the sink nodes, the decoder tries to find the minimal number of l such that x N I ( l) = y N , and decodes ŵ as the estimation of the transmitted source message, where ŵ is the index of subcode containing x N ( l), i.e., x N ( l) ∈ C ŵ, and (10) This subsection establishes that using the coding scheme introduced in Section 4.1, one can generate a codebook satisfying Equation (10) with probability → 1 when the block length N → ∞.

Proof of Inequality
Let C = {x N (l)} M l=1 be a fixed codebook applied by the encoder.For any 1 ≤ l ≤ M and Then, it follows from the decoding scheme introduced in Section 4.1 that Therefore, Equation ( 5) is finally established by the following lemma, whose proof is given in Appendix A.

Pr{ max
where Remark 7. It is clear that 1 → 0 as N → ∞, which concludes from Equations ( 15) and ( 16) that we can obtain the codebook satisfying (10) with probability → 1 when N → ∞.
Remark 8.The idea of generating codebook randomly comes from the random code for AVCs, which was firstly established by Blackwell et al. [16] and further developed by Ahlswede and Wolfowitz [17] (see also Lemma 12.10 in [18]).The coding scheme for AVCs is based on the following results.Let C = {X N (1), X N (2), ..., X N (M )} be a random codebook with 1 N log M being smaller than the capacity.If the decoding scheme of maximal mutual information (MMI) is applied by the decoder, it follows that the expected average decoding error probability under each state sequence is < , when N is sufficiently large.To make the random coding scheme work, for each transmission, we need a separate channel sharing the exact sample value of the random codebook, which is called the common randomness (CR).However, that would occupy a large amount of bandwidth.To solve this problem, Ahlswede developed an elimination technique [19] and claimed that it sufficed to let the random codebook C be uniformly selected from a collection of N 2 deterministic codebooks.Moreover, if the capacity of an AVC was positive, the encoder could send the index of selected codebook before each transmission, and no extra CR was needed.
In fact, the network model in this paper can be regarded as a special case of AVCs with state sequences known at the receiver, if we ignore the participation of eavesdroppers.The capacity of the current network model is obviously positive since each receiver has access to at least one noiseless channel.Therefore, the coding scheme for AVCs works on the current network with no need of extra CR.Nevertheless, we should point out that the communication model of AVCs with state sequences known at the receiver is essentially different from the classic AVCs.In the former model, the decoder knows exactly the probability distribution of the channel input, and this would reduce the degree of difficulty on the coding scheme.In particular, it is proved in Appendix A that a single deterministic codebook is sufficient for the current network model.

Proof of the Existence of "Good" Partition for Every Given Sample Codebook
This subsection proves the existence of "good" partition satisfying Equation ( 14) for every codebook generated via the scheme in Section 4.1, when N is sufficiently large.The result in this subsection can establish Equation (9) immediately on account of Remark 6. Notations in Section 4.1 will continue to be used in this Subsection.
The main result of this subsection is given in the following lemma.there exists a partition on it such that for all I ∈ I λ 2 .
Remark 9. Equation ( 14) is finally established from the fact that the right-hand side of Equation ( 17) converges to 0 as N → ∞.
Proof of Lemma 2. The main idea of the proof is firstly pointed out here.For any I ∈ I λ 2 , to satisfy I( W; X N I (C)) → 0, we need H( W|X N I (C)) → log M. On account of the following obvious equality it suffices to construct a partition satisfying H( W|X N I (C) = z N ) → log M for almost all the z N ∈ X N I .In the following proof, we will construct a collection of subsets B(C, I) of X N I , namely Equation ( 21), and prove that there exists a partition on C such that H( 23).The proof, based on Csiszár's almost independent coloring scheme, is divided into the following three steps.Step 1 constructs a mapping f : C → [1 : M] satisfying Equations ( 27) and ( 28) with the help of Lemma 2. Step 2 establishes Equation ( 29) from (28).Step 3 constructs a "good" partition satisfying Equation ( 14) from the mapping f with the help of Lemma 4.
Proof of Step 1.The following lemma plays an important role in the proof of step 1. [20]) Let P be a set of distributions on A. If there exist

Lemma 3. (Lemma 3.1 in
and l > 0, such that holds for all P ∈ P, then for any positive integer, there exists a function f holds for all P ∈ P. To apply Lemma 3, the main task is to construct the parameter P. In our proof, each element P in P is a conditional probability distribution of X N (C) for a given X N I (C) = z N for z N ∈ B(C, I) and I ∈ I λ 2 .The set B(C, I) is defined as where The useful properties of B(C, I) are given in the following proposition.
Proposition 2. For any x N ∈ C, z N ∈ B(C, I) and I ∈ I λ 2 , it follows that and The proof of Proposition 2 will be given later in this subsection.With the help of B(C, I), the parameters introduced in Lemma 3 are introduced as where and The verification that parameters given in Formula (24) satisfy the requirements of Equations ( 18)-( 20) is given in Appendix B.
Remark 10.Since B(C, I) ⊂ Z N for every Applying Lemma 3 with Formula (24), there exists for all I ∈ I λ 2 and z N ∈ B(C, I), where W f = f (X N (C)).
Remark 11.It is clear that the function f will produce a partition on the codebook C. Equation ( 27) ensures that every subcode in the partition has almost the same cardinality.Equation (28) ensures that . On account of Equation ( 28) and the uniformly continuity of entropy (cf.Lemma 2.7 in [18]), it follows that for all I ∈ I λ 2 and z N ∈ B(C, I).Combining Equation ( 23) and the equation above, for every I ∈ I λ 2 .Since H(W f ) ≤ log M, we arrive at for every I ∈ I λ 2 .The proof of Step 2 is completed.
Proof of Step 3. The proof depends on the following lemma.
where W is the index of bin containing X N (C), i.e., X N (C) ∈ C W .
The proof of Lemma 4 is discussed in Appendix C. In fact, Equation (27) indicates that the random variable W f is almost uniformly distributed on [1 : M].This implies the cardinalities of the sets f −1 (i), 1 ≤ i ≤ M are quite close.Therefore, a desired partition with the same cardinality can be constructed through slight adjustments.
From Lemma 4 and Equation (29), for all I ∈ I λ 2 , where M and ε are given by Equations ( 13) and (24), respectively.This completes the proof of Step 3.
The proof of Lemma 2 is completed.

Proof of Proposition 2. Equation (22) follows because
Pr{X for every x N ∈ X N and z N with U(C, z N , I) > 0. Equation (23) follows because where (a) follows from the fact that X N (C) is uniformly distributed on C, (b) follows from Equation ( 12) and the fact that U(C, z N , I) The proof of Proposition 2 is completed.

Converse Half of Theorem 1
This section proves that every achievable rate R should be no greater than α 1 − α 2 , which is the converse half of Theorem 1.The proof is based on the standard technique.
Let (E, φ) be a pair of encoder-decoder satisfying and Therefore, Equations (32) and (33) give Pe = Pr{φ(X N and Deduced from Equation (31), it follows that where δ( Pe ) → 0 as Pe → 0 and the last inequality follows from Fano's inequality.Combing Equation (35) and the equation above, we have , we have ).
Recalling Proposition 1, the equation above is further bounded by where (a) follows from the chain rule; (b) follows because X i is binary and (c) follows because Substituting Equation (37) into (36), we arrive at The desired inequality R ≤ α 1 − α 2 is finally proved by letting and hence Pe → 0.

Examples
This section gives two simple examples, showing how the coding scheme introduced in Section 4.1 works.Example 3. Let N = K = 2, α 1 = 1 and α 2 = 1 2 .We obtain a network with two sink nodes depicted in Figure 4.In this network, the legitimate receivers are able to access both of the sink nodes, while the eavesdroppers are able to access only one sink node.It is easy to construct a code satisfying 2 , P e = 0 and ∆ = 0.The coding scheme goes as the following.
• Codebook generation and partition.
Let the codebook C = {0, 1} 2 be partitioned as • Encoder.The source message W is uniformly distributed on the message set W = {1, 2} in this example.To transmit W, a random key K, which is uniformly distributed on {1, 2} and independent of W, is firstly generated.Then, the encoder emits a codeword x N (W, K) into the network.Figure 5 shows the digital bits emitted into the sink nodes with respect to different values of W and K. Source Source Source  .We obtain a network with three sink nodes, which is similar to that depicted in Example 2 (see also Figure 3).The only difference is that the block length N = 3 in this example.Therefore, we have I K i = K i = {i} for i = 1, 2, 3, and I K i,j = K i,j = {i, j} for (i, j) = (1, 2), (1,3), (2,3).
The codebook is defined and partitioned as The encoding scheme is similar to that introduced in Example 3, and hence omitted.The decoding scheme and the calculation of P e and ∆ are detailed below.

Decoding scheme and calculation of P e .
This part calculates the average decoding error probability of all the three receivers.
• Receiver 1.The received digital sequences of Receiver 1 with different values of W and K are given in the following table.
W and K Received Sequence The best decoding scheme is to decode sequence '00?' as φ(00?) = 1, and decode the other sequences as φ(01?) = φ(10?)= 2.The decoding error probability of Receiver 1 is Pr{W = Ŵ(I K 12 )} = 0. • Receiver 2. The received digital sequences of Receiver 2 with different values of W and K are given in the following table.
W and K Received Sequence One of the best decoding schemes is to decode sequence '1?0' as φ(1?0) = 2, and decode the other sequences as φ(0?0) = φ(0?1) = 1.In this case, a decoding error would occur when W = 2 and K = 1.Therefore, the average decoding error probability of Receiver 2 is Pr{W = Ŵ(I K 13 )} = Combing the discussions above, it is concluded that the average decoding error probability of the coding scheme is

Calculation of ∆.
This part calculates the amount of the source information exposed to the eavesdroppers.We only take Eavesdropper 1 as an example for the sake of simplicity.The received digital sequences with respect to different values of W and K are given in the following table.

W and K
Received Sequence According to the table above, it follows that Theorem 2. Suppose that the wiretap network depicted in Figure 2 with K sink nodes is given.Let X be a alphabet of size q such that q with µ 1 = Kα 1 and µ 2 = Kα 2 for fixed 0 ≤ α 1 < α 2 ≤ 1 (cf.Definitions 1 and 3).Then, there exists a pair of linear block encoder Ẽ and decoder ( φK 1 , K 1 ∈ K 1 ) working on the alphabet X such that where Pe and ∆ are given by Equations (38) and (39), respectively.
Theorem 2 asserts that the capacity α 1 − α 2 is able to be achieved absolutely with exactly no decoding error and no exposed source information, if the alphabet is sufficiently large.However, it is well known that the alphabets of most channels are binary.To make the theorem work on the binary channels, it is necessary to map the elements in GF(q) on to binary digital sequences, which produces the following corollary.
Corollary 1.When N ≥ K log q with q satisfying Equation (40), there exists a pair of encoder-decoder (E, φ) formulated by Definitions 1 and 3 (working on binary alphabet), such that where P e and ∆ are given by Equations ( 2) and (3), respectively.
Proof.Let Ẽ and ( φK 1 , K 1 ∈ K 1 ) be a pair of linear network code working on the alphabet X = GF(q) such that Equation (41) holds, where q satisfies Equation (40).Denote N K = log q .It is easy to construct an injection f : X → X N K .Set N = KN K .The stochastic encoder E is defined as for every x N ∈ ( f ( X )) K and w ∈ W, where I k is given by Equation (1) for 1 ≤ k ≤ K and the range f ( X ) is the subset of X N K .The decoder φ is defined as for every x N ∈ ( f ( X )) K and K 1 ∈ K 1 .One can easily verify that the pair of encoder and decoder (E, φ) constructed above satisfies Equation (42).The proof is completed.
Remark 15.Corollary 1 requires the block length N should satisfy that indicating N is an approximately quadratic function of K, where the second inequality follows from (40) and the third inequality follows from Lemma 2.3 in [18].
Remark 15 claims that the block length N is an approximately quadratic function of K if the linear network coding scheme introduced in [5] is applied over the wiretap network and one needs to emit about K • max{h(α 1 ), h(α 2 )} digital bits onto each edge.Remark 16 asserts that by sacrificing a small portion of transmission rate, the length of digital bits emitted on each edge can be decreased to O( log K τ ), if the coding scheme in [6] is applied.Nevertheless, when the number K of sink nodes is large, both of the encoding processes turn out to be quite complicated.To have a comparison between the coding scheme in this paper and those in [5,6], the following corollary claims that by sacrificing a tiny portion of transmission rate, it is possible to construct a pair of encoder and decoder with a vanishing average decoding error probability and vanishing exposed source information to the eavesdroppers such that only one digital bit is transmitted onto each edge when the number K of sink nodes is sufficiently large.Corollary 2. For any given 0 < < 1 and 0 < τ < α 1 − α 2 , if the block length N satisfies  17), (47) and Remark 6.
The corollary is proved.
Remark 17.The constraints (44)-(47) are independent of K. Therefore, when K is sufficiently large, only Inequality (43) is active.This indicates that we can set N = K, and hence it suffices to emit only one digital bit to each edge, when the number K of sink nodes is sufficiently large.
When N = K and α 1 = 1, the wiretap network depicted in Figure 2 is equivalent to the wiretap channel II [9] of N times of transmission.Each edge in the network is related to one time of transmission in wiretap channel II.See Figure 1.Therefore, the coding scheme introduced in this paper also works for the communication model of wiretap channel II.However, it is clear that the coding schemes introduced in [5,6], which depend on the size of alphabet, do not work for wiretap channel II.After the discussion above, it has been known that the major advantage of the coding scheme in this paper is that there exists a pair of encoder and decoder such that the source message is transmitted to the legitimate receivers with exactly one time of transmission, when the number K of sink nodes is sufficiently large.More precisely, denoting by N * = N * (α 1 , α 2 , τ, ) the minimal value of N satisfying Equations ( 44)-(47), when α 1 , α 2 , τ and are given, the number sink nodes K should be at least N * to implement the one-time transmission.The values of N * versus α 1 and α 2 is given by Figure 6.The figure shows that when τ = α 1 −α 2 10 , the value of N * is totally determined by the value of α 1 − α 2 .As a concrete example, the data points marked in Figure 6 are those with α 1 − α 2 = 0.1 and they lie on the identical horizontal line and hence share the same value of N * .The marked data points lie on the identical horizontal line

Conclusions
This paper constructs a secure coding scheme for a special class of network with one single source node and K sink nodes, and determines its strong secrecy capacity.Unlike the linear network coding schemes developed in [5,6], which rely on Galois fields with sufficiently large sizes, the coding scheme introduced in this paper is working on the binary alphabet and hence can be readily applied to the classic wiretap channel II.
On account of Fano's inequality, the formula above yields

Figure 1 .
Figure 1.Treating wiretap channel II as a special case of wiretap network.

Definition 1 .Figure 2 .
Figure 2. Communication model of wiretap network in this paper with K sink nodes.

Figure 3 .
Figure 3.An example of wiretap network II with three sink nodes.

4. 1 .
Code Construction Codebook generation.Let C = {X N (l)} M l=1 be the ordered set of M i.i.d.random vectors with mass function Pr{X N (l) = x N } = 1

Lemma 2 .
For any generated sample codebook C of length N

Lemma 4 .
For any given codebook C, if the function f : C → [1 : M] satisfies Equation (27), there exists a partition

Figure 4 .
Figure 4. Wiretap network with two sink nodes.

Figure 5 .
Figure 5. Coding scheme for wiretap network with two sink nodes.

Let 1 M
{C m } be a partition on C with equal cardinality satisfying C m ⊆ D m if |D m | > M M , and D m ⊆ C m otherwise.Denoting by W the index of subcode containing X N (C), i.e., X N (C) ∈ C W , one can obtain that Pr{Wf = W| W = m} = |D m ∩C m | |C m | = min{|D m |,|C m |} |C m | ≥ 1 − 3 √ εfor every m out of E .Consequently,Pr{W f = W} ≥ ∑ m∈[1:M]/E Pr{W f = W| W = m}

Table 1 .
Comparison one can construct a pair of encoder and decoder (E, φ) formulated by Definitions 1 and 3 (with binary alphabet), such that log M N ≥ α 1 − α 2 − τ, P e < and ∆ < .Proof.On account of Lemma 1, Equation (44) claims the existence of codebook C with P e = P e (C) < .Invoking Lemma 2, Equations (45) and (46) indicate the existence of partition on the codebook C satisfying Equation (17).The inequality ∆ < is established from Equations (