Secrecy Capacity of the Extended Wiretap Channel II with Noise

: The secrecy capacity of an extended communication model of wiretap channelII is determined. In this channel model, the source message is encoded into a digital sequence of length N and transmitted to the legitimate receiver through a discrete memoryless channel (DMC). There exists an eavesdropper who is able to observe arbitrary µ = N α digital symbols from the transmitter through a second DMC, where 0 ≤ α ≤ 1 is a constant real number. A pair of an encoder and a decoder is designed to let the receiver be able to recover the source message with a vanishing decoding error probability and keep the eavesdropper ignorant of the message. This communication model includes a variety of wiretap channels as special cases. The coding scheme is based on that designed by Ozarow and Wyner for the classic wiretap channel II.


Introduction
The concept of the wiretap channel was first introduced by Wyner.In his celebrated paper [1], Wyner considered a communication model, where the transmitter communicated with the legitimate receiver through a discrete memoryless channel (DMC).Meanwhile, there existed an eavesdropper observing the digital sequence from the receiver through a second DMC.The goal was to design a pair of an encoder and a decoder such that the receiver was able to recover the source message perfectly, while the eavesdropper was ignorant of the message.That communication model is actually a degraded discrete memoryless wiretap channel.
After that, the communication models of wiretap channels have been studied from various aspects.Csiszár and Körner [2] considered a more general wiretap channel where the wiretap channel did not need to be a degraded version of the main channel, and common messages were also considered there.Other communication models of wiretap channels include wiretap channels with side information [3][4][5][6][7][8], compound wiretap channels [9][10][11][12] and arbitrarily-varying wiretap channels [13].
Ozarow and Wyner studied another kind of wiretap channel called the wiretap channel II [14].The source message W was encoded into digital bits X N and transmitted to the legitimate receiver via a binary noiseless channel.An eavesdropper could observe arbitrary µ = Nα digital bits from the receiver, where 0 < α < 1 is a constant real number not dependent on N.
Some extensions of the wiretap channel have been studied in recent years.
Cai and Yeung extended the wiretap channel II into the network scenario [15,16].In that network model, the source message of length K was transmitted to the legitimate users through a network, and the eavesdropper was able to wiretap on at most µ < K edges.Cai and Yeung suggested using a linear "secret-sharing" method to provide security in the network.Instead of sending K message symbols, the source node sent µ random symbols and K − µ message symbols.Additionally, the code itself underwent a certain linear transformation.Cai and Yeung gave sufficient conditions for this transformation to guarantee security.They showed that as long as the field size is sufficiently large, a secure transformation existed.Some related work on wiretap networks was given in [17][18][19][20].
An extension of wiretap channel II was considered recently in [21,22].The source message was encoded into the digital sequence X N and transmitted to the legitimate receiver via a DMC.The eavesdropper could observe any µ = Nα digital bits of X N from the transmitter.A pair of inner-outer bounds was given in [21], while the secrecy capacity with respect to the semantic secrecy criterion was established in [22].The coding scheme in [21] was based on that developed by Ozarow and Wyner in [14], while the scheme in [22] was by Wyner's soft covering lemma.He et al. considered another extension of wiretap channel II in [23].In that model, the eavesdropper observed arbitrary µ = Nα digital bits of the main channel output Y N from the receiver.The capacity with respect to the strong secrecy criterion was established there.The proof of the coding theorem was based on Csiszár's almost independent coloring scheme.Some other work on wiretap channel II can be found in [24][25][26].
This paper considers a more general extension of wiretap channel II, where the eavesdropper observes arbitrary µ digital bits from the transmitter via a second DMC.The capacity with respect to the weak secrecy criteria is established.The coding scheme is based on that developed by Ozarow and Wyner in [14].It is obvious that this communication model includes the general discrete memoryless wiretap channels, the wiretap channel II and the communication models discussed in [21][22][23] as special cases.Nevertheless, we should notice that the secrecy criteria considered in [22,23] are strictly stronger than those considered in this paper; see Figure 1.

Stronger
Semantic secrecy criterion considered in [22] Strong secrecy criterion considered in [23,24] Weak secrecy criterion The remainder of this paper is organized as follows.The formal statement and the main results are given in Section 2. The secrecy capacity is formulated in Theorem 1, whose proof is discussed in Section 4. Section 5 provides a binary example of this model.Section 6 gives a final conclusion of this paper.

Notation and Problem Statements
Throughout the paper, N is the set of positive integers and [1 : N] = {1, 2, ..., N} for any N ∈ N.
Random variables, sample values and alphabets (sets) are denoted by capital letters, lower case letters and calligraphic letters, respectively.A similar convention is applied to random vectors and their sample values.For example, X N represents a random N-vector (X 1 , X 2 , ..., X N ), and x N is a specific vector of X N in X N .X N is the N-th Cartesian power of X .
Let '?' be a "dummy" letter.For any index set I ⊆ [1 : N] and finite alphabet X not containing the "dummy" letter '?', denote: For any given random vector X N = (X 1 , X 2 , ..., X N ) and index set I ⊆ [1 : N], • X N I = (X 1 , X 2 , ..., X N ) is a "projection" of X N onto I with X n = X n for n ∈ I, and The random vector X N I takes the value from X N I , while the random vector X I takes the value from X |I| .Example 1. Supposing that N = 5, the index set I = {1, 3, 5} and the random vector X N = (X 1 , X 2 , X 3 , X 4 , X 5 ), we have X N I = (X 1 , ?, X 3 , ?, X 5 ) and X I = (X 1 , X 3 , X 5 ).
The communication model in this paper, which is shown in Figure 2, is composed of an encoder, the main channel, the wiretap channel and a decoder.The definitions of these parts are from Definition 1 to Definition 4, respectively.The definition of achievability is in Definition 5.The (stochastic) encoder is specified by a matrix of conditional probability En(x N |w) for the channel input x N ∈ X N and message w ∈ W.

Definition 2. (Main channel)
The main channel is a DMC with finite input alphabet X and finite output alphabet Y, where ?/ ∈ X ∪ Y.The transition probability is Q M (y|x).Let X N and Y N denote the input and output of the main channel, respectively.It follows that for any x N ∈ X N and y N ∈ Y N , where: The eavesdropper is able to observe arbitrary µ = Nα digital bits from the transmitter via another DMC, whose transition probability is denoted as Q W (z|x) with x ∈ X and z ∈ Z.
The alphabet Z does not contain the "dummy" letter '?' either.The input X N and the output Z N of the wiretap channel satisfy that: Supposing that the eavesdropper observes digital bits of the wiretap channel output Z N whose indices lie in the observing index set I, the subsequence obtained by the eavesdropper can then be denoted by ZN = Z N I .Therefore, the information on the source message (of each bit) exposed to the eavesdropper is denoted by: (Achievability) A non-negative real number R is said to be achievable, if for any > 0, there exists an integer N 0 , such that one can construct an (N, M) code satisfying: and: where N > N 0 .The capacity, or the maximal achievable transmission rate, of the communication model is denoted by C s .
Remark 2. Notice that the capacities defined in this paper are under the condition of negligible average decoding error probability, but one can construct the coding schemes for the negligible maximal decoding error probability through the standard techniques.See Appendix A for details.

Main Result
Theorem 1.The capacity of the communication model described in Figure 2 is: where U is an auxiliary random variable distributed on U with |U | ≤ |X |.
The converse half of Theorem 1 can be established quite similarly to the method of establishing the converse of Theorem 2 in [22], and hence, we omit it here.The direct part of Theorem 1 is given in Section 4.
Corollary 1.When α = 1, the communication model is transformed into a general discrete memoryless wiretap channel, whose capacity is formulated by: C s = max The result coincides with that of Corollary 2 in [2].In particular, if X → Y → Z forms a Markov chain, the capacity is further deduced by: C s = max = max where (a) follows because U → Y → Z forms a Markov chain, (b) follows because the Markov chain U → X → Y for a given Z implies that I(U; Y|Z) ≤ I(X; Y|Z) and the equality holds if and only if U = X and (c) follows because X → Y → Z forms a Markov chain.
Corollary 2. When Y = Z, i.e., the eavesdropper observes µ = Nα digital bits from the receiver, the communication model is transformed into that studied in [23].In this case, the capacity is formulated by: where the last equality follows because the Markov chain U → X → Y implies that I(U; Y) ≤ I(X; Y) and the equality holds if and only if U = X.
Notice that [23] considered the capacity with respect to the strong secrecy criterion, while the current paper considers that with the weak secrecy criterion.Therefore, Theorem 1 in [23] and Corollary 2 in the current paper indicate that the capacity with respect to the strong secrecy criterion is identical to that with the weak secrecy criterion.Corollary 3. When Z = X, i.e., the eavesdropper observes µ = Nα digital bits from the transmitter, the communication model is transformed into that studied in [21,22].In this case, the capacity is formulated by: The channel model described in Corollary 3 was first studied in [21], and a pair of inner and outer bounds was given there.The secrecy capacity with respect to the semantic secrecy criterion, which is identical to the capacity with the weak criterion given in Corollary 3, was established in [22].

Direct Part of Theorem 1
This section proves the direct part of Theorem 1.Let an arbitrary quadruple of random variables (U * , X * , Y * , Z * ) be given, which satisfy that: for (u, x, y, z) ∈ U × X × Y × Z.The goal of this section is to establish that every real number R satisfying 0 ≤ R < I(U * ; Y * ) − αI(U * ; Z * ) is achievable.In fact, it suffices to prove the achievability of every R satisfying: To show this, suppose that every transmission rate R satisfying Equation ( 5) is achievable.For any random variable U * satisfying Equation ( 4), the encoder could deliberately increase the noise of the communication system by inserting a virtual noisy channel Q V at the transmitting port of the system, such that: This would create the virtual communication system depicted in Figure 3, where the transition matrix of the main channel is: and the transition matrix of the wiretap channel is: It is clear that every real number 0 < R ≤ I(U * ; Y * ) − αI(U * ; Z * ) is achievable for the virtual system and hence is achievable for the original system.
In the remainder of this section, it will be shown that every transmission rate R satisfying Equation ( 5) is achievable.To be precise, for any and τ satisfying 0 < ≤ τ < I(X * ; Y * ) − αI(X * ; Z * ), we need to establish that there exists an (N, M) code, such that: and: when N is sufficiently large.The coding scheme is based on the scheme developed by Ozarow and Wyner [14] for the classic wiretap channel II.In that channel model, for each wiretap channel output z N , there exists a collection of codewords that are "consistent" with it, namely the codewords that could produce the wiretap channel output z N for some observing index I.Ozarow and Wyner constructed a secure partition, such that the number of "consistent" codewords, for every wiretap channel output z N , in each sub-code is less than a constant integer.However, it is not feasible to consider the "consistent" codewords in our model, where the wiretap channel may be noisy.Instead, we construct a secure partition such that the number of codewords, jointly typical with the wiretap channel output z N , in each sub-code is less than a constant integer.
The proof is organized as follows.Firstly, Section 4.1 gives some definitions on the typicality of µ-subsequences and lists some basic results.Then, the construction of the encoder and decoder is introduced in Section 4.2.The key point is to generate a "good" codebook with the desired partition to ensure secrecy.Thirdly, as the main part of the proof, Section 4.3 shows the existence of a "good" codebook with the desired partition.For any > 0 and τ > 0, the proof that the coding scheme in Section 4.2 satisfies the requirements of the transmission rate, reliability and security, namely Formulas ( 6) to (8), is finally detailed in Section 4.4.

Typicality
The definitions of letter typicality on a given index set follow from those briefly introduced in [23].We list them in this subsection for the sake of completeness.
Firstly, the original definitions of letter typicality are given as follows.Please refer to Chapter 1 in [27] for more details.
For any δ ≥ 0, the δ-letter typical set T N δ (P X ) with respect to the probability distribution P X on X is the set of x N ∈ X N satisfying: where N(a|x N ) is the number of positions of x N having the letter a ∈ X .Similarly, let N(a, b|x N , y N ) be the number of times that the pair (a, b) occurs in the sequence of pairs (x 1 , y 1 ), (x 2 , y 2 ), ..., (x N , y N ).The jointly typical set T N δ (P XY ), with respect to the joint probability distribution for all (a, b) ∈ X × Y.
For any given x N ∈ X N , the conditionally typical set of x N with respect to the joint mass function P XY is defined as: The definitions on the typicality of µ-subsequences on index set I ∈ I µ and some basic results are given as follows.Definition 6.Given a random variable X on X , the letter typical set TN I [X] δ with respect to X on the index set I ∈ I µ is the set of x N ∈ X N I such that x I ∈ T µ δ (P X ), where x I is the µ-subvector of x N and P X is the probability mass function of the random variable X.
Remark 3. (Theorem 1.1 in [27]) Suppose that X 1 , X 2 , ..., X N are N i.i.d.random variables with the same generic probability distribution as that of X.For any given I ∈ I µ and δ < m X , where: and m X = min x∈X :P X (x)>0 P X (x).Definition 8.For any given x N ∈ TN I [X] δ with I ∈ I µ , the conditionally-typical set of x N on the index set I is defined as: Remark 4. Let (X N , Y N ) be a pair of random sequences with the conditional mass function: for x N ∈ X N and y N ∈ Y N .Then, for any index set Corollary 5. Let Y 1 , Y 2 , ..., Y N be N i.i.d.random variables with the same probability distribution as that of Y.For any I ∈ I µ and x N ∈ TN I [X] δ , it follows that: Remark 5. (Theorem 1.2 in [27]) Let (X N , Y N ) be a pair of random sequences satisfying (9).For any index set I ∈ I µ , 0 ≤ δ < m XY and x N ∈ TN I [X] δ , it is satisfied that: where: and m XY = min (x,y)∈X ×Y :P XY (x,y)>0 P XY (x, y).

Code Construction
Suppose that the triple of random variables (X * , Y * , Z * ) is given and fixed.
Codeword generation: The random codebook C = {X * N (l)} M l=1 is an ordered set of M i.i.d.random vectors with mass function Pr{X * N (l) = x N } = ∏ N i=1 P X * (x i ), where: for some τ d > 0.
Codeword partition: Given a specific sample value C = {x N (l)} M l=1 of M randomly-generated codewords, let W be a random variable uniformly distributed on [1 : M ] and X N (C) = x N (W ) be the random sequence uniformly distributed on C. Set R = I(X * ; Y * ) − αI(X * ; Z * ) − τ, and partition C into: subsets {C m } M m=1 with the same cardinality.Let W be the index of sub-code containing X N (C), i.e., X N (C) ∈ C W .We need to find a partition of the codebook C satisfying that: max where Z N (C) is the output of the wiretap channel when taking X N (C) as the input, i.e., for x N ∈ X N and z N ∈ Z N .If there is no such desired partition, declare an encoding error.
Remark 6.We call the codebook C an ordered set because each codeword in the codebook is treated as unique, even if its value may be the same as the other codewords.
Encoder: Suppose that a desired partition {C m } M m=1 on a specific codebook C is given.When the source message W is to be transmitted, the encoder uniformly randomly chooses a codeword from the sub-code C W and transmits it to the main channel.
In this encoding scheme, each message is related to a unique sub-code, which is sometimes called a bin.Therefore, we would call this kind of coding scheme the random binning scheme.
Remark 7.For a given codebook C and a desired partition applied to the encoder, let X N and Z N be the input and output of the wiretap channel respectively, when the source message W is transmitted.It is clear that (W, X N , Z N ) and ( W, X N (C), Z N (C)) share the same joint distribution.
Decoder: Supposing that the output of the main channel is y N .The decoder tries to find a unique sequence x N ( ŵ, ĵ), such that (x N ( ŵ, ĵ), y N ) ∈ T N δ (P X * Y * ) and decodes ŵ as the estimation of the transmitted source message.If there is none or there is more than one satisfied x N ( ŵ, î), the encoder chooses a constant w 0 as ŵ.

Proof of the Existence of a "Good" Codebook with a Secure Partition
This subsection proves the existence of a class of "good" codebooks, on which there exist secure partitions, such that Equation (12) holds, when N is sufficiently large and δ is sufficiently small.Moreover, those kinds of "good" codebooks can be randomly generated with probability → 1 as N → ∞.The notation in Section 4.2 will continue to be used in this subsection.
A formal definition of "good" codebooks is given by the following.
)M for all I ∈ I µ (13) and: where: is the set of typical codewords on the index set I, is the set of codewords jointly typical with z N on the index set I and: The main results of this subsection are summarized as the following three lemmas.Lemma 1 claims the existence of "good" codebooks; Lemma 2 constructs a special class of partitions on the "good" codebooks; and Lemma 3 proves that the partitions constructed by Lemma 2 are secure.

Lemma 1.
Let C be the random codebook generated by the scheme introduced in Section 4.2.If δ < m X * , the probability of C being "good" is bounded by: where: and 1 is given by Equation (15).
The proof of Lemma 1 is detailed in Appendix B.
Remark 8.It can be verified that 3 , 4 → 0 as N → ∞, if δ is sufficiently small.Therefore, one can obtain a "good" codebook with probability → 1.
Lemma 2. For any given codebook C satisfying Equation ( 14) with M = 2 N(I(X * ;Y * )−τ−τ d ) codewords, there exists a secure equipartition {C m } M m=1 on it, such that: for all 1 ≤ m ≤ M, I ∈ I µ and z , where: The proof of Lemma 2 is discussed in Appendix C.
Lemma 3.For any 0 < δ < m X * Y * and secure partition {C m } M m=1 on a "good" codebook C, it follows that: where: and: Remark 9. Formula (12) is finally established from the fact that the right-hand side of Equation ( 19) converges to zero as τ d → 0 and N → ∞.
Proof of Lemma 3. By a way similar to establishing Equation ( 22) in [2], for every I ∈ I µ , it follows that: where the last equality follows because W → X N (C) → Z N I (C) forms a Markov chain.Therefore: The terms in the rightmost side of Equation ( 21) are bounded as follows.
On account of the fact that X N (C) is uniformly distributed on the "good" codebook C (cf. Equation ( 13)), it follows that: for every I ∈ I µ .Combining Remark 5 yields: for every I ∈ I µ , where 5 is given by Equation (20).Denote:

It follows that:
Pr{U I = 1} < 5 (22) for every I ∈ I µ .Moreover, on account of the property of ( 18), we also have: Therefore: where (a) follows from Equation ( 23) and the fact that U I is binary and (b) follows from Equation (22).

•
The value of H(Z N I (C)) is upper bounded as: where (a) follows the facts that U I is binary and Z N I (C) ∈ TN I [Z * ] 2δ when U I = 0 and (b) follows from Equation ( 22).

•
Recalling that when given W = w, the random vector X N (I) is uniformly distributed on C w , we have: • Lower bound of H(Z N I (C)|X N (C)).For any x N satisfying that x N I ∈ TN I [X * ] 2δ , we have: where the function N(x|x I ) represents the number of the letter x appearing in the sequence x I , and the inequality follows from the definition of TN I [X * ] 2δ .Therefore, By now, the terms in the rightmost side of Equation ( 21) have been bounded as expected.Substituting Equations ( 24) to ( 27) into (21) gives Equation (19).The proof of Lemma 3 is completed.

Proofs of Equations
if the codebook is "good", which implies Equation (7).By the standard channel coding scheme (cf., for example, chapter 7.5 [28]), using the random codebook-generating scheme in Section 4.2, one can obtain a codebook satisfying Equation ( 8) with probability → 1 when N → ∞.
Combining Lemma 1, it is established that one can get a codebook achieving both equations of ( 7) and ( 8) with probability → 1. Equation ( 6) is obvious from the coding scheme.The proofs are completed.

Example
This section studies a concrete example of the communication model depicted in Figure 2 and formulated in Section 2, where the main channel is a discrete memoryless binary symmetrical channel (DM-MSC) with the crossover probability 0 ≤ p ≤ 1 2 , and the eavesdropper observes arbitrary µ = Nα digital bits from the transmitter through a binary noiseless channel.This indicates that the transition matrices Q M and Q W (introduced in Definitions 2 and 3) satisfy that: 1 − p otherwise and: where x, y and z all take the value from the binary alphabet X = Y = Z = {0, 1}.This example is in fact a special case of the model considered in [22].Corollary 3 gives that the secrecy capacity of this example is: where the auxiliary random variable U is sufficient to be binary.However, the formula given in ( 28) is inexplicit.In fact, it is an optimization problem.In this section, we will solve this optimization problem and find the exact random variable U achieving the "max" function.The main result is given in Proposition 1, whose proof is based on Lemma 4. Suppose that the random variables of U, X and Y are all distributed on {0, 1}, and they satisfy that: for some 0 ≤ β U , q 0 , q 1 ≤ 1. Formula (28) can then be rewritten as: where: and: and: for 1 ≤ β ≤ 1, the function C s (α, p) can then be further represented as: C(β U , q 0 , q 1 ).
To determine the value of C s (α, p), some properties of the function g are given in the following lemma.Lemma 4. For any given 1 ≤ α, p ≤ 1, it follows that: 2 , the function g is convex over [0, 1], and β = 1 2 is the unique minimal point.3. when 0 ≤ α < (1 − 2p) 2 , there exists a unique minimal point β * < 1 2 on the interval [0, 1  2 ], and hence, 1 − β * is the unique minimal point over the interval  On account of Lemma 4, we conclude that: Proposition 1.The secrecy capacity can be represented as: where β * is the unique minimal point of the function g over the interval [0, 1  2 ].Moreover, C s is positive if and only if α < (1 − 2p) 2 .
The first part is proven by contradiction.Suppose that We can assume that 0 < β X < β * without loss of generality.In this case, it must follows that q 0 ≤ β X ≤ β * or q 1 ≤ β X ≤ β * since β X = β U q 0 + (1 − β U )q 1 is the convex combination of q 0 and q 1 .We further suppose that q 0 < β X < β * .If it is also true that q 1 < β * , then it follows immediately that C(β U , q 0 , q 1 ) ≤ 0 on account of the fact that the function g is convex over the interval [0, β * ].On the other hand, if q 1 > β * , we let β * U satisfy that: Then, it follows clearly that β * U ≤ β U , and hence: C(β U , q 0 , q 1 ) = g(β U q 0 + (1 where (a) follows because β * is the minimal point of g and β * U < β U and (b) follows because g is convex over the interval [0, β * ].This contradicts the assumption that C(β U , q 0 , q 1 ) > 0.
To prove the second part, consider that when β * < β X < 1 − β * , we have: ) − g(β * ), where the inequality holds because 1 2 is the maximal point over the interval [β * , 1 − β * ] and β * is the minimal point over the interval [0, 1].The formula above indicates that C s ≤ g( 12 ) − g(β * ) = 1 − α − β * .The equality holds if q 0 = β * , q 1 = 1 − β * and β U = β X = 1 2 ; see Figure 5.The proof of Proposition 1 is completed.It is clear that when p = 0, the communication model discussed in this section is specialized as wiretap channel II.In that case, the secrecy capacity is obviously a linear function of α.However, when p > 0, the linearity does not hold.Instead, the secrecy capacity is a convex function of α.See Figure 6.

Conclusions
The paper considers a communication model of extended wiretap channel II.In this new model, the source message is sent to the legitimate receiver through a discrete memoryless channel (DMC), and there exists an eavesdropper who is able to observe the digital sequence from the transmitter through a second DMC.The coding scheme is based on that developed by Ozarow and Wyner for the classic wiretap channel II.This communication model includes the general discrete memoryless wiretap channels, the wiretap channel II and the communication models discussed in [21][22][23] as special cases.
for all 1 ≤ m ≤ M, I ∈ I µ and z N ∈ T N I [Y * ] 2δ , where L is a sufficiently large constant.This partition is actually secure.
where M and M are given by Equations ( 10) and (11), respectively.Therefore, The proof is completed.

Appendix C. Proof of Lemma 2
This Appendix proves that for any given "good" codebook C, there exists a secure partition {C m } M m=1 on it such that: The proof is quite similar to the proof of Lemma 2 in [14].Most notation in this Appendix will follow that in [14].
Let F be the set of all possible equipartition on the codebook C. Each element in F is actually a function For any f ∈ F , let Ψ( f ) = 0 if the partition produced by f is secure, and Ψ( f ) = 1 otherwise.Then, it suffices to prove that: where F is the random variable uniformly distributed on F .To this end, for any 1 ≤ m ≤ M, denote: It follows clearly that: Then, it follows that: since the codebook C is "good".Therefore, the probability that there are t codewords of F −1 (m) belonging to T(C, I, z N ) is given by: Pr{| T(C, I, z N ) ∩ F −1 (m)| = t} = ( .
Proof of Property 3. We firstly find the solution for the inequality g (β) < 0. This inequality indicates that: < 0.

Figure 1 .
Figure 1.Comparison of different secrecy criteria.

Figure 2 .Definition 1 .
Figure 2. Communication model of wiretap channel II with noise.

Remark 1 .Definition 4 .
Clearly, for given wiretap channel output ZN , one can easily determine which subsequence of Z N is observed by the eavesdropper.More precisely, ZN = Z N I with I = I( ZN ) = {i : Zi =?}.(Decoder) The decoder is a mapping φ : Y N → W, with Y N as the input and Ŵ = φ(Y N ) as the output.The average decoding error probability is defined as P e = Pr{W = Ŵ}.Definition 5.

Definition 7 .
Let (X, Y) be a pair of random variables with the joint probability mass function P XY on X × Y.The jointly typical set TN I [XY] δ with respect to (X, Y) on the index set I ∈ I µ is the set of (x N , y N ) ∈ X N I × Y N I satisfying (x I , y I ) ∈ T µ δ (P XY ), where x I and y I are the subvectors of x N and y N , respectively.
C1) In the remainder of the proof, we firstly bound the value of E[Φ(F, m, I, z N )], and then, the value of E[Ψ(F)] is bounded by Equation (C1).Upper bound of E[Φ(F, m, I, z N )].For any I ∈ I µ and z N ∈ T N I [Y * ] 2δ , let: