Feeding Back the Output or Sharing the State: Which Is Better for the State-Dependent Wiretap Channel?

: In this paper, the general wiretap channel with channel state information (CSI) at the transmitter and noiseless feedback is investigated, where the feedback is from the legitimate receiver to the transmitter, and the CSI is available at the transmitter in a causal or noncausal manner. The capacity-equivocation regions are determined for this model in both causal and noncausal cases, and the results are further explained via Gaussian and binary examples. For the Gaussian model, we ﬁnd that in some particular cases, the noiseless feedback performs better than Chia and El Gamal’s CSI sharing scheme, i.e ., the secrecy capacity of this feedback scheme is larger than that of the CSI sharing scheme. For the degraded binary model, we ﬁnd that the noiseless feedback performs no better than Chia and El Gamal’s CSI sharing scheme. However, if the cross-over probability of the wiretap channel is large enough, we show that the two schemes perform the same.


Introduction
It is known to all that the capacity of a point-to-point discrete memoryless channel (DMC) cannot be increased by using noiseless feedback.However, does the feedback (from the legitimate receiver to the transmitter) enhance the security of the wiretap channel?Ahlswede and Cai [1] and Dai et al. [2] studied this problem.Specifically, Ahlswede and Cai [1] showed that the secrecy capacity C s f of the degraded wiretap channel with noiseless feedback is given by: where X, Y and Z are for the transmitter, legitimate receiver and wiretapper, respectively, and X → Y → Z forms a Markov chain.Recall that the secrecy capacity C s of the degraded wiretap channel is determined by Wyner [3], and it is given by: min{I(X; Y), I(X; Y) − I(X; Z)}.
From ( 1) and (2), it is easy to see that the noiseless feedback increases the secrecy capacity of the wiretap channel.Based on the work of [1], Dai et al. [2] studied a special wiretap channel with feedback (Y → X → Z) and showed that the secrecy capacity of this model is larger than that of The capacity-equivocation region of the model of Figure 1 is determined for both the noncausal and causal cases, and the results are further explained via degraded binary and Gaussian examples.For the Gaussian example, we find that both the feedback scheme and the CSI sharing scheme [13] help to enhance the security of the wiretap channel with noncausal CSI at the transmitter [10,12], and moreover, we find that in some particular cases, the noiseless feedback performs even better than the shared CSI [13], i.e., the secrecy capacity of the degraded Gaussian case of the model of Figure 1 is larger than that of the degraded Gaussian case of [13].For the binary example, we also find that both the feedback scheme and the CSI sharing scheme [13] help to enhance the security of the wiretap channel with causal CSI at the transmitter.Unlike the Gaussian case, we find that the noiseless feedback performs no better than the shared CSI [13], i.e., the secrecy capacity of the degraded binary case of the model of Figure 1 is not more than that of the degraded binary case of [13].However, if the cross-over probability of the wiretap channel is large enough, we find that the two schemes perform the same.
The remainder of this paper is organized as follows.The capacity-equivocation region of the model of Figure 1 is provided in Section 2. Gaussian and binary examples of the model of Figure 1 are shown in Section 3. Section 4 is for the final conclusion.

Capacity-Equivocation Region of the Model of Figure 1
In this paper, random variables, sample values and alphabets are denoted by capital letters, lower case letters and calligraphic letters, respectively.A similar convention is applied to the random vectors and their sample values.For example, U N denotes a random N-vector (U 1 , ..., U N ), and u N = (u 1 , ..., u N ) is a specific vector value in U N that is the N-th Cartesian power of U .U N i denotes a random N − i + 1-vector (U i , ..., U N ), and u N i = (u i , ..., u N ) is a specific vector value in U N i .Let P V (v) denote the probability mass function Pr{V = v}.Throughout the paper, the logarithmic function is to the base two.

Definitions of the Model of Figure 1
Let W, uniformly distributed over the alphabet W, be the message sent by the transmitter.The components of the channel state sequence V N are independent and identically distributed.The probability of each component is P V (v).V N is independent of W. Let Y i−1 (2 ≤ i ≤ N) be the i-th time feedback from the legitimate receiver to the transmitter.For the noncausal case, the i-th time channel encoder f i is a (stochastic) mapping: where f i (w, y i−1 , v N ) = x i ∈ X , w ∈ W, y i−1 ∈ Y i−1 and v N ∈ V N .For the causal case, the i-th time channel encoder f i is a (stochastic) mapping: where f i (w, y i−1 , v i ) = x i ∈ X , w ∈ W, y i−1 ∈ Y i−1 and v i ∈ V i .Here, note that for the causal case, ).The channel is discrete memoryless, and its transition probability is given by: where The wiretapper's equivocation about the message W is denoted by: The decoder f D is a function that maps a received sequence of N channel outputs to the messages set: We denote the probability of error P e by Pr{W = Ŵ}.Given a pair (R, R e ) (R, R e > 0), it is said to be achievable if, for arbitrary small positive , there exists an encoding-decoding scheme, such that: The set R (n f ) , which is composed of all achievable (R, R e ) pairs, is called the capacity-equivocation region of the model of Figure 1 with noncausal CSI at the transmitter.An achievable rare C (n f ) s , which is denoted by: is called the secrecy capacity of the model of Figure 1 with noncausal CSI at the transmitter.Analogously, let R (c f ) be the capacity-equivocation region of the model of Figure 1 with causal CSI at the transmitter and , which is denoted by: be the secrecy capacity of the model of Figure 1 with causal CSI at the transmitter.

Main Result of the Model of Figure 1
The following Theorem 1 characterizes the capacity-equivocation region R (n f ) of the model of Figure 1 with noncausal CSI at the transmitter; see the following.
for some distribution: Proof.See Sections A and B.

Remark 1.
• The range of the random variable K satisfies K ≤ X V + 1.The proof is standard and easily obtained by using the support lemma (see [15]), and thus, we omit the proof here.
Proof.Substituting R e = R into the region R (n f ) in Theorem 1, we have: By using ( 10), ( 13) and ( 14), Formula ( 12) is achieved; thus, the proof is completed.
• Here, note that if Z N is a degraded version of Y N (which implies the existence of the Markov chain K → (X, V) → Y → Z), the capacity-equivocation region R (n f ) still holds.The proof of this degraded case is along the lines of the proof of Theorem 1, and thus, we omit the proof here.In [10,12], an achievable rate-equivocation region R n i is provided for the wiretap channel with noncausal CSI, and it is given by: where the joint probability distribution P KVXYZ (k, v, x, y, z) of R n i satisfies: Here, note that: where (a) is from K → Y → Z.Therefore, it is easy to see that the achievable rate-equivocation region R n i of [10] and [12] is enhanced by using this noiseless feedback.
The following Theorem 2 characterizes the capacity-equivocation region R (c f ) of the model of Figure 1 with causal CSI at the transmitter; see the following.Theorem 2. A single-letter characterization of the region R (c f ) is as follows, Entropy 2015, 17, 7900-7925 for some distribution: which implies the Markov chain K → (X, V) → (Y, Z) and the fact that V is independent of K.

Proof.
• Proof of the converse: Using the fact that V i is independent of Y i−1 and Z i−1 , the converse proof of Theorem 2 is along the lines of that of Theorem 1 (see Section A), and thus, we omit the proof here.
• Proof of the achievability: The achievability proof of Theorem 2 is along the lines of the achievability proof of Theorem 1 (see Section B), and the only difference is that for the causal case, there is no need to use the binning technique.Thus, we omit the proof here.The proof of Theorem 2 is completed.
• The range of the auxiliary random variable K satisfies K ≤ X V .The proof is standard and easily obtained by using the support lemma (see p. 310 of [16]), and thus, we omit the proof here.• Corollary 2. The secrecy capacity C (c f ) s satisfies: Proof.Proof of ( 16): Substituting R e = R into the region R (c f ) , we have: R ≤ H(Y|Z), .
• Here, note that if Z N is a degraded version of Y N , the capacity-equivocation region R (c f ) still holds.The proof of this degraded case is along the lines of the proof of Theorem 2, and thus, we omit the proof here.In [12], an achievable rate-equivocation region R c i is provided for the wiretap channel with causal CSI, and it is given by: where the joint probability distribution P KVXYZ (k, v, x, y, z) of R c i satisfies: By using (15), it is easy to see that the achievable rate-equivocation region R c i is enhanced by using this noiseless feedback.1 3.1.Gaussian Case of the Model of Figure 1 with Noncausal CSI at the Transmitter For the Gaussian case of the model of Figure 1 with noncausal CSI at the transmitter, the i-th time (1 ≤ i ≤ N) channel inputs and outputs are given by:

Examples of the Model of Figure
where V i ∼ N (0, Q), Z 1,i ∼ N (0, N 1 ) and Z 2,i ∼ N (0, N 2 ).Here, note that V i , Z 1,i and Z 2,i are independent random variables, X i is independent of Z 1,i and Z 2,i and 1 N ∑ N i=1 E(X 2 i ) ≤ P. The noise V i is non-causally known by the transmitter.The following Theorem 3 shows the secrecy capacity of the Gaussian case of the model of Figure 1 with noncausal CSI at the transmitter.Theorem 3.For the Gaussian case of the model of Figure 1 with noncausal CSI at the transmitter, the secrecy capacity C g f s is characterized in the following two cases.Case 1: If N 1 ≤ N 2 , the secrecy capacity C g f s is given by: where the maximum is achieved when s is given by: Remark 3.
If N 1 ≤ N 2 , the relationship of the channel inputs and outputs defined in (20) can be equivalently characterized by: where Z * 2,i ∼ N (0, N 2 − N 1 ), and it is independent of Z 1,i .Similar to the determination of the capacity region of the Gaussian broadcast channel (pp.117-118 of [17]), the relationship (23) implies that there exists a Markov chain (X i , V i ) → Y i → Z i , i.e., the Gaussian case of the model of Figure 1 reduces to a degraded model of Figure 1.
Analogously, if N 1 > N 2 , the relationship of the channel inputs and outputs defined in (20) can be equivalently characterized by: where , and it is independent of Z 2,i , X i and V i .The relationship (24) implies that there exists a Markov chain (X i , V i ) → Z i → Y i in the Gaussian case of the model of Figure 1.
Proof.For the direct part of Theorem 3, like [18] and [10], the achievability of C g f s is proven by substituting K = X + αV, X ∼ N (0, P), V ∼ N (0, Q) and the fact that X is independent of V in Theorem 1; the details of the proof are omitted in this paper.Here, note that the calculation of I(K; Y) − I(K; V) is exactly the same as that of the dirty paper channel (page 440 of [18]), and it is easy to see that the maximum of For the converse part of Theorem 3, note that the transmitter-receiver channel is Costa's dirty paper channel [18]; thus, the secrecy capacity is upper bounded by the capacity of the dirty paper channel, i.e., C First, note that: where (a) is from Fano's inequality.The conditional differential entropy h(Y i |Z i ) in ( 25) is bounded by: where ( 1) is from Definition (23), ( 2) is from the fact that Z * 2,i is independent of X i , V i and Z 1,i , (3) is from the entropy power inequality e 2h(X i +V i +Z 1,i +Z * 2,i ) ≥ e 2h(X i +V i +Z 1,i ) + e 2h(Z * 2,i ) (see [19]), ( 4) is from the fact that the differential entropy of a Gaussian distributed random variable X is h(X) = 1 2 ln(2πeD(X)) (here, D(X) is the variance of the Gaussian random variable X) and ( 5) is from 1  2 ln ) is increasing and the fact that ) (here, note that "=" is achieved if X i ∼ N (0, P)).Substituting (26) into (25), we have: Substituting P e ≤ into (27) and letting N → ∞, it is easy to see that 25) can be bounded by: where (a) is from (24), (b) is from the fact that Z * 1,i is independent of Z 2,i , X i and V i and (c) is from the fact that the differential entropy of a Gaussian distributed random variable X is h(X) = 1 2 ln(2πeD(X)) (here, D(X) is the variance of the Gaussian random variable X).Substituting (28) and P e ≤ into (25) and letting N → ∞, it is easy to see that Thus, the converse part of Theorem 3 is proven.The proof of Theorem 3 is completed.
In [13] (p.2841, Theorem 3), Chia and El Gamal showed that if Y is less noisy than Z (I(X; Y|V) ≥ I(X; Z|V) for every P X|V (x|v)), the secrecy capacity of the wiretap channel with CSI non-causally known by both the transmitter and the legitimate receiver is given by: Here, the I(X; Z|V) − H(V|Z) in the above C s−both can be rewritten as follows.
Substituting (29) into C s−both , we have: On the other hand, for Z less noisy than Y (I(X; Z|V) ≥ I(X; Y|V) for every P X|V (x|v)), Chia and El Gamal provided an achievable secrecy rate (lower bound on the secrecy capacity) for the wiretap channel with CSI non-causally known by both the transmitter and the legitimate receiver, and it is given by: The following Theorem 4 shows the results on the secrecy capacity of the Gaussian case of the wiretap channel with CSI non-causally known by both the transmitter and the legitimate receiver.
Theorem 4. For the Gaussian wiretap channel with part of the Gaussian noise non-causally known by both the transmitter and the legitimate receiver, the secrecy capacity C g s−both is characterized by the following two cases.Case 1: If N 1 ≤ N 2 , the secrecy capacity C g s−both is given by: Case 2: If N 1 > N 2 , a lower bound C gi s−both on the secrecy capacity C g s−both is given by: Remark 4.
For the Gaussian case, the conditional mutual information I(X; Y|V) is calculated by using the fact that when the CSI is known by both the legitimate receiver and the transmitter, it can be simply subtracted off, which in effect reduces the channel to a Gaussian channel with no CSI, i.e., I(X; Y|V) = 1 2 ln(1 + P N 1 ).Analogously, we have I(X; Z|V) = 1 2 ln(1 + P N 2 ).Then, it is easy to see that Y is less noisy than Z (I(X; Z|V) ≥ I(X; Y|V) for every P X|V (x|v)), which can be further expressed Entropy 2015, 17, 7900-7925 by N 1 ≤ N 2 , and Z is less noisy than Y (I(X; Z|V) ≥ I(X; Y|V) for every P X|V (x|v)), which can be further expressed by Proof.The achievability proof of (32) and ( 33) is easily obtained by substituting X ∼ N (0, P), V ∼ N (0, Q) and ( 20) into (30) and (31), respectively.Now, it remains to prove the converse of (32); see the following.
The converse part of (32) is based on the converse proof of (30), (see p.2846, Proof of Theorem 2 of [13] and the left bottom and right top of page 2841 [13]).However, the converse proof of (30) is for the discrete memoryless case, and it needs to be further processed for the Gaussian case.Based on the converse proof of (30) [13] and the fact that N 1 ≤ N 2 , we have the following (34) and (35), and: where ( 1) is from (29), ( 2) is from Definition (23) and is from the fact that the differential entropy of a Gaussian distributed random variable X is h(X) = 1 2 ln(2πeD(X)) (here, D(X) is the variance of the Gaussian random variable X), (4) is from the entropy power inequality e 2h(X i +V i +Z 1,i +Z 2,i ) ≥ e 2h(X i +Z 1,i ) + e 2h(V i +Z 2,i ) (see [19]), (5) is from 1  2 ln increasing while h(X i + Z 1,i ) is increasing and the fact that is from Definition (23) and ( 8) is from h(X i + Z 1,i ) ≤ 1  2 ln(2πe(P + N 1 )).Thus, the converse part of (32) is proven.The proof of Theorem 4 is completed.
Recall that for the degraded Gaussian wiretap channel with noncausal CSI at the transmitter ((X, V) → Y → Z), an achievable secrecy rate (a lower bound on the secrecy capacity) is provided [10]; see the following Theorem 5.
Theorem 5.For the Gaussian non-feedback model of Figure 1 with the condition that N 1 ≤ N 2 , an achievable secrecy rate C gi s is denoted by: Proof.The result is directly obtained from [10], and therefore, the proof is omitted here.

Remark 5.
• For the case N 1 ≤ N 2 , the relationship (20) of the channel inputs and outputs can be equivalently characterized by (23), which implies the Markov chain (X, V) → Y → Z. • To the best of the authors' knowledge, for the case N 1 > N 2 , the bounds on the secrecy capacity of the Gaussian wiretap channel with noncausal CSI at the transmitter are still unknown.
Finally, note that if the CSI is not available at the legitimate receiver, the wiretapper and the transmitter and there is no feedback link from the legitimate receiver to the transmitter, the Gaussian case of the model of Figure 1 (see (20)) reduces to the model of the Gaussian wiretap channel, where V i and Z 1,i of (20) are the legitimate receiver's channel noises and V i and Z 2,i are the wiretapper's channel noises.From [20], it is easy to see that the secrecy capacity C * s of the Gaussian wiretap channel is given by: For the case that N 1 > N 2 , we find that if N 1 2 < N 2 < N 1 , for given P, N 1 and N 2 , C g f s is larger than C gi s−both if and only if: If  s of the Gaussian wiretap channel.Furthermore, we can see that both the noiseless feedback and the CSI sharing scheme perform better than the CSI only available at the transmitter.Moreover, when Q is small (Q = 0.1, 0.5), the noiseless feedback performs better than the CSI sharing scheme, and while Q is increasing (Q = 1), the CSI sharing scheme is beginning to take advantage of the noiseless feedback.
For the case N 1 > N 2 , the following Figure 3

Binary Case of the Model of Figure 1
In this subsection, we calculate the secrecy capacity of a degraded binary case of the model of Figure 1 with causal CSI at the transmitter, where "degraded" means that there exists a Markov chain (X, V) → Y → Z.
Suppose that the random variable V is uniformly distributed over {0, 1}, i.e., p V (0) = p V (1) = 1 2 .Meanwhile, the random variables X, Y and Z take values in {0, 1}, and the wiretap channel is a BSC (binary symmetric channel) with crossover probability q.The transition probability of the main channel is defined as follows: When v = 0, From Remark 2, we know that the secrecy capacity for the model of Figure 1 with causal CSI at the transmitter is given by: and the maximum achievable secrecy rate C (ci) s of the wiretap channel with causal CSI [12] is given by: where ( 43) is from (19).
Entropy 2015, 17, 7900-7925 In addition, from ([13], Theorem 3), we know that the secrecy capacity C s−both of the wiretap channel with CSI causally or non-causally at both the transmitter and the legitimate receiver is given by: C s−both = max s : Let K take values in {0, 1}.The probability of K is defined as follows.p K (0) = α, and p K (1) = 1 − α.Define the conditional probability mass function p X|K,V as follows.
p X|K,V (0|0, The joint probability mass functions p KY is calculated by: Then, we have: By calculating, we have: and: where h By calculating, C s−both is given by: The following Figures 4-6 show and C s−both for several values of q.Here, note that the noise of the wiretap channel is increasing while q is increasing.It is easy to see that when q < 0.5, C s−both and C (c f ) s are always larger than C (ci) s , i.e., both the noiseless feedback (the model of this paper) and the shared CSI [13] help to enhance the security of the wiretap channel with causal CSI at the transmitter.When q = 0.5, there is no wiretapper in the channel; thus, C Moreover, from Figures 4-6, we see that the noiseless feedback performs no better than the shared CSI.However, when q is large enough (satisfying h(q) ≥ 1 − h(p)), the two ways perform the same.

Conclusions
In this paper, we study the general wiretap channel with CSI and noiseless feedback, where the CSI is available at the transmitter in a noncausal or causal manner.Both the capacity-equivocation region and the secrecy capacity are determined for the noncausal and causal cases, and the results are further explained via Gaussian and binary examples.For the Gaussian example, we show that both the noiseless feedback and the CSI sharing scheme [13] help to enhance the security of the Gaussian wiretap channel.Moreover, we show that in some particular cases, the noiseless feedback performs even better than the CSI sharing scheme [13].For the degraded binary example, we also find that the noiseless feedback enhances the security of the wiretap channel with causal CSI.Unlike the Gaussian example, we find that the noiseless feedback always performs no better than the CSI sharing scheme [13].
By using P e ≤ , → 0 as N → ∞, lim N→∞ H(W|Z N ) N ≥ R e and (A.6), it is easy to see that R e ≤ H(Y|Z).
The converse proof of Theorem 1 is completed.

B. Direct Proof of Theorem 1
The direct part (achievability) of Theorem 1 is proven by considering the following two cases.
• The direct proof of Theorem 1 is organized as follows.The balanced coloring lemma introduced by Ahlswede and Cai is provided in Subsection B.1, and it will be used in the remainder of this section.The code-book generation is shown in Subsection B.2, and the equivocation analysis is given in Subsection B.3.

B.1. The Balanced Coloring Lemma
The balanced coloring lemma was first introduced by Ahlswede and Cai; see the following.Lemma 1. Balanced coloring lemma: For all 1 , 2 , 3 , δ > 0, sufficiently large N and all N-type P Y (y), there exists a γ-coloring c : such that for all joint N-type P YZ (y, z) with marginal distribution P Z (z) and for k = 1, 2, ..., γ, where c −1 is the inverse image of c.
Proof.Letting U = const, Lemma 1 is directly from p. 259 of [1], and thus, we omit it here.
Lemma 1 shows that if y N and z N are joint typical, for given z N , the number of y N ∈ T N Y|Z (z N ) for a certain color k (k = 1, 2, ..., γ), which is denoted as |c −1 (k)|, is upper bounded by .
By using Lemma 1, it is easy to see that the typical set T N Y|Z (z N ) maps into at least: colors.On the other hand, the typical set T N Y|Z (z N ) maps into at most γ colors.

B.2. Code-Book Generation
Fix the joint probability mass function P Z,Y|X,V (z, y|x, v)P X|K,V (x|k, v)P KV (k, v).The message set W satisfies: The block Markov encoding scheme is used in the direct proof of Theorem 1.The random vectors K N , V N , X N , Y N and Z N consist of n blocks of length N. Let Ki , Ṽi , Ỹi and Zi (1 ≤ i ≤ n) be the random vectors for block i. Define kn = ( k1 , k2 , ..., kn ), ṽn = ( ṽ1 , ṽ2 , ..., ṽn ), ỹn = ( ỹ1 , ỹ2 , ..., ỹn ) and zn = ( z1 , z2 , ..., zn ) to be the specific vectors for all blocks.The message W n for all n blocks is denoted by uniformly distributed over the alphabet W, and W i is independent of W j (2 ≤ j ≤ n and j = i).Note that w 1 does not exist.

Construction of K N :
Gel'fand and Pinsker's binning and block Markov coding scheme are used in the construction of K N .
In the first block, for a given side information ṽ1 , try to find a k1 , such that ( k1 , ṽ1 ) ∈ T N KV ( ).If multiple sequences exist, randomly choose one for transmission.If there is no such sequence, declare an encoding error.
For the i-th block (2 ≤ i ≤ n), the transmitter receives the output ỹi−1 of the i − 1-th block; he or she gives It is easy to see that the probability for giving up at the i − 1-th block tends to zero as N → ∞.
and it is uniformly distributed over the set W i1 = {1, 2, ..., 2 NH(Y|Z) }.K * i is independent of W i .Reveal the mapping g f to the legitimate receiver, the wiretapper and the transmitter.Then, since the transmitter gets ỹi−1 , he computes k * i = g f ( ỹi−1 ) ∈ {1, 2, ..., 2 NH(Y|Z) }.For a given w i = (w i1 , w i2 ) (2 ≤ i ≤ n), the transmitter selects a sequence ki in the bin (w i1 ⊕ k * i , w i2 ) (where ⊕ is the modulo addition over W i1 ), such that ( ki , ṽi ) ∈ T N KV ( ).If multiple sequences in bin (w i1 ⊕ k * i , w i2 ) exist, choose the sequence with the smallest index in the bin.If there is no such sequence, declare an encoding error.Here, note that since The proof is given as follows. Proof.Since: and: it is easy to see that • Construction of K N for Case 2: The construction of K N for Case 2 is similar to that of Case 1, except that there is no need to divide w i into two parts.The detail is as follows.For the i-th block (2 and it is uniformly distributed over the set W. K * i is independent of W i .Reveal the mapping g f to the legitimate receiver, the wiretapper and the transmitter.When the transmitter receives the feedback ỹi−1 of the i − 1-th block, he or she computes k * i = g f ( ỹi−1 ) ∈ W. For a given transmitted message w i (2 ≤ i ≤ n), the transmitter selects a codeword ki in the bin w i ⊕ k * i (where ⊕ is the modulo addition over W), such that ( ki , ṽi ) ∈ T N KV ( ).If multiple sequences in bin w i ⊕ k * i exist, select the one with the smallest index in the bin.If there is no such sequence, declare an encoding error.Here, note that W i ⊕ K * i is independent of W i and K * i , and the proof is similar to that of Case 1.Thus, we omit the proof here.

Construction of X N :
In each block, the channel input x N is generated by a pre-fixed discrete memoryless channel with transition probability P X|K,V (x|k, v).The inputs of the channel are k N and v N , and the output is x N .
Decoding: For block i (2 ≤ i ≤ n), given a vector ỹi ∈ Y N , try to find a sequence ki ( ŵi1 ⊕ k * i , ŵi2 , ĵ) (Case 1) or ki ( ŵi ⊕ k * i , ĵ) (Case 2), such that ki and ỹi are joint typical.If there exists a unique sequence, put out the corresponding index of the bin ( ŵi1 ⊕ k * i , ŵi2 ) or ŵi ⊕ k * i .Otherwise, declare a decoding error.Since the legitimate receiver has k * i , put out the corresponding ŵi from ( ŵi1 ⊕ k * i , ŵi2 ) or ŵi ⊕ k * i .

B.3. Proof of Achievability
Here, note that the above encoding-decoding scheme for the achievability proof of Theorem 1 is exactly the same as that in [11], except that the transmitter transmits an "encrypted message" by using the secret key k * i .Since the legitimate receiver has k * i , the decoding scheme for the achievability proof of Theorem 1 is in fact the same as that in [11].Hence, we omit the proof of P e ≤ here.It remains to prove that lim N→∞ ∆ ≥ R e ; see the following.
• For Case 1, part of the message w i is encrypted by k * i .In the analysis of the equivocation, we drop w i2 from w i .Then, the equivocation about w i is equivalent to the equivocation about k * i .Since k * i = g f ( ỹi−1 ), the wiretapper tries to guess k * i from ỹi−1 .Note that for a given zi−1 and sufficiently large N, Pr{ ỹi−1 ∈ T N Y|Z ( zi−1 )} → 1.Thus, the wiretapper can guess ỹi−1 from the conditional typical set T N Y|Z ( zi−1 ).By using the above Lemma 1 and (B.2), the set T N Y|Z ( zi−1 ) maps into at least 2 NH(Y|Z) 1+δ (here, γ = 2 NH(Y|Z) ) k * i (colors).Thus, in the i-th block, the uncertainty about K * i is bounded by: Here, note that K * i is uniformly distributed.• For Case 2, the alphabet of the secret key k * i equals the alphabet W i = {1, 2, ..., 2 NR }, and the encrypted message is denoted by w i ⊕ k * i .Then, by using the above Lemma 1 and (B.2), the set T N Y|Z ( zi−1 ) maps into at least 2 NR 1+δ (here, γ = 2 NR ) k * i (colors).Thus, in the i-th block, the uncertainty about K * i is bounded by: Entropy 2015, 17, 7900-7925 Proof of lim N→∞ ∆ ≥ R e for Case 1: → Zi (proven in the remainder of this section), (c) is from W i2 being independent of Zi−1 , W i1 ⊕ K * i and W i1 , (d) follows from the fact that W i1 ⊕ K * i is independent of K * i , W i1 and Zi−1 and (e) is from (B.6).
Letting N → ∞ and n → ∞, it is easy to see that: The proof of lim N→∞ ∆ ≥ R e for Case 1 is completed.Proof of lim N→∞ ∆ ≥ R e for Case 2: where (a) is from W i → ( Zi , Zi−1 ) → (W i−1 , Zi−2 , Zn i+1 ) (proven in the remainder of this section), (b) is from W i → (W i ⊕ K * i , Zi−1 ) → Zi (proven in the remainder of this section), (c) follows from the fact that W i ⊕ K * i is independent of K * i and Zi−1 and (d) is from (B.7).Letting N → ∞ and n → ∞, it is easy to see that: The proof of lim N→∞ ∆ ≥ R e for Case 2 is completed.It remains to prove the Markov chains for Case 1: For convenience, we denote the probability Pr{V = v} by Pr{v}.By definition, W i → ( Zi , Zi−1 ) → (W i−1 , Zi−2 , Zn i+1 ) holds if and only if: Equation (B.12) can be further expressed as: where (a) is from the fact that ( zn i+1 , ṽn i+1 , ỹn i+1 , kn i+1 ) are independent of (w i , zi , ṽi , ỹi , ki ), (b) is from the fact that Zj is independent of Zl for all of the i + 1 ≤ j, l ≤ n and j = l, (c) is from the fact that given w i , Entropy 2015, 17, 7900-7925 zi , ṽi and ỹi , ki is uniquely determined, and (d) follows from the fact that ( z1 , ṽ1 , ỹ1 ), (w 2 , z2 , ṽ2 , ỹ2 ), ..., (w i , zi , ṽi , ỹi ) are independent.
Proof.Proof of W i → (W i ⊕ K * i , Zi−1 ) → Zi for Case 2: Letting W i2 = ∅ and W i1 = W i for all 2 ≤ i ≤ n, the proof of W i → (W i ⊕ K * i , Zi−1 ) → Zi for Case 2 is along the lines of that for Case 1, and therefore, we omit it here.Thus, the direct proof of Theorem 1 is completed.

Figure 1 .
Figure 1.General wiretap channel with noncausal or causal channel state information (CSI) and noiseless feedback.

1 2 ,
for given P, N 1 and N 2 , C g f s is larger than C gi s−both if and only if:

Figure 2 .
Figure 2.For N 1 ≤ N 2 , the relationships of P − C g f s , P − C g s−both , P − C gi s and P − C * s for several values of N 1 , N 2 and Q.
plots the relationships of P − C g f s and P − C g s−both for several values of N 1 , N 2 and Q.Since C * s = 0 for the case that N 1 > N 2 , both the noiseless feedback (C g f s ) and the CSI sharing scheme (C gi s−both ) enhance the secrecy capacity C * s of the Gaussian wiretap channel.Moreover, we can see that for fixed Q, if the gap between the legitimate receiver's channel Entropy 2015, 17, 7900-7925noise variance N 1 and the wiretapper's channel noise variance N 2 is large, the noiseless feedback performs better than the CSI sharing scheme, and vice versa.

Figure 3 .
Figure 3.For N 1 > N 2 , the relationships of P − C g f s and P − C gi s−both for several values of N 1 , N 2 and Q.

Figure 4 .
Figure 4.The C (c f ) s , C (ci) s and C s−both for q = 0.1.

Figure 5 .
Figure 5.The C (c f ) s , C (ci) s and C s−both for q = 0.2.

Figure 6 .
Figure 6.The C (c f ) s , C (ci) s and C s−both for q = 0.5.