Feeding Back the Output or Sharing the State: Which Is Better for the State-Dependent Wiretap Channel?

Dai, Bin; Ma, Zheng; Yu, Linman

doi:10.3390/e17127852

Open AccessArticle

Feeding Back the Output or Sharing the State: Which Is Better for the State-Dependent Wiretap Channel?

by

Bin Dai

^1,2,*,

Zheng Ma

¹ and

Linman Yu

³

¹

School of Information Science and Technology, Southwest JiaoTong University, Northbound Section Second Ring Road 111, Chengdu 610031, China

²

The National Mobile Communications Research Laboratory, Southeast University, Nanjing 210096, China

³

School of Economics and Management, Chengdu Textile College, Chengdu 611731, China

^*

Author to whom correspondence should be addressed.

Entropy 2015, 17(12), 7900-7925; https://doi.org/10.3390/e17127852

Submission received: 29 July 2015 / Revised: 2 October 2015 / Accepted: 23 November 2015 / Published: 30 November 2015

(This article belongs to the Section Information Theory, Probability and Statistics)

Download

Browse Figures

Versions Notes

Abstract

:

In this paper, the general wiretap channel with channel state information (CSI) at the transmitter and noiseless feedback is investigated, where the feedback is from the legitimate receiver to the transmitter, and the CSI is available at the transmitter in a causal or noncausal manner. The capacity-equivocation regions are determined for this model in both causal and noncausal cases, and the results are further explained via Gaussian and binary examples. For the Gaussian model, we find that in some particular cases, the noiseless feedback performs better than Chia and El Gamal’s CSI sharing scheme, i.e., the secrecy capacity of this feedback scheme is larger than that of the CSI sharing scheme. For the degraded binary model, we find that the noiseless feedback performs no better than Chia and El Gamal’s CSI sharing scheme. However, if the cross-over probability of the wiretap channel is large enough, we show that the two schemes perform the same.

Keywords:

capacity-equivocation region; channel state information; noiseless feedback; secrecy capacity; wiretap channel

1. Introduction

It is known to all that the capacity of a point-to-point discrete memoryless channel (DMC) cannot be increased by using noiseless feedback. However, does the feedback (from the legitimate receiver to the transmitter) enhance the security of the wiretap channel? Ahlswede and Cai [1] and Dai et al. [2] studied this problem. Specifically, Ahlswede and Cai [1] showed that the secrecy capacity

C_{s f}

of the degraded wiretap channel with noiseless feedback is given by:

\begin{matrix} C_{s f} = max_{p (x)} min {I (X; Y), I (X; Y) - I (X; Z) + H (Y | X, Z)}, \end{matrix}

(1)

where X, Y and Z are for the transmitter, legitimate receiver and wiretapper, respectively, and

X \to Y \to Z

forms a Markov chain. Recall that the secrecy capacity

C_{s}

of the degraded wiretap channel is determined by Wyner [3], and it is given by:

\begin{matrix} C_{s} = max_{p (x)} min {I (X; Y), I (X; Y) - I (X; Z)} . \end{matrix}

(2)

From (1) and (2), it is easy to see that the noiseless feedback increases the secrecy capacity of the wiretap channel. Based on the work of [1], Dai et al. [2] studied a special wiretap channel with feedback (

Y \to X \to Z

) and showed that the secrecy capacity of this model is larger than that of the model without feedback, i.e., the noiseless feedback helps to enhance the security of the special wiretap channel

Y \to X \to Z

. Here, note that in [1] and [2], the legitimate receiver just sends back the previous received symbols to the transmitter, and it is natural to ask: is it better for the legitimate receiver to send back purely random secret keys to the transmitter? Ardestanizadeh et al. [4] answered this question by considering the model of the wiretap channel with secure rate-limited feedback. Ardestanizadeh et al. [4] showed that if the limits (capacity) of the feedback channel are denoted by

R_{f}

, the secrecy capacity of the physically-degraded wiretap channel (

X \to Y \to Z

) with secure rate-limited feedback is given by:

\begin{matrix} C_{s f} = max_{p (x)} min {I (X; Y), I (X; Y) - I (X; Z) + R_{f}} . \end{matrix}

(3)

Compared to (1), it is easy to see that if

R_{f} \leq H (Y | X, Z)

, sending purely random secret keys is no better than sending

Y^{i - 1}

back. If

R_{f} > H (Y | X, Z)

,

I (X; Y) - I (X; Z) + R_{f} > H (Y | Z)

, sending purely random secret keys is better than sending

Y^{i - 1}

back. Besides these works on the wiretap channel with feedback, Lai et al. [5] studied the wiretap channel with noisy feedback; He et al. [6] studied the Gaussian two-way wiretap channel and the Gaussian half-duplex two-way relay channel with an un-trusted relay; and Bassi et al. [7] studied the wiretap channel with generalized feedback. Bounds on the secrecy capacities of these feedback models are obtained in [5,6,7].

Recently, the wiretap channel with channel state information (CSI) has received much attention. The Gaussian wiretap channel with noncausal CSI at the transmitter was studied in [8,9], and an achievable rate-equivocation region was provided for this Gaussian model. Based on the work of [8], Chen et al. [10] studied the discrete memoryless wiretap channel with noncausal CSI at the transmitter and also provided an achievable rate-equivocation region for this model. The encoding-decoding scheme of [10] is a combination of the binning technique of Gel’fand and Pinsker’s channel [11] and the random binning technique of Wyner’s wiretap channel [3]. After that, Dai et al. [12] studied the outer bound on the capacity-equivocation region of [10] and also investigated the capacity results of the discrete memoryless wiretap channel with causal or memoryless CSI at the transmitter. Besides these works on the wiretap channel with CSI only available at the transmitter, Chia and El Gamal [13] investigated the wiretap channel with CSI causally or non-causally at both the transmitter and the legitimate receiver and provided an achievable secrecy rate, which was larger than that of [10]. In [13], since both the transmitter and the legitimate receiver have access to the CSI, the CSI serves as a secret key shared by them. Therefore, the encoding-decoding scheme of [13] is similar to that of the wiretap channel with rate-limited feedback [4]. Besides these works on the wiretap channel with CSI, Liu et al. [14] studied the block Rayleigh fading MIMO wiretap channel with no CSI available at the legitimate receiver, the wiretapper and the transmitter, and they showed that if the legitimate receiver had more antennas than the wiretapper, non-zero secure degrees of freedom (s.d.o.f) could also be achieved.

In this paper, we study the general wiretap channel with CSI (causally or non-causally at the transmitter) and noiseless feedback; see Figure 1. In Figure 1, the transition probability of the channel depends on a CSI sequence

V^{N}

, which is available at the channel encoder in a noncausal or causal manner. The inputs of the channel are

X^{N}

and

V^{N}

, while the outputs of the channel are

Y^{N}

and

Z^{N}

. Moreover, there exists a noiseless feedback from

Y^{N}

to the channel encoder. The motivation of this work is to find whether the noiseless feedback helps to enhance the secrecy rate of the wiretap channel with noncausal or causal CSI at the transmitter [10,12] and whether the noiseless feedback does better than the shared CSI between the legitimate receiver and the transmitter [13] in enhancing the secrecy rate of the state-dependent wiretap channel.

Figure 1. General wiretap channel with noncausal or causal channel state information (CSI) and noiseless feedback.

The capacity-equivocation region of the model of Figure 1 is determined for both the noncausal and causal cases, and the results are further explained via degraded binary and Gaussian examples. For the Gaussian example, we find that both the feedback scheme and the CSI sharing scheme [13] help to enhance the security of the wiretap channel with noncausal CSI at the transmitter [10,12], and moreover, we find that in some particular cases, the noiseless feedback performs even better than the shared CSI [13], i.e., the secrecy capacity of the degraded Gaussian case of the model of Figure 1 is larger than that of the degraded Gaussian case of [13]. For the binary example, we also find that both the feedback scheme and the CSI sharing scheme [13] help to enhance the security of the wiretap channel with causal CSI at the transmitter. Unlike the Gaussian case, we find that the noiseless feedback performs no better than the shared CSI [13], i.e., the secrecy capacity of the degraded binary case of the model of Figure 1 is not more than that of the degraded binary case of [13]. However, if the cross-over probability of the wiretap channel is large enough, we find that the two schemes perform the same.

The remainder of this paper is organized as follows. The capacity-equivocation region of the model of Figure 1 is provided in Section 2. Gaussian and binary examples of the model of Figure 1 are shown in Section 3. Section 4 is for the final conclusion.

2. Capacity-Equivocation Region of the Model of Figure 1

In this paper, random variables, sample values and alphabets are denoted by capital letters, lower case letters and calligraphic letters, respectively. A similar convention is applied to the random vectors and their sample values. For example,

U^{N}

denotes a random N-vector

(U_{1}, . . ., U_{N})

, and

u^{N} = (u_{1}, . . ., u_{N})

is a specific vector value in

U^{N}

that is the N-th Cartesian power of

U

.

U_{i}^{N}

denotes a random

N - i + 1

-vector

(U_{i}, . . ., U_{N})

, and

u_{i}^{N} = (u_{i}, . . ., u_{N})

is a specific vector value in

U_{i}^{N}

. Let

P_{V} (v)

denote the probability mass function

P r {V = v}

. Throughout the paper, the logarithmic function is to the base two.

2.1. Definitions of the Model of Figure 1

Let W, uniformly distributed over the alphabet

W

, be the message sent by the transmitter. The components of the channel state sequence

V^{N}

are independent and identically distributed. The probability of each component is

P_{V} (v)

.

V^{N}

is independent of W. Let

Y^{i - 1}

(

2 \leq i \leq N

) be the i-th time feedback from the legitimate receiver to the transmitter. For the noncausal case, the i-th time channel encoder

f_{i}

is a (stochastic) mapping:

f_{i} : W \times Y^{i - 1} \times V^{N} \to X_{i},

(4)

where

f_{i} (w, y^{i - 1}, v^{N}) = x_{i} \in X

,

w \in W

,

y^{i - 1} \in Y^{i - 1}

and

v^{N} \in V^{N}

. For the causal case, the i-th time channel encoder

f_{i}

is a (stochastic) mapping:

f_{i} : W \times Y^{i - 1} \times V^{i} \to X_{i},

(5)

where

f_{i} (w, y^{i - 1}, v^{i}) = x_{i} \in X

,

w \in W

,

y^{i - 1} \in Y^{i - 1}

and

v^{i} \in V^{i}

. Here, note that for the causal case,

V_{i}

is independent of

(Y^{i - 1}, W, V_{i + 1}^{N}, Z^{i - 1})

.

The channel is discrete memoryless, and its transition probability is given by:

P_{Z^{N}, Y^{N} | X^{N}, V^{N}} (z^{N}, y^{N} | x^{N}, v^{N}) = \prod_{i = 1}^{N} P_{Z, Y | X, V} (z_{i}, y_{i} | x_{i}, v_{i}),

(6)

where

x_{i} \in X

,

v_{i} \in V

,

y_{i} \in Y

and

z_{i} \in Z

.

The wiretapper’s equivocation about the message W is denoted by:

Δ = \frac{1}{N} H (W | Z^{N}) .

(7)

The decoder

f_{D}

is a function that maps a received sequence of N channel outputs to the messages set:

f_{D} : Y^{N} \to W .

(8)

We denote the probability of error

P_{e}

by

P r {W \neq \hat{W}}

.

Given a pair

(R, R_{e})

(

R, R_{e} > 0

), it is said to be achievable if, for arbitrary small positive ϵ, there exists an encoding-decoding scheme, such that:

lim_{N \to \infty} \frac{log ∥ W ∥}{N} = R, lim_{N \to \infty} Δ \geq R_{e}, P_{e} \leq ϵ .

(9)

The set

R^{(n f)}

, which is composed of all achievable

(R, R_{e})

pairs, is called the capacity-equivocation region of the model of Figure 1 with noncausal CSI at the transmitter. An achievable rare

C_{s}^{(n f)}

, which is denoted by:

C_{s}^{(n f)} = max_{(R, R_{e} = R) \in R^{(n f)}} R,

(10)

is called the secrecy capacity of the model of Figure 1 with noncausal CSI at the transmitter.

Analogously, let

R^{(c f)}

be the capacity-equivocation region of the model of Figure 1 with causal CSI at the transmitter and

C_{s}^{(c f)}

, which is denoted by:

C_{s}^{(c f)} = max_{(R, R_{e} = R) \in R^{(c f)}} R,

(11)

be the secrecy capacity of the model of Figure 1 with causal CSI at the transmitter.

2.2. Main Result of the Model of Figure 1

The following Theorem 1 characterizes the capacity-equivocation region

R^{(n f)}

of the model of Figure 1 with noncausal CSI at the transmitter; see the following.

Theorem 1. A single-letter characterization of the region

R^{(n f)}

is as follows,

\begin{matrix} R^{(n f)} = {(R, R_{e}) : 0 \leq R_{e} \leq R, \\ 0 \leq R \leq I (K; Y) - I (K; V), \\ R_{e} \leq H (Y | Z)}, \end{matrix}

for some distribution:

\begin{matrix} P_{K V X Y Z} (k, v, x, y, z) = P_{Z Y | X V} (z, y | x, v) P_{X | K V} (x | k, v) P_{K V} (k, v), \end{matrix}

which implies the Markov chain

K \to (X, V) \to (Y, Z)

.

Proof. See Section A and Section B. ☐

Remark 1.

The range of the random variable K satisfies $∥ K ∥ \leq ∥ X ∥ ∥ V ∥ + 1$ . The proof is standard and easily obtained by using the support lemma (see [15]), and thus, we omit the proof here.
Corollary 1. The secrecy capacity $C_{s}^{(n f)}$ satisfies:

$C_{s}^{(n f)} = max_{P_{X | K V} P_{K V}} min {I (K; Y) - I (K; V), H (Y | Z)} .$

(12)

Proof. Substituting $R_{e} = R$ into the region $R^{(n f)}$ in Theorem 1, we have:

$\begin{matrix} R & \leq & I (K; Y) - I (K; V), \end{matrix}$

(13)

$\begin{matrix} R & \leq & H (Y | Z), \end{matrix}$

(14)

By using (10), (13) and (14), Formula (12) is achieved; thus, the proof is completed. ☐
Here, note that if $Z^{N}$ is a degraded version of $Y^{N}$ (which implies the existence of the Markov chain $K \to (X, V) \to Y \to Z$ ), the capacity-equivocation region $R^{(n f)}$ still holds. The proof of this degraded case is along the lines of the proof of Theorem 1, and thus, we omit the proof here. In [10,12], an achievable rate-equivocation region $R_{i}^{n}$ is provided for the wiretap channel with noncausal CSI, and it is given by:

$\begin{matrix} R_{i}^{n} = {(R, R_{e}) : R_{e} \leq R, \\ R \leq I (K; Y) - I (K; V), R_{e} \leq I (K; Y) - I (K; Z)}, \end{matrix}$

where the joint probability distribution $P_{K V X Y Z} (k, v, x, y, z)$ of $R_{i}^{n}$ satisfies:

$\begin{matrix} P_{K V X Y Z} (k, v, x, y, z) = P_{Z | Y} (z | y) P_{Y | X V} (y | x, v) P_{X | K V} (x | k, v) P_{K V} (k, v) . \end{matrix}$

Here, note that:

$\begin{matrix} \begin{matrix} I (K; Y) - I (K; Z) & = H (K | Z) - H (K | Y) \\ \overset{(a)}{=} H (K | Z) - H (K | Y, Z) = I (K; Y | Z) \\ \leq H (Y | Z), \end{matrix} \end{matrix}$

(15)

where (a) is from $K \to Y \to Z$ . Therefore, it is easy to see that the achievable rate-equivocation region $R_{i}^{n}$ of [10] and [12] is enhanced by using this noiseless feedback.

The following Theorem 2 characterizes the capacity-equivocation region

R^{(c f)}

of the model of Figure 1 with causal CSI at the transmitter; see the following.

Theorem 2. A single-letter characterization of the region

R^{(c f)}

is as follows,

\begin{matrix} R^{(c f)} = {(R, R_{e}) : 0 \leq R_{e} \leq R, \\ 0 \leq R \leq I (K; Y), \\ R_{e} \leq H (Y | Z)}, \end{matrix}

for some distribution:

\begin{matrix} P_{K V X Y Z} (k, v, x, y, z) = P_{Y Z | X V} (y, z | x, v) P_{X | K V} (x | k, v) P_{K} (k) P_{V} (v) . \end{matrix}

which implies the Markov chain

K \to (X, V) \to (Y, Z)

and the fact that V is independent of K.

Proof.

Proof of the converse: Using the fact that $V_{i}$ is independent of $Y^{i - 1}$ and $Z^{i - 1}$ , the converse proof of Theorem 2 is along the lines of that of Theorem 1 (see Section A), and thus, we omit the proof here.
Proof of the achievability: The achievability proof of Theorem 2 is along the lines of the achievability proof of Theorem 1 (see Section B), and the only difference is that for the causal case, there is no need to use the binning technique. Thus, we omit the proof here.

The proof of Theorem 2 is completed. ☐

Remark 2.

The range of the auxiliary random variable K satisfies $∥ K ∥ \leq ∥ X ∥ ∥ V ∥$ . The proof is standard and easily obtained by using the support lemma (see p. 310 of [16]), and thus, we omit the proof here.
Corollary 2. The secrecy capacity $C_{s}^{(c f)}$ satisfies:

$C_{s}^{(c f)} = max_{P_{X | K V} P_{K}} min {I (K; Y), H (Y | Z)} .$

(16)

Proof. Proof of (16): Substituting $R_{e} = R$ into the region $R^{(c f)}$ , we have:

$\begin{matrix} R & \leq & I (K; Y), \end{matrix}$

(17)

$\begin{matrix} R & \leq & H (Y | Z), . \end{matrix}$

(18)

By using (11), (17) and (18), Formula (16) is achieved; thus, the proof is completed. ☐
Here, note that if $Z^{N}$ is a degraded version of $Y^{N}$ , the capacity-equivocation region $R^{(c f)}$ still holds. The proof of this degraded case is along the lines of the proof of Theorem 2, and thus, we omit the proof here. In [12], an achievable rate-equivocation region $R_{i}^{c}$ is provided for the wiretap channel with causal CSI, and it is given by:

$\begin{matrix} R_{i}^{c} = {(R, R_{e}) : R_{e} \leq R, \end{matrix}$

$\begin{matrix} R \leq I (K; Y), R_{e} \leq I (K; Y) - I (K; Z)}, \end{matrix}$

(19)

where the joint probability distribution $P_{K V X Y Z} (k, v, x, y, z)$ of $R_{i}^{c}$ satisfies:

$\begin{matrix} P_{K V X Y Z} (k, v, x, y, z) = P_{Z | Y} (z | y) P_{Y | X V} (y | x, v) P_{X | K V} (x | k, v) P_{K} (k) P_{V} (v) . \end{matrix}$

By using (15), it is easy to see that the achievable rate-equivocation region $R_{i}^{c}$ is enhanced by using this noiseless feedback.

3. Examples of the Model of Figure 1

3.1. Gaussian Case of the Model of Figure 1 with Noncausal CSI at the Transmitter

For the Gaussian case of the model of Figure 1 with noncausal CSI at the transmitter, the i-th time (

1 \leq i \leq N

) channel inputs and outputs are given by:

Y_{i} = X_{i} + V_{i} + Z_{1, i}, Z_{i} = X_{i} + V_{i} + Z_{2, i},

(20)

where

V_{i} \sim N (0, Q)

,

Z_{1, i} \sim N (0, N_{1})

and

Z_{2, i} \sim N (0, N_{2})

. Here, note that

V_{i}

,

Z_{1, i}

and

Z_{2, i}

are independent random variables,

X_{i}

is independent of

Z_{1, i}

and

Z_{2, i}

and

\frac{1}{N} \sum_{i = 1}^{N} E (X_{i}^{2}) \leq P

. The noise

V_{i}

is non-causally known by the transmitter. The following Theorem 3 shows the secrecy capacity of the Gaussian case of the model of Figure 1 with noncausal CSI at the transmitter.

Theorem 3. For the Gaussian case of the model of Figure 1 with noncausal CSI at the transmitter, the secrecy capacity

C_{s}^{g f}

is characterized in the following two cases.

Case 1: If

N_{1} \leq N_{2}

, the secrecy capacity

C_{s}^{g f}

is given by:

\begin{matrix} C_{s}^{g f} & = max_{α} min \{\begin{matrix} \frac{1}{2} ln \frac{(P + Q + N_{1}) (P + α^{2} Q)}{P Q {(1 - α)}^{2} + N (P + α^{2} Q)} - \frac{1}{2} ln \frac{P + α^{2} Q}{P}, \\ \frac{1}{2} ln \frac{2 π e (P + Q + N_{1}) (N_{2} - N_{1})}{P + Q + N_{2}} \end{matrix}\} \\ = min {\frac{1}{2} ln (1 + \frac{P}{N_{1}}), \frac{1}{2} ln \frac{2 π e (P + Q + N_{1}) (N_{2} - N_{1})}{P + Q + N_{2}}}, \end{matrix}

(21)

where the maximum is achieved when

α = \frac{P}{P + N_{1}}

.

Case 2: If

N_{1} > N_{2}

, the secrecy capacity

C_{s}^{g f}

is given by:

\begin{matrix} C_{s}^{g f} & = & min {\frac{1}{2} ln (1 + \frac{P}{N_{1}}), \frac{1}{2} ln 2 π e (N_{1} - N_{2})} . \end{matrix}

(22)

Remark 3.

If

N_{1} \leq N_{2}

, the relationship of the channel inputs and outputs defined in (20) can be equivalently characterized by:

Y_{i} = X_{i} + V_{i} + Z_{1, i}, Z_{i} = X_{i} + V_{i} + Z_{1, i} + Z_{2, i}^{*},

(23)

where

Z_{2, i}^{*} \sim N (0, N_{2} - N_{1})

, and it is independent of

Z_{1, i}

. Similar to the determination of the capacity region of the Gaussian broadcast channel (pp. 117–118 of [17]), the relationship (23) implies that there exists a Markov chain

(X_{i}, V_{i}) \to Y_{i} \to Z_{i}

, i.e., the Gaussian case of the model of Figure 1 reduces to a degraded model of Figure 1.

Analogously, if

N_{1} > N_{2}

, the relationship of the channel inputs and outputs defined in (20) can be equivalently characterized by:

Y_{i} = X_{i} + V_{i} + Z_{1, i}^{*} + Z_{2, i}, Z_{i} = X_{i} + V_{i} + Z_{2, i},

(24)

where

Z_{1, i}^{*} \sim N (0, N_{1} - N_{2})

, and it is independent of

Z_{2, i}

,

X_{i}

and

V_{i}

. The relationship (24) implies that there exists a Markov chain

(X_{i}, V_{i}) \to Z_{i} \to Y_{i}

in the Gaussian case of the model of Figure 1.

Proof. For the direct part of Theorem 3, like [18] and [10], the achievability of

C_{s}^{g f}

is proven by substituting

K = X + α V

,

X \sim N (0, P)

,

V \sim N (0, Q)

and the fact that X is independent of V in Theorem 1; the details of the proof are omitted in this paper. Here, note that the calculation of

I (K; Y) - I (K; V)

is exactly the same as that of the dirty paper channel (page 440 of [18]), and it is easy to see that the maximum of

I (K; Y) - I (K; V)

is achieved when

α = \frac{P}{P + N_{1}}

.

For the converse part of Theorem 3, note that the transmitter-receiver channel is Costa’s dirty paper channel [18]; thus, the secrecy capacity is upper bounded by the capacity of the dirty paper channel, i.e.,

C_{s}^{g f} \leq \frac{1}{2} ln (1 + \frac{P}{N_{1}})

. Now, it remains to show

C_{s}^{g f} \leq \frac{1}{2} ln \frac{2 π e (P + Q + N_{1}) (N_{2} - N_{1})}{P + Q + N_{2}}

for

N_{1} \leq N_{2}

and

C_{s}^{g f} \leq \frac{1}{2} ln 2 π e (N_{1} - N_{2})

for

N_{1} > N_{2}

; see the following.

Proof of

C_{s}^{g f} \leq \frac{1}{2} ln \frac{2 π e (P + Q + N_{1}) (N_{2} - N_{1})}{P + Q + N_{2}}

for

N_{1} \leq N_{2}

:

First, note that:

\begin{matrix} \frac{1}{N} H (W | Z^{N}) & \overset{(a)}{\leq} \frac{1}{N} (I (W; Y^{N} | Z^{N}) + δ (P_{e})) \\ \leq \frac{1}{N} \sum_{i = 1}^{N} h (Y_{i} | Z_{i}) + \frac{δ (P_{e})}{N}, \end{matrix}

(25)

where (a) is from Fano’s inequality. The conditional differential entropy

h (Y_{i} | Z_{i})

in (25) is bounded by:

\begin{matrix} h (Y_{i} | Z_{i}) & = h (Y_{i}, Z_{i}) - h (Z_{i}) \\ = h (Z_{i} | Y_{i}) + h (Y_{i}) - h (Z_{i}) \\ \overset{(1)}{=} h (X_{i} + V_{i} + Z_{1, i} + Z_{2, i}^{*} | X_{i} + V_{i} + Z_{1, i}) + h (X_{i} + V_{i} + Z_{1, i}) - h (X_{i} + V_{i} + Z_{1, i} + Z_{2, i}^{*}) \\ \overset{(2)}{=} h (Z_{2, i}^{*}) + h (X_{i} + V_{i} + Z_{1, i}) - h (X_{i} + V_{i} + Z_{1, i} + Z_{2, i}^{*}) \\ \overset{(3)}{\leq} h (Z_{2, i}^{*}) + h (X_{i} + V_{i} + Z_{1, i}) - \frac{1}{2} ln (e^{2 h (X_{i} + V_{i} + Z_{1, i})} + e^{2 h (Z_{2, i}^{*})}) \\ = h (Z_{2, i}^{*}) + \frac{1}{2} ln (e^{2 h (X_{i} + V_{i} + Z_{1, i})}) - \frac{1}{2} ln (e^{2 h (X_{i} + V_{i} + Z_{1, i})} + e^{2 h (Z_{2, i}^{*})}) \\ \overset{(4)}{=} \frac{1}{2} ln (2 π e (N_{2} - N_{1})) + \frac{1}{2} ln \frac{e^{2 h (X_{i} + V_{i} + Z_{1, i})}}{e^{2 h (X_{i} + V_{i} + Z_{1, i})} + 2 π e (N_{2} - N_{1})} \\ \overset{(5)}{\leq} \frac{1}{2} ln (2 π e (N_{2} - N_{1})) + \frac{1}{2} ln \frac{2 π e (P + Q + N_{1})}{2 π e (P + Q + N_{1}) + 2 π e (N_{2} - N_{1})} \\ = \frac{1}{2} ln \frac{2 π e (N_{2} - N_{1}) (P + Q + N_{1})}{P + Q + N_{2}}, \end{matrix}

(26)

where (1) is from Definition (23), (2) is from the fact that

Z_{2, i}^{*}

is independent of

X_{i}

,

V_{i}

and

Z_{1, i}

, (3) is from the entropy power inequality

e^{2 h (X_{i} + V_{i} + Z_{1, i} + Z_{2, i}^{*})} \geq e^{2 h (X_{i} + V_{i} + Z_{1, i})} + e^{2 h (Z_{2, i}^{*})}

(see [19]), (4) is from the fact that the differential entropy of a Gaussian distributed random variable X is

h (X) = \frac{1}{2} ln (2 π e D (X))

(here,

D (X)

is the variance of the Gaussian random variable X) and (5) is from

\frac{1}{2} ln \frac{e^{2 h (X_{i} + V_{i} + Z_{1, i})}}{e^{2 h (X_{i} + V_{i} + Z_{1, i})} + 2 π e (N_{2} - N_{1})}

increasing while

h (X_{i} + V_{i} + Z_{1, i})

is increasing and the fact that

h (X_{i} + V_{i} + Z_{1, i}) \leq \frac{1}{2} ln (2 π e (P + Q + N_{1}))

(here, note that “=” is achieved if

X_{i} \sim N (0, P)

).

Substituting (26) into (25), we have:

\begin{matrix} \frac{1}{N} H (W | Z^{N}) & \leq \frac{1}{N} \sum_{i = 1}^{N} \frac{1}{2} ln \frac{2 π e (N_{2} - N_{1}) (P + Q + N_{1})}{P + Q + N_{2}} + \frac{δ (P_{e})}{N} \\ = \frac{1}{2} ln \frac{2 π e (N_{2} - N_{1}) (P + Q + N_{1})}{P + Q + N_{2}} + \frac{δ (P_{e})}{N} . \end{matrix}

(27)

Substituting

P_{e} \leq ϵ

into (27) and letting

N \to \infty

, it is easy to see that

C_{s}^{g f} \leq \frac{1}{2} ln \frac{2 π e (P + Q + N_{1}) (N_{2} - N_{1})}{P + Q + N_{2}}

for

N_{1} \leq N_{2}

.

Proof of

C_{s}^{g f} \leq \frac{1}{2} ln 2 π e (N_{1} - N_{2})

for

N_{1} > N_{2}

:

For the case

N_{1} > N_{2}

, the conditional differential entropy

h (Y_{i} | Z_{i})

in (25) can be bounded by:

\begin{matrix} h (Y_{i} | Z_{i}) & \overset{(a)}{=} h (X_{i} + V_{i} + Z_{1, i}^{*} + Z_{2, i} | X_{i} + V_{i} + Z_{2, i}) \\ \overset{(b)}{=} h (Z_{1, i}^{*}) \\ \overset{(c)}{=} \frac{1}{2} ln 2 π e (N_{1} - N_{2}), \end{matrix}

(28)

where (a) is from (24), (b) is from the fact that

Z_{1, i}^{*}

is independent of

Z_{2, i}

,

X_{i}

and

V_{i}

and (c) is from the fact that the differential entropy of a Gaussian distributed random variable X is

h (X) = \frac{1}{2} ln (2 π e D (X))

(here,

D (X)

is the variance of the Gaussian random variable X). Substituting (28) and

P_{e} \leq ϵ

into (25) and letting

N \to \infty

, it is easy to see that

C_{s}^{g f} \leq \frac{1}{2} ln 2 π e (N_{1} - N_{2})

for

N_{1} > N_{2}

. Thus, the converse part of Theorem 3 is proven. The proof of Theorem 3 is completed. ☐

In [13] (p.2841, Theorem 3), Chia and El Gamal showed that if Y is less noisy than Z (

I (X; Y | V) \geq I (X; Z | V)

for every

P_{X | V} (x | v)

), the secrecy capacity of the wiretap channel with CSI non-causally known by both the transmitter and the legitimate receiver is given by:

\begin{matrix} C_{s - b o t h} = max_{p (x | v)} min {I (X; Y | V), I (X; Y | V) - I (X; Z | V) + H (V | Z)} . \end{matrix}

Here, the

I (X; Z | V) - H (V | Z)

in the above

C_{s - b o t h}

can be rewritten as follows.

\begin{matrix} I (X; Z | V) - H (V | Z) & = H (Z | V) - H (Z | X, V) - H (V | Z) \\ = H (V, Z) - H (V) - H (Z | X, V) - H (V, Z) + H (Z) \\ = H (Z) - H (V) - H (Z | X, V) . \end{matrix}

(29)

Substituting (29) into

C_{s - b o t h}

, we have:

\begin{matrix} C_{s - b o t h} = max_{p (x | v)} min {I (X; Y | V), I (X; Y | V) - H (Z) + H (V) + H (Z | X, V)} . \end{matrix}

(30)

On the other hand, for Z less noisy than Y (

I (X; Z | V) \geq I (X; Y | V)

for every

P_{X | V} (x | v)

), Chia and El Gamal provided an achievable secrecy rate (lower bound on the secrecy capacity) for the wiretap channel with CSI non-causally known by both the transmitter and the legitimate receiver, and it is given by:

\begin{matrix} C_{s - b o t h}^{i} = max_{p (x | v)} min {I (X; Y | V), H (V | Z, X)} . \end{matrix}

(31)

The following Theorem 4 shows the results on the secrecy capacity of the Gaussian case of the wiretap channel with CSI non-causally known by both the transmitter and the legitimate receiver.

Theorem 4. For the Gaussian wiretap channel with part of the Gaussian noise non-causally known by both the transmitter and the legitimate receiver, the secrecy capacity

C_{s - b o t h}^{g}

is characterized by the following two cases.

Case 1: If

N_{1} \leq N_{2}

, the secrecy capacity

C_{s - b o t h}^{g}

is given by:

\begin{matrix} C_{s - b o t h}^{g} = min \{\begin{matrix} \frac{1}{2} ln (1 + \frac{P}{N_{1}}), \\ \frac{1}{2} ln (1 + \frac{P}{N_{1}}) + \frac{1}{2} ln (2 π e Q) - \frac{1}{2} ln (\frac{P + Q + N_{2}}{N_{2}}) \end{matrix}\} . \end{matrix}

(32)

Case 2: If

N_{1} > N_{2}

, a lower bound

C_{s - b o t h}^{g i}

on the secrecy capacity

C_{s - b o t h}^{g}

is given by:

\begin{matrix} C_{s - b o t h}^{g} \geq C_{s - b o t h}^{g i} = min {\frac{1}{2} ln \frac{2 π e Q N_{2}}{Q + N_{2}}, \frac{1}{2} ln (1 + \frac{P}{N_{1}})} . \end{matrix}

(33)

Remark 4.

For the Gaussian case, the conditional mutual information

I (X; Y | V)

is calculated by using the fact that when the CSI is known by both the legitimate receiver and the transmitter, it can be simply subtracted off, which in effect reduces the channel to a Gaussian channel with no CSI, i.e.,

I (X; Y | V) = \frac{1}{2} ln (1 + \frac{P}{N_{1}})

. Analogously, we have

I (X; Z | V) = \frac{1}{2} ln (1 + \frac{P}{N_{2}})

. Then, it is easy to see that Y is less noisy than Z (

I (X; Z | V) \geq I (X; Y | V)

for every

P_{X | V} (x | v)

), which can be further expressed by

N_{1} \leq N_{2}

, and Z is less noisy than Y (

I (X; Z | V) \geq I (X; Y | V)

for every

P_{X | V} (x | v)

), which can be further expressed by

N_{1} \geq N_{2}

.

Proof. The achievability proof of (32) and (33) is easily obtained by substituting

X \sim N (0, P)

,

V \sim N (0, Q)

and (20) into (30) and (31), respectively. Now, it remains to prove the converse of (32); see the following.

The converse part of (32) is based on the converse proof of (30), (see p.2846, Proof of Theorem 2 of [13] and the left bottom and right top of page 2841 [13]). However, the converse proof of (30) is for the discrete memoryless case, and it needs to be further processed for the Gaussian case. Based on the converse proof of (30) [13] and the fact that

N_{1} \leq N_{2}

, we have the following (34) and (35),

\begin{matrix} C_{s - b o t h}^{g} & \leq \frac{1}{N} \sum_{i = 1}^{N} (I (X_{i}; Y_{i} | V_{i}) - I (X_{i}; Z_{i} | V_{i}) + h (V_{i} | Z_{i})) \\ \overset{(1)}{=} \frac{1}{N} \sum_{i = 1}^{N} (I (X_{i}; Y_{i} | V_{i}) - h (Z_{i}) + h (V_{i}) + h (Z_{i} | X_{i}, V_{i})) \\ \overset{(2)}{=} \frac{1}{N} \sum_{i = 1}^{N} (h (X_{i} + Z_{1, i} | V_{i}) - h (Z_{1, i}) - h (Z_{i}) + h (V_{i}) + h (Z_{1, i} + Z_{2, i}^{*})) \\ \leq \frac{1}{N} \sum_{i = 1}^{N} (h (X_{i} + Z_{1, i}) - h (Z_{1, i}) - h (Z_{i}) + h (V_{i}) + h (Z_{1, i} + Z_{2, i}^{*})) \\ \overset{(3)}{=} \frac{1}{N} \sum_{i = 1}^{N} (h (X_{i} + Z_{1, i}) - \frac{1}{2} ln (2 π e N_{1}) - h (X_{i} + V_{i} + Z_{1, i} + Z_{2, i}^{*}) + \frac{1}{2} ln (2 π e Q) \\ + \frac{1}{2} ln (2 π e N_{2})) \\ \overset{(4)}{\leq} \frac{1}{N} \sum_{i = 1}^{N} (\frac{1}{2} ln (e^{2 h (X_{i} + Z_{1, i})}) - \frac{1}{2} ln (2 π e N_{1}) - \frac{1}{2} ln (e^{2 h (X_{i} + Z_{1, i})} + e^{2 h (V_{i} + Z_{2, i}^{*})}) \\ + \frac{1}{2} ln (2 π e Q) + \frac{1}{2} ln (2 π e N_{2})) \\ \overset{(5)}{=} \frac{1}{N} \sum_{i = 1}^{N} (\frac{1}{2} ln (e^{2 h (X_{i} + Z_{1, i})}) - \frac{1}{2} ln (2 π e N_{1}) - \frac{1}{2} ln (e^{2 h (X_{i} + Z_{1, i})} + 2 π e (Q + N_{2} - N_{1})) \\ + \frac{1}{2} ln (2 π e Q) + \frac{1}{2} ln (2 π e N_{2})) \\ = \frac{1}{N} \sum_{i = 1}^{N} (\frac{1}{2} ln \frac{e^{2 h (X_{i} + Z_{1, i})}}{e^{2 h (X_{i} + Z_{1, i})} + 2 π e (Q + N_{2} - N_{1})} - \frac{1}{2} ln (2 π e N_{1}) + \frac{1}{2} ln (2 π e Q) + \frac{1}{2} ln (2 π e N_{2})) \\ \overset{(6)}{\leq} \frac{1}{N} \sum_{i = 1}^{N} (\frac{1}{2} ln \frac{2 π e (P + N_{1})}{2 π e (P + N_{1}) + 2 π e (Q + N_{2} - N_{1})} - \frac{1}{2} ln (2 π e N_{1}) \\ + \frac{1}{2} ln (2 π e Q) + \frac{1}{2} ln (2 π e N_{2})) \\ = \frac{1}{2} ln (1 + \frac{P}{N_{1}}) + \frac{1}{2} ln (2 π e Q) - \frac{1}{2} ln (\frac{P + Q + N_{2}}{N_{2}}), \end{matrix}

(34)

and:

\begin{matrix} C_{s - b o t h}^{g} & \leq \frac{1}{N} \sum_{i = 1}^{N} I (X_{i}; Y_{i} | V_{i}) \\ = \frac{1}{N} \sum_{i = 1}^{N} (h (Y_{i} | V_{i}) - h (Y_{i} | V_{i}, X_{i})) \\ \overset{(7)}{=} \frac{1}{N} \sum_{i = 1}^{N} (h (X_{i} + Z_{1, i} | V_{i}) - h (Z_{1, i})) \\ \leq \frac{1}{N} \sum_{i = 1}^{N} (h (X_{i} + Z_{1, i}) - h (Z_{1, i})) \\ \overset{(8)}{\leq} \frac{1}{N} \sum_{i = 1}^{N} (\frac{1}{2} ln (2 π e (P + N_{1})) - \frac{1}{2} ln (2 π e N_{1})) \\ = \frac{1}{2} ln (1 + \frac{P}{N_{1}}), \end{matrix}

(35)

where (1) is from (29), (2) is from Definition (23) and

Z_{2, i}^{*} \sim N (0, N_{2} - N_{1})

, (3) is from the fact that the differential entropy of a Gaussian distributed random variable X is

h (X) = \frac{1}{2} ln (2 π e D (X))

(here,

D (X)

is the variance of the Gaussian random variable X), (4) is from the entropy power inequality

e^{2 h (X_{i} + V_{i} + Z_{1, i} + Z_{2, i})} \geq e^{2 h (X_{i} + Z_{1, i})} + e^{2 h (V_{i} + Z_{2, i})}

(see [19]), (5) is from

h (V_{i} + Z_{2, i}) = \frac{1}{2} ln (2 π e (Q + N_{2}))

, (6) is from

\frac{1}{2} ln \frac{e^{2 h (X_{i} + Z_{1, i})}}{e^{2 h (X_{i} + Z_{1, i})} + 2 π e (Q + N_{2})}

increasing while

h (X_{i} + Z_{1, i})

is increasing and the fact that

h (X_{i} + Z_{1, i}) \leq \frac{1}{2} ln (2 π e (P + N_{1}))

(here, note that “=” is achieved if

X_{i} \sim N (0, P)

), (7) is from Definition (23) and (8) is from

h (X_{i} + Z_{1, i}) \leq \frac{1}{2} ln (2 π e (P + N_{1}))

. Thus, the converse part of (32) is proven. The proof of Theorem 4 is completed. ☐

Recall that for the degraded Gaussian wiretap channel with noncausal CSI at the transmitter (

(X, V) \to Y \to Z

), an achievable secrecy rate (a lower bound on the secrecy capacity) is provided [10]; see the following Theorem 5.

Theorem 5. For the Gaussian non-feedback model of Figure 1 with the condition that

N_{1} \leq N_{2}

, an achievable secrecy rate

C_{s}^{g i}

is denoted by:

\begin{matrix} C_{s}^{g i} = max_{0 \leq α \leq 1} min \{\begin{matrix} \frac{1}{2} ln \frac{(P + N_{1}) (P + α^{2} Q)}{α^{2} Q (P + N_{1}) + N_{1} P} - \frac{1}{2} ln \frac{P + α^{2} Q}{P}, \\ \frac{1}{2} ln \frac{(P + N_{1}) (P + α^{2} Q)}{α^{2} Q (P + N_{1}) + N_{1} P} - \frac{1}{2} ln \frac{(P + N_{2}) (P + α^{2} Q)}{α^{2} Q P + N_{2} (P + α^{2} Q)} \end{matrix}\} . \end{matrix}

Proof. The result is directly obtained from [10], and therefore, the proof is omitted here. ☐

Remark 5.

For the case $N_{1} \leq N_{2}$ , the relationship (20) of the channel inputs and outputs can be equivalently characterized by (23), which implies the Markov chain $(X, V) \to Y \to Z$ .
To the best of the authors’ knowledge, for the case $N_{1} > N_{2}$ , the bounds on the secrecy capacity of the Gaussian wiretap channel with noncausal CSI at the transmitter are still unknown.

Finally, note that if the CSI is not available at the legitimate receiver, the wiretapper and the transmitter and there is no feedback link from the legitimate receiver to the transmitter, the Gaussian case of the model of Figure 1 (see (20)) reduces to the model of the Gaussian wiretap channel, where

V_{i}

and

Z_{1, i}

of (20) are the legitimate receiver’s channel noises and

V_{i}

and

Z_{2, i}

are the wiretapper’s channel noises. From [20], it is easy to see that the secrecy capacity

C_{s}^{*}

of the Gaussian wiretap channel is given by:

C_{s}^{*} = \frac{1}{2} ln \frac{P + Q + N_{1}}{Q + N_{1}} - \frac{1}{2} ln \frac{P + Q + N_{2}}{Q + N_{2}} .

(36)

Comparing Theorem 3 to Theorem 4, we can conclude that if

N_{1} \leq N_{2}

, for given P,

N_{1}

and

N_{2}

,

C_{s}^{g f}

is larger than

C_{s - b o t h}^{g}

if and only if:

Q \leq \frac{N_{1} (N_{2} - N_{1}) (P + N_{1})}{N_{1}^{2} + N_{2} P} .

(37)

For the case that

N_{1} > N_{2}

, we find that if

\frac{N_{1}}{2} < N_{2} < N_{1}

, for given P,

N_{1}

and

N_{2}

,

C_{s}^{g f}

is larger than

C_{s - b o t h}^{g i}

if and only if:

Q \leq \frac{N_{2} (N_{1} - N_{2})}{2 N_{2} - N_{1}} .

(38)

If

N_{2} = \frac{N_{1}}{2}

,

C_{s}^{g f}

is always larger than

C_{s - b o t h}^{g i}

.

If

N_{2} < \frac{N_{1}}{2}

, for given P,

N_{1}

and

N_{2}

,

C_{s}^{g f}

is larger than

C_{s - b o t h}^{g i}

if and only if:

Q \geq \frac{N_{2} (N_{1} - N_{2})}{2 N_{2} - N_{1}} .

(39)

Figure 2. For

N_{1} \leq N_{2}

, the relationships of

P - C_{s}^{g f}

,

P - C_{s - b o t h}^{g}

,

P - C_{s}^{g i}

and

P - C_{s}^{*}

for several values of

N_{1}

,

N_{2}

and Q.

Figure 2. For

N_{1} \leq N_{2}

, the relationships of

P - C_{s}^{g f}

,

P - C_{s - b o t h}^{g}

,

P - C_{s}^{g i}

and

P - C_{s}^{*}

for several values of

N_{1}

,

N_{2}

and Q.

For the case

N_{1} \leq N_{2}

, Figure 2 plots the relationships of

P - C_{s}^{*}

,

P - C_{s}^{g i}

,

P - C_{s}^{g f}

and

P - C_{s - b o t h}^{g}

for several values of

N_{1}

,

N_{2}

and Q. It is easy to see that the noiseless feedback (

C_{s}^{g f}

), the CSI sharing scheme (

C_{s - b o t h}^{g}

) and the CSI only available at the transmitter (

C_{s}^{g i}

) help to enhance the secrecy capacity

C_{s}^{*}

of the Gaussian wiretap channel. Furthermore, we can see that both the noiseless feedback and the CSI sharing scheme perform better than the CSI only available at the transmitter. Moreover, when Q is small (

Q = 0.1, 0.5

), the noiseless feedback performs better than the CSI sharing scheme, and while Q is increasing (

Q = 1

), the CSI sharing scheme is beginning to take advantage of the noiseless feedback.

For the case

N_{1} > N_{2}

, the following Figure 3 plots the relationships of

P - C_{s}^{g f}

and

P - C_{s - b o t h}^{g}

for several values of

N_{1}

,

N_{2}

and Q. Since

C_{s}^{*} = 0

for the case that

N_{1} > N_{2}

, both the noiseless feedback (

C_{s}^{g f}

) and the CSI sharing scheme (

C_{s - b o t h}^{g i}

) enhance the secrecy capacity

C_{s}^{*}

of the Gaussian wiretap channel. Moreover, we can see that for fixed Q, if the gap between the legitimate receiver’s channel noise variance

N_{1}

and the wiretapper’s channel noise variance

N_{2}

is large, the noiseless feedback performs better than the CSI sharing scheme, and vice versa.

Figure 3. For

N_{1} > N_{2}

, the relationships of

P - C_{s}^{g f}

and

P - C_{s - b o t h}^{g i}

for several values of

N_{1}

,

N_{2}

and Q.

Figure 3. For

N_{1} > N_{2}

, the relationships of

P - C_{s}^{g f}

and

P - C_{s - b o t h}^{g i}

for several values of

N_{1}

,

N_{2}

and Q.

3.2. Binary Case of the Model of Figure 1

In this subsection, we calculate the secrecy capacity of a degraded binary case of the model of Figure 1 with causal CSI at the transmitter, where “degraded” means that there exists a Markov chain

(X, V) \to Y \to Z

.

Suppose that the random variable V is uniformly distributed over

{0, 1}

, i.e.,

p_{V} (0) = p_{V} (1) = \frac{1}{2}

. Meanwhile, the random variables X, Y and Z take values in

{0, 1}

, and the wiretap channel is a BSC (binary symmetric channel) with crossover probability q. The transition probability of the main channel is defined as follows:

When

v = 0

,

p_{Y | X, V} (y | x, v = 0) = \{\begin{matrix} 1 - p, & if y = x, \\ p, & otherwise . \end{matrix}

(40)

When

v = 1

,

p_{Y | X, V} (y | x, v = 1) = \{\begin{matrix} p, & if y = x, \\ 1 - p, & otherwise . \end{matrix}

(41)

From Remark 2, we know that the secrecy capacity for the model of Figure 1 with causal CSI at the transmitter is given by:

C_{s}^{(c f)} = max_{P_{K} (k) P_{X | K, V} (x | k, v)} min {I (K; Y), H (Y | Z)}

(42)

and the maximum achievable secrecy rate

C_{s}^{(c i)}

of the wiretap channel with causal CSI [12] is given by:

C_{s}^{(c i)} = max_{P_{K} (k) P_{X | K, V} (x | k, v)} (I (K; Y) - I (K; Z)),

(43)

where (43) is from (19).

In addition, from ([13], Theorem 3), we know that the secrecy capacity

C_{s - b o t h}

of the wiretap channel with CSI causally or non-causally at both the transmitter and the legitimate receiver is given by:

C_{s - b o t h} = max_{P_{X | V} (x | v)} min {I (X; Y | V) - I (X; Z | V) + H (V | Z), I (X; Y | V)} .

(44)

It remains to calculate

C_{s}^{(c f)}

,

C_{s}^{(c i)}

and

C_{s - b o t h}

; see the following.

The calculation of

C_{s}^{(c f)}

and

C_{s}^{(c i)}

:

Let K take values in

{0, 1}

. The probability of K is defined as follows.

p_{K} (0) = α

, and

p_{K} (1) = 1 - α

. Define the conditional probability mass function

p_{X | K, V}

as follows.

p_{X | K, V} (0 | 0, 0) = β_{1}

,

p_{X | K, V} (1 | 0, 0) = 1 - β_{1}

,

p_{X | K, V} (0 | 0, 1) = β_{2}

,

p_{X | K, V} (1 | 0, 1) = 1 - β_{2}

,

p_{X | K, V} (0 | 1, 0) = β_{3}

,

p_{X | K, V} (1 | 1, 0) = 1 - β_{3}

,

p_{X | K, V} (0 | 1, 1) = β_{4}

,

p_{X | K, V} (1 | 1, 1) = 1 - β_{4}

.

The joint probability mass functions

p_{K Y}

is calculated by:

\begin{matrix} p_{K Y} (k, y) & = \sum_{x, v} p_{K Y X V} (k, y, x, v) \\ = \sum_{x, v} p_{Y | X V} (y | x, v) p_{X | K, V} (x | k, v) p_{K} (k) p_{V} (v) . \end{matrix}

(45)

Then, we have:

p_{K Y} (0, 0) = \frac{α}{2} [1 - (β_{1} - β_{2}) (1 - 2 p)],

(46)

p_{K Y} (0, 1) = \frac{α}{2} [1 + (β_{1} - β_{2}) (1 - 2 p)],

(47)

p_{K Y} (1, 0) = \frac{α}{2} [1 - (β_{3} - β_{4}) (1 - 2 p)],

(48)

p_{K Y} (1, 1) = \frac{α}{2} [1 + (β_{3} - β_{4}) (1 - 2 p)] .

(49)

By calculating, we have:

C_{s}^{(c f)} = min {1 - h (p), h (q)},

(50)

and:

C_{s}^{(c i)} = h (p + q - 2 p q) - h (p),

(51)

where

h (x) = - x log x - (1 - x) log (1 - x)

and

0 \leq x \leq 1

.

The calculation of

C_{s - b o t h}

:

Define

p_{X | V} (0 | 0) = α

,

p_{X | V} (1 | 0) = 1 - α

,

p_{X | V} (0 | 1) = β

,

p_{X | V} (1 | 1) = 1 - β

.

By calculating,

C_{s - b o t h}

is given by:

C_{s - b o t h} = min {1 - h (p), 1 - h (p) + h (p + q - 2 p q)} = 1 - h (p) .

(52)

The following Figure 4, Figure 5 and Figure 6 show

C_{s}^{(c f)}

,

C_{s}^{(c i)}

and

C_{s - b o t h}

for several values of q. Here, note that the noise of the wiretap channel is increasing while q is increasing. It is easy to see that when

q < 0.5

,

C_{s - b o t h}

and

C_{s}^{(c f)}

are always larger than

C_{s}^{(c i)}

, i.e., both the noiseless feedback (the model of this paper) and the shared CSI [13] help to enhance the security of the wiretap channel with causal CSI at the transmitter. When

q = 0.5

, there is no wiretapper in the channel; thus,

C_{s}^{(c f)} = C_{s}^{(c i)} = C_{s - b o t h} = 1 - h (p)

.

Moreover, from Figure 4, Figure 5 and Figure 6, we see that the noiseless feedback performs no better than the shared CSI. However, when q is large enough (satisfying

h (q) \geq 1 - h (p)

), the two ways perform the same.

Figure 4. The

C_{s}^{(c f)}

,

C_{s}^{(c i)}

and

C_{s - b o t h}

for

q = 0.1

.

Figure 4. The

C_{s}^{(c f)}

,

C_{s}^{(c i)}

and

C_{s - b o t h}

for

q = 0.1

.

Figure 5. The

C_{s}^{(c f)}

,

C_{s}^{(c i)}

and

C_{s - b o t h}

for

q = 0.2

.

Figure 5. The

C_{s}^{(c f)}

,

C_{s}^{(c i)}

and

C_{s - b o t h}

for

q = 0.2

.

Figure 6. The

C_{s}^{(c f)}

,

C_{s}^{(c i)}

and

C_{s - b o t h}

for

q = 0.5

.

Figure 6. The

C_{s}^{(c f)}

,

C_{s}^{(c i)}

and

C_{s - b o t h}

for

q = 0.5

.

4. Conclusions

In this paper, we study the general wiretap channel with CSI and noiseless feedback, where the CSI is available at the transmitter in a noncausal or causal manner. Both the capacity-equivocation region and the secrecy capacity are determined for the noncausal and causal cases, and the results are further explained via Gaussian and binary examples. For the Gaussian example, we show that both the noiseless feedback and the CSI sharing scheme [13] help to enhance the security of the Gaussian wiretap channel. Moreover, we show that in some particular cases, the noiseless feedback performs even better than the CSI sharing scheme [13]. For the degraded binary example, we also find that the noiseless feedback enhances the security of the wiretap channel with causal CSI. Unlike the Gaussian example, we find that the noiseless feedback always performs no better than the CSI sharing scheme [13].

Acknowledgment

The authors would like to thank the anonymous reviewers for their valuable suggestions to improve this paper. This work was supported by a sub-project in the National Basic Research Program of China under Grant 2012CB316100 on Broadband Mobile Communications at High Speeds, the National Natural Science Foundation of China under Grant 61301121, the Fundamental Research Fund for the Central Universities under Grant 2682014CX099, the Key Grant Project of Chinese Ministry of Education (No. 311031 100), the Young Innovative Research Team of Sichuan Province (2011JTD0007) and the Open Research Fund of National Mobile Communications Research Laboratory, Southeast University (No. 2014D01).

Author Contributions

Bin Dai designed research; Bin Dai and Zheng Ma performed research; Linman Yu analyzed the data; Bin Dai wrote the paper. All authors have read and approved the final manuscript.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix

A. Converse Proof of Theorem 1

Given an achievable

(R, R_{e})

pair, we need to show that there exists a joint distribution of the form

P_{Z | Y} (z | y) P_{Y | X V} (y | x, v) P_{X | K V} (x | k, v) P_{K V} (k, v)

, such that,

0 \leq R_{e} \leq R,

(A.1)

0 \leq R \leq I (K; Y) - I (K; V),

(A.2)

R_{e} \leq H (Y | Z) .

(A.3)

A.1. Proof of (A.1)

R_{e} \leq lim_{N \to \infty} Δ \leq lim_{N \to \infty} \frac{1}{N} H (W) = lim_{N \to \infty} \frac{log ∥ W ∥}{N} = R .

A.2. Proof of (A.2)

\begin{matrix} \frac{1}{N} H (W) & = \frac{1}{N} (I (W; Y^{N}) + H (W | Y^{N})) \overset{(a)}{\leq} \frac{1}{N} (I (W; Y^{N}) + δ (P_{e})) \\ \overset{(b)}{=} \frac{1}{N} (I (W; Y^{N}) - I (W; V^{N}) + δ (P_{e})) \\ \overset{(c)}{=} \frac{1}{N} \sum_{i = 1}^{N} (I (Y_{i}; W, V_{i + 1}^{N} | Y^{i - 1}) - I (V_{i}; W, Y^{i - 1} | V_{i + 1}^{N}) + δ (P_{e})) \\ \overset{(d)}{\leq} \frac{1}{N} \sum_{i = 1}^{N} (H (Y_{i}) - H (Y_{i} | Y^{i - 1}, W, V_{i + 1}^{N}) - H (V_{i}) + H (V_{i} | V_{i + 1}^{N}, W, Y^{i - 1}) + δ (P_{e})) \\ = \frac{1}{N} \sum_{i = 1}^{N} (I (Y_{i}; W, V_{i + 1}^{N}, Y^{i - 1}) - I (V_{i}; W, Y^{i - 1}, V_{i + 1}^{N}) + δ (P_{e})) \\ \overset{(e)}{=} \frac{1}{N} \sum_{i = 1}^{N} (I (Y_{i}; W, V_{i + 1}^{N}, Y^{i - 1} | J = i) - I (V_{i}; W, Y^{i - 1}, V_{i + 1}^{N} | J = i) + δ (P_{e})) \\ \overset{(f)}{=} I (Y_{J}; W, V_{J + 1}^{N}, Y^{J - 1} | J) - I (V_{J}; W, Y^{J - 1}, V_{J + 1}^{N} | J) + \frac{δ (P_{e})}{N} \\ \overset{(g)}{\leq} I (Y_{J}; W, V_{J + 1}^{N}, Y^{J - 1}, J) - I (V_{J}; W, Y^{J - 1}, V_{J + 1}^{N}, J) + \frac{δ (P_{e})}{N} \\ \overset{(h)}{=} I (K; Y) - I (K; V) + \frac{δ (P_{e})}{N}, \end{matrix}

(A.4)

where (a) is from Fano’s inequality, (b) is from W is independent of

V^{N}

, (c) is from Csisz

\overset{´}{a}

r’s equality:

\sum_{i = 1}^{N} I (Y_{i}; V_{i + 1}^{N} | Y^{i - 1}, W) = \sum_{i = 1}^{N} I (V_{i}; Y^{i - 1} | V_{i + 1}^{N}, W),

(A.5)

(d) is from

V_{i}

being independent of

V_{i + 1}^{N}

, (e) and (f) are from J being a random variable (uniformly distributed over

[1, N]

) and being independent of W,

V^{N}

and

Y^{N}

, (g) is from

V_{J}

being independent of J and (h) is from the definitions that

Y ≜ Y_{J}

,

V ≜ V_{J}

and

K ≜ (W, Y^{J - 1}, V_{J + 1}^{N}, J)

.

By using

P_{e} \leq ϵ

,

ϵ \to 0

as

N \to \infty

,

{lim}_{N \to \infty} \frac{H (W)}{N} = R

and (A.4), it is easy to see that

R \leq I (K; Y) - I (K; V)

.

A.3. Proof of (A.3)

\begin{matrix} \frac{1}{N} H (W | Z^{N}) & \overset{(1)}{\leq} \frac{1}{N} (I (W; Y^{N} | Z^{N}) + δ (P_{e})) \\ \leq \frac{1}{N} \sum_{i = 1}^{N} H (Y_{i} | Z_{i}) + \frac{δ (P_{e})}{N} \\ \overset{(2)}{=} \frac{1}{N} \sum_{i = 1}^{N} H (Y_{i} | Z_{i}, J = i) + \frac{δ (P_{e})}{N} \\ \overset{(3)}{\leq} H (Y_{J} | Z_{J}, J) + \frac{δ (P_{e})}{N} \\ \overset{(4)}{\leq} H (Y | Z) + \frac{δ (P_{e})}{N}, \end{matrix}

(A.6)

where (1) is from Fano’s inequality, (2) is from J being a random variable (uniformly distributed over

{1, 2, . . ., N}

) and being independent of

Y^{N}

and

Z^{N}

, (3) is from J being uniformly distributed over

{1, 2, . . ., N}

and (4) is from the definitions that

Y ≜ Y_{J}

, and

Z ≜ Z_{J}

.

By using

P_{e} \leq ϵ

,

ϵ \to 0

as

N \to \infty

,

{lim}_{N \to \infty} \frac{H (W | Z^{N})}{N} \geq R_{e}

and (A.6), it is easy to see that

R_{e} \leq H (Y | Z)

.

The converse proof of Theorem 1 is completed.

B. Direct Proof of Theorem 1

The direct part (achievability) of Theorem 1 is proven by considering the following two cases.

Case 1: If $I (K; Y) - I (K; V) \geq H (Y | Z)$ , we need to show that $(R = I (K; Y) - I (K; V) - ϵ, R_{e} = H (Y | Z))$ is achievable, where $ϵ \to 0^{+}$ .
Case 2: If $I (K; Y) - I (K; V) \leq H (Y | Z)$ , we need to show that $(R = I (K; Y) - I (K; V) - ϵ, R_{e} = R = I (K; Y) - I (K; V) - ϵ)$ is achievable.

The direct proof of Theorem 1 is organized as follows. The balanced coloring lemma introduced by Ahlswede and Cai is provided in Subsection B.1, and it will be used in the remainder of this section. The code-book generation is shown in Subsection B.2, and the equivocation analysis is given in Subsection B.3.

B.1. The Balanced Coloring Lemma

The balanced coloring lemma was first introduced by Ahlswede and Cai; see the following.

Lemma 1. Balanced coloring lemma: For all

ϵ_{1}

,

ϵ_{2}

,

ϵ_{3}

,

δ > 0

, sufficiently large N and all N-type

P_{Y} (y)

, there exists a γ-coloring

c : T_{Y}^{N} (ϵ_{1}) \to {1, 2, . ., γ}

of

T_{Y}^{N} (ϵ_{1})

such that for all joint N-type

P_{Y Z} (y, z)

with marginal distribution

P_{Z} (z)

and

\frac{| T_{Y | Z}^{N} (z^{N}) |}{γ} > 2^{N ϵ_{2}}

,

z^{N} \in T_{Z}^{N} (ϵ_{3})

,

| c^{- 1} (k) | \leq \frac{| T_{Y | Z}^{N} (z^{N}) | (1 + δ)}{γ},

(B.1)

for

k = 1, 2, . . ., γ

, where

c^{- 1}

is the inverse image of c.

Proof. Letting

U = c o n s t

, Lemma 1 is directly from p. 259 of [1], and thus, we omit it here. ☐

Lemma 1 shows that if

y^{N}

and

z^{N}

are joint typical, for given

z^{N}

, the number of

y^{N} \in T_{Y | Z}^{N} (z^{N})

for a certain color k (

k = 1, 2, . . ., γ

), which is denoted as

| c^{- 1} (k) |

, is upper bounded by

\frac{| T_{Y | Z}^{N} (z^{N}) | (1 + δ)}{γ}

. By using Lemma 1, it is easy to see that the typical set

T_{Y | Z}^{N} (z^{N})

maps into at least:

\frac{| T_{Y | Z}^{N} (z^{N}) |}{\frac{| T_{Y | Z}^{N} (z^{N}) | (1 + δ)}{γ}} = \frac{γ}{1 + δ}

(B.2)

colors. On the other hand, the typical set

T_{Y | Z}^{N} (z^{N})

maps into at most γ colors.

B.2. Code-Book Generation

Fix the joint probability mass function

P_{Z, Y | X, V} (z, y | x, v) P_{X | K, V} (x | k, v) P_{K V} (k, v)

. The message set

W

satisfies:

lim_{N \to \infty} \frac{log ∥ W ∥}{N} = R = I (K; Y) - I (K; V) - ϵ .

(B.3)

Let

W = {1, 2, . . ., 2^{N R}}

.

The block Markov encoding scheme is used in the direct proof of Theorem 1. The random vectors

K^{N}

,

V^{N}

,

X^{N}

,

Y^{N}

and

Z^{N}

consist of n blocks of length N. Let

{\tilde{K}}_{i}

,

{\tilde{V}}_{i}

,

{\tilde{Y}}_{i}

and

{\tilde{Z}}_{i}

(

1 \leq i \leq n

) be the random vectors for block i. Define

{\tilde{k}}^{n} = ({\tilde{k}}_{1}, {\tilde{k}}_{2}, . . ., {\tilde{k}}_{n})

,

{\tilde{v}}^{n} = ({\tilde{v}}_{1}, {\tilde{v}}_{2}, . . ., {\tilde{v}}_{n})

,

{\tilde{y}}^{n} = ({\tilde{y}}_{1}, {\tilde{y}}_{2}, . . ., {\tilde{y}}_{n})

and

{\tilde{z}}^{n} = ({\tilde{z}}_{1}, {\tilde{z}}_{2}, . . ., {\tilde{z}}_{n})

to be the specific vectors for all blocks. The message

W^{n}

for all n blocks is denoted by

W^{n} = (W_{1}, W_{2}, . . ., W_{n})

, where

W_{i}

(

2 \leq i \leq n

) is uniformly distributed over the alphabet

W

, and

W_{i}

is independent of

W_{j}

(

2 \leq j \leq n

and

j \neq i

). Note that

w_{1}

does not exist.

Construction of

K^{N}

:

Gel’fand and Pinsker’s binning and block Markov coding scheme are used in the construction of

K^{N}

.

Construction of $K^{N}$ for Case 1:
For each block, generate $2^{N (I (K; Y) - ϵ_{2, N})}$ ( $ϵ_{2, N} \to 0$ ) i.i.d. sequences of $k^{N}$ , according to $p_{K} (k)$ . Partition these sequences at random into $2^{N R} = 2^{N (I (K; Y) - I (K; V) - γ_{1})}$ bins, such that each bin has $2^{N (I (K; V) + γ_{1} - ϵ_{2, N})}$ sequences. Index each bin by $l \in {1, 2, . . ., 2^{N R}}$ .
Denote the message $w_{i}$ ( $2 \leq i \leq n$ ) by $w_{i} = (w_{i 1}, w_{i 2})$ , where $w_{i 1} \in W_{i 1} = {1, 2, . . ., 2^{N H (Y | Z)}}$ and $w_{i 2} \in W_{i 2} = {1, 2, . . ., 2^{N (R - H (Y | Z))}}$ . Here, note that $W_{i 1}$ is independent of $W_{i 2}$ .
In the first block, for a given side information ${\tilde{v}}_{1}$ , try to find a ${\tilde{k}}_{1}$ , such that $({\tilde{k}}_{1}, {\tilde{v}}_{1}) \in T_{K V}^{N} (ϵ)$ . If multiple sequences exist, randomly choose one for transmission. If there is no such sequence, declare an encoding error.
For the i-th block ( $2 \leq i \leq n$ ), the transmitter receives the output ${\tilde{y}}_{i - 1}$ of the $i - 1$ -th block; he or she gives up if ${\tilde{y}}_{i - 1} \notin T_{Y}^{N} (ϵ_{2})$ ( $ϵ_{2} \to 0$ as $N \to \infty$ ). It is easy to see that the probability for giving up at the $i - 1$ -th block tends to zero as $N \to \infty$ . In the case ${\tilde{y}}_{i - 1} \in T_{Y}^{N} (ϵ_{2})$ , generate a mapping $g_{f} : T_{Y}^{N} (ϵ_{2}) \to {1, 2, . . ., 2^{N H (Y | Z)}}$ . Define a random variable $K_{i}^{*}$ by $K_{i}^{*} = g_{f} ({\tilde{Y}}_{i - 1})$ ( $2 \leq i \leq n$ ), and it is uniformly distributed over the set $W_{i 1} = {1, 2, . . ., 2^{N H (Y | Z)}}$ . $K_{i}^{*}$ is independent of $W_{i}$ . Reveal the mapping $g_{f}$ to the legitimate receiver, the wiretapper and the transmitter. Then, since the transmitter gets ${\tilde{y}}_{i - 1}$ , he computes $k_{i}^{*} = g_{f} ({\tilde{y}}_{i - 1}) \in {1, 2, . . ., 2^{N H (Y | Z)}}$ . For a given $w_{i} = (w_{i 1}, w_{i 2})$ ( $2 \leq i \leq n$ ), the transmitter selects a sequence ${\tilde{k}}_{i}$ in the bin $(w_{i 1} \oplus k_{i}^{*}, w_{i 2})$ (where ⊕ is the modulo addition over $W_{i 1}$ ), such that $({\tilde{k}}_{i}, {\tilde{v}}_{i}) \in T_{K V}^{N} (ϵ)$ . If multiple sequences in bin $(w_{i 1} \oplus k_{i}^{*}, w_{i 2})$ exist, choose the sequence with the smallest index in the bin. If there is no such sequence, declare an encoding error. Here, note that since $K_{i}^{*}$ is independent of $W_{i} = (W_{i 1}, W_{i 2})$ , $W_{i 1} \oplus K_{i}^{*}$ is independent of $W_{i}$ and $K_{i}^{*}$ . The proof is given as follows.
Proof. Since:

$\begin{matrix} P r {K_{i}^{*} \oplus W_{i 1} = a} = \sum_{k_{i}^{*} \in W_{i 1}} P r {K_{i}^{*} \oplus W_{i 1} = a, K_{i}^{*} = k_{i}^{*}} \\ = \sum_{k_{i}^{*} \in W_{i 1}} P r {W_{i 1} = a ⊖ k_{i}^{*}, K_{i}^{*} = k_{i}^{*}} \\ = \sum_{k_{i}^{*} \in W_{i 1}} P r {W_{i 1} = a ⊖ k_{i}^{*}} P r {K_{i}^{*} = k_{i}^{*}} \\ = \sum_{k_{i}^{*} \in W_{i 1}} \frac{1}{∥ W_{i 1} ∥^{2}} = \frac{1}{∥ W_{i 1} ∥}, \end{matrix}$

(B.4)

and:

$\begin{matrix} P r {K_{i}^{*} \oplus W_{i 1} = a, K_{i}^{*} = k_{i}^{*}} \\ = P r {W_{i 1} = a ⊖ k_{i}^{*}, K_{i}^{*} = k_{i}^{*}} \\ = P r {W_{i 1} = a ⊖ k_{i}^{*}} P r {K_{i}^{*} = k_{i}^{*}} \\ = \frac{1}{∥ W_{i 1} ∥^{2}}, \end{matrix}$

(B.5)

it is easy to see that $P r {K_{i}^{*} \oplus W_{i 1} = a, K_{i}^{*} = k_{i}^{*}} = P r {K_{i}^{*} \oplus W_{i 1} = a} \cdot P r {K_{i}^{*} = k_{i}^{*}}$ , which implies that $K_{i}^{*} \oplus W_{i 1}$ is independent of $K_{i}^{*}$ .
Analogously, we can prove that $P r {K_{i}^{*} \oplus W_{i 1} = a, W_{i 1} = w_{i 1}, W_{i 2} = w_{i 2}} = P r {K_{i}^{*} \oplus W_{i 1} = a} \cdot P r {W_{i 1} = w_{i 1}} \cdot P r {W_{i 2} = w_{i 2}}$ , which implies that $K_{i}^{*} \oplus W_{i 1}$ is independent of $W_{i} = (W_{i 1}, W_{i 2})$ . Thus, the proof of $W_{i 1} \oplus K_{i}^{*}$ is independent of $W_{i}$ , and $K_{i}^{*}$ is completed.
☐
Construction of $K^{N}$ for Case 2: The construction of $K^{N}$ for Case 2 is similar to that of Case 1, except that there is no need to divide $w_{i}$ into two parts. The detail is as follows. For the i-th block ( $2 \leq i \leq n$ ), if ${\tilde{y}}_{i - 1} \in T_{Y}^{N} (ϵ_{2})$ , generate a mapping $g_{f} : T_{Y}^{N} (ϵ_{2}) \to W$ (note that $| T_{Y}^{N} (ϵ_{2}) | \geq | W |$ ). Let $K_{i}^{*} = g_{f} ({\tilde{Y}}_{i - 1})$ ( $2 \leq i \leq n$ ), and it is uniformly distributed over the set $W$ . $K_{i}^{*}$ is independent of $W_{i}$ . Reveal the mapping $g_{f}$ to the legitimate receiver, the wiretapper and the transmitter. When the transmitter receives the feedback ${\tilde{y}}_{i - 1}$ of the $i - 1$ -th block, he or she computes $k_{i}^{*} = g_{f} ({\tilde{y}}_{i - 1}) \in W$ . For a given transmitted message $w_{i}$ ( $2 \leq i \leq n$ ), the transmitter selects a codeword ${\tilde{k}}_{i}$ in the bin $w_{i} \oplus k_{i}^{*}$ (where ⊕ is the modulo addition over $W$ ), such that $({\tilde{k}}_{i}, {\tilde{v}}_{i}) \in T_{K V}^{N} (ϵ)$ . If multiple sequences in bin $w_{i} \oplus k_{i}^{*}$ exist, select the one with the smallest index in the bin. If there is no such sequence, declare an encoding error. Here, note that $W_{i} \oplus K_{i}^{*}$ is independent of $W_{i}$ and $K_{i}^{*}$ , and the proof is similar to that of Case 1. Thus, we omit the proof here.

Construction of

X^{N}

:

In each block, the channel input

x^{N}

is generated by a pre-fixed discrete memoryless channel with transition probability

P_{X | K, V} (x | k, v)

. The inputs of the channel are

k^{N}

and

v^{N}

, and the output is

x^{N}

.

Here, note that for Case 1, the random vector

{\tilde{K}}_{i}

of block i (

2 \leq i \leq n

) is i.i.d. generated corresponding to the encrypted message

(W_{i 1} \oplus K_{i}^{*}, W_{i 2})

and

{\tilde{V}}_{i}

(here,

{\tilde{V}}_{i}

is also i.i.d. generated according to the probability mass function

P_{V} (v)

). Since

{\tilde{Y}}_{i}

and

{\tilde{Z}}_{i}

are generated according to

{\tilde{K}}_{i}

,

{\tilde{V}}_{i}

and the discrete memoryless channel, the only connection between

(W_{i}, {\tilde{K}}_{i}, {\tilde{V}}_{i}, {\tilde{Y}}_{i}, {\tilde{Z}}_{i})

of the i-th block and

(W_{i - 1}, {\tilde{K}}_{i - 1}, {\tilde{V}}_{i - 1}, {\tilde{Y}}_{i - 1}, {\tilde{Z}}_{i - 1})

of the

i - 1

-th block is the secret key

K_{i}^{*}

, which is generated by

{\tilde{Y}}_{i - 1}

. As stated above, both the encrypted message

(W_{i 1} \oplus K_{i}^{*}, W_{i 2})

and the real message

W_{i} = (W_{i 1}, W_{i 2})

are independent of

K_{i}^{*}

, and thus,

(W_{i}, {\tilde{K}}_{i}, {\tilde{V}}_{i}, {\tilde{Y}}_{i}, {\tilde{Z}}_{i})

of the i-th block are independent of

(W_{i - 1}, {\tilde{K}}_{i - 1}, {\tilde{V}}_{i - 1}, {\tilde{Y}}_{i - 1}, {\tilde{Z}}_{i - 1})

of the

i - 1

-th block. Since

(W_{i 1} \oplus K_{i}^{*}, W_{i 2})

and

W_{i}

are also independent of

W_{j}

and

K_{j}^{*}

(

2 \leq i, j \leq n

and

j \neq i

), it is easy to see that

(W_{i}, {\tilde{K}}_{i}, {\tilde{V}}_{i}, {\tilde{Y}}_{i}, {\tilde{Z}}_{i})

are independent of

(W_{j}, {\tilde{K}}_{j}, {\tilde{V}}_{j}, {\tilde{Y}}_{j}, {\tilde{Z}}_{j})

. Finally, note that

(W_{i 1} \oplus K_{i}^{*}, W_{i 2})

(

2 \leq i \leq n

) is independent of

K_{2}^{*}

(generated by

{\tilde{Y}}_{1}

); thus,

(W_{i}, {\tilde{K}}_{i}, {\tilde{V}}_{i}, {\tilde{Y}}_{i}, {\tilde{Z}}_{i})

are independent of

({\tilde{K}}_{1}, {\tilde{V}}_{1}, {\tilde{Y}}_{1}, {\tilde{Z}}_{1})

.

Analogously, in Case 2, for

2 \leq i, j \leq n

and

j \neq i

, the fact that

(W_{i}, {\tilde{K}}_{i}, {\tilde{V}}_{i}, {\tilde{Y}}_{i}, {\tilde{Z}}_{i})

are independent of

(W_{j}, {\tilde{K}}_{j}, {\tilde{V}}_{j}, {\tilde{Y}}_{j}, {\tilde{Z}}_{j})

and

({\tilde{K}}_{1}, {\tilde{V}}_{1}, {\tilde{Y}}_{1}, {\tilde{Z}}_{1})

also holds.

Decoding: For block i (

2 \leq i \leq n

), given a vector

{\tilde{y}}_{i} \in Y^{N}

, try to find a sequence

{\tilde{k}}_{i} ({\hat{w}}_{i 1} \oplus k_{i}^{*}, {\hat{w}}_{i 2}, \hat{j})

(Case 1) or

{\tilde{k}}_{i} ({\hat{w}}_{i} \oplus k_{i}^{*}, \hat{j})

(Case 2), such that

{\tilde{k}}_{i}

and

{\tilde{y}}_{i}

are joint typical. If there exists a unique sequence, put out the corresponding index of the bin

({\hat{w}}_{i 1} \oplus k_{i}^{*}, {\hat{w}}_{i 2})

or

{\hat{w}}_{i} \oplus k_{i}^{*}

. Otherwise, declare a decoding error. Since the legitimate receiver has

k_{i}^{*}

, put out the corresponding

{\hat{w}}_{i}

from

({\hat{w}}_{i 1} \oplus k_{i}^{*}, {\hat{w}}_{i 2})

or

{\hat{w}}_{i} \oplus k_{i}^{*}

.

B.3. Proof of Achievability

Here, note that the above encoding-decoding scheme for the achievability proof of Theorem 1 is exactly the same as that in [11], except that the transmitter transmits an “encrypted message” by using the secret key

k_{i}^{*}

. Since the legitimate receiver has

k_{i}^{*}

, the decoding scheme for the achievability proof of Theorem 1 is in fact the same as that in [11]. Hence, we omit the proof of

P_{e} \leq ϵ

here. It remains to prove that

{lim}_{N \to \infty} Δ \geq R_{e}

; see the following.

For Case 1, part of the message $w_{i}$ is encrypted by $k_{i}^{*}$ . In the analysis of the equivocation, we drop $w_{i 2}$ from $w_{i}$ . Then, the equivocation about $w_{i}$ is equivalent to the equivocation about $k_{i}^{*}$ . Since $k_{i}^{*} = g_{f} ({\tilde{y}}_{i - 1})$ , the wiretapper tries to guess $k_{i}^{*}$ from ${\tilde{y}}_{i - 1}$ . Note that for a given ${\tilde{z}}_{i - 1}$ and sufficiently large N, $P r {{\tilde{y}}_{i - 1} \in T_{Y | Z}^{N} ({\tilde{z}}_{i - 1})} \to 1$ . Thus, the wiretapper can guess ${\tilde{y}}_{i - 1}$ from the conditional typical set $T_{Y | Z}^{N} ({\tilde{z}}_{i - 1})$ . By using the above Lemma 1 and (B.2), the set $T_{Y | Z}^{N} ({\tilde{z}}_{i - 1})$ maps into at least $\frac{2^{N H (Y | Z)}}{1 + δ}$ (here, $γ = 2^{N H (Y | Z)}$ ) $k_{i}^{*}$ (colors). Thus, in the i-th block, the uncertainty about $K_{i}^{*}$ is bounded by:

$\frac{1}{N} H (K_{i}^{*} | {\tilde{Z}}_{i - 1}) \geq H (Y | Z) - \frac{log (1 + δ)}{N},$

(B.6)

Here, note that $K_{i}^{*}$ is uniformly distributed.
For Case 2, the alphabet of the secret key $k_{i}^{*}$ equals the alphabet $W_{i} = {1, 2, . . ., 2^{N R}}$ , and the encrypted message is denoted by $w_{i} \oplus k_{i}^{*}$ . Then, by using the above Lemma 1 and (B.2), the set $T_{Y | Z}^{N} ({\tilde{z}}_{i - 1})$ maps into at least $\frac{2^{N R}}{1 + δ}$ (here, $γ = 2^{N R}$ ) $k_{i}^{*}$ (colors). Thus, in the i-th block, the uncertainty about $K_{i}^{*}$ is bounded by:

$\frac{1}{N} H (K_{i}^{*} | {\tilde{Z}}_{i - 1}) \geq R - \frac{log (1 + δ)}{N} .$

(B.7)

Proof of

{lim}_{N \to \infty} Δ \geq R_{e}

for Case 1:

\begin{matrix} Δ = \frac{H (W^{n} | Z^{n})}{n N} = \frac{\sum_{i = 2}^{n} H (W_{i} | W^{i - 1}, Z^{n})}{n N} \\ \overset{(a)}{=} \frac{\sum_{i = 2}^{n} H (W_{i} | {\tilde{Z}}_{i}, {\tilde{Z}}_{i - 1})}{n N} \\ \geq \frac{\sum_{i = 2}^{n} H (W_{i 1} | {\tilde{Z}}_{i}, {\tilde{Z}}_{i - 1})}{n N} \\ \geq \frac{\sum_{i = 2}^{n} H (W_{i 1} | {\tilde{Z}}_{i}, {\tilde{Z}}_{i - 1}, W_{i 1} \oplus K_{i}^{*})}{n N} \\ \geq \frac{\sum_{i = 2}^{n} H (W_{i 1} | W_{i 2}, {\tilde{Z}}_{i}, {\tilde{Z}}_{i - 1}, W_{i 1} \oplus K_{i}^{*})}{n N} \\ \overset{(b)}{=} \frac{\sum_{i = 2}^{n} H (W_{i 1} | W_{i 2}, {\tilde{Z}}_{i - 1}, W_{i 1} \oplus K_{i}^{*})}{n N} \\ \overset{(c)}{=} \frac{\sum_{i = 2}^{n} H (W_{i 1} | {\tilde{Z}}_{i - 1}, W_{i 1} \oplus K_{i}^{*})}{n N} \\ = \frac{\sum_{i = 2}^{n} H (K_{i}^{*} | {\tilde{Z}}_{i - 1}, W_{i 1} \oplus K_{i}^{*})}{n N} \\ \overset{(d)}{=} \frac{\sum_{i = 2}^{n} H (K_{i}^{*} | {\tilde{Z}}_{i - 1})}{n N} \\ \overset{(e)}{\geq} \frac{\sum_{i = 2}^{n} (N H (Y | Z) - log (1 + δ))}{n N} \\ = \frac{(n - 1) (N H (Y | Z) - log (1 + δ))}{n N}, \end{matrix}

(B.8)

where (a) is from

W_{i} \to ({\tilde{Z}}_{i}, {\tilde{Z}}_{i - 1}) \to (W^{i - 1}, {\tilde{Z}}^{i - 2}, {\tilde{Z}}_{i + 1}^{n})

(proven in the remainder of this section), (b) is from

W_{i 1} \to (W_{i 2}, W_{i 1} \oplus K_{i}^{*}, {\tilde{Z}}_{i - 1}) \to {\tilde{Z}}_{i}

(proven in the remainder of this section), (c) is from

W_{i 2}

being independent of

{\tilde{Z}}_{i - 1}

,

W_{i 1} \oplus K_{i}^{*}

and

W_{i 1}

, (d) follows from the fact that

W_{i 1} \oplus K_{i}^{*}

is independent of

K_{i}^{*}

,

W_{i 1}

and

{\tilde{Z}}_{i - 1}

and (e) is from (B.6).

Letting

N \to \infty

and

n \to \infty

, it is easy to see that:

\begin{matrix} lim_{N \to \infty} Δ = lim_{N \to \infty} lim_{n \to \infty} \frac{H (W^{n} | Z^{n})}{n N} \geq H (Y | Z) = R_{e} . \end{matrix}

(B.9)

The proof of

{lim}_{N \to \infty} Δ \geq R_{e}

for Case 1 is completed.

Proof of

{lim}_{N \to \infty} Δ \geq R_{e}

for Case 2:

\begin{matrix} Δ = \frac{H (W^{n} | Z^{n})}{n N} \\ \overset{(a)}{=} \frac{\sum_{i = 2}^{n} H (W_{i} | {\tilde{Z}}_{i}, {\tilde{Z}}_{i - 1})}{n N} \\ \geq \frac{\sum_{i = 2}^{n} H (W_{i} | {\tilde{Z}}_{i}, {\tilde{Z}}_{i - 1}, W_{i} \oplus K_{i}^{*})}{n N} \\ \overset{(b)}{=} \frac{\sum_{i = 2}^{n} H (W_{i} | {\tilde{Z}}_{i - 1}, W_{i} \oplus K_{i}^{*})}{n N} \\ = \frac{\sum_{i = 2}^{n} H (K_{i}^{*} | {\tilde{Z}}_{i - 1}, W_{i} \oplus K_{i}^{*})}{n N} \\ \overset{(c)}{=} \frac{\sum_{i = 2}^{n} H (K_{i}^{*} | {\tilde{Z}}_{i - 1})}{n N} \\ \overset{(d)}{\geq} \frac{\sum_{i = 2}^{n} (N R - log (1 + δ))}{n N} \\ = \frac{(n - 1) (N R - log (1 + δ))}{n N}, \end{matrix}

(B.10)

where (a) is from

W_{i} \to ({\tilde{Z}}_{i}, {\tilde{Z}}_{i - 1}) \to (W^{i - 1}, {\tilde{Z}}^{i - 2}, {\tilde{Z}}_{i + 1}^{n})

(proven in the remainder of this section), (b) is from

W_{i} \to (W_{i} \oplus K_{i}^{*}, {\tilde{Z}}_{i - 1}) \to {\tilde{Z}}_{i}

(proven in the remainder of this section), (c) follows from the fact that

W_{i} \oplus K_{i}^{*}

is independent of

K_{i}^{*}

and

{\tilde{Z}}_{i - 1}

and (d) is from (B.7).

Letting

N \to \infty

and

n \to \infty

, it is easy to see that:

\begin{matrix} lim_{N \to \infty} Δ = lim_{N \to \infty} lim_{n \to \infty} \frac{H (W^{n} | Z^{n})}{n N} \geq R = R_{e} . \end{matrix}

(B.11)

The proof of

{lim}_{N \to \infty} Δ \geq R_{e}

for Case 2 is completed.

It remains to prove the Markov chains

W_{i} \to ({\tilde{Z}}_{i}, {\tilde{Z}}_{i - 1}) \to (W^{i - 1}, {\tilde{Z}}^{i - 2}, {\tilde{Z}}_{i + 1}^{n})

and

W_{i 1} \to (W_{i 2}, W_{i 1} \oplus K_{i}^{*}, {\tilde{Z}}_{i - 1}) \to {\tilde{Z}}_{i}

of the proof of

{lim}_{N \to \infty} Δ \geq R_{e}

for Case 1 and

W_{i} \to ({\tilde{Z}}_{i}, {\tilde{Z}}_{i - 1}) \to (W^{i - 1}, {\tilde{Z}}^{i - 2}, {\tilde{Z}}_{i + 1}^{n})

,

W_{i} \to (W_{i} \oplus K_{i}^{*}, {\tilde{Z}}_{i - 1}) \to {\tilde{Z}}_{i}

of the proof of

{lim}_{N \to \infty} Δ \geq R_{e}

for Case 2.

Proof. Proof of

W_{i} \to ({\tilde{Z}}_{i}, {\tilde{Z}}_{i - 1}) \to (W^{i - 1}, {\tilde{Z}}^{i - 2}, {\tilde{Z}}_{i + 1}^{n})

for Case 1:

For convenience, we denote the probability

P r {V = v}

by

P r {v}

.

By definition,

W_{i} \to ({\tilde{Z}}_{i}, {\tilde{Z}}_{i - 1}) \to (W^{i - 1}, {\tilde{Z}}^{i - 2}, {\tilde{Z}}_{i + 1}^{n})

holds if and only if:

\begin{matrix} P r {w_{i} | {\tilde{z}}_{i}, {\tilde{z}}_{i - 1}, w^{i - 1}, {\tilde{z}}^{i - 2}, {\tilde{z}}_{i + 1}^{n}} = P r {w_{i} | {\tilde{z}}_{i}, {\tilde{z}}_{i - 1}} . \end{matrix}

(B.12)

Equation (B.12) can be further expressed as:

\begin{matrix} \frac{P r {w_{i}, {\tilde{z}}_{i}, {\tilde{z}}_{i - 1}, w^{i - 1}, {\tilde{z}}^{i - 2}, {\tilde{z}}_{i + 1}^{n}}}{P r {{\tilde{z}}_{i}, {\tilde{z}}_{i - 1}, w^{i - 1}, {\tilde{z}}^{i - 2}, {\tilde{z}}_{i + 1}^{n}}} = \frac{P r {w_{i}, {\tilde{z}}_{i}, {\tilde{z}}_{i - 1}}}{P r {{\tilde{z}}_{i}, {\tilde{z}}_{i - 1}}} . \end{matrix}

(B.13)

It remains to calculate the joint probabilities in (B.13); see the following.

\begin{matrix} P r {w_{i}, {\tilde{z}}_{i}, {\tilde{z}}_{i - 1}, w^{i - 1}, {\tilde{z}}^{i - 2}, {\tilde{z}}_{i + 1}^{n}} = P r {w^{i}, {\tilde{z}}^{n}} \\ = \sum_{{\tilde{v}}^{n}} \sum_{{\tilde{y}}^{n}} \sum_{{\tilde{k}}^{n}} P r {w^{i}, {\tilde{z}}^{n}, {\tilde{v}}^{n}, {\tilde{y}}^{n}, {\tilde{k}}^{n}} \\ \overset{(a)}{=} \sum_{{\tilde{v}}^{n}} \sum_{{\tilde{y}}^{n}} \sum_{{\tilde{k}}^{n}} P r {w^{i}, {\tilde{z}}^{i}, {\tilde{v}}^{i}, {\tilde{y}}^{i}, {\tilde{k}}^{i}} \cdot P r {{\tilde{z}}_{i + 1}^{n}, {\tilde{v}}_{i + 1}^{n}, {\tilde{y}}_{i + 1}^{n}, {\tilde{k}}_{i + 1}^{n}} \\ = \sum_{{\tilde{v}}^{i}} \sum_{{\tilde{y}}^{i}} \sum_{{\tilde{k}}^{i}} P r {w^{i}, {\tilde{z}}^{i}, {\tilde{v}}^{i}, {\tilde{y}}^{i}, {\tilde{k}}^{i}} \sum_{{\tilde{v}}_{i + 1}^{n}} \sum_{{\tilde{y}}_{i + 1}^{n}} \sum_{{\tilde{k}}_{i + 1}^{n}} P r {{\tilde{z}}_{i + 1}^{n}, {\tilde{v}}_{i + 1}^{n}, {\tilde{y}}_{i + 1}^{n}, {\tilde{k}}_{i + 1}^{n}} \\ = \sum_{{\tilde{v}}^{i}} \sum_{{\tilde{y}}^{i}} \sum_{{\tilde{k}}^{i}} P r {w^{i}, {\tilde{z}}^{i}, {\tilde{v}}^{i}, {\tilde{y}}^{i}, {\tilde{k}}^{i}} P r {{\tilde{z}}_{i + 1}^{n}} \\ \overset{(b)}{=} P r {{\tilde{z}}_{i + 1}} \cdot \cdot \cdot \cdot P r {{\tilde{z}}_{n}} (\sum_{{\tilde{v}}^{i}} \sum_{{\tilde{y}}^{i}} \sum_{{\tilde{k}}^{i}} P r {w^{i}, {\tilde{z}}^{i}, {\tilde{v}}^{i}, {\tilde{y}}^{i}, {\tilde{k}}^{i}}) \\ = P r {{\tilde{z}}_{i + 1}} \cdot \cdot \cdot \cdot P r {{\tilde{z}}_{n}} (\sum_{{\tilde{v}}^{i}} \sum_{{\tilde{y}}^{i}} \sum_{{\tilde{k}}^{i}} P r {{\tilde{k}}^{i} | w^{i}, {\tilde{z}}^{i}, {\tilde{v}}^{i}, {\tilde{y}}^{i}} P r {w^{i}, {\tilde{z}}^{i}, {\tilde{v}}^{i}, {\tilde{y}}^{i}}) \\ \overset{(c)}{=} P r {{\tilde{z}}_{i + 1}} \cdot \cdot \cdot \cdot P r {{\tilde{z}}_{n}} (\sum_{{\tilde{v}}^{i}} \sum_{{\tilde{y}}^{i}} P r {w^{i}, {\tilde{z}}^{i}, {\tilde{v}}^{i}, {\tilde{y}}^{i}}) \\ \overset{(d)}{=} P r {{\tilde{z}}_{i + 1}} \cdot \cdot \cdot \cdot P r {{\tilde{z}}_{n}} (\sum_{{\tilde{v}}^{i}} \sum_{{\tilde{y}}^{i}} P r {{\tilde{z}}_{1}, {\tilde{v}}_{1}, {\tilde{y}}_{1}} \prod_{j = 2}^{i} P r {w_{j}, {\tilde{z}}_{j}, {\tilde{v}}_{j}, {\tilde{y}}_{j}}) \\ = P r {{\tilde{z}}_{i + 1}} \cdot \cdot \cdot \cdot P r {{\tilde{z}}_{n}} (\sum_{{\tilde{v}}_{1}} \sum_{{\tilde{y}}_{1}} P r {{\tilde{z}}_{1}, {\tilde{v}}_{1}, {\tilde{y}}_{1}}) (\sum_{{\tilde{v}}_{2}} \sum_{{\tilde{y}}_{2}} P r {w_{2}, {\tilde{z}}_{2}, {\tilde{v}}_{2}, {\tilde{y}}_{2}}) \cdot \cdot \cdot (\sum_{{\tilde{v}}_{i}} \sum_{{\tilde{y}}_{i}} P r {w_{i}, {\tilde{z}}_{i}, {\tilde{v}}_{i}, {\tilde{y}}_{i}}) \\ = P r {{\tilde{z}}_{i + 1}} \cdot \cdot \cdot \cdot P r {{\tilde{z}}_{n}} P r {{\tilde{z}}_{1}} P r {w_{2}, {\tilde{z}}_{2}} \cdot \cdot \cdot P r {w_{i}, {\tilde{z}}_{i}}, \end{matrix}

(B.14)

where (a) is from the fact that

({\tilde{z}}_{i + 1}^{n}, {\tilde{v}}_{i + 1}^{n}, {\tilde{y}}_{i + 1}^{n}, {\tilde{k}}_{i + 1}^{n})

are independent of

(w^{i}, {\tilde{z}}^{i}, {\tilde{v}}^{i}, {\tilde{y}}^{i}, {\tilde{k}}^{i})

, (b) is from the fact that

{\tilde{Z}}_{j}

is independent of

{\tilde{Z}}_{l}

for all of the

i + 1 \leq j, l \leq n

and

j \neq l

, (c) is from the fact that given

w^{i}

,

{\tilde{z}}^{i}

,

{\tilde{v}}^{i}

and

{\tilde{y}}^{i}

,

{\tilde{k}}^{i}

is uniquely determined, and (d) follows from the fact that

({\tilde{z}}_{1}, {\tilde{v}}_{1}, {\tilde{y}}_{1})

,

(w_{2}, {\tilde{z}}_{2}, {\tilde{v}}_{2}, {\tilde{y}}_{2})

, ...,

(w_{i}, {\tilde{z}}_{i}, {\tilde{v}}_{i}, {\tilde{y}}_{i})

are independent.

Replacing i by

i - 1

, the joint probability

P r {{\tilde{z}}_{i}, {\tilde{z}}_{i - 1}, w^{i - 1}, {\tilde{z}}^{i - 2}, {\tilde{z}}_{i + 1}^{n}}

can be calculated by:

\begin{matrix} P r {{\tilde{z}}_{i}, {\tilde{z}}_{i - 1}, w^{i - 1}, {\tilde{z}}^{i - 2}, {\tilde{z}}_{i + 1}^{n}} = P r {w^{i - 1}, {\tilde{z}}^{n}} \\ \overset{(e)}{=} & P r {{\tilde{z}}_{i}} \cdot \cdot \cdot \cdot P r {{\tilde{z}}_{n}} P r {{\tilde{z}}_{1}} P r {w_{2}, {\tilde{z}}_{2}} \cdot \cdot \cdot P r {w_{i - 1}, {\tilde{z}}_{i - 1}}, \end{matrix}

(B.15)

where (e) follows from (B.14) (replacing i by

i - 1

).

Substituting (B.14) and (B.15) into the left-hand side of (B.13), we have:

\begin{matrix} \frac{P r {w_{i}, {\tilde{z}}_{i}, {\tilde{z}}_{i - 1}, w^{i - 1}, {\tilde{z}}^{i - 2}, {\tilde{z}}_{i + 1}^{n}}}{P r {{\tilde{z}}_{i}, {\tilde{z}}_{i - 1}, w^{i - 1}, {\tilde{z}}^{i - 2}, {\tilde{z}}_{i + 1}^{n}}} \\ = \frac{P r {w_{i}, {\tilde{z}}_{i}}}{P r {{\tilde{z}}_{i}}} . \end{matrix}

(B.16)

Next, we need to calculate the right-hand side of (B.13); see the following.

\begin{matrix} P r {w_{i}, {\tilde{z}}_{i}, {\tilde{z}}_{i - 1}} \overset{(1)}{=} P r {w_{i}, {\tilde{z}}_{i}} P r {{\tilde{z}}_{i - 1}}, \end{matrix}

(B.17)

where (1) is from the fact that

W_{i}

and

{\tilde{Z}}_{i}

are independent of

{\tilde{Z}}_{i - 1}

.

The joint probability

P r {{\tilde{z}}_{i}, {\tilde{z}}_{i - 1}}

is calculated by:

\begin{matrix} P r {{\tilde{z}}_{i}, {\tilde{z}}_{i - 1}} \overset{(2)}{=} P r {{\tilde{z}}_{i}} \cdot P r {{\tilde{z}}_{i - 1}}, \end{matrix}

(B.18)

where (1) is from the fact that

{\tilde{Z}}_{i}

is independent of

{\tilde{Z}}_{i - 1}

.

Substituting (B.17) and (B.18) into the right-hand side of (B.13), we have:

\begin{matrix} \frac{P r {w_{i}, {\tilde{z}}_{i}, {\tilde{z}}_{i - 1}}}{P r {{\tilde{z}}_{i}, {\tilde{z}}_{i - 1}}} = \frac{P r {w_{i}, {\tilde{z}}_{i}}}{P r {{\tilde{z}}_{i}}} . \end{matrix}

(B.19)

By checking (B.16) and (B.19), the Markov chain

W_{i} \to ({\tilde{Z}}_{i}, {\tilde{Z}}_{i - 1}) \to (W^{i - 1}, {\tilde{Z}}^{i - 2}, {\tilde{Z}}_{i + 1}^{n})

is proven. ☐

Proof. Proof of

W_{i 1} \to (W_{i 2}, W_{i 1} \oplus K_{i}^{*}, {\tilde{Z}}_{i - 1}) \to {\tilde{Z}}_{i}

for Case 1:

By definition,

W_{i 1} \to (W_{i 2}, W_{i 1} \oplus K_{i}^{*}, {\tilde{Z}}_{i - 1}) \to {\tilde{Z}}_{i}

holds if and only if:

\begin{matrix} P r {w_{i 1} | w_{i 2}, w_{i 1} \oplus k_{i}^{*}, {\tilde{z}}_{i - 1}, {\tilde{z}}_{i}} = P r {w_{i 1} | w_{i 2}, w_{i 1} \oplus k_{i}^{*}, {\tilde{z}}_{i - 1}} . \end{matrix}

(B.20)

Equation (B.20) can be further expressed as:

\begin{matrix} \frac{P r {w_{i 1}, w_{i 2}, w_{i 1} \oplus k_{i}^{*}, {\tilde{z}}_{i - 1}, {\tilde{z}}_{i}}}{P r {w_{i 2}, w_{i 1} \oplus k_{i}^{*}, {\tilde{z}}_{i - 1}, {\tilde{z}}_{i}}} = \frac{P r {w_{i 1}, w_{i 2}, w_{i 1} \oplus k_{i}^{*}, {\tilde{z}}_{i - 1}}}{P r {w_{i 2}, w_{i 1} \oplus k_{i}^{*}, {\tilde{z}}_{i - 1}}} . \end{matrix}

(B.21)

It remains to calculate the joint probabilities in (B.21); see the following.

\begin{matrix} P r {w_{i 1}, w_{i 2}, w_{i 1} \oplus k_{i}^{*}, {\tilde{z}}_{i - 1}, {\tilde{z}}_{i}} \overset{(a)}{=} P r {w_{i 1}} \cdot P r {w_{i 2}, w_{i 1} \oplus k_{i}^{*}, {\tilde{z}}_{i}} \cdot P r {{\tilde{z}}_{i - 1}}, \end{matrix}

(B.22)

where (a) is from the fact that

W_{i 1}

is independent of

W_{i 2}

,

W_{i 1} \oplus K_{i}^{*}

,

{\tilde{Z}}_{i}

and

{\tilde{Z}}_{i - 1}

and

{\tilde{Z}}_{i - 1}

is independent of

W_{i 2}

,

W_{i 1} \oplus K_{i}^{*}

,

{\tilde{Z}}_{i}

.

Similarly, we have:

\begin{matrix} P r {w_{i 2}, w_{i 1} \oplus k_{i}^{*}, {\tilde{z}}_{i - 1}, {\tilde{z}}_{i}} \overset{(b)}{=} P r {w_{i 2}, w_{i 1} \oplus k_{i}^{*}, {\tilde{z}}_{i}} \cdot P r {{\tilde{z}}_{i - 1}}, \end{matrix}

(B.23)

where (b) is from the fact that

{\tilde{Z}}_{i - 1}

is independent of

W_{i 2}

,

W_{i 1} \oplus K_{i}^{*}

and

{\tilde{Z}}_{i}

.

Substituting (B.22) and (B.23) into the left-hand side of (B.21), we have:

\begin{matrix} \frac{P r {w_{i 1}, w_{i 2}, w_{i 1} \oplus k_{i}^{*}, {\tilde{z}}_{i - 1}, {\tilde{z}}_{i}}}{P r {w_{i 2}, w_{i 1} \oplus k_{i}^{*}, {\tilde{z}}_{i - 1}, {\tilde{z}}_{i}}} = P r {w_{i 1}} . \end{matrix}

(B.24)

Next, we need to calculate the right-hand side of (B.21); see the following.

\begin{matrix} P r {w_{i 1}, w_{i 2}, w_{i 1} \oplus k_{i}^{*}, {\tilde{z}}_{i - 1}} \overset{(c)}{=} P r {w_{i 1}} \cdot P r {w_{i 2}} \cdot P r {w_{i 1} \oplus k_{i}^{*}} \cdot P r {{\tilde{z}}_{i - 1}}, \end{matrix}

(B.25)

where (c) is from the fact that

W_{i 1}

,

W_{i 2}

,

W_{i 1} \oplus K_{i}^{*}

and

{\tilde{Z}}_{i - 1}

are independent.

The joint probability

P r {w_{i 2}, w_{i 1} \oplus k_{i}^{*}, {\tilde{z}}_{i - 1}}

is calculated by:

\begin{matrix} P r {w_{i 2}, w_{i 1} \oplus k_{i}^{*}, {\tilde{z}}_{i - 1}} \overset{(d)}{=} P r {w_{i 2}} \cdot P r {w_{i 1} \oplus k_{i}^{*}} \cdot P r {{\tilde{z}}_{i - 1}}, \end{matrix}

(B.26)

where (d) is from the fact that

W_{i 2}

,

W_{i 1} \oplus K_{i}^{*}

and

{\tilde{Z}}_{i - 1}

are independent.

Substituting (B.25) and (B.26) into the right-hand side of (B.21), we have:

\begin{matrix} \frac{P r {w_{i 1}, w_{i 2}, w_{i 1} \oplus k_{i}^{*}, {\tilde{z}}_{i - 1}}}{P r {w_{i 2}, w_{i 1} \oplus k_{i}^{*}, {\tilde{z}}_{i - 1}}} = P r {w_{i 1}} . \end{matrix}

(B.27)

By checking (B.24) and (B.27), the Markov chain

W_{i 1} \to (W_{i 2}, W_{i 1} \oplus K_{i}^{*}, {\tilde{Z}}_{i - 1}) \to {\tilde{Z}}_{i}

is proven. ☐

Proof. Proof of

W_{i} \to ({\tilde{Z}}_{i}, {\tilde{Z}}_{i - 1}) \to (W^{i - 1}, {\tilde{Z}}^{i - 2}, {\tilde{Z}}_{i + 1}^{n})

for Case 2:

Letting

W_{i 2} = ⌀

and

W_{i 1} = W_{i}

for all

2 \leq i \leq n

, the proof of

W_{i} \to ({\tilde{Z}}_{i}, {\tilde{Z}}_{i - 1}) \to (W^{i - 1}, {\tilde{Z}}^{i - 2}, {\tilde{Z}}_{i + 1}^{n})

for Case 2 is along the lines of that for Case 1, and therefore, we omit it here. ☐

Proof. Proof of

W_{i} \to (W_{i} \oplus K_{i}^{*}, {\tilde{Z}}_{i - 1}) \to {\tilde{Z}}_{i}

for Case 2:

Letting

W_{i 2} = ⌀

and

W_{i 1} = W_{i}

for all

2 \leq i \leq n

, the proof of

W_{i} \to (W_{i} \oplus K_{i}^{*}, {\tilde{Z}}_{i - 1}) \to {\tilde{Z}}_{i}

for Case 2 is along the lines of that for Case 1, and therefore, we omit it here. ☐

Thus, the direct proof of Theorem 1 is completed.

References

Ahlswede, R.; Cai, N. Transmission, Identification and Common Randomness Capacities for Wire-Tap Channels with Secure Feedback from the Decoder. In General Theory of Information Transfer and Combinatorics; Springer-Verlag: Berlin/Heidelberg, Germany, 2006; Volume 4123, pp. 258–275. [Google Scholar]
Dai, B.; Vinck, A.J.H.; Luo, Y.; Zhuang, Z. Capacity region of non-degraded wiretap channel with noiseless feedback. In Proceedings of 2012 IEEE International Symposium on Information Theory, Cambridge, MA, USA, 1–6 July 2012.
Wyner, A.D. The wire-tap channel. Bell Syst. Tech. J. 1975, 54, 1355–1387. [Google Scholar] [CrossRef]
Ardestanizadeh, E.; Franceschetti, M.; Javidi, T.; Kim, Y. Wiretap channel with secure rate-limited feedback. IEEE Trans. Inf. Theory 2009, 55, 5353–5361. [Google Scholar] [CrossRef]
Lai, L.; El Gamal, H.; Poor, H.V. The wiretap channel with feedback: Encryption over the channel. IEEE Trans. Inf. Theory 2008, 54, 5059–5067. [Google Scholar] [CrossRef]
He, X.; Yener, A. The role of feedback in two-way secure communication. IEEE Trans. Inf. Theory 2013, 59, 8115–8130. [Google Scholar] [CrossRef]
Bassi, G.; Piantanida, P.; Shamai, S. On the capacity of the wiretap channel with generalized feedback. In Processdings of 2015 IEEE International Symposium on Information Theory, Hong Kong, China, 14–19 June 2015.
Mitrpant, C.; Vinck, A.J.H.; Luo, Y. An achievable region for the gaussian wiretap channel with side information. IEEE Trans. Inf. Theory 2006, 52, 2181–2190. [Google Scholar] [CrossRef]
El-Halabi, M.; Liu, T.; Georghiades, C.N.; Shamai, S. Secret writing on dirty paper: A deterministic view. IEEE Trans. Inf. Theory 2012, 58, 3419–3429. [Google Scholar] [CrossRef]
Chen, Y.; Vinck, A.J.H. Wiretap channel with side information. IEEE Trans. Inf. Theory 2008, 54, 395–402. [Google Scholar] [CrossRef]
Gel’fand, S.I.; Pinsker, M.S. Coding for channel with random parameters. Probl. Control Inf. Theory 1980, 9, 19–31. [Google Scholar]
Dai, B.; Luo, Y. Some new results on wiretap channel with side information. Entropy 2012, 14, 1671–1702. [Google Scholar] [CrossRef]
Chia, Y.K.; El Gamal, A. Wiretap channel with causal state information. IEEE Trans. Inf. Theory 2012, 58, 2838–2849. [Google Scholar] [CrossRef]
Liu, T.; Mukherjee, P.; Ulukus, S.; Lin, S.; Hong, Y.W.P. Secure degrees of freedom of MIMO Rayleigh block fading wiretap channels with no CSI anywhere. IEEE Trans. Wireless Commun. 2015, 14, 2655–2669. [Google Scholar] [CrossRef]
Csiszár, I.; Körner, J. Information Theory: Coding Theorems for Discrete Memoryless Systems; Academic Press: New York, NY, USA, 1981; pp. 123–124. [Google Scholar]
Csiszár, I.; Körner, J. Broadcast channels with confidential messages. IEEE Trans. Inf. Theory 1978, 24, 339–348. [Google Scholar]
Gamal, A.E.; Kim, Y.H. Network Information Theory; Cambridge University Press: Cambridge, UK, 2011. [Google Scholar]
Costa, M.H.M. Writing on dirty paper. IEEE Trans. Inf. Theory 1983, 29, 439–441. [Google Scholar] [CrossRef]
Shannon, C.E. A mathematical theory of communication. Bell Syst. Tech. J. 1948, 27, 379–423. [Google Scholar] [CrossRef]
Leung-Yan-Cheong, S.; Hellman, M.E. The Gaussian wire-tap channel. IEEE Trans. Inf. Theory 1978, 24, 451–456. [Google Scholar] [CrossRef]

© 2015 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons by Attribution (CC-BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Dai, B.; Ma, Z.; Yu, L. Feeding Back the Output or Sharing the State: Which Is Better for the State-Dependent Wiretap Channel? Entropy 2015, 17, 7900-7925. https://doi.org/10.3390/e17127852

AMA Style

Dai B, Ma Z, Yu L. Feeding Back the Output or Sharing the State: Which Is Better for the State-Dependent Wiretap Channel? Entropy. 2015; 17(12):7900-7925. https://doi.org/10.3390/e17127852

Chicago/Turabian Style

Dai, Bin, Zheng Ma, and Linman Yu. 2015. "Feeding Back the Output or Sharing the State: Which Is Better for the State-Dependent Wiretap Channel?" Entropy 17, no. 12: 7900-7925. https://doi.org/10.3390/e17127852

Article Menu

Feeding Back the Output or Sharing the State: Which Is Better for the State-Dependent Wiretap Channel?

Abstract

1. Introduction

2. Capacity-Equivocation Region of the Model of Figure 1

2.1. Definitions of the Model of Figure 1

2.2. Main Result of the Model of Figure 1

3. Examples of the Model of Figure 1

3.1. Gaussian Case of the Model of Figure 1 with Noncausal CSI at the Transmitter

3.2. Binary Case of the Model of Figure 1

4. Conclusions

Acknowledgment

Author Contributions

Conflicts of Interest

Appendix

A. Converse Proof of Theorem 1

A.1. Proof of (A.1)

A.2. Proof of (A.2)

A.3. Proof of (A.3)

B. Direct Proof of Theorem 1

B.1. The Balanced Coloring Lemma

B.2. Code-Book Generation

B.3. Proof of Achievability

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI