Biometric Identification Systems with Noisy Enrollment for Gaussian Sources and Channels

Yachongka, Vamoua; Yagi, Hideki; Oohama, Yasutada

doi:10.3390/e23081049

Open AccessArticle

Biometric Identification Systems with Noisy Enrollment for Gaussian Sources and Channels^†

by

Vamoua Yachongka

^1,*

,

Hideki Yagi

² and

Yasutada Oohama

²

¹

Advanced Wireless & Communication Research Center (AWCC), The University of Electro-Communications, 1-5-1 Chofugaoka, Tokyo 182-8585, Japan

²

Department of Computer and Network Engineering, The University of Electro-Communications, 1-5-1 Chofugaoka, Tokyo 182-8585, Japan

^*

Author to whom correspondence should be addressed.

^†

A part of this paper was presented at the 2020 IEEE Information Theory Workshop, 11–15 April 2021. Available online: https://itw2020.it/ (accessed on 16 May 2020). The new contributions are that we provide a complete proof of Theorem 1, show that the characterized regions are convex, and add numerical calculations of the capacity regions.

Entropy 2021, 23(8), 1049; https://doi.org/10.3390/e23081049

Submission received: 30 June 2021 / Revised: 5 August 2021 / Accepted: 10 August 2021 / Published: 15 August 2021

(This article belongs to the Special Issue Information Privacy)

Download

Browse Figures

Versions Notes

Abstract

:

In the present paper, we investigate the fundamental trade-off of identification, secret-key, storage, and privacy-leakage rates in biometric identification systems for remote or hidden Gaussian sources. We use a technique of converting the system to one where the data flow is in one-way direction to derive the capacity region of these rates. Also, we provide numerical calculations of three different examples for the system. The numerical results imply that it seems hard to achieve both high secret-key and small privacy-leakage rates simultaneously.

Keywords:

biometric identification system; noisy enrollment; privacy-leakage; entropy power inequality

1. Introduction

Biometric identification indicates an automated process of recognizing an individual by matching the individual’s biological data (bio-data) with the digital files stored in the system database [1]. Some unique bio-data that can be used for biometric identification include fingerprint, iris, face, voice, palm, and so on [2]. Compared to the traditional method such as password or smart card based identification, it provides higher convenience and security. However, the critical drawback for biometric identification is that the usable sources are limited [3], for instance, human has only two eyes and if their information are leaked, there is no alternative option to replace, and therefore it is important to protect users’ privacy. Furthermore, the size of the storage should be minimized to reduce the memory space of the database [4], especially when the number of users becomes large.

From an information theoretic point of view, there are two major settings of the studies related to biometric identification systems (BISs), namely, the BIS with exponentially many users and the system with one user. The difference of these systems is that in the former setting, we are interested in finding the maximum number of users who are reliably identifiable, i.e., the maximum achievable identification rate at which the error probability of the BIS vanishes (the identification capacity). However, in the latter setting, the estimation of the user need not be considered since there exists only one user and it becomes redundant. For discrete memoryless sources (DMSs), the fundamental performance of the BIS was widely analyzed for both scenarios.

The BIS with multiple users was initially treated as a mathematical model in the seminal work [5], and the identification capacity of the BIS was clarified. In the model, it is assumed that every biometric identifier is enrolled via a noisy channel, and this type of model is known as a remote or hidden source model. The term the remote sources were used in [6,7], and hidden source model (HSM) is from [8]. In this paper, we use HSM as in [8] to represent the BIS with noisy enrollment. The encoding process was introduced in [9] to reduce the size of the storage. This work was extended to incorporate noisy reconstruction in [10]. The BIS with estimating both user’s index and secret key for two classical models, namely, generated- and chosen-secret BIS models, was investigated in [11]. In this literature, a clear explanation of the difference of these models is given. Later, adopting the concept of the wiretap channel to assume that the adversary has side information of the identified user’s bio-data sequence for the generated-secret BIS model was analyzed in [12]. Recently, a storage constraint and an HSM added to the model of [11] were studied in [13,14]. By using an additional private key, user’s privacy-leakage can be made negligible [15,16]. Another scenario, that is, the BIS with one user, was extensively examined in [8,17,18,19,20,21,22]. More precisely, in [17,18], the relation of secret-key and privacy-leakage rates was analyzed. The optimal secret-key rate under privacy and storage constraints was characterized in [8,19] for non-vanishing and vanishing secrecy-leakage rate, respectively. It is worthwhile noting that in [8], a successful attempt for characterizing the capacity region of the BIS with one user for HSM was first made. The works of [8] was extended to constrain the action cost for the decoder in [20], and to consider two-enrollment systems for the same hidden source, where the encoders do not trust each other [21]. Moreover, in [22], the secret-key capacity of a multi-enrollment system, in which the decoder is required to estimate all secret keys generated in the earlier enrollments, was formulated.

Compared to the analyses of the BIS for DMSs, the results given under Gaussian sources are still few. For example, the optimal trade-off between secret-key and privacy-leakage rates was characterized in [23] and in order to speed up search complexity, hierarchical identification was taken into account in [24]. A common assumption in [23,24] is that the enrollment channel is noiseless, known as a visible source model (VSM). However, in real-life application, the signal of bio-data is basically represented with continuous values, and most communication links can be modeled as Gaussian channels [23]. What is more, the HSM is considered to be more realistic, e.g., captured picture of a finger via a scanner, and when the BIS is switched from the VSM to the HSM, the evaluation becomes more challenging [8] because many techniques used for deriving the results of the VSM are not directly applicable. These facts motivate us to extend the models in [13] to Gaussian sources and channels. Note that from the technical perspectives, this extension is not trivial since the technique for establishing Theorems 1 and 2 in [13] massively depends on the property that the alphabet sizes are finite, but unfortunately it cannot be applied to continuous sources. The technique used in this paper will be explained in Section 5 in details. Therefore, the extension is of both theoretical and practical interest. Although it is well-known that the bio-data is real-valued, as mentioned in [23], the validity of Gaussian assumption is not discussed in this paper and we leave this for further research. Here, we are interested in specifying the optimal trade-off of the BIS.

In this study, our goal is to find the optimal trade-off of identification and secret-key rates in the BIS under privacy and storage constraints. We demonstrate that an idea of converting the system to another one, where the data flow of each user is in the same direction, enables us to characterize the capacity region. More specifically, in establishing the outer bound of the region, the converted system allows us to use the entropy power inequality (EPI) [25] doubly in two opposite directions, and also its property facilitates the derivation of the inner bound. In [8], Mrs. Gerber’s lemma was applied twice, too, to simplify the rate region of the HSM for binary sources and symmetric channels without converting the BIS. That was possible due to the uniformity of the source, and the backward channel of the enrollment channel is also the binary symmetric channel with the same crossover probability. However, this claim is no longer true in the Gaussian case, so it is necessary to formulate the general behavior of the backward channel. We also provide numerical calculations of three different examples. As a consequence, we may conclude that it is difficult to achieve high secret-key and small privacy-leakage rates at the same time. To achieve a small privacy-leakage rate, the secret-key rate must be sacrificed. Furthermore, as a by-product of our result, the capacity regions of the BIS analyzed in [8] for Gaussian sources and channels are obtained, and as special cases, it can be checked that this characterization reduces to the results given in [5,23].

The rest of this paper is organized as follows. In Section 2, we define the notation used in this paper, and describe our system model and the converted system. In Section 3, the formal definitions and main results are discussed in detail. We continue investigating the basic properties of the capacity regions, and provide there different examples in Section 4. The overviews of the proof of our main results are given in Section 5. The full proof is available in Appendix A and Appendix B. Finally, some concluding remark and future work are mentioned in Section 6.

2. System Model and Converted System

2.1. Notation and System Model

Upper-case A and lower-case a denote random variable (RV) and its realization, respectively.

A^{n} = (A_{1}, \dots, A_{n})

represents a string of RVs and subscripts represent the position of an RV in the string.

f_{A}

denotes the probability density function (pdf) of RV A. For integers k and t such that

k < t

,

[k : t]

denotes the set

{k, k + 1, \dots, t}

.

log x

stands for the natural logarithm of

x > 0

.

The generated-secret BIS model and chosen-secret BIS model considered in this study are depicted in Figure 1. Arrows (g) and (c) indicate the directions of the secret key of the former and latter models. Let

I = [1 : M_{I}]

,

S = [1 : M_{S}]

, and

J = [1 : M_{J}]

be the sets of user’s indices, secret keys, and helper data, respectively, where

M_{I}

,

M_{S}

, and

M_{J}

denote the numbers of users, secret keys, and helper data, respectively. These sets are assumed to be finite.

X_{i}^{n}

,

Y_{i}^{n}

, and

Z^{n}

denote the bio-data sequence of user i generated from source

P_{X}

, the output of

X_{i}^{n}

via the enrollment channel

P_{Y | X}

, and the output of

X_{i}^{n}

via the identification channel

P_{Z | X}

, respectively. For

i \in I

and

k \in [1 : n]

, we assume

X_{i k} \sim N (0, 1)

, where

N (0, 1)

is a Gaussian RV with mean zero and variance one. Note that an RV with unit variance can be obtained by applying a scaling technique.

P_{Y | X}

and

P_{Z | X}

are additive Gaussian noise channels modeled as follows:

\begin{matrix} Y_{i k} = ρ_{1} X_{i k} + N_{1}, Z_{k} = ρ_{2} X_{i k} + N_{2}, (k \in [1 : n]) . \end{matrix}

(1)

where

| ρ_{1} | < 1

,

| ρ_{2} | < 1

are the Pearson’s correlation coefficients, and

N_{1} \sim N (0, 1 - ρ_{1}^{2})

and

N_{2} \sim N (0, 1 - ρ_{2}^{2})

are Gaussian RVs, independent of each other and bio-data sequences. From (1),

Y_{i k}

and

Z_{k}

are also Gaussian with zero mean and unit variance, and the Markov chain

Y - X - Z

holds. Then, the pdf corresponding to the tuple

(X_{i}^{n}, Y_{i}^{n}, Z^{n})

is given by

\begin{matrix} f_{X_{i}^{n} Y_{i}^{n} Z^{n}} (x_{i}^{n}, y_{i}^{n}, z^{n}) = \prod_{k = 1}^{n} f_{X Y Z} (x_{i k}, y_{i k}, z_{k}), \end{matrix}

(2)

where for

x, y, z \in R

,

\begin{matrix} f_{X Y Z} (x, y, z) & = f_{X} (x) \cdot f_{Y | X} (y | x) \cdot f_{Z | X} (z | x), \end{matrix}

(3)

\begin{matrix} = \frac{1}{\sqrt{{(2 π)}^{3} (1 - ρ_{1}^{2}) (1 - ρ_{2}^{2})}} exp (- (\frac{x^{2}}{2} + \frac{{(y - ρ_{1} x)}^{2}}{2 (1 - ρ_{1}^{2})} + \frac{{(z - ρ_{2} x)}^{2}}{2 (1 - ρ_{2}^{2})})) . \end{matrix}

(4)

In the generated-secret BIS model, upon observing

Y_{i}^{n}

, the encoder

e (\cdot)

generates secret key

S (i) \in S

and helper data

J (i) \in J

as

(S (i), J (i)) = e (Y_{i}^{n})

. Then,

J (i)

is stored at position i in the public database (helper DB) and

S (i)

is saved in the key DB, which is installed in a secure location. Let W and

\hat{W}

denote the index of the identified user and its estimated value, respectively. Seeing

Z^{n}

, the decoder

d (\cdot)

estimates

(\hat{W}, \hat{S (W)})

from

Z^{n}

and all helper data in DB

J \equiv {J (1), \dots, J (M_{I})}

, i.e.,

(\hat{W}, \hat{S (W)}) = d (Z^{n}, J)

.

In the chosen-secret BIS model, the secret key

S (i)

is chosen uniformly from

S

, i.e.,

\begin{matrix} P_{S (i)} (s) = 1 / M_{S} (s \in S), \end{matrix}

(5)

and independent of other RVs. The encoder forms the helper data as

J (i) = e (Y_{i}^{n}, S (i))

for every individual. The decoder

d (\cdot)

owns the same functionality as in the generated-secret BIS model.

2.2. Converted System

The original system, having X as input source and

Y, Z

as outputs, is in the top figure in Figure 2. There are two main obstacles toward characterizing the capacity regions directly from this system. (I) In establishing the converse proof, an upper bound regarding RV Y for a fixed condition of RV X is needed, but it is laborious to pursue the desired bound since applying EPI to the first relation in (1) produces only a lower bound. (II) It seems difficult to prove the achievability part by generating the codebook via a test channel due to the input X. To overcome these bottlenecks, we use an idea of converting the original system to a new one in which the data flow of each user is one-way from Y to Z without losing its general properties. The image of this idea is shown in the bottom figure of Figure 2, where Y becomes input virtually. To achieve this objective, knowing the statistics of the backward channel

P_{X | Y}

, namely, how X correlates to the virtual input Y, is crucial and we explore that in the rest of this section.

Due to the Markov chain

Y - X - Z

, Equation (3) can also be expanded in the following form.

\begin{matrix} f_{X Y Z} (x, y, z) = f_{Y} (y) \cdot f_{X | Y} (x | y) \cdot f_{Z | X} (z | x) . \end{matrix}

(6)

Observe that

\begin{matrix} \frac{x^{2}}{2} + \frac{{(y - ρ_{1} x)}^{2}}{2 (1 - ρ_{1}^{2})} & = \frac{x^{2}}{2} + \frac{y^{2}}{2 (1 - ρ_{1}^{2})} - \frac{ρ_{1} x y}{1 - ρ_{1}^{2}} + \frac{{(ρ_{1} x)}^{2}}{2 (1 - ρ_{1}^{2})} = \frac{y^{2}}{2} + \frac{{(x - ρ_{1} y)}^{2}}{2 (1 - ρ_{1}^{2})} . \end{matrix}

(7)

Without loss of generality, the exponential part in (4) can be rearranged as

\begin{matrix} - (\frac{y^{2}}{2} + \frac{{(x - ρ_{1} y)}^{2}}{2 (1 - ρ_{1}^{2})} + \frac{{(z - ρ_{2} x)}^{2}}{2 (1 - ρ_{2}^{2})}) . \end{matrix}

(8)

From (6) and (8), we may conclude that the following relations hold with some RV

N_{1}^{'} \sim N (0, 1 - ρ_{1}^{2})

.

\begin{matrix} X_{i k} & = ρ_{1} Y_{i k} + N_{1}^{'}, \end{matrix}

(9)

\begin{matrix} Z_{k} & = ρ_{2} X_{i k} + N_{2} = ρ_{1} ρ_{2} Y_{i k} + ρ_{2} N_{1}^{'} + N_{2} . \end{matrix}

(10)

Equations (9) and (10) describe the outputs of the backward channel

P_{X | Y}

and the combined channel

P_{Z | Y}

of the virtual system. Actually, these relations can also be observed directly from the covariance matrix of RVs

(X, Y, Z)

. However, we derive them based on the joint pdf for general readers’ purpose. Moreover, this transformation is useful for the analysis of a non-standard source. The above relations play key roles for solving the problem of the HSM, and we use them in many steps during the analysis in this paper. In [23,24], the concept of this transformation is not seen because the enrollment channel is noiseless due to the assumption of VSM as mentioned before.

Remark 1.

In the case where there is no operation of scaling, Equations (9) and (10) are settled as follows. Suppose that

X_{i k} \sim N (0, σ_{x}^{2})

with

σ_{x}^{2} < \infty

,

Y_{i k} = X_{i k} + D_{1}

, and

Z_{k} = X_{i k} + D_{2}

, where

D_{1} \sim N (0, σ_{1}^{2})

and

D_{2} \sim N (0, σ_{2}^{2})

are Gaussian RVs, and independent of each other and other RVs. By applying the similar arguments around (6)–(8), we obtain that

\begin{matrix} X_{i k} = \frac{σ_{x}^{2}}{σ_{x}^{2} + σ_{1}^{2}} Y_{i k} + D_{1}^{'}, Z_{k} = X_{i k} + N_{2}^{'} = \frac{σ_{x}^{2}}{σ_{x}^{2} + σ_{1}^{2}} Y_{i k} + D_{1}^{'} + D_{2}, \end{matrix}

(11)

where

D_{1}^{'} \sim N (0, \frac{σ_{x}^{2} σ_{1}^{2}}{σ_{x}^{2} + σ_{1}^{2}})

is Gaussian and independent of other RVs. The capacity regions of the models considered in this paper can also be characterized via (11), and the results for this case will be mentioned in Remark 3. However, equation developments need more space and do not look so neat. Herein, we pursue our results based on the method that RVs X, Y, and Z are standardized.

Now from (9) and (10), it is not difficult to verify that

\begin{matrix} I (X; Y) = \frac{1}{2} log (\frac{1}{1 - ρ_{1}^{2}}), I (Z; Y) = \frac{1}{2} log (\frac{1}{1 - ρ_{1}^{2} ρ_{2}^{2}}), \end{matrix}

(12)

where the right equation in (12) is attained because the variance of the noise term

ρ_{2} N_{1}^{'} + N_{2}

in (10) is equal to

1 - ρ_{1}^{2} ρ_{2}^{2}

.

3. Problem Formulation and Main Results

The achievability definition for the generated-secret BIS model is given below.

Definition 1.

A tuple of identification, secret-key, public storage, and privacy-leakage rates

(R_{I}, R_{S}, R_{J}, R_{L})

is said to be achievable for the generated-secret BIS model under a Gaussian source if for any

δ > 0

and large enough n there exist pairs of encoders and decoders satisfying

\begin{matrix} {max}_{i \in I} Pr {(\hat{W}, \hat{S (W))} \neq (W, S (W)) & | W = i} \leq δ, (error probability) \end{matrix}

(13)

\begin{matrix} \frac{1}{n} log M_{I} & \geq R_{I} - δ, (identification rate) \end{matrix}

(14)

\begin{matrix} {min}_{i \in I} \frac{1}{n} H (S (i)) & \geq R_{S} - δ, (secret - key rate) \end{matrix}

(15)

\begin{matrix} \frac{1}{n} log M_{J} & \leq R_{J} + δ, (public storage rate) \end{matrix}

(16)

\begin{matrix} {max}_{i \in I} \frac{1}{n} I (S (i); J (i)) & \leq δ, (secrecy - leakage rate) \end{matrix}

(17)

\begin{matrix} {max}_{i \in I} \frac{1}{n} I (X_{i}^{n}; J (i)) & \leq R_{L} + δ . (privacy - leakage rate) \end{matrix}

(18)

Moreover,

R_{G}

is defined as the set of all achievable rate tuples for the generated-secret BIS model, called the capacity region.

For the chosen-secret BIS model, the definition is provided as follows:

Definition 2.

A tuple

(R_{I}, R_{S}, R_{J}, R_{L})

is said to be achievable for the chosen-secret BIS model under a Gaussian source if there exist pairs of encoders and decoders that satisfy all the requirements in Definition 1 for any

δ > 0

and large enough n. Note that the left-hand side of (15) is expressed as

\frac{1}{n} log M_{S}

because the key is chosen uniformly from

S

(cf. (5)). In addition,

R_{C}

is defined as the capacity region for the chosen-secret BIS model.

Remark 2.

Note that in the BIS, there are two databases, namely, databases of secret keys and helper data. The memory space of the database for storing the helper data (public database) is minimized, while that for the secret keys (secure database) should be maximized. This means only a part of the entire storage space of the BIS, which is the public database, is being compressed, and thus it is suitable to call this compression rate the public storage rate. However, we call the public storage rate just the storage rate as in [8] hereafter for brevity reason.

Now we are ready to introduce our main results.

Theorem 1.

The capacity regions for the generated- and chosen-secret BIS models are given by

\begin{matrix} R_{G} = ⋃_{0 < α \leq 1} {(R_{I}, R_{S}, R_{J}, R_{L}) : R_{I} + R_{S} & \leq \frac{1}{2} log (\frac{1}{α ρ_{1}^{2} ρ_{2}^{2} + 1 - ρ_{1}^{2} ρ_{2}^{2}}), \\ R_{J} & \geq \frac{1}{2} log (\frac{α ρ_{1}^{2} ρ_{2}^{2} + 1 - ρ_{1}^{2} ρ_{2}^{2}}{α}) + R_{I}, \\ R_{L} & \geq \frac{1}{2} log (\frac{α ρ_{1}^{2} ρ_{2}^{2} + 1 - ρ_{1}^{2} ρ_{2}^{2}}{α ρ_{1}^{2} + 1 - ρ_{1}^{2}}) + R_{I}, \\ R_{I} & \geq 0, R_{S} \geq 0}, \end{matrix}

(19)

\begin{matrix} R_{C} = ⋃_{0 < α \leq 1} {(R_{I}, R_{S}, R_{J}, R_{L}) : R_{I} + R_{S} & \leq \frac{1}{2} log (\frac{1}{α ρ_{1}^{2} ρ_{2}^{2} + 1 - ρ_{1}^{2} ρ_{2}^{2}}), \\ R_{J} & \geq \frac{1}{2} log (\frac{1}{α}), \\ R_{L} & \geq \frac{1}{2} log (\frac{α ρ_{1}^{2} ρ_{2}^{2} + 1 - ρ_{1}^{2} ρ_{2}^{2}}{α ρ_{1}^{2} + 1 - ρ_{1}^{2}}) + R_{I}, \\ R_{I} & \geq 0, R_{S} \geq 0} . \end{matrix}

(20)

The proof of Theorem 1 is provided in Appendix A and Appendix B. It can be verified that the regions

R_{G}

and

R_{C}

are both convex, whose proofs are available in Appendix C. Unlike the approach taken in [23], based on investigating the second derivative of the rate region function, our proof makes use of the concavity of the logarithmic function. In both regions,

α = 0

is excluded by the reason that the point is not achievable, and this fact will be mentioned again in the converse proof of Equation (19).

For a fixed

α

, the optimal rate values for the regions

R_{G}

and

R_{C}

are shown in Figure 3. We begin with explaining Figure 3a. Suppose that

0 < R_{I} < \frac{1}{2} log (\frac{1}{α ρ_{1}^{2} ρ_{2}^{2} + 1 - ρ_{1}^{2} ρ_{2}^{2}})

. In the top band chart,

\frac{1}{2} log (\frac{1}{α ρ_{1}^{2} ρ_{2}^{2} + 1 - ρ_{1}^{2} ρ_{2}^{2}})

is the maximum achievable rate that user’s identities can be estimated correctly at the decoder. Since the index and the secret key of the identified user are reconstructed at the decoder, the sum of the optimal values for the identification and secret-key rates is equal to this value, implying the optimal secret-key rate is

R_{S} = \frac{1}{2} log (\frac{1}{α ρ_{1}^{2} ρ_{2}^{2} + 1 - ρ_{1}^{2} ρ_{2}^{2}}) - R_{I}

. One can see that these rates are in a trade-off relation as the identification rate rises, the secret-key rate falls off. In the bottom one,

\frac{1}{2} log (\frac{1}{α})

is the entire rate that we need to generate auxiliary random sequences for encoding. The first part (blue part) represents the secret-key rate, and the second half

(\frac{1}{2} log (\frac{1}{α}) - R_{S} = \frac{1}{2} log (\frac{α ρ_{1}^{2} ρ_{2}^{2} + 1 - ρ_{1}^{2} ρ_{2}^{2}}{α}) + R_{I})

is the rate of the sequences that are shared between the encoder and decoder to help estimation of the index and secret key, corresponding the storage rate. Storing the helper data at this rate results in leaking the user’s privacy at least

\frac{1}{2} log (\frac{α ρ_{1}^{2} ρ_{2}^{2} + 1 - ρ_{1}^{2} ρ_{2}^{2}}{α ρ_{1}^{2} + 1 - ρ_{1}^{2}}) + R_{I}

, which is the optimal or minimum privacy-leakage for a given

α

.

For Figure 3b (the chosen-secret BIS model), the relation of the identification and secret-key rates is the same as in the generated-secret BIS model. However, the optimal storage rate becomes larger than the one seen in Figure 3a, equal to

\frac{1}{2} log (\frac{1}{α})

(the bottom band chart of Figure 3b), as the information related to the secret key chosen at the encoder (the concealed part) must be saved together with the helper data in DB to help the estimation of the key. For the privacy-leakage rate, the minimum values are not distinct in both models. This is because the unconcealed part of the storage at rate

\frac{1}{2} log (\frac{1}{α}) - R_{S} = \frac{1}{2} log (\frac{α ρ_{1}^{2} ρ_{2}^{2} + 1 - ρ_{1}^{2} ρ_{2}^{2}}{α}) + R_{I}

, identical to the optimal storage rate of the generated-secret BIS model, is still exposed publicly, and thus the minimum privacy-leakage rates of the two models are the same.

Figure 4 shows a numerical example of the region

R_{G}

for

ρ_{1}^{2} = 3 / 4

and

ρ_{2}^{2} = 2 / 3

. More specially, Figure 4a is a projection of the capacity region to the three-dimensional Euclidean space with X-axis

R_{J}

, Y-axis

R_{S}

, and Z-axis

R_{I}

. The black thick arrow indicates the direction of the achievable region for all rate tuples

(R_{J}, R_{S}, R_{I})

. Figure 4b is another projection of the capacity region to

R_{J} R_{I}

-plane. Red asterisks and circles correspond to the rate points

(R_{J}, R_{I})

at which

R_{I}

is zero and

R_{I}

is optimal, respectively, for some

α \in (0, 1]

. To explain the relation of the identification and storage rates, let us focus on the rightmost asterisk and circle pair in Figure 4b. When identification rate varies from zero to the optimal value, the rate point

(R_{J}, R_{I})

moves from the asterisk point (in the bottom) to the circled point along the arrow. From this, it is clear that the value of the storage rate for the circled point is greater compared to the asterisk point, implying that the change of identification rate affects the storage rate.

As a by-product of Theorem 1, the following corollary is obtained.

Corollary 1.

The capacity regions of the generated- and chosen-secret BIS models with a single user (the models considered in [8]) for Gaussian sources are given by substituting

R_{I} = 0

into the right-hand sides of (19) and (20), respectively.

Remark 3.

Let

R_{G}^{'}

and

R_{C}^{'}

denote the capacity regions of the generated-secret and chosen-secret BIS models characterized via (11) in Remark 1. The two regions are provided below.

\begin{matrix} R_{G}^{'} = ⋃_{0 < α \leq 1} {(R_{I}, R_{S}, R_{J}, R_{L}) : R_{I} + R_{S} & \leq \frac{1}{2} log (\frac{(σ_{x}^{2} + σ_{1}^{2}) (σ_{x}^{2} + σ_{2}^{2})}{α σ_{x}^{4} + σ_{x}^{2} σ_{1}^{2} + σ_{1}^{2} σ_{2}^{2} + σ_{2}^{2} σ_{x}^{2}}), \\ R_{J} & \geq \frac{1}{2} log (\frac{α σ_{x}^{4} + σ_{x}^{2} σ_{1}^{2} + σ_{1}^{2} σ_{2}^{2} + σ_{2}^{2} σ_{x}^{2}}{α (σ_{x}^{2} + σ_{1}^{2}) (σ_{x}^{2} + σ_{2}^{2})}) + R_{I}, \\ R_{L} & \geq \frac{1}{2} log (\frac{α σ_{x}^{4} + σ_{x}^{2} σ_{1}^{2} + σ_{1}^{2} σ_{2}^{2} + σ_{2}^{2} σ_{x}^{2}}{(α σ_{x}^{2} + σ_{1}^{2}) (σ_{x}^{2} + σ_{2}^{2})}) + R_{I}, \\ R_{I} & \geq 0, R_{S} \geq 0}, \end{matrix}

(21)

\begin{matrix} R_{C}^{'} = ⋃_{0 < α \leq 1} {(R_{I}, R_{S}, R_{J}, R_{L}) : R_{I} + R_{S} & \leq \frac{1}{2} log (\frac{(σ_{x}^{2} + σ_{1}^{2}) (σ_{x}^{2} + σ_{2}^{2})}{α σ_{x}^{4} + σ_{x}^{2} σ_{1}^{2} + σ_{1}^{2} σ_{2}^{2} + σ_{2}^{2} σ_{x}^{2}}), \\ R_{J} & \geq \frac{1}{2} log (\frac{1}{α}), \\ R_{L} & \geq \frac{1}{2} log (\frac{α σ_{x}^{4} + σ_{x}^{2} σ_{1}^{2} + σ_{1}^{2} σ_{2}^{2} + σ_{2}^{2} σ_{x}^{2}}{(α σ_{x}^{2} + σ_{1}^{2}) (σ_{x}^{2} + σ_{2}^{2})}) + R_{I}, \\ R_{I} & \geq 0, R_{S} \geq 0} . \end{matrix}

(22)

It can be verified that

R_{G}

and

R_{C}

are equivalent to

R_{G}^{'}

and

R_{C}^{'}

, respectively, if one sets

ρ_{1}^{2} = σ_{x}^{2} / (σ_{x}^{2} + σ_{1}^{2})

and

ρ_{2}^{2} = σ_{x}^{2} / (σ_{x}^{2} + σ_{2}^{2})

, respectively. In addition, as a connection to the result in a previous study, when there is no secret-key generation or provision

(R_{S} = 0)

, and

R_{J}, R_{L}

are large enough (

R_{J}, R_{L} \to \infty

), one can easily see that in

R_{G}^{'}

and

R_{C}^{'}

, the maximum value of

R_{I}

is

\frac{1}{2} log (\frac{(σ_{x}^{2} + σ_{1}^{2}) (σ_{x}^{2} + σ_{2}^{2})}{σ_{x}^{2} σ_{1}^{2} + σ_{1}^{2} σ_{2}^{2} + σ_{2}^{2} σ_{x}^{2}}) = \frac{1}{2} log (1 + \frac{σ_{x}^{2}}{σ_{1}^{2} + σ_{2}^{2} + σ_{1}^{2} σ_{2}^{2} / σ_{x}^{2}})

. This value is exactly the identification capacity of the BIS for non-standard Gaussian RVs shown in [5] (Equation (21)) and it is achieved when

α ↓ 0

.

Another special case where

R_{I} = 0

(only one user),

R_{J} \to \infty

(the storage rate is sufficiently large), and

ρ_{1} \to 1

(the enrollment channel is noiseless), one can see that Theorem 1 naturally reduces to the characterizations of [23].

4. Behaviors of the Capacity Region

4.1. Optimal Asymptotic Rates and Zero-Rate Slopes

For the sake of succinct discussion, we concentrate on the generated-secret BIS model at which

R_{I} = 0

, and the capacity region for this case is denoted by

R

, whose characterization is obtained by setting

R_{I} = 0

in the right-hand side of (19). We first investigate some special points of secret-key and privacy-leakage rates when storage rate becomes extremely low or large. Define two rate functions of

R_{J}

as

\begin{matrix} R_{S}^{*} (R_{J}) = max_{(R_{S}, R_{J}, R_{L}) \in R} R_{S}, R_{L}^{*} (R_{J}) = min_{(R_{S}, R_{J}, R_{L}) \in R} R_{L}, \end{matrix}

(23)

where the left and right equations in (23) are the maximum secret-key rate and the minimum privacy-leakage rate, respectively. Moreover, we define

R_{J}^{α} = \frac{1}{2} log (\frac{α ρ_{1}^{2} ρ_{2}^{2} + 1 - ρ_{1}^{2} ρ_{2}^{2}}{α}),

so that we can write

\begin{matrix} R_{S}^{*} (R_{J}^{α}) & = \frac{1}{2} log (\frac{1 - ρ_{1}^{2} ρ_{2}^{2} / 2^{2 (R_{J}^{α})}}{1 - ρ_{1}^{2} ρ_{2}^{2}}), \\ R_{L}^{*} (R_{J}^{α}) & = \frac{1}{2} log (\frac{1 - ρ_{1}^{2} ρ_{2}^{2}}{1 - ρ_{1}^{2} + ρ_{1}^{2} (1 - ρ_{2}^{2}) / 2^{2 (R_{J}^{α})}}) . \end{matrix}

(24)

As

R_{J}^{α} \to \infty (α ↓ 0)

, the optimal asymptotic secret-key rate and the amount of privacy-leakage approach to

lim_{R_{J}^{α} \to \infty} R_{S}^{*} (R_{J}^{α}) = \frac{1}{2} log (\frac{1}{1 - ρ_{1}^{2} ρ_{2}^{2}}) = I (Y; Z),

(25)

\begin{matrix} lim_{R_{J}^{α} \to \infty} R_{L}^{*} (R_{J}^{α}) & = \frac{1}{2} log (\frac{1 - ρ_{1}^{2} ρ_{2}^{2}}{1 - ρ_{1}^{2}}) = \frac{1}{2} log (\frac{1}{1 - ρ_{1}^{2}}) - \frac{1}{2} log (\frac{1}{1 - ρ_{1}^{2} ρ_{2}^{2}}) \\ = I (X; Y) - I (Z; Y) . \end{matrix}

(26)

The result (25) corresponds to the optimal asymptotic secret-key rate [23] (Sect. III-B), and in order to achieve this value, it is required to let the storage rate go to infinity and leak the user’s privacy up to rate

I (X; Y) - I (Z; Y)

.

In contrast, when

R_{J} ↓ 0

, it is evident that

R_{S}

and

R_{L}

become zero as well, which does not carry much information. However, to investigate the BIS that achieves high secret-key and small privacy-leakage rates in the low storage rate regime, the zero-rate slopes of secret-key and privacy-leakage rates, namely, how fast these rates converge to zero, are important indicators. In light of (24), by a few steps of calculations, the slopes of secret-key and privacy-leakage rates at

R_{J} ↓ 0

can be determined as follows:

\begin{matrix} \frac{d R_{S}^{*} (R_{J}^{α})}{d R_{J}^{α}} |_{R_{J}^{α} = 0} & = \frac{ρ_{1}^{2} ρ_{2}^{2}}{1 - ρ_{1}^{2} ρ_{2}^{2}}, \end{matrix}

(27)

\begin{matrix} \frac{d R_{L}^{*} (R_{J}^{α})}{d R_{J}^{α}} |_{R_{J}^{α} = 0} & = \frac{ρ_{1}^{2} (1 - ρ_{2}^{2})}{1 - ρ_{1}^{2} ρ_{2}^{2}} = \frac{ρ_{1}^{2} ρ_{2}^{2}}{1 - ρ_{1}^{2} ρ_{2}^{2}} \cdot \frac{1 - ρ_{2}^{2}}{ρ_{2}^{2}}, \end{matrix}

(28)

where (27) is equal to the signal-to-noise ratio (SNR) of the channel from Y to Z, and this value multiplied by the reverse of the SNR of the channel

P_{Z | X}

appears in the slope of privacy-leakage rate in (28).

4.2. Examples

Next, we give numerical computations of three different examples and take a look into behaviors of the special points.

Ex.1:: (a) $ρ_{1}^{2} = 3 / 4, ρ_{2}^{2} = 2 / 3$ , (b) $ρ_{1}^{2} = 7 / 8, ρ_{2}^{2} = 2 / 3$ , (c) $ρ_{1}^{2} = 15 / 16, ρ_{2}^{2} = 2 / 3$ ,
Ex.2:: (a) $ρ_{1}^{2} = 3 / 4, ρ_{2}^{2} = 2 / 3$ , (b) $ρ_{1}^{2} = 9 / 10, ρ_{2}^{2} = 7 / 8$ , (c) $ρ_{1}^{2} = 15 / 16, ρ_{2}^{2} = 11 / 12$ ,
Ex.3:: (a) $ρ_{1}^{2} = 3 / 4, ρ_{2}^{2} = 2 / 3$ , (b) $ρ_{1}^{2} = 3 / 4, ρ_{2}^{2} = 8 / 9$ , (c) $ρ_{1}^{2} = 3 / 4, ρ_{2}^{2} = 14 / 15$ .

Note that as

ρ_{1}^{2}, ρ_{2}^{2}

are large, the levels of noises (noises with smaller variances) added to the bio-data sequences at the encoder and decoder become small. Example 1 is the case where the level of noise at the encoder gradually decreases from (a) to (c), but the level of noise at the decoder stays constant for each round. Example 2 is the case in which the levels of noises at both the encoder and decoder are improved gradually from (a) to (c). Example 3 is opposite to Example 1. The calculated results of the secret-key and privacy-leakage rates for these cases are summarized in Table 1 and Table 2, and Figure 5.

It is ideal to keep the privacy-leakage rate small while producing a high secret-key rate, but Example 1 works out in the opposite way (cf. the rows of Ex. 1 in Table 1 and Table 2), so this is not a preferable choice. Example 2 realizes a high secret-key rate, but the amount of privacy-leakage remains high at some level, too (cf. the rows of Ex. 2 in Table 1 and Table 2, and Figure 5a,b). On the other hand, in Example 3, the privacy-leakage rate declines, but the secret-key rate becomes smaller compared to Example 2 (cf. the rows of Ex. 3 in Table 1 and Table 2, and Figure 5c,d). From these behaviors, we may conclude that it is unmanageable to achieve both high secret-key and small privacy-leakage rates at the same time. If one aims to achieve a high secret-key rate, it is important to diminish the levels of noises at both encoder and decoder, e.g., deploying quantizers with high quality, but this could result in leaking more users’ privacy. In different circumstances, to achieve a small privacy-leakage rate, it is preferable to maintain a certain level of noise at the encoder and pay sufficient attention for processing the noise’s level at the decoder. In this way, however, the gain of the secret-key rate may be dropped.

5. Overviews of the Proof of Theorem 1

The detailed proof of Theorem 1 is provided in Appendix A for

R_{G}

and Appendix B for

R_{C}

. The regions

R_{G}

and

R_{C}

can be derived similarly, and the difference is that one-time pad is used to conceal the chosen secret key for secure transmission in the proof of

R_{C}

. The proof of each region consists of two parts: achievability and converse parts. The converse proof follows by applying Fano’s inequality [26], and the conditional version of EPI [27] doubly in two different directions. In the achievability part, the modified typical set [11], giving the so-called Markov lemma for weak typicality, helps us show that the error probability of the BIS vanishes since the so-called Markov lemma based on strong typicality can not be applied to the case of continuous RVs. Though a more general version of the Markov lemma for Gaussian sources, including lossy reconstruction, is shown in [28], we found that the two properties of the modified typical set are handy tools for checking all conditions in Definitions 1 and 2, and thus we provide our proof of the achievability based on this set. To evaluate the secret-key, secrecy-leakage, and privacy-leakage rates, we extend [29] (Lemma 4) to include continuous RVs so that the extended one can be used to derive the upper bounds on conditional differential entropies of jointly typical sequences, appearing in these evaluations.

6. Conclusions and Future Work

We characterized the capacity regions of identification, secret-key, storage, and privacy-leakage rates for both generated- and chosen-secret BIS models under Gaussian sources and channels. We showed that an idea for deriving the capacity regions of the BIS with HSM is to convert the system to another one, where the data flows of each user are in one-way direction. We also gave numerical computations of three different examples for the generated-secret BIS model, and from these results, it appeared that achieving high secret-key and small privacy-leakage rates simultaneously is unlikely manageable.

For future work, a natural extension is to characterize the capacity regions of the BIS with Gaussian vector sources and channels. In the scalar Gaussian case, we showed that it suffices to use a single parameter to characterize the optimal trade-off of the BIS. However, for Gaussian vector sources, the optimal trade-off regions is generally in the form of the covariance matrix optimization problem, and solving the problem becomes more challenging as one may need to use the enhancement technique, introduced in [30], to characterize the capacity regions.

Another extension is to construct practical codes that can achieve the capacity regions. In the BIS with a single user, convolutional and turbo codes that control the privacy-leakage were investigated in [31] and applied to real-life application, Electroencephalograph, in [32]. In these studies, it was shown that by applying vector quantization at the encoder and soft-decision at the decoder for Gaussian sources, a lower privacy-leakage rate was realizable. However, to the best of our knowledge, there has not yet been any studies dealing with practical codes for the BIS with multiple users. This remains as an interesting research topic.

Author Contributions

The V.Y. author contributes to the conceptualization of the research goals and aims, the visualization/presentation of the works, the formal analysis of the results, and the review and editing. The H.Y. and Y.O. authors contribute to the conceptualization of the ideas, the validation of the results, and the supervision. All authors have read and agreed to the published version of the manuscript.

Funding

This study was supported in part by JSPS KAKENHI Grant Numbers JP20K04462, JP19J13686, JP18H01438, and JP17K00020.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Proof of Equation (19)

In this appendix, we give the proof of the capacity region of the generated-secret BIS model. Before proceeding to the proof, we review the definitions of the weakly typical and modified typical sets, and some properties of these sets.

Appendix A.1. Weakly Typical Sets and Modified Typical Sets

The definition and property of weakly typical set hold for both discrete and continuous RVs, but here we provide only the continuous version.

Definition A1.(Weakly ϵ-typical set for continuous RVs [26] (Chapter 8))

Let

X_{1}, X_{2}, \dots, X_{k}

be a sequence of continuous RVs drawn i.i.d. according to the joint pdf

f_{X_{1} X_{2} \dots X_{k}} (x_{1}, x_{2}, \dots, x_{k})

. For small enough

ϵ > 0

, and any n, the weakly ϵ-typical set

A_{ϵ}^{(n)} (X_{1} X_{2} \dots X_{k})

with respect to

f_{X_{1} X_{2} \dots X_{k}} (x_{1}, x_{2}, \dots, x_{k})

is defined as follows:

\begin{matrix} A_{ϵ}^{(n)} (X_{1} X_{2} \dots X_{k}) = \{(x_{1}^{n}, x_{2}^{n}, \dots, x_{k}^{n}) : |- \frac{1}{n} log f_{S^{n}} (s^{n}) - h (S)| < ϵ, \forall S \subseteq {X_{1}, X_{2}, \dots, X_{k}}\}, \end{matrix}

(A1)

where

s^{n} \subseteq {x_{1}^{n}, x_{2}^{n}, \dots, x_{k}^{n}}

corresponding to RV S and

f_{S^{n}} (s^{n}) = \prod_{t = 1}^{n} f_{S_{t}} (s_{t})

. Moreover, the conditional ϵ-typical set is defined as

A_{ϵ}^{(n)} (X_{k} | x_{2}^{n}, \dots, x_{k - 1}^{n}) = \{x_{k}^{n} : (x_{1}^{n}, x_{2}^{n}, \dots, x_{k}^{n}) \in A_{ϵ}^{(n)} (X_{1} X_{2} \dots X_{k})\} .

The weakly

ϵ

-typical set

A_{ϵ}^{(n)} (\cdot)

has the following properties.

Lemma A1.

(Some properties of weakly ϵ-typical set [26])

1: For $\forall S \subseteq {X_{1}, X_{2}, \dots, X_{k}}$ and large enough n,

$\begin{matrix} Pr {A_{ϵ}^{(n)} (S)} & \geq 1 - ϵ . \end{matrix}$

(A2)
2: For $\forall S, V \subseteq {X_{1}, X_{2}, \dots, X_{k}}$ $(S \cap V = Ø)$ , we have that

$\begin{matrix} Vol (A_{ϵ}^{(n)} (V | s^{n})) \leq e^{n (h (V | S) + 2 ϵ)}, \end{matrix}$

(A3)

where $Vol (\cdot)$ denotes the volume of a set.
3: Fix $k = 2$ . If $({\tilde{X}}_{1}^{n}, {\tilde{X}}_{2}^{n})$ are independent sequences with the same marginals as $f_{X_{1}^{n} X_{2}^{n}} (x_{1}^{n}, x_{2}^{n})$ , then

$\begin{matrix} Pr {({\tilde{X}}_{1}^{n}, {\tilde{X}}_{2}^{n}) \in A_{ϵ}^{(n)} (X_{1} X_{2})} \leq e^{- n (I (X_{1}; X_{2}) - 2 ϵ)} . \end{matrix}$

(A4)

Moreover, for n large enough,

$\begin{matrix} Pr {({\tilde{X}}_{1}^{n}, {\tilde{X}}_{2}^{n}) \in A_{ϵ}^{(n)} (X_{1} X_{2})} \geq (1 - ϵ) e^{- n (I (X_{1}; X_{2}) + 2 ϵ)} . \end{matrix}$

(A5)

Proof.

See [26] (Section 15.2). □

Next, we provide the definition of the modified

ϵ

-typical set. This set gives the so-called Markov lemma for the weak typicality.

Definition A2.(Modified ϵ-typical set [11] (Appendix A-A))

Consider that

(X, Y, U)

forms a Markov chain

X - Y - U

, i.e.,

f_{X Y U} (x, y, u) = f_{X Y} (x, y) f_{U | Y} (u | y)

. The modified ϵ-typical set

B_{ϵ}^{(n)} (Y U)

is defined as

\begin{matrix} B_{ϵ}^{(n)} (Y U) = \{(y^{n}, u^{n}) : Pr {X^{n} \in A_{ϵ}^{(n)} (X | y^{n}, u^{n}) | (Y^{n}, U^{n}) = (y^{n}, u^{n})} \geq 1 - ϵ\}, \end{matrix}

(A6)

where

X^{n}

is drawn i.i.d. from the transition probability

\prod_{k = 1}^{n} f_{X | Y} (x_{k} | y_{k})

. In addition, define

B_{ϵ}^{(n)} (U | y^{n}) = {u^{n} : (y^{n}, u^{n}) \in B_{ϵ}^{(n)} (Y U)}

for all

y^{n}

, and

B_{ϵ}^{(n)} {(U | y^{n})}^{c}

denotes the complementary set of

B_{ϵ}^{(n)} (U | y^{n})

.

The modified set induces two useful properties below.

Lemma A2.

(Properties of the modified set [11] (Appendix A-A))

Property 1.: If $(y^{n}, u^{n}) \in B_{ϵ}^{(n)} (Y U)$ then also $(y^{n}, u^{n}) \in A_{ϵ}^{(n)} (Y U)$ .
Property 2.: Assume that $(U^{n}, Y^{n}, X^{n}) \sim f_{U^{n} X^{n} Y^{n}} = \prod_{t = 1}^{n} f_{X_{t} Y_{t}} f_{U_{t} | Y_{t}}$ . Then, for $ϵ \in (0, 1)$ and n large enough, $\sum_{(y^{n}, u^{n}) \in B_{ϵ}^{(n)} (Y U)} P_{Y^{n} U^{n}} (y^{n}, u^{n}) \geq 1 - ϵ$ .

Proof.

See [11] (Appendix A-A). □

Now we are at the position to present the detailed proofs of Equation (19).

Appendix A.2. Achievability Part

Let

0 < α \leq 1

and fix

δ > 0

(small enough positive), the block length n, and the joint pdf of

(U, Y, X, Z)

such that the Markov chain

U - Y - X - Z

holds. Let

U \sim N (0, 1 - α)

, Gaussian RV with mean zero and variance

1 - α

. As shown in Figure A1, based on the converted system, consider that

Y_{i k} = U + Φ,

where

Φ \sim N (0, α)

, independent of U, is Gaussian with mean zero and variance

α

. From (9) and (10) of the converted system, it holds that

\begin{matrix} X_{i k} = ρ_{1} U + ρ_{1} Φ + N_{1}^{'}, Z_{k} = ρ_{1} ρ_{2} U + ρ_{1} ρ_{2} Φ + ρ_{2} N_{1}^{'} + N_{2} . \end{matrix}

(A7)

Figure A1. Relation among RVs

(U, Y, X, Z) .

Figure A1. Relation among RVs

(U, Y, X, Z) .

Hence, we readily see that

\begin{matrix} I (Y; U) & = \frac{1}{2} log (\frac{1}{α}), I (X; U) = \frac{1}{2} log (\frac{1}{α ρ_{1}^{2} + 1 - ρ_{1}^{2}}), \\ I (Z; U) & = \frac{1}{2} log (\frac{1}{α ρ_{1}^{2} ρ_{2}^{2} + 1 - ρ_{1}^{2} ρ_{2}^{2}}) . \end{matrix}

(A8)

Now set

0 < R_{I} < I (Z; U)

, and

\begin{matrix} R_{S} & = I (Z; U) - R_{I} - 2 δ, R_{J} = I (Y; U) - I (Z; U) + R_{I} + 6 δ, \end{matrix}

(A9)

\begin{matrix} R_{L} & = I (X; U) - I (Z; U) + R_{I} + 6 δ, M_{I} = 2^{n R_{I}}, M_{S} = 2^{n R_{S}}, M_{J} = 2^{n R_{J}}, \end{matrix}

(A10)

where the values of

I (Y; U), I (X; U)

, and

I (Z; U)

are specified in (A8). Also, recall that

I = [1 : M_{I}], S = [1 : M_{S}], J = [1 : M_{J}]

.

Next we generate

e^{n (I (Y; U) + δ)}

sequences of

u^{n} (s, j)

i.i.d. from pdf

f_{U}

, where

s \in S

and

j \in J

.

Seeing

y_{i}^{n} (i \in I)

, the encoder finds

u^{n} (s, j)

such that

(y_{i}^{n}, u^{n} (s, j)) \in B_{δ}^{(n)} (Y U)

, denoting the modified set defined in Definition A2. If there are multiple pairs of such

(s, j)

, the encoder picks one at random. If there are no such pairs, it declares error. We denote the chosen pair as

(s (i), j (i))

, where each element is a function of the index i. The helper

j (i)

is stored in the helper DB and secret key

s (i)

is saved in the key DB at location i, respectively.

Observing

z^{n}

, the noisy sequence of the identified user

x_{w}^{n}

, the decoder looks for

u^{n} (s, j (i))

such that

(z^{n}, u^{n} (s, j (i))) \in A_{δ}^{(n)} (Z U)

for some

i \in I

and

s \in S

, where

A_{δ}^{(n)} (Z U)

denotes the weakly

δ

-typical set. If a unique pair

(i, s)

is found, it outputs

(\hat{w}, \hat{s (w)}) = (i, s)

, or else it declares error.

Let

(S (i), J (i))

denote the index pair chosen at the encoder based on

Y_{i}^{n}

, i.e.,

(Y_{i}^{n}, U^{n} (S (i), J (i))) \in B_{δ}^{(n)} (Y U)

. Furthermore, we denote

U^{n} (S (i), J (i))

as

U_{i}^{n}

for notational simplicity. Note that the pair

(S (i), J (i))

can determine the sequence

U_{i}^{n}

precisely. Next, we check all conditions in Definition 1 hold for a random codebook

C_{n} = {U^{n} (s, j), s \in S

and

j \in J}

.

Analysis of Error Probability: For

W = i

, an error event possibly happens at the encoder is:

$ℰ$ ₁: : { $(Y_{i}^{n}, U^{n} (s, j)) \notin B_{δ}^{(n)} (Y U)$ for all $s \in S$ and $j \in J$ },

and those at the decoder are:

$ℰ$ ₂: : { $(Z^{n}, U_{i}^{n}) \notin A_{δ}^{(n)} (Z U)$ },
$ℰ$ ₃: : { $(Z^{n}, U^{n} (s^{'}, J (i))) \in A_{δ}^{(n)} (Z U)$ for some $s^{'} \in S, s^{'} \neq S (i)$ },
$ℰ$ ₄: : { $(Z^{n}, U^{n} (s, J (i^{'}))) \in A_{δ}^{(n)} (Z U)$ for some $i^{'} \in I, i^{'} \neq i$ , and $s \in S$ }.

As usual, we use the random coding argument, and the ensemble average of the error probability can be evaluated as

\begin{matrix} Pr {(\hat{W}, \hat{S (W)}) \neq (W, S (W)) | W = i} & = Pr {E_{1} \cup E_{2} \cup E_{3} \cup E_{4}} \\ \leq Pr \{E_{1}\} + Pr \{E_{2} \cap E_{1}^{c}\} + Pr \{E_{3}\} + Pr \{E_{4}\}, \end{matrix}

(A11)

where each of

Pr {\cdot}

on the right-hand side denotes the conditional probability given

W = i

.

Now let us focus on bounding each term individually. The first term of (A11) can be made arbitrarily small by a similar argument of [11] (cf. the analysis of First Term of Error Probability in Appendix A-B). The rest can be analyzed as follows. For the second term, it follows that

\begin{matrix} Pr \{E_{2} \cap E_{1}^{c}\} & = Pr {(Z^{n}, U_{i}^{n}) \notin A_{δ}^{(n)} (Z U) \cap (Y_{i}^{n}, U_{i}^{n}) \in B_{δ}^{(n)} (Y U)} \\ \leq Pr {(Z^{n}, Y_{i}^{n}, U_{i}^{n}) \notin A_{δ}^{(n)} (Z Y U) \cap (Y_{i}^{n}, U_{i}^{n}) \in B_{δ}^{(n)} (Y U)} \\ = {\int \int}_{B_{δ}^{(n)} (Y U)} f_{Y_{i}^{n} U_{i}^{n}} (y^{n}, u^{n}) Pr {Z^{n} \notin A_{δ}^{(n)} (Z | y^{n}, u^{n}) | (Y_{i}^{n}, U_{i}^{n}) = (y^{n}, u^{n})} d (y^{n}, u^{n}) \\ \overset{(a)}{\leq} δ {\int \int}_{B_{δ}^{(n)} (Y U)} f_{Y_{i}^{n} U_{i}^{n}} (y^{n}, u^{n}) d (y^{n}, u^{n}) \leq δ, \end{matrix}

(A12)

where (a) follows from the definition of the modified

δ

-typical set due to the Markov chain

Z - Y - U

. To bound

Pr \{E_{3}\}

, due to the symmetry in the codebook generation, we can set

J (i) = 1

and have that

\begin{matrix} Pr \{E_{3}\} & = Pr {(Z^{n}, U^{n} (s^{'}, 1)) \in A_{δ}^{(n)} (Z U) for some s^{'} \in S, s^{'} \neq S (i)} \\ = \sum_{s \in S} (Pr {S (i) = s} \cdot Pr {(Z^{n}, U^{n} (s^{'}, 1)) \in A_{δ}^{(n)} (Z U) for some s^{'} \in S, s^{'} \neq s | S (i) = s}) \\ = \sum_{s \in S} (Pr {S (i) = s} \cdot Pr \{⋃_{s^{'} \in S ∖ s} (Z^{n}, U^{n} (s^{'}, 1)) \in A_{δ}^{(n)} (Z U) | S (i) = s\}) \\ \leq \sum_{s \in S} (Pr {S (i) = s} \cdot (\sum_{s^{'} \in S ∖ s} Pr \{(Z^{n}, U^{n} (s^{'}, 1)) \in A_{δ}^{(n)} (Z U) | S (i) = s\})) \\ \overset{(b)}{=} \sum_{s \in S} (Pr {S (i) = s} \cdot (\sum_{s^{'} \in S ∖ s} Pr {(Z^{n}, U^{n} (s^{'}, 1)) \in A_{δ}^{(n)} (Z U)})) \\ \leq \sum_{s \in S} (Pr {S (i) = s} \cdot (\sum_{s^{'} \in S ∖ s} e^{- n (I (Z; U) - δ)})) \\ \leq M_{S} \cdot e^{- n (I (Z; U) - δ)} \leq e^{- n δ}, \end{matrix}

(A13)

where (b) holds because the event

{(Z^{n}, U^{n} (s^{'}, 1)) \in A_{δ}^{(n)} (Z U)}

,

s^{'} \in S ∖ s

and the event

{S (i) = s}

are mutually independent, and the last inequality in (A13) follows as

R_{S} < I (Z; U)

.

To evaluate

Pr \{E_{4}\}

, we define two new events

E_{4}^{'} = {(Z^{n}, U^{n} (S (i), J (i^{'}))) \in A_{δ}^{(n)} (Z U)

for some

i^{'} \neq i, i^{'} \in I}

and

E_{4}^{''} = {(Z^{n}, U^{n} (s^{'}, J (i^{'}))) \in A_{δ}^{(n)} (Z U)

for some

i^{'} \neq i, i^{'} \in I

and

s^{'} \neq S (i), s^{'} \in S}

. Because

E^{'} \cap E^{''} = Ø

, it follows that

Pr \{E_{4}\} = Pr \{E_{4}^{'}\} + Pr \{E_{4}^{''}\}

. For

Pr \{E_{4}^{'}\}

, without loss of generality, we set

S (i) = 1

. Then, it follows that

\begin{matrix} Pr \{E_{4}^{'}\} & = Pr {(Z^{n}, U^{n} (1, J (i^{'}))) \in A_{δ}^{(n)} (Z U) for some i^{'} \in I, i^{'} \neq i} \\ \leq \sum_{i^{'} \in I ∖ i} Pr {(Z^{n}, U^{n} (1, J (i^{'}))) \in A_{δ}^{(n)} (Z U)} \\ \leq \sum_{i^{'} \in I ∖ i} e^{- n (I (Z; U) - δ)} \leq M_{I} \cdot e^{- n (I (Z; U) - δ)} \leq e^{- n δ}, \end{matrix}

(A14)

where the last inequality in (A14) follows as

R_{I} < I (Z; U)

. For

Pr \{E_{4}^{''}\}

, due to the statistical independence of

Z^{n}

and

U^{n} (s^{'}, J (i^{'}))

, we have that

\begin{matrix} Pr \{E_{4}^{''}\} & = Pr {(Z^{n}, U^{n} (s^{'}, J (i^{'}))) \in A_{δ}^{(n)} (Z U) for some i^{'} \neq i, i^{'} \in I and s^{'} \neq S (i), s^{'} \in S} \\ = Pr \{⋃_{\begin{matrix} i^{'} \in I ∖ i, s^{'} \in S ∖ S (i) \end{matrix}} (Z^{n}, U^{n} (s^{'}, J (i^{'}))) \in A_{δ}^{(n)} (Z U)\} \\ = \sum_{s \in S} (Pr {S (i) = s} \cdot Pr \{⋃_{\begin{matrix} i^{'} \in I ∖ i, s^{'} \in S ∖ s \end{matrix}} (Z^{n}, U^{n} (s^{'}, J (i^{'}))) \in A_{δ}^{(n)} (Z U) | S (i) = s\}) \\ \leq \sum_{s \in S} (Pr {S (i) = s} \cdot (\sum_{i^{'} \in I ∖ i} \sum_{s^{'} \in S ∖ s} Pr \{(Z^{n}, U^{n} (s^{'}, J (i^{'}))) \in A_{δ}^{(n)} (Z U) | S (i) = s\})) \\ = \sum_{s \in S} (Pr {S (i) = s} \cdot (\sum_{i^{'} \in I ∖ i} \sum_{s^{'} \in S ∖ s} Pr {(Z^{n}, U^{n} (s^{'}, J (i^{'}))) \in A_{δ}^{(n)} (Z U)})) \\ = \sum_{s \in S} (Pr {S (i) = s} \cdot (\sum_{i^{'} \in I ∖ i} \sum_{s^{'} \in S ∖ s} e^{- n (I (Z; U) - δ)})) \\ \leq M_{I} \cdot M_{S} \cdot e^{- n (I (Z; U) - δ)} = e^{- n δ}, \end{matrix}

(A15)

where the last equality in (A15) holds as

\frac{1}{n} log M_{I} + \frac{1}{n} log M_{S} = I (Z; U) - 2 δ

. Hence, the error probability is bounded by

\begin{matrix} Pr & {(\hat{W}, \hat{S (W)}) \neq (W, S (W)) | W = i} \leq 4 δ \end{matrix}

(A16)

for large enough n.

Before diving into the detailed analysis, we state lemmas that are important for the rest of the evaluations.

Lemma A3.

For given

u^{n}

and large enough n, we have that

\begin{matrix} Vol (B_{δ}^{(n)} (Y | u^{n})) & \leq Vol (A_{δ}^{(n)} (Y | u^{n})) . \end{matrix}

(A17)

Proof.

This is a straightforward result from Property 1 of Lemma A2. □

The following lemma plays an essential role in bounding the secret-key, secrecy-leakage, and privacy-leakage rates of the BIS. Again recall that the index pair

(S (i), J (i))

determines

U_{i}^{n}

directly, and therefore Lemma A4 can be thought of an extended version of [29] (Lemma 4) to incorporate continuous RVs. In [29], the lemma was proved by using the strongly typical set [33], and the literature, e.g., [8,13] demonstrated that it could be finely applied to establish the achievability part of the BIS under noisy enrollment for DMS settings. However, when the sources of the BIS becomes Gaussian, it is not trivial whether the claim of this lemma still holds or not. Here, we provide a full proof of the extended version of [29] (Lemma 4) for Gaussian RVs by using the connection between the modified

δ

-typical set

B_{δ}^{(n)} (\cdot)

and the weakly

δ

-typical set

A_{δ}^{(n)} (\cdot)

.

Lemma A4.

It holds that

\begin{matrix} h (Y_{i}^{n} | S (i), J (i), C_{n}) & \leq n (h (Y | U) + δ_{n}), \end{matrix}

(A18)

\begin{matrix} h (Y_{i}^{n} | X_{i}^{n}, S (i), J (i), C_{n}) & \leq n (h (Y | X, U) + δ_{n}), \end{matrix}

(A19)

where

δ_{n} ↓ 0

as

δ ↓ 0

and

n \to \infty

.

Proof.

We first prove (A18). Define a binary RV T taking value 1 if

(Y_{i}^{n}, U_{i}^{n}) \in B_{δ}^{(n)} (Y U)

, and 0 otherwise. In the analysis of the error probability, it is guaranteed that

P_{T} (0) \leq δ

, i.e.,

(Y_{i}^{n}, U_{i}^{n}) \in B_{δ}^{(n)} (Y U)

with high probability. The left-hand side of (A18) can be bounded as

\begin{matrix} h (Y_{i}^{n} | S (i), J (i), C_{n}) & \overset{(c)}{=} h (Y_{i}^{n} | U_{i}^{n}, S (i), J (i), C_{n}) \\ \overset{(d)}{\leq} h (Y_{i}^{n} | U_{i}^{n}) \leq h (Y_{i}^{n}, T | U_{i}^{n}) \leq H (T) + h (Y_{i}^{n} | U_{i}^{n}, T) \\ \leq 1 + P_{T} (0) h (Y_{i}^{n} | U_{i}^{n}, T = 0) + P_{T} (1) h (Y_{i}^{n} | U_{i}^{n}, T = 1) \\ \overset{(e)}{\leq} n ϵ_{n} + h (Y_{i}^{n} | U_{i}^{n}, T = 1) \\ = n ϵ_{n} + \int_{R^{n}} h (Y_{i}^{n} | U_{i}^{n} = u^{n}, T = 1) d F (u^{n}) \\ = n ϵ_{n} + \int_{R^{n}} \int_{B_{δ}^{(n)} (Y | u^{n})} P_{Y_{i}^{n} | U_{i}^{n}, T} (y^{n} | u^{n}, 1) log \frac{1}{P_{Y_{i}^{n} | U_{i}^{n}, T} (y^{n} | u^{n}, 1)} d y^{n} d F (u^{n}) \\ \overset{(f)}{\leq} n ϵ_{n} + \int_{R^{n}} log (\int_{B_{δ}^{(n)} (Y | u^{n})} P_{Y_{i}^{n} | U_{i}^{n}, T} (y^{n} | u^{n}, 1) \frac{1}{P_{Y_{i}^{n} | U_{i}^{n}, T} (y^{n} | u^{n}, 1)} d y^{n}) d F (u^{n}) \\ = n ϵ_{n} + \int_{R^{n}} log (\int_{B_{δ}^{(n)} (Y | u^{n})} d y^{n}) d F (u^{n}) \\ = n ϵ_{n} + \int_{R^{n}} log (Vol (B_{δ}^{(n)} (Y | u^{n}))) d F (u^{n}) \\ \overset{(g)}{\leq} n ϵ_{n} + \int_{R^{n}} log (Vol (A_{δ}^{(n)} (Y | u^{n}))) d F (u^{n}) \\ \overset{(h)}{\leq} n ϵ_{n} + n (h (Y | U) + δ)) \int_{R^{n}} d F (u^{n}) \\ = n (h (Y | U) + δ + ϵ_{n}), \end{matrix}

(A20)

where

(c): follows as $(S (i), J (i))$ determines $U_{i}^{n}$ ,
(d): follows because conditioning reduces entropy,
(e): follows as $h (Y_{i}^{n} | U_{i}^{n}, T = 0) \leq h (Y_{i}^{n}) = \frac{n}{2} log (2 π e)$ , and we define $ϵ_{n} = \frac{1}{n} + \frac{δ}{2} log (2 π e)$ ,
(f): follows by applying Jensen’s inequality to the concave function $ϕ (t) = - t log t$ ,
(g): is due to (A17) in Lemma A3,
(h): is due to (A3) in Lemma A1.

Therefore, from (A20), we obtain that

\begin{matrix} \frac{1}{n} h (Y_{i}^{n} | S (i), J (i), C_{n}) & \leq h (Y | U) + δ_{n}, \end{matrix}

(A21)

where

δ_{n} = δ + ϵ_{n}

and

δ_{n} ↓ 0

as

n \to \infty

and

δ ↓ 0

.

Next, we briefly summarize how to show (A19). The left-hand side of the inequality can be developed as

h (Y_{i}^{n} | X_{i}^{n}, S (i), J (i), C_{n}) = h (Y_{i}^{n} | X_{i}^{n}, U_{i}^{n}, S (i), J (i), C_{n}) \leq h (Y_{i}^{n} | X_{i}^{n}, U_{i}^{n})

, where the equality and inequality follow due to the same reasons of (c) and (d) in (A20), respectively. By the definition of the modified

δ

-typical set in Definition A2, it can be concluded that

Pr {(X_{i}^{n}, Y_{i}^{n}, U_{i}^{n}) \in A_{δ}^{(n)} (X Y U)} \to 1

as

n \to \infty

due to the Markov chain

X - Y - U

and

(Y_{i}^{n}, U_{i}^{n}) \in B_{δ}^{(n)} (Y U)

with high probability. This implies

Pr {(X_{i}^{n}, U_{i}^{n}) \in A_{δ}^{(n)} (X U)} \to 1

and

Pr {Y_{i}^{n} \in A_{δ}^{(n)} (Y | x^{n}, u^{n}) | (X_{i}^{n}, U_{i}^{n}) = (x^{n}, u^{n})} \to 1

as

n \to \infty

as well. Based on this observation, the rest of the proof for (A19) can be done similarly by the arguments seen in [29] (Appendix C), and therefore the details are omitted. □

Note that Equations (14) and (16) obviously hold from the parameter settings. By applying Lemma A4, bounds on the secret-key and the secrecy-leakage rates can be derived as follows:

\begin{matrix} \frac{1}{n} H (S (i) | C_{n}) & \geq R_{S} - 5 δ = \frac{1}{n} log M_{S} - 5 δ, \frac{1}{n} I (S (i); J (i) | C_{n}) \leq 5 δ \end{matrix}

(A22)

for large enough n. For detailed discussions, the readers may refer to [14] (Proof of Theorem 1). Analysis of Privacy-Leakage Rate: From the left-hand side of (18), we have that

\begin{matrix} I (X_{i}^{n}; J (i) | C_{n}) & = H (J (i) | C_{n}) - H (J (i) | X_{i}^{n}, C_{n}) \\ \leq n R_{J} - H (J (i) | X_{i}^{n}, C_{n}) \\ = n (I (Y; U) - I (Z; U) + R_{I} + 6 δ) - H (J (i) | X_{i}^{n}, C_{n}) \\ = n (h (U | Z) - h (U | Y) + R_{I} + 6 δ) - H (J (i) | X_{i}^{n}, C_{n}) . \end{matrix}

(A23)

The last term in (A23) can be further evaluated as

\begin{matrix} H (J (i) | X_{i}^{n}, C_{n}) & = h (Y_{i}^{n}, J (i) | X_{i}^{n}, C_{n}) - h (Y_{i}^{n} | X_{i}^{n}, J (i), C_{n}) \\ = h (Y_{i}^{n}, J (i) | X_{i}^{n}, C_{n}) - h (Y_{i}^{n} | X_{i}^{n}, J (i), S (i), C_{n}) - I (Y_{i}^{n}; S (i) | X_{i}^{n}, J (i), C_{n}) \\ \overset{(i)}{=} h (Y_{i}^{n} | X_{i}^{n}, C_{n}) - h (Y_{i}^{n} | X_{i}^{n}, J (i), S (i), C_{n}) - H (S (i) | X_{i}^{n}, J (i), C_{n}) \\ \overset{(j)}{=} n h (Y | X) - h (Y_{i}^{n} | X_{i}^{n}, J (i), S (i), C_{n}) - H (S (i) | X_{i}^{n}, J (i), Z^{n}, C_{n}) \\ \overset{(k)}{\geq} n h (Y | X) - h (Y_{i}^{n} | X_{i}^{n}, J (i), S (i), C_{n}) - H (S (i) | J, Z^{n}, C_{n}) \\ \overset{(l)}{\geq} n h (Y | X) - h (Y_{i}^{n} | X_{i}^{n}, J (i), S (i), C_{n}) - n δ_{n}^{'} \\ \overset{(m)}{\geq} n h (Y | X) - n (h (Y | X, U) + δ_{n}) - n δ_{n}^{'} \\ = n (I (Y; U | X) - δ_{n} - δ_{n}^{'}) \\ = n (h (U | X) - h (U | Y) - δ_{n} - δ_{n}^{'}), \end{matrix}

(A24)

where

(i): follows as $J (i)$ and $S (i)$ are functions of $Y_{i}^{n}$ for given codebook $C_{n}$ ,
(j): follows since $(Y_{i}^{n}, X_{i}^{n})$ are independent of $C_{n}$ , and the Markov chain $S (i) - (X_{i}^{n}, J (i)) - Z^{n}$ holds,
(k): follows because conditioning reduces entropy and $S (i) - (J (i), Z^{n}) - J ∖ J (i)$ is applied,
(l): follows by applying Fano’s inequality since $S (i)$ can be reliably reconstructed from $(J, Z^{n})$ for given codebook $C_{n}$ , and $δ_{n}^{'} ↓ 0$ as $δ ↓ 0$ and $n \to \infty$ ,
(m): is due to (A19).

From (A23) and (A24), we have that

\begin{matrix} \frac{1}{n} I (X_{i}^{n}; J (i) | C_{n}) & \leq h (U | Z) - h (U | X) + R_{I} + 6 δ + δ_{n} + δ_{n}^{'} \\ = I (X; U) - I (Z; U) + R_{I} + 6 δ + δ_{n} + δ_{n}^{'} \\ \leq R_{L} + δ \end{matrix}

(A25)

for sufficiently large n.

Finally, by using the selection lemma [34] (Lemma 2.2), from, e.g., (A16) and (A25), there exists at least one good codebook satisfying all the conditions in Definition 1 for large enough n.

Appendix A.3. Converse Part

We consider a more relaxed case where W is uniformly distributed on

I

, and (13), (15), (17) and (18) in Definition 1 are replaced by

\begin{matrix} Pr {(\hat{W}, \hat{S (W)}) \neq (W, S (W))} & \leq δ, \end{matrix}

(A26)

\begin{matrix} \frac{1}{n} H (S (W) | W) & \geq R_{S} - δ, \end{matrix}

(A27)

\begin{matrix} \frac{1}{n} I (S (W); J (W) | W) & \leq δ, \end{matrix}

(A28)

\begin{matrix} \frac{1}{n} I (X_{W}^{n}; J (W) | W) & \leq R_{L} + δ, \end{matrix}

(A29)

respectively. Other conditions remain unchanged. We shall show that even for this case the outer bound on

R_{G}

coincides with its inner bound where there is no assumption of the uniformity of W. A similar approach was also taken in [12]. We assume that a rate tuple

(R_{I}, R_{S}, R_{J}, R_{L})

is achievable, implying there exists a pair of encoder and decoder

(e, d)

such that conditions (14), (16), and (A26)–(A29) are satisfied for any

δ > 0

and large enough n. Analysis of Identification and Secret-key Rates: The joint entropy of W and

S (W)

can be developed as

\begin{matrix} H (W, S (W)) & = H (W, S (W) | Z^{n}, J) + I (W, S (W); Z^{n}, J) \\ \overset{(a)}{=} H (W, S (W) | \hat{W}, \hat{S (W)}, Z^{n}, J) + I (W, S (W); J) + I (W, S (W); Z^{n} | J) \\ \overset{(b)}{\leq} H (W, S (W) | \hat{W}, \hat{S (W)}) + I (W, S (W); J (W)) + I (W, S (W); Z^{n} | J (W)) \\ \overset{(c)}{\leq} n δ_{n} + I (W; J (W)) + I (S (W); J (W) | W) + I (W, S (W); Z^{n} | J (W)) \\ \overset{(d)}{\leq} n (δ_{n} + δ) + h (Z^{n} | J (W)) - h (Z^{n} | J (W), S (W)) \\ \overset{(e)}{\leq} n (δ_{n} + δ) + h (Z^{n}) - h (Z^{n} | J (W), S (W)), \end{matrix}

(A30)

where

(a): holds since $(\hat{W}, \hat{S (W)})$ is a function of $(Z^{n}, J)$ ,
(b): follows because conditioning reduces entropy, and only $J (W)$ is possibly dependent on $Z^{n}, S (W)$ ,
(c): is due to Fano’s inequality with $δ_{n} = \frac{1}{n} (1 + δ log M_{I} M_{S})$ ,
(d): follows since (A28) is applied, and W is independent of other RVs,
(e): follows because conditioning reduces entropy.

Due to the uniformity of W on

I

, we have that

\begin{matrix} H (W, S (W)) = H (W) + H (S (W) | W) = log M_{I} + H (S (W) | W) . \end{matrix}

(A31)

From (14), (A27), and (A30), it yields that

\begin{matrix} R_{I} + R_{S} \leq h (Z) - \frac{1}{n} h (Z^{n} | J (W), S (W)) + 3 δ + δ_{n} . \end{matrix}

(A32)

Analysis of Storage Rate: From (16),

\begin{matrix} n (R_{J} + δ) & \geq log M_{J} \geq max_{w \in I} H (J (w)) \geq H (J (W) | W) \\ = I (Y_{W}^{n}; J (W) | W) \overset{(f)}{=} h (Y_{W}^{n}) - h (Y_{W}^{n} | J (W), W) \\ = h (Y_{W}^{n}) - h (Y_{W}^{n} | J (W), S (W), W) - I (S (W); Y_{W}^{n} | J (W), W) \\ \overset{(g)}{=} h (Y_{W}^{n}) - h (Y_{W}^{n} | J (W), S (W)) - H (S (W) | J (W), W) \\ \geq h (Y_{W}^{n}) - h (Y_{W}^{n} | J (W), S (W)) - H (S (W) | W) \\ \overset{(h)}{\geq} h (Z^{n} | J (W), S (W)) - h (Y_{W}^{n} | J (W), S (W)) + n (R_{I} - (δ_{n} + 2 δ)), \end{matrix}

(A33)

where

(f): holds as W is independent of $Y_{W}^{n}$ ,
(g): holds as W is independent of other RVs and $S (W)$ is a function of $Y_{W}^{n}$ ,
(h): follows because $h (Y_{W}^{n}) = h (Z^{n}) = \frac{n}{2} log (2 π e)$ , and by combining (A30) and (A31), we obtain that $H (S (W) | W) \leq h (Z^{n}) - h (Z^{n} | J (W), S (W)) - n (R_{I} - (δ_{n} + 2 δ))$ .

Analysis of Privacy-Leakage Rate: From (A29),

\begin{matrix} n (R_{L} + δ) & \geq I (X_{W}^{n}; J (W) | W) \overset{(i)}{=} h (X_{W}^{n}) - h (X_{W}^{n} | J (W), W) \\ = h (X_{W}^{n}) - h (X_{W}^{n} | J (W), S (W), W) - I (S (W); X_{W}^{n} | J (W), W) \\ \geq h (X_{W}^{n}) - h (X_{W}^{n} | J (W), S (W)) - H (S (W) | J (W), W) \\ \geq h (X_{W}^{n}) - h (X_{W}^{n} | J (W), S (W)) - H (S (W) | W) \\ \overset{(j)}{\geq} h (Z^{n} | J (W), S (W)) - h (X_{W}^{n} | J (W), S (W)) + n (R_{I} - (δ_{n} + 2 δ)), \end{matrix}

(A34)

where

(i): holds as W is independent of $X_{W}^{n}$ ,
(j): follows because $h (X_{W}^{n}) = h (Z^{n})$ , and the same reason of (h) in (A33) is used.

For further evaluations of (A32)–(A34), we scrutinize a lower bound on

h (Z^{n} | J (W), S (W))

and an upper bound on

h (Y_{W}^{n} | J (W), S (W))

with fixed

h (X_{W}^{n} | J (W), S (W))

by applying the conditional EPI [27] (Lemma II). It is a key to set

\begin{matrix} \frac{1}{n} h (X_{W}^{n} | J (W), S (W)) = \frac{1}{2} log (2 π e (α ρ_{1}^{2} + 1 - ρ_{1}^{2})) \end{matrix}

(A35)

with some

0 < α \leq 1

. Indeed, this is a reasonable setting because

\frac{1}{2} log (2 π e) \geq \frac{1}{n} h (X_{W}^{n} | J (W), S (W)) \geq \frac{1}{2} log (2 π e (1 - ρ_{1}^{2}))

. The lower bound is obtained from

\frac{1}{n} h (X_{W}^{n} | J (W), S (W)) \geq \frac{1}{n} h (X_{W}^{n} | Y_{W}^{n}, J (W), S (W)) = \frac{1}{n} h (X_{W}^{n} | Y_{W}^{n})

due to the fact that

(J (W), S (W))

is a function of

Y_{W}^{n}

. For

α = 0

, it is not possible to achieve such point, and the reason will be explained right after Equation (A40).

In the direction from X to Z, by applying the conditional EPI [27] (Lemma II) to the first equality in (10), it follows that

\begin{matrix} e^{\frac{2}{n} h (Z^{n} | J (W), S (W))} & \geq e^{\frac{2}{n} h (ρ_{2} X_{W}^{n} | J (W), S (W))} + e^{\frac{2}{n} h (N_{2}^{n} | J (W), S (W))} \\ \overset{(k)}{=} ρ_{2}^{2} e^{\frac{2}{n} h (X_{W}^{n} | J (W), S (W))} + e^{\frac{2}{n} h (N_{2}^{n})} \\ = ρ_{2}^{2} (2 π e (α ρ_{1}^{2} + 1 - ρ_{1}^{2})) + 2 π e (1 - ρ_{2}^{2}) \\ = 2 π e (α ρ_{1}^{2} ρ_{2}^{2} + 1 - ρ_{1}^{2} ρ_{2}^{2}), \end{matrix}

(A36)

where (k) holds as

N_{2}^{n}

is independent of

(J (W), S (W))

, and as a deduction,

\begin{matrix} \frac{1}{n} h (Z^{n} | J (W), S (W)) \geq \frac{1}{2} log (2 π e (α ρ_{1}^{2} ρ_{2}^{2} + 1 - ρ_{1}^{2} ρ_{2}^{2})) . \end{matrix}

(A37)

In the opposite direction (from X to Y), again applying the conditional EPI [27] (Lemma II) to (9), we have that

\begin{matrix} e^{\frac{2}{n} h (X_{W}^{n} | J (W), S (W))} & \geq e^{\frac{2}{n} h (ρ_{1} Y_{W}^{n} | J (W), S (W))} + e^{\frac{2}{n} h ({(N_{1}^{'})}^{n})}, \end{matrix}

(A38)

where the inequality holds as

{(N_{1}^{'})}^{n}

is also independent of

(J (W), S (W))

, meaning that

\begin{matrix} 2 π e (α ρ_{1}^{2} + 1 - ρ_{1}^{2}) & \geq ρ_{1}^{2} e^{\frac{2}{n} h (Y_{W}^{n} | J (W), S (W))} + 2 π e (1 - ρ_{1}^{2}) \end{matrix}

(A39)

and thus

\begin{matrix} e^{\frac{2}{n} h (Y_{W}^{n} | J (W), S (W))} & \leq 2 π e α, \\ \frac{1}{n} h (Y_{W}^{n} | J (W), S (W)) & \leq \frac{1}{2} log (2 π e α), \end{matrix}

(A40)

which is not derivable from the first equation in (1) of the original system. As previously mentioned for the case in which

α = 0

, in (A40), since RV Y has unit variance, it is required that the joint entropy

H (J (W), S (W))

should be infinity, but this value is impossible to achieve for finite sets

S

and

J

.

Now plugging (A35), (A37), and (A40) into (A32)–(A34), we obtain that

\begin{matrix} R_{I} + R_{S} \leq \frac{1}{2} log (\frac{1}{α ρ_{1}^{2} ρ_{2}^{2} + 1 - ρ_{1}^{2} ρ_{2}^{2}}) + 3 δ + δ_{n}, \end{matrix}

(A41)

\begin{matrix} R_{J} \geq \frac{1}{2} log (\frac{α ρ_{1}^{2} ρ_{2}^{2} + 1 - ρ_{1}^{2} ρ_{2}^{2}}{α}) + R_{I} - (3 δ + δ_{n}), \end{matrix}

(A42)

\begin{matrix} R_{L} \geq \frac{1}{2} log (\frac{α ρ_{1}^{2} ρ_{2}^{2} + 1 - ρ_{1}^{2} ρ_{2}^{2}}{α ρ_{1}^{2} + 1 - ρ_{1}^{2}}) + R_{I} - (3 δ + δ_{n}) . \end{matrix}

(A43)

Eventually, by letting

n \to \infty

and

δ ↓ 0

, from (A41)–(A43), we can see that the capacity region is contained in the right-hand side of (19).

Appendix B. Proof Sketch of Equation (20)

In this section, we highlight the proof of the chosen-secret BIS model. The parts that follow directly from the same arguments in Appendix A are omitted.

Appendix B.1. Achievability Part

The proof slightly differs from the one in Appendix A, in which the encoder and decoder of the generated-secret BIS model are used as components inside the encoder and decoder of the chosen-secret BIS model as shown in Figure A2. In order to avoid confusion in the subsequent arguments, we define some new notation used only in this part. The pairs

(J_{C} (i), S_{C} (i))

and

(J_{G} (i), S_{G} (i))

denote the helper and secret key for individual i generated by the encoders of chosen- and generated-secret BIS models, respectively. To encode

Y_{i}^{n}

for

i \in I

, the component inside the encoder first generates

(J_{G} (i), S_{G} (i))

and then uses

S_{G} (i)

to mask

S_{C} (i)

by executing one-time pad operation

S_{C} (i) \oplus S_{G} (i)

, where ⊕ denotes the addition modulo

M_{S}

. The helper data

J_{C} (i)

is the combined information of

J_{G} (i)

and the masked data, i.e.,

\begin{matrix} J_{C} (i) = (J_{G} (i), S_{C} (i) \oplus S_{G} (i)) . \end{matrix}

(A44)

Figure A2. Encoder and decoder of the chosen-secret BIS model for achievability proof.

For decoding the identified user

W = i

, it first uses the component decoder to estimate (

\hat{W}, \hat{S_{G} (W)}

) and then the secret key is retrieved from

\begin{matrix} \hat{S_{C} (W)} = S_{C} (\hat{W}) \oplus S_{G} (\hat{W}) ⊖ \hat{S_{G} (W)}, \end{matrix}

(A45)

where ⊖ denotes the subtraction modulo

M_{S}

. This technique is also used in [8,11,13]. Analysis of Error Probability: For individual

W = i

, the operation at the decoder (A45) means that

(\hat{W}, \hat{S_{C} (W)}) = (W, S_{C} (W))

if and only if

(\hat{W}, \hat{S_{G} (W)}) = (W, S_{G} (W))

. In (A16), it is revealed that

Pr {(\hat{W}, \hat{S_{G} (W)}) \neq (W, S_{G} (W)) | W = i} \leq 4 δ .

Therefore, the error probability of the chosen-secret BIS model can also be bounded by

\begin{matrix} Pr {(\hat{W}, \hat{S_{C} (W)}) \neq (W, S_{C} (W)) | W = i} \leq 4 δ \end{matrix}

(A46)

for large enough n.

Analyses of Identification and Secret-key Rates: Equations (14) and (15) are straightforward from the parameter settings.

Analysis of Storage Rate: From (A44), the total storage rate is bounded by

\begin{matrix} \underset{the rate of J_{G} (i)}{\underset{︸}{I (Y; U) - I (Z; U) + R_{I} + 6 δ}} + \underset{the rate of S_{C} (i) \oplus S_{G} (i)}{\underset{︸}{I (Z; U) - R_{I} - 2 δ}} & = I (Y; U) + 4 δ \\ = \frac{1}{2} log (\frac{1}{α}) + 4 δ . \end{matrix}

(A47)

Analysis of Secrecy-Leakage Rate: From the similar argument of [11] (Equation (48)), it was shown that

\begin{matrix} I & (J_{C} (i); S_{C} (i) | C_{n}) \leq I (J_{G} (i); S_{G} (i) | C_{n}) + log M_{S} - H (S_{G} (i) | C_{n}), \end{matrix}

(A48)

and by substituting (A22) into (A48), the secrecy-leakage rate of the chosen-secret BIS model is bounded by

\begin{matrix} \frac{1}{n} I (J_{C} (i); S_{C} (i) | C_{n}) \leq 10 δ \end{matrix}

(A49)

for large enough n.

Analysis of Privacy-Leakage Rate: It can be proved that

\begin{matrix} I (X_{i}^{n}; J_{C} (i) | C_{n}) = I (X_{i}^{n}; J_{G} (i) | C_{n}) . \end{matrix}

(A50)

To verify this, first one can easily see that

\begin{matrix} I (X_{i}^{n}; J_{C} (i) | C_{n}) = I (X_{i}^{n}; J_{G} (i), S_{C} (i) \oplus S_{G} (i) | C_{n}) \geq I (X_{i}^{n}; J_{G} (i) | C_{n}) . \end{matrix}

(A51)

Meanwhile, by the same analogy to the development of ([11] Equation (49)), it also holds that

\begin{matrix} I (X_{i}^{n}; J_{C} (i) | C_{n}) \leq I (X_{i}^{n}; J_{G} (i) | C_{n}) . \end{matrix}

(A52)

From (A51) and (A52), (A50) clearly holds. By invoking the result of (A25), from (A50), the privacy-leakage rate can also be made that

\begin{matrix} \frac{1}{n} I (X_{i}^{n}; J_{C} (i) | C_{n}) \leq R_{L} + δ \end{matrix}

(A53)

for large enough n.

Finally, by using the selection lemma [34] (Lemma 2.2), there is at least one good codebook satisfying all the conditions in Definition 2 for large enough n.

Appendix B.2. Converse Part

As seen in the converse proof of the generated-secret BIS model, we also consider the case in which W is uniformly distributed on

I

. Suppose that a pair

(R_{I}, R_{S}, R_{J}, R_{L})

is achievable and fix

α

such that the condition in (A35) is satisfied.

For the analyses of identification, secret-key, and privacy-leakage rates, the reader should refer to the discussions around (A32) and (A34). We argue only the bound on

R_{J}

, which is different from the one seen in the generated-secret BIS model. Analysis of Storage Rate:

\begin{matrix} n (R_{J} + δ) & \geq log M_{J} \geq max_{w \in I} H (w) \geq H (J (W) | W) \\ \overset{(a)}{=} I (Y_{W}^{n}, S (W); J (W) | W) = I (S (W); J (W) | W) + I (Y_{W}^{n}; J (W) | W, S (W)) \\ \geq I (Y_{W}^{n}; J (W) | W, S (W)) \overset{(b)}{=} h (Y_{W}^{n}) - h (Y_{W}^{n} | J (W), S (W)) \\ \overset{(c)}{\geq} \frac{n}{2} log (2 π e) - \frac{n}{2} log (2 π e α) = \frac{n}{2} log (\frac{1}{α}), \end{matrix}

(A54)

where

(a): follows since $J (W)$ is a function of $(Y_{W}^{n}, S (W))$ ,
(b): follows as W is independent of other RVs and $S (W)$ is chosen independently of $Y_{W}^{n}$ ,
(c): follows because (A40) is applied.

Then, we have that

\begin{matrix} R_{J} \geq \frac{1}{2} log (\frac{1}{α}) - δ . \end{matrix}

(A55)

By letting

n \to \infty

and

δ ↓ 0

, the capacity region of the chosen-secret BIS model is contained in the right-hand side of (20).

Appendix C. Convexity of the Regions $R_{G}$ and $R_{C}$

Here, we only verify that the region

R_{G}

is convex as the proof for

R_{C}

follows similarly. We begin with the case in which

ρ_{1}^{2} ρ_{2}^{2} > 0

. We first define

η = \frac{1}{α ρ_{1}^{2} ρ_{2}^{2} + 1 - ρ_{1}^{2} ρ_{2}^{2}},

and then it follows that

α = \frac{1}{ρ_{1}^{2} ρ_{2}^{2}} (\frac{1}{η} - (1 - ρ_{1}^{2} ρ_{2}^{2})) .

Therefore, the constraints of

R_{J}

and

R_{L}

in (19) can be transformed as follows:

\begin{matrix} R_{J} & \geq \frac{1}{2} log (\frac{\frac{1}{ρ_{1}^{2} ρ_{2}^{2}} (\frac{1}{η} - (1 - ρ_{1}^{2} ρ_{2}^{2})) ρ_{1}^{2} ρ_{2}^{2} + 1 - ρ_{1}^{2} ρ_{2}^{2}}{\frac{1}{ρ_{1}^{2} ρ_{2}^{2}} (\frac{1}{η} - (1 - ρ_{1}^{2} ρ_{2}^{2}))}) + R_{I} \\ = \frac{1}{2} log (\frac{ρ_{1}^{2} ρ_{2}^{2}}{1 - (1 - ρ_{1}^{2} ρ_{2}^{2}) η}) + R_{I} \\ = - \frac{1}{2} log (1 - (1 - ρ_{1}^{2} ρ_{2}^{2}) η) + log | ρ_{1} ρ_{2} | + R_{I}, \end{matrix}

(A56)

and

\begin{matrix} R_{L} & \geq \frac{1}{2} log (\frac{\frac{1}{ρ_{1}^{2} ρ_{2}^{2}} (\frac{1}{η} - (1 - ρ_{1}^{2} ρ_{2}^{2})) ρ_{1}^{2} ρ_{2}^{2} + 1 - ρ_{1}^{2} ρ_{2}^{2}}{\frac{1}{ρ_{1}^{2} ρ_{2}^{2}} (\frac{1}{η} - (1 - ρ_{1}^{2} ρ_{2}^{2})) ρ_{1}^{2} + 1 - ρ_{1}^{2}}) + R_{I} \\ = \frac{1}{2} log (\frac{ρ_{2}^{2}}{1 - (1 - ρ_{2}^{2}) η}) + R_{I} \\ = - \frac{1}{2} log (1 - (1 - ρ_{2}^{2}) η) + log | ρ_{2} | + R_{I} . \end{matrix}

(A57)

Since

0 < | ρ_{1} |, | ρ_{2} | < 1

, and

0 < α \leq 1

, it holds that

\frac{1 - ρ_{2}^{2}}{α ρ_{1}^{2} ρ_{2}^{2} + 1 - ρ_{1}^{2} ρ_{2}^{2}} < \frac{1 - ρ_{1}^{2} ρ_{2}^{2}}{α ρ_{1}^{2} ρ_{2}^{2} + 1 - ρ_{1}^{2} ρ_{2}^{2}} < 1

, indicating the values of

1 - (1 - ρ_{1}^{2} ρ_{2}^{2}) η

and

1 - (1 - ρ_{2}^{2}) η

are positive. Now the region in (19) can also be expressed as follows:

\begin{matrix} R_{G} = {(R_{I}, R_{S}, R_{J}, R_{L}) : & R_{I} + R_{S} \leq \frac{1}{2} log η, \\ R_{J} \geq - \frac{1}{2} log (1 - (1 - ρ_{1}^{2} ρ_{2}^{2}) η) + log | ρ_{1} ρ_{2} | + R_{I}, \\ R_{L} \geq - \frac{1}{2} log (1 - (1 - ρ_{2}^{2}) η) + log | ρ_{2} | + R_{I}, \\ R_{I} \geq 0, R_{S} \geq 0 for some 1 \leq η < \frac{1}{1 - ρ_{1}^{2} ρ_{2}^{2}}} . \end{matrix}

(A58)

Suppose that the rate tuples

R^{1} = (R_{I}^{1}, R_{S}^{1}, R_{J}^{1}, R_{L}^{1})

and

R^{2} = (R_{I}^{2}, R_{S}^{2}, R_{J}^{2}, R_{L}^{2})

are achievable tuples for

η_{1}

and

η_{2}

, respectively. Without loss of generality, we assume that

1 \leq η_{1} \leq η_{2} < \frac{1}{1 - ρ_{1}^{2} ρ_{2}^{2}}

. Next, let us consider linear combinations of these tuples. For

0 \leq λ \leq 1

, we have that

\begin{matrix} λ (R_{I}^{1} + R_{S}^{1}) + (1 - λ) (R_{I}^{2} + R_{S}^{2}) & \leq \frac{1}{2} (λ log η_{1} + (1 - λ) log η_{2}) \\ \overset{(a)}{\leq} \frac{1}{2} log (λ η_{1} + (1 - λ) η_{2}) \\ \overset{(b)}{=} \frac{1}{2} log η^{'}, \end{matrix}

(A59)

where

(a): follows as $log x (x > 0)$ is a concave function,
(b): holds as we define $η^{'} = λ η_{1} + (1 - λ) η_{2}$ .

Now take a look into the bound on the storage rate

\begin{matrix} λ R_{J}^{1} + (1 - λ) R_{J}^{2} & \geq - λ \frac{1}{2} log (1 - (1 - ρ_{1}^{2} ρ_{2}^{2}) η_{1}) - (1 - λ) \frac{1}{2} log (1 - (1 - ρ_{1}^{2} ρ_{2}^{2}) η_{2}) \\ + log | ρ_{1} ρ_{2} | + R_{I} \\ \overset{(c)}{\geq} - \frac{1}{2} log (1 - (1 - ρ_{1}^{2} ρ_{2}^{2}) (λ η_{1} + (1 - λ) η_{2})) + log | ρ_{1} ρ_{2} | + R_{I} \\ = - \frac{1}{2} log (1 - (1 - ρ_{1}^{2} ρ_{2}^{2}) η^{'}) + log | ρ_{1} ρ_{2} | + R_{I}, \end{matrix}

(A60)

where (c) follows because

f (x) = - log (1 - x) (x < 1)

is a convex function. Likewise, we can also show that

\begin{matrix} λ R_{L}^{1} + (1 - λ) R_{L}^{2} \geq - \frac{1}{2} log (1 - (1 - ρ_{2}^{2}) η^{'}) + log | ρ_{2} | + R_{I} . \end{matrix}

(A61)

From (A59)–(A61), we see that there exists an

η^{'}

, where

η_{1} \leq η^{'} \leq η_{2}

, that satisfies

λ R^{1} + (1 - λ) R^{2} \in R_{G}

.

For the other case (when

ρ_{1}^{2} ρ_{2}^{2} = 0

), the right-hand sides of each constraint in

R_{G}

become simpler forms, and it is easier to check that both regions are convex by applying Jensen’s inequality to the logarithmic function. Accordingly, this concludes that the region

R_{G}

is convex.

References

Luis-Garcia, R.; Alberola-Lopez, C.; Aghzout, O.; Ruiz-Alzola, J. Biometric identification systems. Signal Proces. 2003, 83, 2539–2557. [Google Scholar] [CrossRef]
Jain, A.K.; Flynn, P.; Ross, A. Handbook of Biometrics; Springer: New York, NY, USA, 2009. [Google Scholar]
Schneier, B. Inside risks: The uses and abuses of biometrics. Commun. ACM 1999, 42, 136. [Google Scholar] [CrossRef]
Csisźar, I.; Narayan, P. Common randomness and secret key generation with a helper. IEEE Trans. Inf. Theory 2000, 46, 344–366. [Google Scholar] [CrossRef]
Willems, F.M.J.; Kalker, T.; Baggen, S.; Linnartz, J.P. On the capacity of a biometric identification system. In Proceedings of the 2003 IEEE International Symposium on Information Theory (ISIT), Yokohama, Japan, 29 June–4 July 2003; p. 82. [Google Scholar]
Berger, T. Rate Distortion Theory; Prentice-Hall: Upper Saddle River, NJ, USA, 1971. [Google Scholar]
Csiszár, I.; Köner, J. Information Theory—Coding Theorems for Discrete Memoryless Systems, 2nd ed.; Cambridge University Press: Cambridge, UK, 2011. [Google Scholar]
Günlü, O.; Kramer, G. Privacy, Secrecy, and storage with multiple noisy measurements of identifiers. IEEE Trans. Inf. Forensics Secur. 2018, 13, 2872–2883. [Google Scholar] [CrossRef] [Green Version]
Tuncel, E. Capacity/Storage tradeoff in high-dimensional identification systems. IEEE Trans. Inf. Theory 2009, 55, 2097–2106. [Google Scholar] [CrossRef]
Tuncel, E.; Gündüz, D. Identification and lossy reconstruction in noisy databases. IEEE Trans. Inf. Theory 2014, 60, 822–831. [Google Scholar] [CrossRef]
Ignatenko, T.; Willems, F.M.J. Fundamental limits for privacy-preserving biometric identification systems that support authentication. IEEE Trans. Inf. Theory 2015, 61, 5583–5594. [Google Scholar] [CrossRef] [Green Version]
Kittichokechai, K.; Caire, G. Secret key-based identification and authentication with a privacy constraint. IEEE Trans. Inf. Theory 2018, 64, 5879–5897. [Google Scholar] [CrossRef] [Green Version]
Yachongka, V.; Yagi, H. A new characterization of the capacity region of identification systems under noisy enrollment. In Proceedings of the 54th Annual Conference on Information Sciences and Systems (CISS), Princeton, NJ, USA, 18–20 March 2020. [Google Scholar]
Yachongka, V.; Yagi, H. Identification, secrecy, template, and privacy-leakage of biometric identification system under noisy enrollment. arXiv 2019, arXiv:1902.01663. [Google Scholar]
Zhou, L.; Vu, M.T.; Oechtering, T.J.; Skoglund, M. Two-stage biometric identification systems without privacy leakage. IEEE J. Sel. Areas Inf. Theory 2021, 2, 233–239. [Google Scholar] [CrossRef]
Zhou, L.; Vu, M.T.; Oechtering, T.J.; Skoglund, M. Privacy-preserving identification systems with noisy enrollment. IEEE Trans. Inf. Forensics Secur. 2021, 16, 3510–3523. [Google Scholar] [CrossRef]
Ignatenko, T.; Willems, F.M.J. Biometric systems: Privacy and secrecy aspects. IEEE Trans. Inf. Forensics Security 2009, 4, 956–973. [Google Scholar] [CrossRef] [Green Version]
Lai, L.; Ho, S.-W.; Poor, H.V. Privacy-security trade-offs in biometric security systems–part I: Single use case. IEEE Trans. Inf. Forensics Secur. 2011, 6, 122–139. [Google Scholar] [CrossRef]
Koide, M.; Yamamoto, H. Coding theorems for biometric systems. In Proceedings of the 2010 IEEE International Symposium on Information Theory (ISIT), Austin, TX, USA, 13–18 June 2010; pp. 2647–2651. [Google Scholar]
Günlü, O.; Kittichokechai, K.; Schaefer, R.F.; Caire, G. Controllable identifier measurements for private authentication with secret keys. IEEE Trans. Inf. Forensics Secur. 2018, 13, 1945–1959. [Google Scholar] [CrossRef] [Green Version]
Günlü, O. Multi-entity and multi-enrollment key agreement with correlated noise. IEEE Trans. Inf. Forensics Secur. 2021, 16, 1190–1202. [Google Scholar] [CrossRef]
Kusters, L.; Willems, F.M.J. Secret-key capacity regions for multiple enrollments with an SRAM-PUF. IEEE Trans. Inf. Forensics Secur. 2019, 14, 2276–2287. [Google Scholar] [CrossRef]
Willems, F.M.J.; Ignatenko, T. Quantization effects in biometric systems. Proceeding of Information Theory and Applications Workshop, San Diego, CA, USA, 8–13 February 2009; pp. 372–379. [Google Scholar]
Vu, M.T.; Oechtering, T.J.; Skoglund, M. Hierarchical identification with pre-processing. IEEE Trans. Inform. Theory 2020, 1, 82–113. [Google Scholar] [CrossRef]
Shannon, C.E. A mathematical theory of communication. Bell System Tech. J. 1948, 27, 623–656. [Google Scholar] [CrossRef]
Cover, T.M.; Thomas, J.A. Elements of Information Theory, 2nd ed.; John Wiley & Sons: Hoboken, NJ, USA, 2006. [Google Scholar]
Bergmans, P. A simple converse for broadcast channels with additive white Gaussian noise. IEEE Trans. Inf. Theory 1978, 20, 279–280. [Google Scholar] [CrossRef]
Oohama, Y. Gaussian multiterminal source coding. IEEE Trans. Inf. Theory 1997, 43, 1912–1923. [Google Scholar] [CrossRef]
Kittichokechai, K.; Oechtering, T.J.; Skoglund, M.; Chia, Y.K. Secure source coding with action-dependent side information. IEEE Trans. Inf. Theory 2015, 61, 6444–6464. [Google Scholar] [CrossRef]
Weingarten, H.; Steinberg, Y.; Shamai (Shitz), S. The capacity region of the Gaussian MIMO broadcast channel. IEEE Trans. Inf.Theory 2006, 52, 3936–3964. [Google Scholar] [CrossRef]
Ignatenko, T.; Willems, F.M.J. Privacy-leakage codes for biometric authentication systems. Proceeding of 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Florence, Italy, 4–9 May 2014; pp. 1601–1605. [Google Scholar]
Yang, H.; Mihajlović, V.; Ignatenko, T. Private authentication keys based on wearable device EEG recordings. Proceeding of 25th European Signal Processing Conference (EUSIPCO), Kos Island, Greece, 28 August–2 September 2017; pp. 956–960. [Google Scholar]
El Gamal, A.; Kim, Y.H. Network Information Theory; Cambridge University Press: Cambridge, UK, 2011. [Google Scholar]
Bloch, M.; Barros, J. Physical-Layer Security; Cambridge University Press: Cambridge, UK, 2011. [Google Scholar]

Figure 1. The generated- and chosen-secret BIS models.

Figure 2. The original (top) and converted (bottom) systems.

Figure 3. (a,b) are the explanations of the optimal values of identification, secret-key, storage, and privacy-leakage rates in the regions

R_{G}

and

R_{C}

, respectively, for a fixed

α

.

Figure 3. (a,b) are the explanations of the optimal values of identification, secret-key, storage, and privacy-leakage rates in the regions

R_{G}

and

R_{C}

, respectively, for a fixed

α

.

Figure 4. Projections of the capacity region

R_{G}

onto (a)

R_{J} R_{S} R_{I}

-space and (b)

R_{J} R_{I}

-plane.

Figure 4. Projections of the capacity region

R_{G}

onto (a)

R_{J} R_{S} R_{I}

-space and (b)

R_{J} R_{I}

-plane.

Figure 5. The projections of the capacity region

R_{G}

onto two dimension figures for Exs. 2 and 3. (a) is the boundary of the capacity region

R_{G}

onto

R_{J} R_{S}

-plane for Ex. 2. (b) is the boundary of the capacity region

R_{G}

onto

R_{J} R_{L}

-plane for Ex. 2. (c) is the boundary of the capacity region

R_{G}

onto

R_{J} R_{S}

-plane for Ex. 3. (d) is the boundary of the capacity region

R_{G}

into

R_{J} R_{L}

-plane for Ex. 3.

Figure 5. The projections of the capacity region

R_{G}

onto two dimension figures for Exs. 2 and 3. (a) is the boundary of the capacity region

R_{G}

onto

R_{J} R_{S}

-plane for Ex. 2. (b) is the boundary of the capacity region

R_{G}

onto

R_{J} R_{L}

-plane for Ex. 2. (c) is the boundary of the capacity region

R_{G}

onto

R_{J} R_{S}

-plane for Ex. 3. (d) is the boundary of the capacity region

R_{G}

into

R_{J} R_{L}

-plane for Ex. 3.

Table 1. The secret-key and privacy-leakage rates when

R_{J} \to \infty

.

Table 1. The secret-key and privacy-leakage rates when

R_{J} \to \infty

.

Cases	The Optimal Secret-Key Rate			Privacy-Leakage Rate
Cases	(a)	(b)	(c)	(a)	(b)	(c)
Ex. 1	$0.35$	$0.44$	$0.49$	$0.35$	$0.6$	$0.90$
Ex. 2	$0.35$	$0.77$	$0.98$	$0.35$	$0.38$	$0.41$
Ex. 3	$0.35$	$0.55$	$0.6$	$0.35$	0.14	0.09

Table 2. The slopes of secret-key and privacy-leakage rates at

R_{J} ↓ 0

.

Table 2. The slopes of secret-key and privacy-leakage rates at

R_{J} ↓ 0

.

Cases	The Slope of Secret-Key Rate			The Slope of Privacy-Leakage Rate
Cases	(a)	(b)	(c)	(a)	(b)	(c)
Ex. 1	$1.0$	$1.40$	$1.67$	$0.5$	$0.7$	$0.83$
Ex. 2	$1.0$	$3.71$	$6.11$	$0.5$	$0.53$	$0.56$
Ex. 3	$1.0$	$2.0$	$2.33$	$0.5$	$0.25$	$0.17$

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yachongka, V.; Yagi, H.; Oohama, Y. Biometric Identification Systems with Noisy Enrollment for Gaussian Sources and Channels. Entropy 2021, 23, 1049. https://doi.org/10.3390/e23081049

AMA Style

Yachongka V, Yagi H, Oohama Y. Biometric Identification Systems with Noisy Enrollment for Gaussian Sources and Channels. Entropy. 2021; 23(8):1049. https://doi.org/10.3390/e23081049

Chicago/Turabian Style

Yachongka, Vamoua, Hideki Yagi, and Yasutada Oohama. 2021. "Biometric Identification Systems with Noisy Enrollment for Gaussian Sources and Channels" Entropy 23, no. 8: 1049. https://doi.org/10.3390/e23081049

APA Style

Yachongka, V., Yagi, H., & Oohama, Y. (2021). Biometric Identification Systems with Noisy Enrollment for Gaussian Sources and Channels. Entropy, 23(8), 1049. https://doi.org/10.3390/e23081049

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Biometric Identification Systems with Noisy Enrollment for Gaussian Sources and Channels^†

Abstract

1. Introduction

2. System Model and Converted System

2.1. Notation and System Model

2.2. Converted System

3. Problem Formulation and Main Results

4. Behaviors of the Capacity Region

4.1. Optimal Asymptotic Rates and Zero-Rate Slopes

4.2. Examples

5. Overviews of the Proof of Theorem 1

6. Conclusions and Future Work

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Appendix A. Proof of Equation (19)

Appendix A.1. Weakly Typical Sets and Modified Typical Sets

Appendix A.2. Achievability Part

Appendix A.3. Converse Part

Appendix B. Proof Sketch of Equation (20)

Appendix B.1. Achievability Part

Appendix B.2. Converse Part

Appendix C. Convexity of the Regions $R_{G}$ and $R_{C}$

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Biometric Identification Systems with Noisy Enrollment for Gaussian Sources and Channels †

Abstract

1. Introduction

2. System Model and Converted System

2.1. Notation and System Model

2.2. Converted System

3. Problem Formulation and Main Results

4. Behaviors of the Capacity Region

4.1. Optimal Asymptotic Rates and Zero-Rate Slopes

4.2. Examples

5. Overviews of the Proof of Theorem 1

6. Conclusions and Future Work

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Appendix A. Proof of Equation (19)

Appendix A.1. Weakly Typical Sets and Modified Typical Sets

Appendix A.2. Achievability Part

Appendix A.3. Converse Part

Appendix B. Proof Sketch of Equation (20)

Appendix B.1. Achievability Part

Appendix B.2. Converse Part

Appendix C. Convexity of the Regions R G and R C

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Biometric Identification Systems with Noisy Enrollment for Gaussian Sources and Channels^†

Appendix C. Convexity of the Regions $R_{G}$ and $R_{C}$