Multiple Observations for Secret-Key Binding with SRAM PUFs

We present a new Multiple-Observations (MO) helper data scheme for secret-key binding to an SRAM-PUF. This MO scheme binds a single key to multiple enrollment observations of the SRAM-PUF. Performance is improved in comparison to classic schemes which generate helper data based on a single enrollment observation. The performance increase can be explained by the fact that the reliabilities of the different SRAM cells are modeled (implicitly) in the helper data. We prove that the scheme achieves secret-key capacity for any number of enrollment observations, and, therefore, it is optimal. We evaluate performance of the scheme using Monte Carlo simulations, where an off-the-shelf LDPC code is used to implement the linear error-correcting code. Another scheme that models the reliabilities of the SRAM cells is the so-called Soft-Decision (SD) helper data scheme. The SD scheme considers the one-probabilities of the SRAM cells as an input, which in practice are not observable. We present a new strategy for the SD scheme that considers the binary SRAM-PUF observations as an input instead and show that the new strategy is optimal and achieves the same reconstruction performance as the MO scheme. Finally, we present a variation on the MO helper data scheme that updates the helper data sequentially after each successful reconstruction of the key. As a result, the error-correcting performance of the scheme is improved over time.


Introduction
The The Internet of Things (IoT) makes it possible to connect and share information between many different devices through the Internet. This sharing of information is beneficial for many applications, e.g., in healthcare or consumer electronics. At the same time, the information may be sensitive and should not fall into the wrong hands or be tampered with. Therefore, security is one of the main challenges of the IoT devices. Since the IoT devices are often small and low cost, securing the devices should come at a low price.
Secure communication is often achieved through cryptographic protocols that rely on secret keys. A low-cost alternative for secure storage of the keys is enabled by Static Random-Access Memory Physical Unclonable Functions (SRAM PUFs). A PUF is a physical object or device that responds to a challenge with a response that is unique and unpredictable [1,2]. The SRAM-PUF functionality is based on the uninitialized values of the SRAM. These are the values that appear in the memory cells directly after power up of the SRAM. The corresponding binary vector is unique for each SRAM and it is the result of small variations in the silicon material. It can be considered to be a noisy fingerprint of the device and can be used to generate and bind secret keys [3,4].
Since the SRAM-PUF observations are noisy, additional processing is required to ensure reliable reconstruction of the key. This can be achieved through a so-called key binding scheme, see Figure 1. The scheme considers two phases: an enrollment phase during which a uniformly generated key is bound to a first SRAM-PUF observation; and a reconstruction phase during which the key is reconstructed from an additional observation of the SRAM-PUF. Please note that enrollment is usually performed only once, whereas reconstruction can be repeated many times. During enrollment, besides the key s, also some helper data w is generated. This helper data w ensures that the key can be reliably reconstructed even though y n is a noisy version of x n . The helper data are considered public, and therefore should not reveal information about the key s to an attacker. E D w sŝ y n x n reconstruction enrollment In classic helper data schemes, a single SRAM-PUF observation is used for enrollment and a single observation is used for reconstruction. However, it has been shown that the mutual information between the encoder and decoder observations increases when more observations are considered [5]. Since the secret-key capacity is equal to the mutual information (see Section 3), it follows that the achievable secret-key rate is increased when multiple SRAM-PUF observations are used instead of a single observation.
We introduce the Multiple-Observations (MO) helper data scheme, which enrolls a single key after processing multiple SRAM-PUF observation vectors. The scheme is based on a linear error-correcting code and can be seen as an extension of the fuzzy commitment scheme [6]. Any number of enrollment observations can be used. Furthermore, the performance of the scheme increases when more observations are used.

Related Work
A first implementation of a key binding scheme for generating and reconstructing a cryptographic key from SRAM PUFs was presented by Guajardo et al. in [4]. There, the fuzzy commitment scheme [6] was used to construct the helper data and later to reproduce the key. It is known that within one SRAM-PUF, some cells are more reliable (smaller error probability) than other cells [3]. The reliability information of the SRAM cells can be used to improve the performance of the helper data schemes. For example, a Soft-Decision (SD) helper data scheme [7,8] publicly shares the error probability of each SRAM cell to improve the decoder performance. Furthermore, a Selection-based helper data scheme [9][10][11] selects only the most reliable SRAM cells to reduce the average error probability of the SRAM-PUF observations. Both schemes assume that reliability of each SRAM cell is known during enrollment. However, in general this information is not available. Either, special measurement techniques must be applied, or a sufficient number of observations is required to estimate these values before enrollment. We propose a new scheme that accepts standard SRAM-PUF observation vectors as an input, i.e., the MO helper data scheme. In [12] multiple enrollment observations are used under various environmental conditions. The focus of [12] is on experimental validation of new strategies that consider multiple enrollment temperatures; however, a mathematical analysis of the strategies is missing. Our work focuses instead on finding an optimal multiple observations strategy. In the future, our analysis may be extended to consider temperature dependence as well, see Section 2.
Applying multiple observations for key binding (and generation) has been studied from information-theoretic perspective in [13][14][15]. It is shown that the secret-key rate can be improved when multiple observations are used by the encoder or the decoder. Achievable rate regions are analyzed for various multiple enrollments and multiple entities scenarios, but no code constructions are proposed or investigated.
Multiple enrollment scenarios are studied from leakage perspective in [16,17]. In these papers, scenarios are considered (e.g., the reverse fuzzy extractor [18]) in which enrollment is repeated multiple times and correspondingly multiple helper data are generated. Please note that the additional helper data are not generated to increase performance; instead, they are considered to be a security challenge that follows from the repeated enrollments. The decoder performs a classic reconstruction that is based on a single helper data sequence only, whereas an attacker may have stored all previously generated helper data. It is shown in [17] that zero leakage is ensured in these multiple enrollment scenarios, when the SRAM PUFs meet a certain symmetry condition. Inspired by the zero leakage results of [17], we have developed the MO helper data scheme which produces a single helper data sequence based on repeated enrollments. In contrast to the scenarios discussed in the previous paragraph, the decoder now benefits from the additional observations which are embedded in the helper data.

Contributions and Outline
In Section 3, we define secret-key capacity for multiple enrollment observations and calculate its value for the SRAM-PUF statistical model. We show that secret-key capacity can increase significantly when more enrollment observations are considered. This observation is the main motivation for the results presented in this paper, which are listed as follows.

•
We introduce the MO helper data scheme in Section 4. We prove that the helper data does not reveal any information about the key when the SRAM-PUF statistical model meets the symmetry assumption (Section 2). Then, we prove that secret-key capacity can be achieved with the MO helper data scheme for any number of enrollment observations. • In Section 5, we present a code construction and evaluate performance of the scheme through Monte Carlo simulations. • We propose a new variation on the Soft-Decision (SD) helper data scheme from [7] in Section 6. In contrast to the original scheme, this scheme considers binary SRAM-PUF observations as an input. We prove that the new SD strategy is optimal (achieves secret-key capacity) and that it can achieve the same reconstruction performance as the MO helper data scheme. • In Section 7, we present a variation on the MO helper data scheme that can update the helper data sequentially. The error-correcting performance of the scheme improves after each successful key reconstruction, and therefore the performance improves over the lifetime of the device. This enables usage of less observations during the enrollment phase, when allowing worse initial reconstruction performance that is improved over time.
We conclude with a summary of our results. In the following, we first introduce the notation and the statistical model that we use for SRAM PUFs.

Notation and SRAM-PUF Statistical Model
In the following, we first introduce the notation that is used. Then, we present the statistical model that we use for SRAM PUFs which is based on the commonly used model introduced in [7]. We introduce a symmetry assumption which is required for the security of the MO helper data scheme. We derive several properties that are needed for the proofs later in this work.

Notation and Definitions
We use uppercase symbols to denote random variables and lowercase symbols to denote their realizations. We consider t enrollment observations of n SRAM cells, corresponding to t binary observation vectors of length n, i.e., x n 1 , x n 2 , . . . , x n t . The ith observation of the jth SRAM cell is represented by x i,j , and assumes a value in {0, 1}. We often analyze the behavior of a single SRAM cell, in which case we omit the cell index j and we have (x 1 , x 2 , . . . , x t ), corresponding to t observations of one SRAM cell.
The reconstruction observation vector is represented as y n , with y j the reconstruction observation of the jth SRAM cell. We use a distinct symbol for the reconstruction observation to emphasize its different functionality in the helper data scheme. Note, however, that the reconstruction observations have the same statistical behavior as the enrollment observations, and they may as well have been represented as x n t+1 , the (t + 1)th observation of the SRAM cells.
Calligraphic letters are used for finite sets and |S| denotes the cardinality of the set S. Finally, (conditional) entropy and mutual information are defined as in [19,20], with the base of the log equal to 2 and the units are bits. We define the binary entropy function as Each SRAM cell has a one-probability θ ∈ [0, 1] that defines the probability that a one is observed for this cell, i.e., Pr In practice, we do not know the one-probability θ of each SRAM cell. Instead, the one-probability is a random variable, independent and identically distributed over the cells according to some known distribution p Θ (θ). And we can calculate the average one-probability The currently most used model for PUFs [7] adopts the following Θ-distribution Here φ(x) and Φ −1 (x) are the probability density function and the inverse of the cumulative distribution function of the normal distribution. Furthermore, λ 1 and λ 2 are parameters that determine the average reliability and average one-probability of the SRAM cells, respectively. It has been observed on several occasions [3,4,21] that the SRAM PUFs are unbiased, i.e., the average probability of observing a one is equal to the probability of observing a zero. This is a desirable property for PUFs as it ensures that the observations are completely unpredictable for an attacker. In [7], Maes et al. showed that the model (4) well represents their empirical data for λ 1 = 0.065, λ 2 = 0.000, which indeed corresponds to unbiased SRAM PUFs. In this paper, we assume an unbiased SRAM-PUF and set λ 2 = 0. It follows that the distribution of the one-probabilities is symmetric, i.e., This symmetry of the one-probability distribution is a property of the SRAM-PUF model that is key for the zero-secrecy leakage of the multiple observation schemes discussed in this paper.

Multiple Observations
First, we consider multiple observations of a single SRAM cell. The probability of observing t observations (x 1 , x 2 , . . . , x t ) of an SRAM cell with Hamming weight k = w H (x 1 , x 2 , . . . , x t ) (where the Hamming weight is defined as the number of non-zero symbols in the sequence), is defined as where in (a) we substituted with θ = (1 − θ) and applied the symmetry property (6). Please note that the probability only depends on the number of observed ones k, and the order of the observations is irrelevant. Furthermore, the probability of observing a length t sequence containing k ones is equal to the probability of observing a length t sequence containing k zeros. Similarly, the conditional probability of observing a value y given a previously observed sequence (x 1 , x 2 , . . . , x t ), only depends on the number of previously observed ones k, i.e., where in (a) we used Bayes' theorem and in (b) we used (7). Finally, the last step follows from the fact that (1 − y) ≡ (y ⊕ 1) for y ∈ {0, 1}.

Multiple SRAM Cells
All the equations that were derived for a single SRAM cell can be easily generalized to multiple SRAM cells. Since the one-probabilities are independent and identically distributed over the SRAM cells, the observations of different cells are also independent and the joint probability of t observations for n cells is with k j = w H x 1,j , . . . , x t,j the number of observed ones after t observations of the jth cell.

Θ-Distribution
Until now, we have assumed that λ 2 = 0 and thus the SRAM-PUF is unbiased. This assumption is sufficient for the security and achievability proofs in Section 4. However, to predict the performance of our scheme in a practical setting, we also need to set a value for λ 1 . In correspondence with previous works [7,11] we choose λ 1 = 0.51 for our simulations, which corresponds to average error probability regarding dominant (most likely) value of a cell In Figure 2, we plot the Θ-distribution that is used for the calculations later in the paper. Please note that stable cells (one-probability close to 0 or 1) are more likely than unstable cells. We will see later that knowledge about the stability (error probability) of specific cells can improve the performance of helper data schemes. Finally, note that temperature and voltage ramp-up time of the power up may influence the behavior of the SRAM PUFs [21]. Here, we do not model this behavior and only assume a worst-case average error probability of 0.15 (see (10)). However, it is worthwhile mentioning that the model (4) can be extended to also model the temperature dependence, see [22]. Knowledge about the temperature behavior can be exploited to improve the performance of the helper data schemes, especially when the temperature is known [23]. Therefore, in the future, it may be beneficial to extend the current work to also consider temperature dependent behavior of the SRAM PUFs.

Multiple Enrollment Observations for Increased Secret-Key Capacity
In this section, we describe a secret-key binding scheme with multiple enrollment observations, see Figure 3. We derive the achievable secret-key rate for the used statistical model and show that the rate can improve significantly when more enrollment observations are considered. These results motivate us to design a helper data scheme that can approach such rates in Section 4.
As in the classic (single enrollment) key binding scheme in Figure 1, we distinguish between an enrollment and a reconstruction phase. Figure 3. Secret-key binding with t observations by the encoder and a single observation by the decoder.
During enrollment, the secret key s ∈ {1, 2, . . . , |S|} is generated uniformly at random. An encoder obtains the secret key s and t observation vectors x n 1 , x n 2 , . . . , x n t , and generates corresponding helper data w ∈ W. The helper data alphabet W is specified by the encoding function that is used, see Section 4. We assume that the helper data w is stored and/or communicated in the public domain. Therefore, the helper data by itself should not reveal any information about the key. During reconstruction, a decoder observes a reconstruction vector y n as well as the helper data w and maps this input to an estimateŝ of the secret key. Reconstruction is successful when the estimate is equal to the original key, i.e., whenŝ = s.
We are now interested in the achievable secret-key rate of the scheme when enrollment is performed using t observation vectors of the SRAM-PUF, and where achievable rate is defined as follows.

Definition 1.
A secret-key rate R t is called achievable after t enrollment observations, if for all δ > 0 and for all n large enough, there exist encoders and decoders such that The secret-key capacity C t is the maximum achievable secret-key rate with t enrollment observations.
Here (11) requires that the key reconstruction is reliable, (12) that the key is uniformly distributed, and (13) limits the information leakage by the helper data about the key. Theorem 1 gives the fundamental limit on achievable secret-key rate for key binding schemes with t enrollment observations, and it follows from the results presented in [24,25].

Theorem 1. A secret-key rate R t is achievable after t enrollment observations if and only if
The secret-key capacity C t = I(Y; X 1 , X 2 , . . . , X t ).
In Figure 4, we plot the secret-key capacity for an example SRAM-PUF. Here, we have assumed a symmetric distribution of the one-probabilities, and average error probability ψ ≈ 0.15, see Figure 2 for the Θ-distribution. The mutual information increases from 0.26 for a single enrollment observation, to 0.50 after 20 enrollment observations. This corresponds to almost doubling the secret-key rate regarding single enrollment.  In the plot, we also visualize an upper bound to the secret-key capacity, given by where (a) follows from the Markov chain Y ↔ Θ ↔ (X 1 , X 2 , . . . , X t ). From the weak law of large numbers, i.e., lim t→∞ Pr | 1 x i converges (in probability) to θ for t → ∞. Therefore, the upper bound (15) can be achieved and we say

Multiple-Observations Helper Data Scheme
We introduce the Multiple-Observations (MO) helper data scheme for binding a secret key to multiple observations of an SRAM-PUF, see Figure 5. First, we give a step-by-step description of the scheme. Then, we prove that it is secure. Finally, we prove that the scheme achieves the secret-key capacity C t (see Theorem 1) for t enrollment observations and, therefore, that the MO helper data scheme is optimal. Please note that all derivations are under the symmetric Θ-distribution assumption (6).

Description of the Scheme
First, a binary secret key s ∈ {0, 1} log 2 |S| is generated uniformly at random, such that H(S) = log 2 |S|, with log 2 |S| an integer. The key s is then encoded using a linear error-correcting code into a codeword c n . The codeword is XORed with all t enrollment vectors x n 1 , x n 2 , . . . , x n t , resulting in t sequences w n 1 , w n 2 , . . . , w n t , with Please note that w n i corresponds to a helper data sequence of the classic fuzzy commitment scheme [6]. The helper data of the MO helper data scheme is constructed by adding all these sequences together, i.e., the helper data are The helper data are stored on the device or in a database, and they are considered public information.
When the key must be recovered, the decoder observes the helper data m n t and another observation vector y n of the SRAM-PUF. A decoder function maps each pair (m n t , y n ) to a corresponding estimateŝ of the original secret. Ideally, the reconstructed secretŝ is equal to the original secret. Furthermore, an attacker who can observe the helper data m n t should not obtain information about the secret.

Uniformity and Zero Leakage
First, by definition of the MO helper data scheme, the secret key is generated uniformly at random and therefore the uniformity condition of Definition 1, 1 n H(S) = 1 n log 2 |S|, is satisfied. Second, it has been shown in [17] that, for symmetric SRAM PUFs, zero leakage occurs by all the helper data after repeated enrollments, i.e., I(S; W n 1 , . . . , W n t ) = 0. It follows that where (a) follows from the data processing inequality (see [19] Chapter 2) for the Markov Chain By (19) the MO helper data scheme achieves strong secrecy, i.e., I(S; M n t ) ≤ δ.

Achievable Secret-Key Rate
Third, we derive the achievable rate over the channel from the encoder to the decoder. The channel is described by the conditional distribution Pr(M t = m t , Y = y|C = c), and it follows from the channel coding theorem (see, e.g., [20], Chapter 3) that the maximum achievable rate is given by the mutual information maximized over the input distribution. Furthermore, C is uniform here, since we are using a linear code. We therefore evaluate this mutual information next Please note that (a) follows from zero leakage (see (24)), and (b) holds since (by (18)) the Hamming weight w H (x 1 , x 2 , . . . , x t ) is uniquely determined by m t given c and vice-versa. Furthermore, (c) follows from (22) below and since the key and thus codebit c are generated independently from the SRAM-PUF observations. Finally, (d) follows from the fact that the symmetric SRAM-PUF is unbiased (5), and by the fact that the conditional probability of the next observation given t previous observations (x 1 , x 2 , . . . , x t ) only depends on the number of observed ones (7). The following derivation shows that the helper data integer m t by itself does not reveal any information about the reconstruction observation y, and thus H(Y|M t ) = 1.
where (a) follows from (23), (b) follows from the definition of the helper data (18), and in (c) we use the fact that the key and codebit c is generated independently from the SRAM-PUF observations, and furthermore we use the symmetry property for the conditional distribution (8) that we have derived for symmetric SRAM-PUF in Section 2. In the above derivation we use that which can be derived from (19) by observing that where (a) follows from the data processing inequality (see [19], Chapter 2) for the Markov Chain It follows from the derivations in Sections 4.2 and 4.3 that the MO helper data scheme achieves secret-key rate R * MO,t , where achievable secret-key rate is defined in Definition 1. Furthermore, R * MO,t = C t is equal to the secret-key capacity for t enrollment observations as given by Theorem 1, which shows that the MO helper data scheme is optimal.

Code Construction and Simulation Results
In the previous section, we have proved that secret-key capacity for t enrollment observations is achievable with the MO helper data scheme. However, the performance that is achieved in practice depends on the error-correcting code that is implemented by the encoder and decoder function. Here, performance is evaluated in terms of secret-key rate (R) and reconstruction error probability (FER). Until now, we have not specified an error-correcting code. Please note that a Soft-Decision decoder should be used that can benefit from the reliability information of the SRAM cells. Furthermore, we are looking for codes that have a low rate and that perform well for relatively short blocklengths (128 key bits). Here, we use an off-the-shelf LDPC code to evaluate the expected performance of the scheme in a practical setting.

Encoder
First, a key s of 128 bits length is generated uniformly at random. The secret-key is then encoded using an error-correcting code, resulting in codeword c n . As the error-correcting code we use a CRC (cyclic-redundancy check) code concatenated with an LDPC code as defined in the 5G NR (fifth generation new radio) standard [26]. For our simulations, we use the LDPC coding-chain implementation from the MATLAB 5G Toolbox [27]. The codeword c n is XORed with t SRAM-PUF observation vectors of length n and the results are added together, resulting in helper data m n t . This concludes the encoder part of the scheme, where the helper data m n t is stored for later usage and all other variables are discarded. The secret-key rate of our implementation is R = 128 n key bits per SRAM cell.
Furthermore, for t enrollment observations of n SRAM cells the required number of bits for storage of the helper data are n log 2 (t + 1) , and thus the helper data rate is R hd = log 2 (t + 1) R helper data bits per key bit.

Decoder
The decoder uses an SRAM-PUF observation y n and the previously stored helper data m n t to reconstruct the keyŝ. First, the log-likelihood ratios (LLR) of the received code bits are calculated. Since the SRAM cells are independently distributed, we can calculate the LLR of each bit separately. The LLR of a code bit c, after observing the corresponding helper data m t and SRAM-PUF observation y, is LLR(y, m t ) = log π t+1 (y + m t ) π t+1 (y + t − m t ) .
See the Appendix A.1 for a derivation of the above equation. Please note that there are 2(t + 1) possible combinations of (y, m t ). However, due to the symmetry properties of the LLR function (see Appendix A.1), we only need to store LLRs in a look-up table. Finally, the LDPC decoder uses an iterative Soft-Decision decoder (belief propagation algorithm), combined with CRC for error detection, to reconstruct the 128 bit secretŝ from the received LLRs.

Simulation Results
We have simulated the MO helper data scheme using Monte Carlo simulations and the statistical model for SRAM PUFs that was presented in Section 2, with λ 1 = 0.51 and λ 2 = 0 and average error probability ψ ≈ 0.15. We plot the resulting error probability of the key reconstruction (FER) for various key rates R and number of enrollment observations t in Figure 6 Observe that the error probability can be reduced by decreasing the secret-key rate R and by increasing the number of enrollment observations t. We choose FER ≤ 10 −6 as the target error probability and find that R = 1/6 is the best (highest) achievable secret-key rate that achieves the target FER. Furthermore, at least 5 enrollment observations are required to achieve the target FER with this key rate, and the corresponding helper data rate is R hd ≈ 15.5. Another rate-enrollment pair that achieves the target FER is R = 1/7 for t = 3 enrollment observations, with corresponding helper data rate R hd ≈ 14. Please note that the classic single enrollment scheme (corresponding to MO scheme for t = 1) achieves the target FER for R = 1/11 or worse key rates. Furthermore, the helper data rate R hd = 11 bits per key bit in this case. Therefore, the MO helper data scheme achieves a secret-key rate that is 11/6 ≈ 1.8 times higher than the single enrollment scheme. However, the improved secret-key rate comes at a cost since the helper data rate is 15.5 11 ≈ 1.4 times as high in comparison to the classic scheme.
We conclude that a trade-off must be considered between the required enrollment time, helper data storage, and the number of SRAM cells, when selecting the parameters (enrollment observations and secret-key rate) for a given setting.

The Soft-Decision Helper Data Scheme
The performance increase of the MO helper data scheme, regarding the traditional single enrollment scheme, is mostly due to the fact that it can distinguish between reliable and unreliable SRAM cells. A well-known helper data scheme that also considers the reliability information of the SRAM cells is the Soft-Decision scheme introduced by Maes et al. [7]. In this section, we first describe the SD scheme and derive its achievable performance. We note that the SD scheme assumes that the one-probabilities of the SRAM cells are observable which in practice is often not the case. Therefore, a pre-processing step is required that estimates the one-probabilities of the SRAM cells. Based on this observation, we propose a variation on the SD scheme that instead directly considers the binary SRAM-PUF observations as an input. We show that the newly proposed strategy results in the same decoder LLRs as the MO scheme (for equal key and SRAM observations). This implies that both schemes achieve the same reconstruction performance, and thus, since the MO scheme is provably optimal, the newly proposed strategy for the binary SD scheme is optimal as well.
To the best of our knowledge, we are the first to propose an optimal strategy for the SD scheme with binary enrollment observations. Furthermore, we show that very few observations (less than 10) are sufficient to achieve an acceptable performance, whereas in the literature 64 observations are used [8] for the same statistical model and parameter settings that are used in the current work.

Description of (Regular) SD Helper Data Scheme
The SD helper data scheme, see Figure 7, observes the one-probability vector θ n of all the cells, and derives the dominant values u n , defined as and the error probability ψ n (regarding the dominant value of the cell), defined as with ψ ∈ [0, 1/2]. As in the MO helper data scheme, a key is generated uniformly at random, and encoded using a linear error-correcting code into a codeword c n . Then the codeword is XORed with the dominant values u n to create the helper data sequence Besides the helper data sequence w n , also the reliability values ψ n of the SRAM cells are stored publicly. For reconstruction, another observation y n of the SRAM-PUF is XORed with the helper data, resulting in a noisy codeword r n = w n ⊕ y n = c n ⊕ (u n ⊕ y n ), where (u n ⊕ y n ) is an error vector, and an error (r j = c j ⊕ 1) occurs when the reconstruction observation y j is flipped regarding the dominant value u j of a cell. The decoder reconstructs the keyŝ based on the received noisy codeword r n and the error probabilities of the cells ψ n .

Achievable Performance
We are interested in the maximum achievable secret-key rate of the SD scheme. As for the MO scheme (Section 4.2) we should first show that the SD scheme is secure, i.e., the uniformity condition of Definition 1 is satisfied since the key is generated uniformly at random. Furthermore, the leakage condition is satisfied as I(S; W n , Ψ n ) = 0 as is shown in Appendix A.2. Now, we can derive the achievable rate over the channel from the encoder to the decoder. The channel is described by the conditional distribution Pr(R = r, Ψ = ψ|C = c), and it follows from the channel coding theorem (see, e.g., [20], Chapter 3) that the maximum achievable rate is given by the mutual information maximized over the input distribution. Furthermore, C is uniform here, since we are using a linear code. We therefore evaluate this mutual information next In (a) we used the fact that the key is generated independently from the SRAM-PUF observations and thus I(C; Ψ) = 0. In (b) we used the fact that r = c ⊕ (u ⊕ y) so the channel from encoder to decoder can be modeled as a binary symmetric channel with cross-over probability ψ (the probability that u = y). Please note that R * SD,∞ = C ∞ (for symmetric SRAM-PUF) this rate is equal to the limit (for t to infinity) of the secret-key capacity for multiple enrollment observations, see (16). Therefore, the SD helper data scheme is optimal, and furthermore the MO helper data scheme approaches the same key rate as the SD helper data scheme when sufficient enrollment observations are used.

New SD Strategy for Binary Enrollment Observations
In practice, the one-probabilities θ n are non-observable and therefore, they must be estimated based on the binary observation vectors x n 1 , x n 2 , . . . , x n t before the SD scheme can be applied. Instead of applying a pre-processing step for estimation, we propose to adjust the SD scheme s.t. the dominant values and error probabilities are directly estimated based on t binary observations. The dominant value of an SRAM cell is then estimated aŝ And furthermore, the error probability (regarding the dominant value) iŝ with In Appendix A.3 we show that (36) indeed is equal to the error probability (regarding the dominant value) of a cell given the t binary enrollment observations.
The helper data and noisy codeword are constructed as before, so w n = c n ⊕û n t and r n = c n ⊕ (û n t ⊕ y n ). The decoder reconstructs the keyŝ based on the received noisy codeword r n and the error probabilities of the cellsψ n t .

Achievable Performance
We are interested in the maximum achievable secret-key rate of the binary SD scheme. As before we should first show that the scheme is secure, i.e., the uniformity condition of Definition 1 is satisfied since the key is generated uniformly at random. Furthermore, the leakage condition is satisfied as I(S; W n ,Ψ n t ) = 0 as is shown in Appendix A.4. Now, we can derive the achievable rate over the channel from the encoder to the decoder. The channel is described by the conditional distribution Pr R = r,Ψ t =ψ t |C = c , and it follows from the channel coding theorem (see, e.g., [20], Chapter 3) that the maximum achievable rate is given by the mutual information maximized over the input distribution. Furthermore, C is uniform here, since we are using a linear code. We therefore evaluate this mutual information next . . , X t ) = I(Y; X 1 , X 2 , . . . , X t ). In (a) we used the fact that the key is generated independently from the SRAM-PUF observations and thus I C;Ψ t = 0. In (b) we used that r = c ⊕ (û t ⊕ y) and furthermore the key (and codebit) is generated uniformly and independently from the SRAM-PUF observations (and thus also independently fromψ t ). Furthermore, (c) follows by the Markov Chain and (d) holds since (by (18)) w H (x 1 , x 2 , . . . , x t ) is uniquely determined by the tuple (m SD t ,û t ) and vice-versa. Finally, (e) follows from the fact that the symmetric SRAM-PUF is unbiased (5), and by the fact that the conditional probability of the next observation given t previous observations (x 1 , x 2 , . . . , x t ) only depends on the number of observed ones (7).
It follows from the derivations above that the binary SD scheme achieves secretkey rate R * SD,t , where achievable secret-key rate is defined in Definition 1. Furthermore, R * SD,t = C t is equal to the secret-key capacity for t enrollment observations as given by Theorem 1, which shows that the binary SD scheme is optimal.

Code Construction and Simulations
As with the MO scheme we can now define a code construction, with a uniformly generated secret s, an encoder based on an off-the-shelf LDPC code, and where the helper data w n and reliability informationψ n t are constructed as explained in Section 6.3. The decoder observes r n and the error probabilitiesψ t which can be used to calculate the log-likelihood ratios needed for reconstruction of the secret. Since the SRAM cells are independently distributed, we can calculate the LLR of each bit separately. The LLR for a codebit c, after observing the noisy codebit r and the error probabilityψ t for t enrollment observations, is See Appendix A.5 for a derivation. It is shown in Appendix A.5 that the LLRs for the SD scheme are equal to the LLRs for the MO scheme when the same enrollment observations (x 1 , . . . , x t ) and reconstruction observation y are generated by the SRAM-PUF. Therefore, simulations for the SD scheme would give the same FER results as for the MO scheme, and we can use the plots in Figure 6 to predict the performance of the SD scheme.

Sequential MO Helper Data Scheme
We have seen in Section 5 that a better performance (smaller reconstruction error probability) is often achieved when the number of enrollment observations is increased. However, each enrollment observation requires a full reset (power off and power up) of the SRAM-PUF. Therefore, considering more enrollment observations results in an increased duration of the enrollment phase of the MO helper data scheme. This may be undesirable in practice as the enrollment phase must be completed in a secure environment and limited time is available.
In this section, we present a variation on the MO helper data scheme that we call the Sequential Multiple-Observations (SMO) helper data scheme. We exploit the fact that a new SRAM-PUF observation is required for each key reconstruction, and, after the reconstruction, we use this additional observation to update the helper data. As a result, the helper data are updated (sequentially) after each reconstruction, and thus the reconstruction error rate is reduced over time. The SMO scheme improves the efficiency regarding the MO scheme, since observations are used both for reconstruction and enrollment. Furthermore, the SMO scheme enables the usage of a reduced number of observations during the enrollment phase. The reduced number of enrollment observations comes at an initial cost of larger reconstruction error probability (FER); however, this is quickly reduced (see Figure 8) during the lifetime of the device.

Description
The enrollment phase of the Sequential MO helper data scheme is the same as for the MO helper data scheme, i.e., a key s is generated uniformly at random, and the helper data are constructed according to (18), with t = t 0 enrollment observations. Therefore, after the initial enrollment, a helper data m n t 0 is stored that is based on t 0 enrollment observations. The reconstruction phase, however, is different than before and it is visualized in Figure 8.
In the reconstruction phase of the SMO scheme, the helper data m n t (corresponding to t SRAM-PUF observations) and an SRAM-PUF observation y n are used to reconstruct the keyŝ. If reconstruction is successful (We assume that correctness of the key can be verified, for example, by feedback from other protocols that apply the key for decryption, or by a failed authentication.), the SRAM-PUF observation is XORed with the encoded secret and is added to the existing helper data, which results in a new (improved) helper data corresponding to t + 1 SRAM-PUF observations. If, on the other hand, reconstruction has failed, it is not possible to reconstruct the original codeword c n which is required for the helper data update. Therefore, the helper data remains the same as before in this case.

Security and Achievable Rate
The main difference between the Sequential MO helper data scheme and the original MO helper data scheme is that the helper data are now updated sequentially in the field. Therefore, an attacker may have access to multiple helper data m n 1 , m n 2 , . . . , m n t , instead of only the final helper data m n t . However, despite the reveal of multiple helper data, we can show that zero-secrecy leakage is still guaranteed. In particular, a similar Markov Chain as defined in (20) holds in this case S ↔ C n ↔ (W n 1 , W n 2 , . . . , W n t ) ↔ (M n 1 , M n 2 , . . . , M n t ).
Therefore, we can repeat the same derivations as in Section 4.2 to show that I(S; M n 1 , M n 2 , . . . , M n t ) ≤ I(C n ; W n 1 , W n 2 , . . . , W n t ) = 0, and thus, there is no information leakage about the key by all the revealed helper data. Please note that only the helper data are updated, whereas the key that has been generated during the enrollment phase remains the same. Since the key should be reconstructable already after the initial enrollment (since successful reconstruction is required for each update), the maximum achievable rate of the Sequential MO scheme is limited by the number of observations t 0 that is used for the initial enrollment, i.e., Nevertheless, as we show in the next subsection, the main advantage of the SMO scheme is that the reconstruction observations are exploited to improve the reconstruction error probability over time.

Simulation Results
We evaluate the performance of the Sequential MO helper data scheme through Monte Carlo simulations. The encoder and decoder constructions and LLR calculations are similar to the construction presented in Section 5. We evaluate two key rates, R = 1/6 and R = 1/7 with t 0 = 4 and t 0 = 2 enrollment observations, respectively. We have chosen the number of enrollment observations s.t. the achieved initial error probability is larger than, but close to, the target FER < 10 −6 (based on the previous results in Figure 6). We simulate 4 consecutive reconstruction attempts of the SMO scheme and plot the FER as function of the number of reconstruction attempts (time) in Figure 9. Please note that the helper data are only updated when the reconstruction attempt is successful. Therefore, the 'reconstructions' in the current simulation are different from 't enrollment observations' plotted on the x-axis in Figure 9.
First, we can directly observe that the FER reduces with each reconstruction attempt. Therefore, indeed the SMO scheme supports an improved reconstruction performance over the lifetime of the devices. More specifically, for the simulated rates and selected number of enrollment observations, we see that the average error probability already reduces below the required threshold FER < 10 −6 after the first reconstruction attempt. Therefore, in both cases, the initial cost (larger FER) of reducing the number of enrollment observations by one, was already nullified after one reconstruction.

Conclusions
We have presented the Multiple-Observations (MO) helper data scheme for binding a secret key to multiple observations of an SRAM-PUF. We have shown that the MO helper data scheme can achieve secret-key capacity corresponding to t enrollment observations, and therefore the scheme is optimal in information-theoretic sense. Furthermore, we have evaluated performance of the scheme with Monte Carlo simulations for a standard statistical model for SRAM PUFs with average error probability ψ ≈ 0.15. Secret-key rate R = 1/6 is sufficient to achieve FER ≤ 10 −6 after t = 5 enrollment observations. This is a key rate that is 11/6 ≈ 1.8 times higher (better) than for the single enrollment scheme with comparable FER.
The MO helper data scheme is very similar to the Soft-Decision (SD) helper data scheme; however, the SD scheme assumes one-probabilities as an input, which in practice are non-observable. Therefore, we proposed a new strategy that considers binary observations instead. We have shown that this new strategy is optimal and achieves the same reconstruction performance as the MO scheme.
We have introduced a variation on the MO scheme, which we call the Sequential Multiple-Observations helper data scheme. The scheme supports a sequential update of the helper data after each successful reconstruction of the key, resulting in a reduced FER over the lifetime of the device. The SMO scheme enables the usage of less enrollment observations, by accepting a (slightly) worse initial FER that is quickly improved upon.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: We give the derivation that leads to (28) the log-likelihood ratio for a codebit c, after observing the current observation y and the helper data m t for t enrollment observations. We start by deriving the likelihood for c = 0, where (a) follows from the definition of the helper data (18), and in (b) we used the fact that the code bit c is generated independently from the SRAM observations. We can repeat a similar derivation to find the likelihood for c = 1, and the log-likelihood ratio is then LLR(y, m t ) ∆ = log Pr( Y=y,M t =m t |C=0) Pr( Y=y,M t =m t |C=1) = log π t+1 (y+m t ) π t+1 (y+t−m t ) . (A3) Please note that we can derive the following two symmetry properties for the LLR function LLR(y, m t ) = − log π t+1 (y+t−m t ) π t+1 (y+m t ) = −LLR(y, t − m t ), LLR(y, m t ) (a) = log π t+1 (t+1−y−m t ) π t+1 (t+1−y−t+m t ) = log π t+1 ((y⊕1)+t−m t ) π t+1 ((y⊕1)+m t ) = LLR(y ⊕ 1, t − m t ), where (a) follows from (7) which follows from the symmetry assumption for SRAM PUFs.
First, we can express the log-likelihood ratio (A14) as a function of m SD t LLR' SD (r, m SD t ) Furthermore, m t if c = 0 and w H (x 1 , . . . , x t ) ≤ t/2, or if c = 1 and w H (x 1 , . . . , x t ) > t/2, t − m t otherwise, where in the last step we used the definition ofû t (29) and the fact that r = c ⊕û t ⊕ y.
Combining (A15) and (A16) we find that the log-likelihood ratio for the SD scheme is if r = y ⊕ 1, = log π t+1 (y+m t ) π t (y+t−m t ) . (A17) In the last step we used (7). Now the last expression in (A17) is equal to the log-likelihood ratio for the MO helper data scheme (28). Therefore, we conclude that the LLRs of both schemes are equal when the same enrollment observations (x 1 , . . . , x t ), codebit c and reconstruction observation y are used.