1. Introduction
Security in electronic devices is a highly demanded requirement in many applications. To identify and protect intellectual property, electronic components, and systems from manipulations, counterfeiting, or other harm, physical unclonable functions (PUFs) have been proposed over the last two decades. They are memory-free low-power security primitives with the functionality to generate random but unique hardware fingerprints that can be used to identify hardware devices or for more advanced encryption applications like [
1,
2]. The PUF uniqueness only depends on small manufacturing variations, which make its responses ideally unclonable and unpredictable, even by the manufacturer. Beyond many other parameters [
3,
4], two important key requirements for PUFs are the stability of their response and short readout time.
PUFs have been proposed in various architectures, both for implementation as integrated circuits (ICs) [
5,
6,
7,
8] or on field-programmable gate arrays (FPGAs) [
9,
10,
11,
12,
13]. Whereas IC-based PUFs obviously offer larger design space, FPGA-based designs offer other distinct advantages: the FPGA can be used both functionally and as source of entropy of the PUF by reprogramming, and FPGAs are intrinsically available in many state-of-the-art (SotA) electronic systems and are thus the more cost-effective solution in many applications.
A well-studied SotA implementation of an FPGA-based PUF is the arbiter-PUF, as indicated in
Figure 1a [
10,
14]. The source of entropy is the delay difference regarding two parallel delay paths; the resulting phase shift of a triggered rising edge through the two paths is detected by an arbiter, effectively a D-flip-flop (D-FF) with one path entering its D-input and the other used as the clock input; the binary arbiter decision realizes a 1-bit output. The delay paths consist of switchable delay elements (SDEs), which can be swapped using multiplexers controlled by an input challenge vector. This alters the propagation delays depending on the challenge, which results in a challenge-dependent phase shift and arbiter response. Thus, a multitude of challenge–response pairs (CRPs) are generated, and this realizes a strong PUF [
3]. Without the challenge and reconfiguration of the delay lines, every arbiter generates exactly one bit, and a weak PUF is realized. In [
15], such a compact design was presented. Overall, the arbiter-PUF is very simple and has a fast response time, but it is very sensitive to noise as especially small phase shifts result in noisy unstable arbiter decisions. Moreover, when implemented as a strong PUF, it is also known for its easy attackability [
16]. Ref. [
10] reported a bit error rate (BER) of
, while the evaluation time is in the picoseconds range.
Another SotA FPGA-based PUF is the ring oscillator (RO)-PUF, as illustrated in
Figure 1b [
9,
13,
17]. Herein, the source of entropy is again a chain of delay elements, which by positive feedback result in an oscillation frequency of an RO; by comparing the oscillation frequencies of two (adjacent) ROs, a response bit is obtained. In [
18], guidelines for the design of efficient RO-PUFs are provided. The SDEs of the ROs can also be reconfigured upon a challenge input vector generating CRPs [
19]. RO-PUFs offer much better stability compared to arbiter-PUFs as they average the noise by their longer oscillation times. Unfortunately, this comes at the cost of longer readout times [
20]. For example, Ref. [
19] reduced the BER to be <
, while one evaluation required
. Also, they are prone to (even contactless) side-channel attacks due to the emission of RF spurs at their self-oscillation frequencies [
20]. Another known variation of the RO-PUF is the TERO-PUF [
21]. Instead of using two parallel ROs, they are connected in one chain to prevent locking phenomena. In [
22], a compact variant, using just two inverters and D-latches, was proposed. Another common PUF approach regarding FPGAs was exploited with SRAM-based PUFs [
23,
24]. The bistable state of a non-written SRAM-cell decides on either 0 or 1 shortly after start-up. Unfortunately, the native SRAM PUF has a high error rate [
23]. The bistable ring (BR)-PUF also evaluates the final state of several inverters arranged in a ring structure after release from a bistable state, like a power-up but offering a challenge input [
11].
The instability in PUF responses due to noise, but also due to environmental changes like supply or temperature, has to be corrected for as, otherwise, authentication, fingerprint generation, or any cryptographic application would become impossible. Various methods are employed in the SotA to stabilize PUF responses. Bit-masking is used to store the locations of unstable bits and to blank them for reproducing the PUF response [
5,
6]; disadvantageously, in an enrollment phase, the unstable bits must first be found by (factory) measurement, and the positions of unstable bits can even change upon other environmental changes, such as temperature, which makes the task complex and costly. IC-based PUFs have also used hardening techniques, e.g., in IC-based SRAM PUFs [
24]; this is not applicable in arbiter-, RO-, or generally in FPGA-PUFs. For the remaining PUF errors, digital error correction can be used [
25,
26]. The remaining random bit-flips of the PUF response are treated like noise in a communication channel, which are then corrected using helper data and error correction codes (ECCs). Obviously, the more errors that need to be corrected, the lower the code rate (number of useful bits for a given number of raw bits) and the more the complexity of the helper data and error correction algorithms increases. This helper data needs additional costly permanent memory, motivating a reduction in the amount of necessary data. Furthermore, the helper data can leak information about the secret PUF information to an attacker. Thus, secure schemes to store helper data have to be implemented, as investigated in [
25,
26,
27].
Consequently, there is demand for more robust PUF architectures and improved PUF stabilization techniques, which ideally include the possibility to automatically detect unstable bits, to have an as ideal response of the PUF as possible before digital error correction. An idea to achieve this is the application of the so-called eye-opening arbiter (EoA)-PUF [
28]. It operates as a mixture between the arbiter- and ring oscillator-PUFs and exploits phase difference integration by enabling oscillation until a predefined phase difference threshold (deadzone) is exceeded; this is illustrated in
Figure 1c. Thus, it promises both fast and highly robust readouts.
In this paper, we present the first actual FPGA implementation of the EoA-PUF and validate its principles by measurement. Several tuning algorithms are proposed to remove routing-based response bias, which is—in contrast to IC-based implementations—an intrinsic non-ideality in FPGA-based designs. The paper is organized as follows:
Section 2 reviews the basic operation of the EoA-PUF.
Section 3 describes the FPGA-specific implementation of the EoA-PUF and a tuning method to overcome routing-induced bias. The PUF uniqueness is examined in
Section 4, and, in
Section 5, the adjustment of the deadzone and resulting BERs are investigated.
Section 6 summarizes our work, and
Section 7 concludes the paper.
4. PUF Uniqueness Evaluation and Mean Frequency Tuning
Section 2 has introduced the EoA-PUF design, has derived ways to largely eliminate bias from deterministic routing mismatches in the SDEs by design and in the arbiter by PDL tuning, and has explained how the PDL can be further used to adjust the deadzone and compromise slower robust vs. faster noisy decisions. Next, we evaluate the PUF responses for uniformity, uniqueness, and randomness. A fixed tuning configuration per PUF instance is used for all evaluations, which was determined by the previously described algorithm.
Further metrics, such as NIST 800-90B and model-building resistance, have not been evaluated because a weak PUF implementation cannot extract enough bits to yield significant results. In addition, the applied metrics are used in other publications, allowing for direct comparison.
(1) The PUF uniformity is investigated by checking for an equal distribution of extracted bits observed when all challenges are applied to a single PUF instance. (2) Then, the uniqueness requirement has to be evaluated by comparing one PUF-bit derived from one challenge with all other PUF-bits derived from the same challenge with all PUF instances and FPGAs. Also, the extracted bits must be uncorrelated to fulfill randomness when (3) all challenges are applied to a single PUF-cell, or when (4) a single challenge is applied to all PUF instances. If one of these conditions is not fulfilled, an attacker can easily guess the response.
For evaluation, a total of 300 PUF instances were set up with
challenge bits among 18 different FPGA boards. Hadamard codewords are used as challenges, which are linearly independent and therefore harden the PUF against machine learning attacks as the usable information from one codeword to guess the next is minimized [
20,
31]. Hadamard codewords have a length of powers of two and are constructed such that all codewords have the same Hamming distance (HD) to all other codewords, making them linearly independent. We excluded the all-one codeword such that all codewords have the same amount of ones and zeros, resulting in 63 different codewords. Applying these in a constant sequence to the PUF, each revealing one bit, results in a weak PUF with 63 bits and no external challenge input exposed to an attacker. An evaluation of the strong PUF configuration and its resistance to machine learning attacks motivates future work.
In order to validate the above-mentioned uniformity requirement (1), the condition
is calculated for all bits extracted per PUF [
32]. With
, the amount of responses resulting in 1 is rated against the total amount of responses
R. An equal distribution of 0 and 1 results in a
=
, while
tending to
correlates with a bias to 0/1 as all
are either 0 or 1 independent of the challenge.
Since routing-induced bias was eliminated in the source of entropy by design and in the arbiter by tuning using the PDL approach, a
=
was expected. Surprisingly, during evaluation, a
close to
was measured in many PUF instances. To debug this unexpected behavior, counters were added to the RO outputs to measure their actual frequencies. A measured change in RO-frequencies over challenges for a single exemplary EoA-PUF instance is depicted in
Figure 8.
As expected, each challenge vector changes the frequency of an individual RO in random directions with random magnitude. However, the difference in the mean frequencies of the two competing ROs, indicated by the dashed lines, is larger than the average frequency change caused by the challenge vector and thus the mismatch of individual SDEs. As the EoA-PUF response is derived from phase differences, which are integrated from the actual RO-frequencies, the EoA-PUF response is constant for all challenges in the shown exemplary PUF instance. Re-arranging and re-routing of the PUF did not help to reduce this bias in the challenged PUF response. Since the FPGA vendor does not provide detailed schematics of the internal architecture, it was also not possible to identify the reason for the residual bias in the ROs.
To overcome the partially too large mean frequency difference in RO
A vs. RO
B, an additional 20-bit PDL was added to the oscillating path of the ROs. By assigning these PDLs with different unary-coded vectors, the frequencies can be shifted, minimizing the frequency difference, similar to the arbiter tuning. The result for the same two ROs as in
Figure 8 is shown in
Figure 9. Due to the additional LUTs in the ROs, their mean frequencies decrease compared to
Figure 8, but, most importantly, their random variation over different challenges is now distributed around the same mean. Even though an individual offset compensation must be found for every PUF instance, advantageously, these individual compensations remained stable over many evaluation cycles and can thus be stored as constant helper data. Since the PDL input vectors are thermometer-coded, a total of 10 bits are required per PUF instance. If these values are tampered with, the mean frequencies of the ROs will diverge again, which will result in a biased key and compromise the following application, similar to the insertion of a wrong key.
With this PDL-based RO-bias compensation, the
was re-evaluated for all PUF instances over all CRPs. The resulting distribution is plotted in
Figure 10 in blue. The mean is at
with a standard deviation of ≈4%, which is comparable to the SotA [
10,
11,
23]. Note that the choice of a length of the tuning-PDL depends on the expected variation in the employed FPGA and its LUTs. During our experiments, we evaluated varying lengths of tuning-PDLs and 20 bits was found to be suitable to tune all evaluated PUF instances. Changing the FPGA type would either require vendor information of statistical variation or a re-evaluation by experiment.
After removing the bias in the individual PUF instance, the above-introduced uniqueness requirement (2) is evaluated by computing the HD
inter [
3,
11]
HD
inter generally calculates for each bit
of a vector with length
N the relative HD to all other bits
in the vector. An HD
inter close to
indicates a unique distribution of the underlying bits, while HD
inter close to
indicates that all bits result in the same response. HD
inter was then calculated by comparing every bit of a PUF instance with all other bits extracted with the same challenge of all other PUF instances, and this again along all FPGAs. The resulting distribution is plotted in orange in
Figure 10. The mean value is
with a standard deviation of
, which fulfills the uniqueness requirement.
To validate the earlier listed requirements of uncorrelated PUF responses (3) and (4), the correlation coefficient of every PUF to all other PUFs is evaluated. Thereby, the correlation coefficient ranges between
and represents the relation of two vectors: a value of
implies identity or inversion, while non-related vectors have a correlation coefficient of 0. The distributions of the correlation coefficients of (3) all CRPs on a single PUF instance and (4) all PUF instances per single challenge are plotted in
Figure 11. Both are arranged around 0.
Concluding, the presented EoA-PUF implementation on FPGA can be stated to be unique and bias-free.
5. PUF Stability Evaluation and Automatic Detection of Unstable Bits
After having evaluated the uniqueness, the stability of the PUF responses over noise and environmental conditions is assessed. As discussed in
Section 2, the distinct advantage of the EoA-PUF is the deadzone, which enables making arbiter decisions as soon as the arbiter/RO phase difference is large enough and thus stable to be decided. Moreover, as discussed with
Figure 7, the PDLs used for arbiter bias tuning also allows to trade robustness in the arbiter decision vs. number of required RO oscillations by adjusting the deadzone. This is as the stability of a single PUF instance’s readout depends on the phase difference
seen by the arbiter at the time of decision-making. Since small
have a high chance to be influenced by noise, a larger deadzone makes this decision more robust.
As seen in
Figure 7, the PDLs should tune the arbiter along the unbiased decision line, close to the switching region for sensitive or farther away for robust decisions. PDL
ac and PDL
bc were used to tune this sensitivity against robustness, whereas PDL
ad and PDL
bd were then used to set the deadzone along the unbiased decision line. As the PDLs for arbiter tuning are limited to 20 bits in unary code (see
Section 3.2), 21 different deadzones can be adjusted and are considered in the following. For simplicity, we call this a deadzone configuration
dz. For example, a
dz of 2 bits sets the inputs of two H-SDEs in PDL
ad and PDL
bd to 1, while the others are set to 0. Based on the average delay per H-SDE, each bit of the deadzone configuration will increase the deadzone by
ps
ps as the deadzone covers the negative and positive phase differences (see
Figure 2).
5.1. Stability Evaluation
To evaluate the robustness of PUF readouts, each challenge is applied
times to each PUF instance and then compared to a
reference response Rref, for which we have chosen the median of all readouts at the largest deadzone. The stability is evaluated by computing the error rate via the intra-Hamming distance (HD
intra)
for each CRP [
3,
11]. An HD
intra of 0% means that all readouts were the same, whereas an HD
intra of 100% means all readouts differ from the reference. The HD
intra, averaged over all challenges, is plotted over varying
dz in
Figure 12 as blue line. If only the intrinsic deadzone of the arbiter is used and no additional deadzone (
) is adjusted by the PDLs, the HD
intra is ≈10%. For an increasing deadzone by PDL, an HD
intra of ≈2% can be achieved, but not less.
Close investigation revealed that the HD
intra was largely dominated by a limited number of very unstable bits on a limited number of PUF instances. Such unstable bits are highlighted for the exemplarily shown PUF instance in
Figure 9 with red circles. There, it is obviously seen that RO
A and RO
B have almost identical frequencies for some challenges and thus remain indistinguishable for a long time. The experiment on this particular instance revealed that their frequency difference is ≤
with a standard deviation between
and
. For such challenges, the impact of
compared to noise is very small. For reliable bit extraction, a large deadzone would be required together with a very large evaluation time to allow the tiny frequency difference to be integrated to a sufficiently large phase difference.
Hence, such unstable bits can be blanked to exclude them from the final key. Blanking is a known technique from the SotA [
5,
6], and, usually, the unstable bits, which need to be blanked, have to be found by time-consuming repetitive measurements. The unique feature of the EoA-PUF is the possible auto-detection of unstable bits, which will be explained in more detail in
Section 5.2. After blanking of the unstable bits, the averaged error rate of all PUFs decreased to <
when using the largest
, which is shown in
Figure 12 with the green solid line. Also, the much larger error rate of the unstable bits is shown in red for completeness’ sake. In addition,
Figure 13 shows the relation of stable and unstable bits per FPGA-board. As can be seen, a minimum of
of the bits are stable per evaluated FPGA.
For other than the largest deadzone, it can be seen in
Figure 12 that the error rate decreases with increasing
. With a deadzone configuration greater than
s, the error rate starts flattening. This can be explained by the fact that the minimal required phase difference to decide is already significantly larger than the influence of noise, so any further increase in the deadzone brings only small improvements. This is comparable with the findings of [
9,
17], where RO-PUFs were investigated and where it was found that, after a certain evaluation time, no more improvements in stability could be seen.
Nevertheless, the maximum deadzone configuration of s was used for all further investigations as this value resulted in the lowest error rate.
5.2. Intrinsic Autodetection of Unstable Bits in EoA-PUF
Blanking of unstable bits is a commonly seen technique in PUF design in order to improve PUF robustness [
5,
33,
34]. This usually requires many repetitive measurements of all PUF instances and challenges and subsequently, after the statistical detection of (too) unstable responses, an exclusion of unstable bits. Detection of unstable bits is often conducted externally, requiring the responses to be shifted off-chip, which can leak the secret information through an unprotected port. Since repetitive measurements are time-consuming and detection based on statistical analysis complex, an onboard method that requires only a limited number of measurements without prior statistical knowledge would be advantageous. To the authors’ knowledge, both requirements are met by the unique features of the EoA-PUF as it allows auto-detection of unstable PUF-bits, which is explained in the following.
In the presented EoA-PUF architecture, RO pairs with small frequency differences must oscillate more frequently than RO pairs with large frequency differences to generate a large enough phase difference to leave the dead zone and thus cause the arbiter to make a decision. This explains the auto-detection of unstable bits; i.e., if a decision takes too long, it is rather considered unstable.
To evaluate this, we examined the number of required RO oscillations until the arbiter makes a decision. This is plotted over adjusted arbiter deadzones in
Figure 14; thereby, we separate prior found unstable responses from stable responses. It is seen that the number of required oscillations provides a very clear indication if a response was rather stable or not. This is an intrinsic property and advantage of the eye-opening effect of the arbiter and does not rely on the applied arbiter tuning.
By defining a threshold , all challenges that oscillated longer than until a final decision was made can be blanked in an initializing step. This can be repeated at every system start-up, making non-volatile memory to permanently store a blanking matrix obsolete, which is an improvement compared to the SotA. Further, it is not necessary to move the secret information, i.e., the PUF-bit responses, off the device for stability assessment. Although not evaluated in this work, these findings are also true for strong PUF implementations. Challenges leading to reasonable phase difference will lead to fast and stable results, while the remaining will be detected as unstable, without the need to classify every challenge in advance.
One could argue that such an auto-detection of unstable bits should also be possible in other oscillating PUFs like the TERO-PUF. The TERO-PUF is enabled for a fixed amount of time. Stable bits collapse into a stable state during this evaluation, while unstable ones do not; this should theoretically be detectable. The difference to the proposed EoA-PUF is that the latter has the XOR-gate output as an intrinsic signal, indicating if a decision was made. For the TERO-PUF, an external analysis of the outputs must be performed. Just adding an XOR-gate to the TERO-PUF would also not solve the issue as it would already recognize small noise-induced differences. Additional noise-suppressing hardware would be necessary, which results in the idea of the eye-opening arbiter, as presented in this work.
5.3. Further Use of the Reliability Information of Bit Decisions in the EoA-PUF
In addition to the possibility to use the oscillation length as an indication for unstable bits and thus automatic mask building, the same information can be used for other advantages, like the reliability information for a soft decision error correction code presented in [
26]. Also, readouts that are not blanked as unstable bits but by chance exceed
in a single readout can be identified and re-sampled for, e.g., majority decision.
The number of oscillations can also be related to a maximum evaluation time for a response as the mean frequency of the RO is known. These frequencies vary from 13.16 MHz to 17.46 MHz for different PUF instances.
Figure 14 shows that the lower bound for the required number of oscillations is 20 for unstable bits with a deadzone of 20 bits, which is expected to have the longest evaluation time. By choosing an upper bound of allowed oscillations for a decision to be
and setting the slowest mean RO oscillation frequency to
= 13 MHz, a maximum evaluation time of
can be calculated by
s. For
, the mean amount of oscillations
can be used to approximate an average evaluation time of
s.
5.4. Stability Against Temperature Variations
Environmental influences are a well-known non-ideality for all ICs and FPGA designs and can change the extracted PUF responses. Therefore, the influence of temperature changes on the EoA PUF is investigated. Please note that the used ZYBO allows no direct control of the FPGA power supply; thus, voltage variations are not investigated. Four FPGAs were evaluated at varying temperatures from
°C to
°C in steps of
°C. This range was chosen to not damage the other components on the used ZYBO. The deadzone configuration was set to the maximum value of 20 bits, which corresponds to an expected deadzone of
ps = 280 ps and results in maximum stable bits. The midpoint of the evaluated temperature range was taken as a reference temperature at
°C. At every temperature step, an arbiter tuning (see
Section 3.2) was conducted at start-up. Thereafter, all PUFs were evaluated 250 times and the resulting responses compared to the golden response at the reference temperature using Equation (4). The resulting HD
intra is plotted in
Figure 15 as green line for the EoA PUF. For the sake of comparability, a conventional SotA RO-PUF is implemented and evaluated on the same FPGAs. This is completed by disabling the arbitration signal and enabling the ROs for
evaluation time; this value was extrapolated from the results of [
17], who used the same type of FPGA for RO-PUFs. By comparing the amount of oscillations of both ROs, a response bit was derived. This bit was compared to the median of all responses. The resulting HD
intra for this RO PUF is plotted as blue line in
Figure 15. The general profile over temperature is similar, but the RO-PUF shows a distinct shift to worse stability when compared to the EoA-configuration. As the same hardware setup was used for both configurations, platform-dependent reasons for this shift can be excluded. Further investigations showed that some RO pairs just differed in a few oscillations after the evaluation time of
and therefore are sensitive to noise. Since a pure RO-PUF implementation does not have an in-built detection of unstable bits, these bits cannot be blanked at runtime but have to be determined in a post-processing step.
For further comparison of temperature dependency, results reported in the SotA for other FPGA designs are also considered. Even though the authors in [
10] presented a PDL-based arbiter-PUF on an FPGA and reported their error rate over the temperature range from
to
°C as up to
, no detailed data was provided and thus no graph was plotted. In [
35], a TERO-PUF was implemented on FPGA. The TERO-PUF consists of two ROs connected in one chain, running until a stable state is reached. The best-case error rate over temperature provided in [
35] is plotted as orange line in
Figure 15, which shows a
higher error rate at reference temperature than the presented EoA implementation of this work. In addition, the slope of the error rate is two times higher for negative temperature changes. For positive temperature changes, the slope seems to decrease. As the authors of [
35] did not address this behavior, it might also be a board-specific second-order effect, making further elaboration or comparison difficult.
In [
9], the authors evaluated a short RO-PUF, consisting of three LUTs, for temperature variations from 5 to
°C. From their data, the authors used a linear approximation of the error variation with temperature and stated
for a change of
°C. As the authors did not state an error rate at reference temperature, but only a change, no overall comparison can be made. Nevertheless, it can be stated that the slope of
of the EoA PUF and the approximation from [
9] are quite similar for small temperature changes but diverge for larger temperature changes. As the authors in [
9] already stated, the changes in frequency (and therefore also phase difference) on FPGAs are non-linear over temperature; Ref. [
36] showed that a third-order polynomial provides a better fit. Therefore, the linear approximation in [
9] is not accurate for large temperature differences and not suitable for comparison to the course of the measured results in
Figure 15.
In summary, the unique features of the proposed EoA-PUF design allow very low error rates at reference temperatures, which is based on the calibration using the PDLs as well as the automatic detection of unstable bits. Even though the positions of unstable bits and the ideal calibration/tuning setting can partially change at other temperatures, the temperature dependency of the error rate is quite comparable with other SotA implementations. Still, the features of calibration and auto-detection of unstable bits can be further exploited to reduce the bit error rate over temperature, which will be shown next.
5.5. Further Temperature Compensation
In the previous evaluation, all tuning values were kept constant. However, in the following, the possibilities of PDL-based tuning as well as the unique property of the auto-detection of unstable bits in the proposed EoA PUF are exploited with the goal to further reduce the error rate over temperature.
5.5.1. Offset Re-Calibration at Different Temperatures
A temperature variation causes the mean oscillation frequencies of the two employed ROs in the EoA PUF to shift with slightly different magnitudes [
9,
36]. This can be seen in
Figure 16 for one exemplary RO pair. Thus, the employed mean frequency offset compensation at a single reference temperature (see
Section 4) becomes less effective for large temperature changes. For better visibility, the frequency difference of both ROs is plotted in
Figure 16b. The overall drift dominates the influence of the challenge, causing some extracted bits to flip.
Since all PDLs used for offset compensation are affected by temperature in slightly different ways, no general rate of change for the offset tuning can be found. Nonetheless, this shift in mean frequency offset and thus its tuning could occur at more than one calibration point. For illustration, the nine evaluated temperatures were divided into three parts, each covering three temperatures. Next, the offset compensation, as explained throughout
Section 4, was re-calculated at the midpoint of each part, i.e., at
°C,
°C, and
°C. The resulting error rate for the whole temperature range from
°C to
°C is plotted as the red line in
Figure 17. The three calibration regions, in which the 3 sets of calibration data were used, are indicated by the vertical dashed lines.
By calibrating the offset at the exemplary three temperatures (plotted in red in
Figure 17), this flattens the overall temperature dependency when compared to the original error rate (re-plotted in green from
Figure 15), but still errors occur.
5.5.2. Re-Sampling the Reference and Stability
Throughout the prior sections, the EoA PUF has not only been calibrated for an offset of the two ROs’ mean frequencies but also (auto-detected) unstable bits were eliminated from the response. Thus, it is obvious that only an offset re-calibration over temperature cannot solve all the additional bit errors over temperature variations.
After the re-calibrated mean frequency drift, the different tuning vectors may result in minor changes in individual pairs and bit responses and thus change some prior unstable bits to become stable and vice versa. Therefore, the EoA-PUF intrinsic auto-detection of unstable bits was also used at all three calibration points to determine blanked bits for each evaluated region. Thereafter, the reference response (determined at
°C) was updated at the other two temperatures as the original reference might be invalid for bits that were unstable and became stable with changing temperature. This lowered the error rate at all three calibration points to almost zero, as can be seen in
Figure 17 as blue curve. The evaluated temperatures next to the calibration points have a remaining error of around
, which can be expected as the offset calibration, masking, and reference response change with temperature and cause some bits to flip. In consequence, to reduce the overall error rate of the EoA-PUF, more than one calibration point can be selected to re-calibrate the new mean oscillation frequencies of the two challenged ROs and to auto-detect the unstable bits over a wider temperature range. The amount of stable bits stays constant over all calibration points (see
Figure 13), but detected unstable bits vary in position. On average, the amount of stable bits over all calibration points is reduced by
compared to the reference temperature. Although no voltage variation can be conducted due to the lack of direct access to the power supply, it can be expected that constant voltage drops, which change the operation point, can also be compensated by this approach. Temporary fluctuations in the power supply can be seen as additional noise and eliminated by increasing the deadzone.
In summary, the EoA-PUF—when calibrated only at one temperature—behaves as good as or even better than SotA implementations over temperature changes, while it provides a faster readout compared to SotA RO-PUFs. In addition, using some of its unique features, like calibration and auto-detection of unstable bits, the temperature dependency can be minimized at distinct temperatures and thus overall flattened over a wide temperature range.
6. Discussion and Comparison
PUFs are used in different configurations and applications with different requirements. Applications using a PUF for constant key generation lack the need of a challenge input and render into a weak PUF, whereas authentication applications require strong PUFs with an accessible challenge input. After a short recapitulation of the final architecture, the tuning effort as well as the attackability of the EoA-PUF in both configurations are shortly discussed in the following.
6.1. Final Architecture
In the final design, two parallel ROs consisting of one NAND, sixty-three H-SDEs as the source of entropy, and twenty H-SDEs for offset calibration were used. In addition, four arbiter tuning PDLs, each consisting of twenty H-SDEs, were implemented. These are connected to the source of entropy by a selection LUT, which switches between tuning and normal operation mode. The arbiter tuning PDLs are also used to adjust the desired deadzone. An eye-opening arbiter is implemented to finally evaluate the phase difference caused by the source of entropy. Hadamard codewords are applied to render the PUF into a weak PUF configuration. This required an additional 20-bit PDL to minimize the frequency offset. As four LUTs fit into one slice, this results in a total consumption of 65 FPGA slices. Since Hadamard codewords have a length of powers of two, the challenge input was not increased further as, otherwise, the design would have been implemented over different slice types and clock regions on the used FPGA. An overview of the final design is depicted in
Figure 18.
6.2. Tuning Effort—Weak Versus Strong PUF
For weak and strong PUFs, any implementation-specific bias in the response must be removed. Asymmetric routing from the entropy source to the arbiter causes such bias for both PUF-types, which are compensated in the presented work by the addition of tuning PDLs and a newly proposed tuning algorithm.
With the implemented challenge input, the amount of extracted bits per PUF instance was increased, which also enables a future extension to a strong PUF. However, in the presented work, Hadamard codewords have been employed, effectively rendering the strong PUF into a weak PUF for key extraction. During the evaluation of the FPGA-based EoA-PUF, it was found that some instances tended to provide the same response for most applied challenges. This was caused by slight differences in the mean frequencies of the two competing rings, which were larger than the frequency change caused by the applied challenges; this would be similarly seen if the same challenged delay lines would implement an SotA RO or arbiter PUF configuration. By adding an additional tuning, the influence of this difference was compensated in the EoA-PUF, and the same approach could thus be applied to an RO or arbiter PUF featuring a challenge input. For a non-challenged weak PUF, this implementation would be unnecessary. This would omit the offset tuning and simplify the whole architecture while reducing the number of available bits. In this work, we presented the more complex strong PUF implementation and included all the necessary tuning to make it operational, but we leave it to the reader to make decisions in terms of the tradeoff regarding the available number of bits and required tuning vectors.
6.3. Attackability
PUFs used in secure systems for protection mechanisms are targeted by attackers in many different ways, like side-channel attacks, machine learning, or tampering.
As introduced in
Section 1, oscillation-based PUFs (e.g., loop-PUF) are prone to side-channel attacks (SCAs). Ref. [
20] stated that reduced measurement time of the attacker decreases the accuracy of the attack. A TERO-PUF was attacked via a side-channel in [
37], and the authors mentioned that multiple parallel oscillations increase the difficulty for the attacker to reveal the resulting bit. Both findings promise advantages of the EoA-PUF against attackability because it evaluates pairs of ROs with a very limited amount of oscillations to reach a decision (see
Section 5.3).
Further, strong PUFs are known to be attackable via machine learning. Several approaches against these attacks have been proposed in the literature, like [
38,
39]. However, some approaches, such as the XOR-arbiter PUF, have been shown to be broken [
40]. In the present work, Hadamard codewords have been employed, reducing the generally large number of challenges to a very limited number of codewords, thus rendering the strong PUF into a weak PUF for key extraction (see
Section 4), which makes machine learning attacks obsolete as a challenge input, no longer accessible by the attacker.
Tampering with helper data is another known method to attack PUFs [
41]. Helper data for the presented EoA-PUF are the tuning vectors for the arbiter and offset compensation, as well as the stability information, which is used for bit-masking. The tuning vector for the arbiter and the stability information can be regenerated within the system by the tuning algorithm and the auto-detection mechanism and therefore need not be stored permanently or made available externally. However, the offset compensation tuning must be stored as helper data and may offer an attacking-vector. As this data would also be necessary in an RO or arbiter PUF featuring a challenge input, as discussed above, it is not regarded as a distinct disadvantage of the EoA-PUF.
Actual attacks are not covered in this work as the focus was on the first implementation, feature description, and performance evaluation of the EoA-PUF on FPGAs. Nevertheless, future work is motivated concerning the security of the challenged EoA-PUF.
6.4. Comparison to State of the Art
Table 1 summarizes the results obtained with this EoA-PUF and compares them with other state-of-the-art FPGA implementations. ASIC implementations are not included in the comparison because they are expected to outperform FPGA-based PUFs due to their design flexibility. However, they require a dedicated chip and cannot be included in a present FPGA design.
Although the authors of [
10] did not provide a readout time, it can be expected to be the fastest due to the principles of arbiter-PUFs. Nevertheless, it has a much higher error rate than the proposed design and, as it was also implemented with SDEs, might be biased, which was not addressed by the authors. As for all FPGA-PUFs, temperature variations also have a high impact on the error rate of the presented EoA-PUF implementation compared to other implementations. Using the features of the EoA-PUF, the temperature dependency can be partly addressed by the proposed compensating techniques. These offer the reader a trade-off of complexity vs. accuracy, e.g., when product identification can be conducted in a controlled environment and therefore does not require temperature stability and thus compensation.
The evaluated uniqueness and uniformity of the proposed FPGA EoA-PUF are close to the ideal value of
and comparable to other SotA implementations. Comparing the amount of slices utilized for the final design, our work can also compete with the SotA. Concerning readout speed, among the SotA implementations, only the TERO-PUF [
35] and the DD-PUF feature a
times faster readout than the proposed EoA-PUF but also have higher error rates. Compared with an arbiter-PUF, which is closely related to the presented architecture, a speed comparison is difficult as other works did not provide numbers. As the EoA-PUF performs several oscillations (<20) before stopping, it has to be assumed to be a bit slower, but it results in a much better error rate. The other related architecture is the RO-PUF in [
17], which is 11 times slower than our design. It must be emphasized that this PUF has a 30 times faster mean frequency as it is designed as a short weak PUF. Therefore, it oscillates more often, which increases the reliability, at the same evaluation time. The loop-PUF, also based on an RO architecture with a challenge input, from [
19], achieves a comparably low error rate but at the cost of an approximately 20,000× longer readout time than the proposed EoA-PUF. Overall, the EoA-PUF outperforms the other implementations as it combines a fast and stable readout with a mean readout time of
s and a
. Additionally, none of the other works feature the distinct advantage of the auto-detection of unstable bits.