Intrinsic Run-time Row Hammer PUFs Leveraging the Row Hammer Effect for Run-Time Cryptography and Improved Security

: Physical Unclonable Functions (PUFs) based on the retention times of the cells of a Dynamic Random Access Memory (DRAM) can be utilised for the implementation of cost-efﬁcient and lightweight cryptographic protocols. However, as recent work has demonstrated, the times needed in order to generate their responses may prohibit their widespread usage. In order to address this issue, the Row Hammer PUF has been proposed by Schaller et al. [1], which leverages the row hammer effect in DRAM modules to reduce the retention times of their cells and, therefore, signiﬁcantly speed up the generation times for the responses of PUFs based on these retention times. In this work, we extend the work of Schaller et al. by presenting a run-time accessible implementation of this PUF and further reducing the time required for the generation of its responses. Additionally, we also provide a more thorough investigation of the effects of temperature variations on the the Row Hammer PUF and brieﬂy discuss potential statistical relationships between the cells used to implement it. As our results prove, the Row Hammer PUF could potentially provide an adequate level of security for Commercial Off-The-Shelf (COTS) devices, if its dependency on temperature is mitigated, and, may therefore, be commercially adopted in the near future.


Introduction
In recent years, attacks that exploit the effects of row hammering in Dynamic Random Access Memories (DRAMs) have gained a lot of attention.However, as proven by the work of Schaller et al. [1], which was published in 2017, the row hammer effect can also be used to actually enhance the security of a system, rather than diminish it.This paper extends the work of Schaller et al., demonstrating that the row hammer effect can be utilised to provide run-time accessible cryptographic applications and improved security.

Background and Related Literature
In this section, we provide some background information on the way DRAMs work and briefly discuss works relevant to this paper.We examine briefly literature concerning either of the two main topics related to this paper, the row hammer effect and memory-based intrinsic PUFs.

DRAM Data Storage and Access
The most common contemporary design for a DRAM cell consists of one transistor and one capacitor, as shown in Figure 1(a).The transistor acts as a gatekeeper, regulating access to the capacitor, whose charged or discharged state indicates the logical value stored in the DRAM cell.The gate of the transistor is connected to a wordline (WL) that controls access to the whole row.The capacitor is connected to a bitline (BL) through the transistor.Each bitline is also connected to an equalizer and a sense amplifier that are used to convert the capacitor's charge to a logical value.DRAM cells are quite often separated into true cells, whose charged state indicates logical one, and anti-cells, whose charged state represents logical zero [21].For both types of cells, their discharged states indicate the opposite logical value from that of their charged states.In Figure 1(a), we represent true cells as being connected to a BL bitline and anti-cells as connected to a BL * bitline.DRAM cells are organized in arrays, which are called banks, in the way Figure 1(b) demonstrates.In order to access a particular cell, a bank is selected, on that bank the correct wordline and bitline are charged, in order to allow access to a particular row and column, respectively, of that bank.
The charge stored in the capacitor of a DRAM cell leaks over time, leading the cell's logical value to flip.Therefore, the time required for enough charge to leak from a cell's capacitor is equivalent to its data retention time.Charge can leak from a cell's capacitor either to components of that cell itself or to components of other cells, which may be in the same or in different rows, as indicated in Figure 1(a) by the blue arrows.Therefore, in order to prevent data stored in the DRAM cells from leaking, the cells need to be accessed periodically, in order to reinforce their stored values, through a process being referred to as the refresh operation.In order to ensure data integrity, each DRAM row needs to be refreshed with a certain frequency, which is in the order of milliseconds for most contemporary DRAM implementations.

The Row Hammer Effect in DRAM
In recent years, large scale integration and higher clock frequencies being used in DRAMs have brought into the spotlight the significance of the row hammer effect for the security of contemporary DRAM implementations [2,22,23].The row hammer effect is an unintended side effect that occurs when a memory row, referred to as the hammer row, is rapidly and repeatedly accessed, causing cells in nearby rows, called victim rows, to leak charge more quickly [2,22,[24][25][26].This charge leakage can cause the cell's logical value to change, causing what is known as a bit flip.Such bit flips are persistent to the refresh operation [26].
The row hammer effect is based on the crosstalk between adjacent wordlines and bitlines, as well as between DRAM cells and their neighbouring capacitors and wires, as depicted in Figure 1(a).It has been shown that hammering a row will most likely affect its two adjacent rows.Consequently, we can distinguish between single-sided row hammering, where one hammer row is used to affect its two adjacent rows, and double-sided row hammering, where two (hammer) rows adjacent to the same victim row are hammered, in order to increase the chance of bit flips [3].Of course, these two hammer rows may also affect their adjacent (victim) row that is not adjacent to both of them.The distinction between single-sided and double-sided row hammering is evident on Figure 2.
Usually, to allow for a sufficiently high DRAM access rate, and thus to trigger disturbance errors, non-cached memory accesses are needed, e.g., by leveraging the CLFLUSH instruction.Lately, several works have demonstrated the feasibility of exploiting the Row Hammer effect on platforms that do not provide such cache line flush instructions.In order to circumvent CPU caching mechanisms and ensure direct access to DRAM, Gruss et al. [7] and Aweke et al. [8] enforce cache eviction through elaborate memory access patterns.Qiao and Seaborn [9] make use of x86 non-temporal store instructions, which do not use the CPU cache and van der Veen et al. [4] utilize non-cacheable DMA queries to exploit the Row Hammer effect.
Other papers have presented techniques to gain understanding of the locations of flipping bits.Razavi et al. [6] presented a technique that allows for targeted bit flips at arbitrary physical memory locations by combining the Row Hammer effect with memory duplication.In order to conduct predictable Row Hammer attacks, van der Veen et al. [4] use a brute-force approach to hammer all DRAM rows and collect information about expectable bit flip locations.Finally, in the work of Jung et al. [27] a novel approach is presented that allows for reconstructing the physical layout of DRAM cells, by applying multiple temperature gradients on the memory module and observing DRAM data retention, which can allow for row hammer attacks of high precision.
Since its discovery, the Row Hammer effect has been used mainly as means of attacking a computer system.In particular, changing the contents of memory cells can result in modification of important data.Seaborn and Dullien [3] as well as van der Veen et al. [4] rely on the Row Hammer effect in order to gain root privileges, by flipping bits in page table entries.Xiao et al. [5] attack Xen's paravirtualized memory isolation by employing the Row Hammer effect from within a malicious virtual machine.Razavi et al. [6] as well as Bhattacharya and Mukhopadhyay [28,29] successfully attack RSA by creating bit flips in keys stored in DRAM.Finally, Jang et al. [30] take advantage of the row hammer effect in order to successfully attack the memory isolation solution provided by the Intel Software Guard Extensions (SGX).
To the best of our knowledge, the work of Schaller et al. [1] is the only one that proposes the use of the row hammer effect in a DRAM in order to enhance the security of the relevant computer system that incorporates the DRAM, rather than diminish it.In this work, we extend and improve their techniques, in order to allow for improved security and run-time cryptographic applications.Additionally, we also briefly explore the potential of the examined security scheme for commercial adoption.Peer-reviewed version available at Cryptography 2018, 2, 13; doi:10.3390/cryptography2030013

Memory-Based Intrinsic PUFs
Physical Unclonable Functions (PUFs) ideally act as functions encoded in hardware, which produce a unique output, being referred to as a response, for a specific input, being called a challenge.However, in practice, PUFs tend to provide slightly noisy responses, which could affect their reliability [31,32].For this reason, usually a fuzzy extractor scheme,which incorporates some Error Correction Code (ECC) [12], needs to be applied, in order to stabilise the PUF response [33].
The Row Hammer PUF, like most memory-based PUF implementations, is an intrinsic PUF and, therefore, its implementation does not require the addition of extra circuitry either for its construction or for its operation.Some other well-known memory-based intrinsic PUFs include the SRAM PUF [34,35], the Flash PUF [36] and different types of DRAM PUFs based either on the startup values of the DRAM cells [15,16] or on their retention times [10][11][12]17] or on the access latency times of the DRAM operations [19,20].
Depending on the number of their available input-output pairs, which are referred to as Challenge-Response Pairs (CRPs), PUFs can provide a varying level of security and can, therefore, be used in different applications.The most common applications of PUFs include secure key storage and key agreement, as well as identification and authentication.A distinction can be made between PUFs with a single or very few CRPs, which are referred to as "weak" PUFs, and PUFs with a large number of CRPs, which are called "strong" [37,38].
Although memory-based PUFs are usually considered as weak PUFs, some types of DRAM-based PUFs, including the Row Hammer PUF, tend to provide multiple CRPs.In this work, we refrain from judging whether the Row Hammer PUF is a "weak" or a "strong" PUF implementation, and only note that it can provide multiple CRPs.This property of the Row Hammer PUF could be considered as a potential advantage over other memory-based PUFs that can provide only a single CRP, such as PUF implementations based on the startup values of SRAMs and DRAMs.Additionally, while the SRAM PUF can only be accessed at boot-time, the Row Hammer, as this work proves, can be also accessed at run-time, therefore allowing for the implementation of PUF-based cryptographic applications at run-time.
However, while SRAM PUFs have been studied extensively [39,40], DRAM-based PUFs, such as the Row Hammer PUF, have not yet been fully studied.Therefore, and as this work also proves, while DRAM-based PUFs, such as the Row Hammer PUF may have a number of advantages over other memory-based PUFs, such as SRAM PUFs, they also still suffer from a number of shortcomings, such as their dependency on temperature.These open issues will need to be addressed in order to explore the full potential of DRAM-based PUFs, including the Row Hammer PUF, for commercial adoption [14].Nevertheless, some of the identified shortcomings of some DRAM-based PUFs are being addressed in recent literature, e.g., as this works demonstrates, the Row Hammer PUF implementation can reduce significantly the time required for the generation of responses, in comparison to the ordinary DRAM retention-based PUF implementations.
Finally, as the Row Hammer PUF is implemented in DRAM, which is an essential memory component of most contemporary computer systems, it allows for the efficient implementation of run-time cryptographic applications even on resource-constrained devices, such as IoT hardware, which may not support the use of other, more complex and resource-demanding security primitives.

Run-time Row Hammer PUF Implementations in Commodity DRAM
In this section, we examine in detail the different parameters and factors that can affect the operation of a Row Hammer PUF.Additionally, we also present and discuss our Row Hammer PUF implementation, in comparison to the implentation presented by Schaller et al. [1].In particular, we introduce an improved run-time accessible implementation of the Row Hammer PUF, which leverages a Linux kernel module, in a similar way to the work of Xiong et al. [17], in order to provide access to DRAM characteristics utilised by the Row Hammer PUF.
As noted in previous works, the locations of the disturbance errors caused by the rowhammer effect in DRAM cells are stable [2,4].This makes the row hammer effect a promising candidate for a PUF.However, the number of bit flips introduced by the Row Hammer effect can be relatively small, and thus may only provide a limited amount of entropy.For this reason, the Row Hammer PUF takes advantage not only of the effect that the row hammer has on the DRAM cells, but also of their data retention characteristic.In this way, it manages to significantly decrease the time required for the generation of responses, and address a known problem of the DRAM retention-based PUF, which has been noted in the recent literature [20].Finally, further increases in the entropy of the responses of the Row Hammer PUF can be achieved by increasing the number of DRAM rows used in order to implement this PUF, as well as by controlling the initial values of the DRAM cells in both the hammer and the victim rows.

Row Hammer PUF Parameters
Our implementation setup is based on the inherent DRAM of a PandaBoard ES, the same DRAM that was employed also in the work of Schaller et al. [1], in order to facilitate comparisons between our results and the results of that work.In general, the operation of the Row Hammer PUF is influenced by a number of different parameters, which can be exploited in order to increase the number of potential responses of this PUF.The most influential of these parameters are examined in this section, while additional factors that can also affect the operation and performance of the Row Hammer PUF are discussed in Section 3.3.
In particular, we note the following significant parameters that have been examined in order to test their influence on the responses of the Row Hammer PUF:

•
Row hammering type: There are two approaches to induce the row hammer effect described in the relevant literature [3,8].Therefore, we can distinguish between two different Row Hammering types (RH types).If for one victim row there is only one adjacent hammer row, used to induce bit flips, we call it Single-Sided Row Hammering (SSRH).In contrast, Double-Sided Row Hammering (DSRH) involves the usage of both neighbours of a particular victim row as hammer rows.The patterns of hammer and victim rows, used to conduct SSRH and DSRH are shown in Figure 2. As we utilise the bit flips of a number of victim rows as the PUF response, we refer to these rows as PUF rows, as indicated in Figure 2. the Row Hammer PUF (PUF rows in Figure 2).Due to the existence of hammer rows, PUF rows are not consecutive.The PUF size and the RH type influence the actual hammering frequency, as a smaller PUF size will allow each hammer row to be accessed more frequently.Likewise, SSRH has fewer hammer rows, so each can be accessed more often within the same time period.

•
Initial Value (IV) of the hammer rows: For the memory range that corresponds to PUF address and PUF size, corresponding hammer rows will be pre-initialized by writing the hammer row IV to them, before conducting the row hammer process.

•
Initial Value (IV) of the PUF rows: Similarly, all PUF rows that are included in this memory range are initialized with the PUF row IV before the Row Hammer process is started.Both, the hammer row IV and the PUF row IV are important parameters because disturbance errors are caused by the interaction of the charges of DRAM cells, which are dependent on the logical values of these cells.Furthermore, as already mentioned, DRAM cells may be divided into so-called true cells and anti-cells, which represent the same logical value using different charge states [21].True cells have a logical value of one when charged and a logical value of zero when discharged, while anti-cells have the opposite values for these charge states.Consequently, initializing a true cell with a value of logical zero or an anti-cell with logical one will most likely prevent bit flips from occurring in these cells.Thus, it is important to evaluate the effect of different values of PUF row IV and hammer row IV.As the layout of true and anti-cells is identical for DRAM modules of the same type, once optimal settings for both sets of initial values have been found, they can be used for all other instances of the same device type.

•
Row hammering time: The Row Hammering time (RH time) defines the total duration of the PUF measurement, including the time needed to disable the refresh rate and conduct the row hammering process.The RH time, just as the PUF size and the RH type, affects how many times each hammer row will be accessed in total.Taking into account the above-mentioned parameters, the process of querying a Row Hammer PUF is depicted in Algorithm 1.Based on the PUF address, the PUF size, and the RH type, the DRAM region that will be used for the Row Hammer PUF is defined.First, memory caching is disabled, if it has been enabled, in order to ensure that all the row hammering commands will be executed and the relevant DRAM hammer rows will be accessed.Then, the defined DRAM region is reserved, so that no other program can access it.Next, the PUF rows and hammer rows are initialized with PUF row IV and hammer row IV, respectively.The PUF query is started by disabling the DRAM auto-refresh in the next step.This is done using the same technique as the one employed by Xiong et al. [17].Subsequently, the row hammering process is started.For this purpose, the hammer rows need to be accessed repeatedly for a certain time.This is achieved by a read operation to the first word of each hammer row, which in turn causes the whole DRAM row to be refreshed.Hence, bits in the PUF rows may start to leak charge and eventually flip.After RH time has passed, the process ends and the DRAM auto-refresh is enabled again.Finally, the PUF response is read from the PUF rows.

Additional Factors that Can Affect the Row Hammer PUF
As the Row Hammer PUF is inherently tied to the underlying physical properties of the DRAM modules, there is a number of additional factors that can influence its operation.In particular, external factors, such as the temperature and the voltage supply, as well as internal components, such as error correction implementations, can affect the properties and operation of the DRAM in general, and thus also significantly affect the responses of the Row Hammer PUF.
We discuss these factors and their potential influence on the Row Hammer PUF responses in more detail, as follows: • Temperature: Prior work has shown that victim cells are not strongly affected by temperature [2].However, the Row Hammer PUF is based on the interaction between the row hammer effect and the DRAM decay, which was shown to be temperature-dependent [17,20].We, therefore, evaluate the temperature effect in Section 4, which confirms that the Row Hammer PUF is significantly affected by temperature, exhibiting increased bit flips at higher temperatures.While,for low temperature variations, the noise levels are stable and low, for high temperature variations, the increase in the number of bit flips is such that it significantly differentiates the PUF responses.

•
Voltage: Prior work has also shown that the voltage supply affects the leakage of DRAM cells [41].In COTS devices there is currently no interface to control the voltage of DRAM cells.We assume that, for the Row Hammer PUF, the DRAM operates at the factory-specified voltage settings.The influence of voltage on the Row Hammer PUF will be investigated in future works.

•
Error Correction Code (ECC): ECC can be used in DRAMs to protect from bit flips.Many DRAM modules, such as the DRAM of the PandaBoard ES platforms used in this work, do not have ECC implemented.Even if ECC is present, Aichinger [42] showed that ECC is not enough to mitigate the Row Hammer effect.In order to use the Row Hammer PUF in the presence of ECC, the PUF size would have to be increased.Nevertheless, if ECC is implemented, it could potentially be disabled while the Row Hammer PUF is being queried, either fully or only for the DRAM rows being used as PUF rows.Even if ECC is present and cannot be disabled, it could also potentially be used for inherent error correction of the PUF response, therefore improving the Row Hammer PUF operation, rather than hindering it.Finally, as relevant ECC registers would indicate the rows on which bit flips are being observed, this information could potentially also be exploited to enhance the PUF measurements.However, in this work, we assume that no ECC is used, and the influence of ECC on the Row Hammer PUF remains to be explored by future works.

Implementation Setup
In this work, we test different implementations of the Row Hammer PUF that are based on the DRAM modules of multiple PandaBoard ES Rev. B3 evaluation board, which is the same implementation setup as the one used in the work of Schaller et al. [1] that introduced the Row Hammer PUF.In this way, we can facilitate comparisons between our results and the results of that work.All our PUF implementations are purely in software, leaving the hardware configuration unchanged.
The memory module in a Package-on-Package (PoP) configuration [43][44][45][46].This DRAM module is divided into 2 chips, with each chip containing 2 dies [46].The memory is also further divided in 8 banks and their overall row size is 32 KB [46], with each bank, therefore, having a row size of 4 KB.The PUF responses produced by the boards can be transferred from them to a computer using a serial connection, in order to be stored for further analysis and processing, or can be printed out using the available inherent system commands.In both cases, either the full PUF response can be extracted or only the memory regions that contain bit flips and their values.Finally, the correct time points at which the row hammering should start and end are also calculated by our code, in such a way as not to stall the PandaBoard's microprocessor.

Firmware Implementation
The original Row Hammer PUF implementation described by Schaller et al. [1] was realised using the U-Boot boot-loader [47].We have also tested this implementation, for the purposes of this paper, in order to compare it with the other implementations we present in this work.In the original implementation, the Row Hammer PUF is queried during an early stage during DRAM initialisation, before caching is enabled by the boot-loader.In this way, caching of the row hammering commands could be avoided, in order to ensure that the hammer rows in the DRAM are accessed every time using these commands.
In this work, we also present and test a similar firmware implementation, in which caching is completely disabled by setting the relevant registers to the appropriate values through software.This firmware implementation can be set to perform both SSRH and DSRH, enable or disable memory caching, use different sets of initial values as hammer row IV and PUF row IV and, finally, work for different RH time, PUF address and PUF size.Using these parameters, the Row Hammer PUF is queried as shown in Algorithm 1.
Since the DRAM is idle while the U-Boot boot-loader is running, queries to the Row Hammer PUF can be conducted without affecting any other functions of the platform.In U-Boot, one can also control the DRAM refresh cycle, as demonstrated by Xiong et al. [17].Furthermore, although the PandaBoard implements an ARM processor that does not provide the CLFLUSH instruction, one can access physical DRAM addresses without caching, as described above.
The address organisation of the DRAM being examined can be deduced from the relevant manuals [43][44][45][46].We allocate hammer rows and PUF rows in the same bank and make them adjacent, as shown in Figure 2. To perform the row hammer operation, the hammer rows need to be activated repeatedly for a certain time.In our implementation this is achieved by a read operation to the first word of each hammer row.

Kernel Module Implementation
As noted in the work of Schaller et al. [1], the Row Hammer PUF can also be implemented using a kernel module, in order to achieve run-time access.Similar to the U-Boot functionality, the DRAM refresh operation can also be disabled from kernel space, as demonstrated by Xiong et al. [17].Additionally, also in this case, caching is disabled by setting the relevant registers to the appropriate values through software.For the purposes of this work, we have implemented and tested such a run-time accessible Row Hammer PUF, using a Linux kernel module.
The module can be injected into the Linux kernel and run as one of the processes of the Linux Operating System (OS), without affecting its normal performance.The kernel module runs parallel to the other processes and is listed in the process list.Again, in this case, we allocate hammer rows and PUF rows in the same bank and make them adjacent, as shown in Figure 2.This kernel module can be set to perform both SSRH and DSRH, enable or disable memory caching, use different sets of initial values as hammer row IV and PUF row IV and, finally, work for different RH time, PUF address and PUF size.Using these parameters, the module queries the Row Hammer PUF following the steps of Algorithm 1.

Disabling the cache operation
The produced code runs in the PandaBoard's Cortex A9 MicroProcessor Unit (MPU), which also has a Cache Management Unit (CMU) that manages the caches of the MPU [44].The CMU needs to be programmed in such a way as to force all the cache lines to remain invalidated while the row hammering process is running.This can be achieved by using the 64 range operation sets of the CMU, each of which is referred to as a range set [44].Each range set has three registers that can be programmed to allow for a range operation to be performed over a memory region [44].The first of these registers stores the starting physical address of the memory region, the second its length and the third the type of operation to be performed, which can be clean, clean/invalidate or invalidate [44].In this way, all levels of cache can be invalidated.
The CMU performs the appropriate operation and can be programmed to interrupt the processor when the operation completes, or to poll for the status [44].It has 64 range sets out of which only 56 can be allocated [44].As the allocated registers need to be deallocated after their use, the allocations and deallocations of the 56 range sets are done as an explicit step in the relevant Row Hammer PUF code.The cache lines need to be constantly invalidated, while the Row Hammer PUF is queried, in order to effectively disable caching.This ensures that the memory rows accessed are not cached, but are accessed in the DRAM itself.

Evaluation
In this section, we evaluate the different implementations of the Row Hammer PUF according to their characteristics.We first present the original Row Hammer PUF implementation, introduced by Schaller et al. [1], and briefly discuss its relevant characteristics and evaluation.Then, we examine the different performance metrics by which the Row Hammer PUF implementations can be assessed, in general, as well as compared against each other.Additionally, we demonstrate and compare our results regarding the different Row Hammer PUF implementations, based on their characteristics.Furthermore, we investigate in detail how temperature affects the Row Hammer PUF and whether there are potential statistical relations among the PUF cells and their values.Finally, we also examine the potential of the Row Hammer PUF for commercial adoption.

Evaluation of the Original Row Hammer PUF
The original Row Hammer PUF, which was introduced by Schaller et al. [1], in 2017, is based on a firmware implementation that was querying the PUF during an early stage during DRAM initialisation, before caching had been enabled by the boot-loader.This implementation was tested using the values of the Row Hammer PUF parameters shown in Table 1.It was tested how these parameters affected the number of observed bit flips and, then, this implementation was evaluated, using a fixed parameter configuration, with regards to its uniqueness, robustness and entropy.Additionally, it was briefly discussed how temperature variations could influence the Row Hammer PUF.
Furthermore, due to the lack of information about the distribution of true and anti-cells it was necessary to explore the correlation between such parameters as the hammer row IV and the PUF row IV of the Row Hammer PUF and its PUF behaviour experimentally, by testing various parameter settings.The reason for this was that most vendors of COTS, including the manufacturers of the PandaBoard, treat such implementation details regarding their hardware components as the distribution of true and anti-cells in the DRAM, as intellectual property and thus will not disclose them.However, one potential approach to retrieve the layout of true cells (and anti-cells) would be to initialize the DRAM with '0xFF' (or '0x00'), disable the DRAM refresh operation and read back the memory contents after a period of several hours or days, i.e. at the end of the decay process.In the original evaluation, three different memory regions, each located on one individual PandaBoard, had been measured, with each such memory region considered as a PUF instance.For all of the measurements, the PUF address was fixed.For each parameter combination, 20 measurements were taken.
As Table 1 reveals, the original paper by Schaller et al. [1] considered a number of different values for the Row Hammer PUF parameters, focusing, however, on evaluating configuration settings that were expected to yield a good PUF.In order to extract the maximum possible entropy from the PUF, Schaller et al. primarily strived to maximize the number of bit flips.For this purpose, they needed to identify which parameters had the largest influence on the amount of bit flips.Their results, shown in Figure 3 and Table 2, reveal that the hammer row IV and the PUF row IV play a significant role in the amount of bit flips produced.Finally, the original work by Schaller et al. [1] also considered the Jaccard index for bit flips found in different responses of the same PandaBoard (intra-device Jaccard index -J intra ) or of different PandaBoards (inter-device Jaccard index -J inter ).By applying these metrics, they were able to prove that the original Row Hammer PUF responses exhibit a high degree of robustness and uniqueness, as the J intra values were close to one and the J inter values close to zero.As Figure 3 and Table 2 indicate, the original Row Hammer PUF provides the most bit flips and the highest entropy when hammer row IV = '0xAA' and PUF row IV = '0x55'.For this reason, Schaller et al. [1] chose to present results for the J intra and the J inter values only for this case, which can be seen in Figure 5.
In Figure 5, histograms for both J intra and J inter are presented, for RH time set either to 60s or 120s and PUF row IV='0xAA', hammer row IV='0x55', PUF size = 128KB and RH type = SSRH.This Figure shows that the values of J intra and J inter are not overlapping in any case, indicating that all the original Row Hammer PUF instances can be robustly and uniquely identified.With a minimum J intra  As DRAM retention-based PUFs exhibit high generation times for their responses, providing a relatively low amount of new bit flips over time, they usually exhibit a bias towards their original (non-flipped) values, which may even be public.Therefore, using metrics based on the Hamming distance, such as the intra-device and the inter-device Hamming distances, for their characterisation cannot usually provide useful insights into their performance.However, recent works [1,17,18,20,49] have shown that the use of similarly constructed metrics based on the Jaccard index of the positions of their flipped bits, such as the intra-device and inter-device Jaccard index, can provide a clear overview of their performance.
The J intra and J inter metrics are based on the Jaccard index [50], and for two sets s 1 and s 2 of indices of flipped bits in two PUF responses R 1 and R 2 , respectively, the Jaccard index between these two responses is given by the formula: which provides the similarity of the two sets, s 1 and s 2 .If R 1 and R 2 are obtained from the same PUF instance, then J(s1, s2) is equivalent to their J intra value, whereas if R 1 and R 2 are obtained from different PUF instances, then J(s1, s2) is equivalent to their J inter value.

The Role of the Row Hammer PUF Parameters in Its Evaluation
Schaller et al. [1] also noted that the bit flips observed in their results only partially overlap with the bit flips caused by the DRAM data retention characteristic alone.Compared to the bit flips caused by DRAM decay, their Row Hammer PUF implementation introduces 2.4 times bit flips in 60 seconds and about twice the number of bit flips in 120 seconds.Hence, the bit flips observed in the Row Hammer PUF responses are due to the hammering process and the DRAM cell decay that emerges after DRAM refresh is disabled, and the row hammering process induces new bit flips, which are at different locations compared to the DRAM decay process.
Additionally, the results of the evaluation of the original Row Hammer PUF clearly indicate that the hammer row IV, the PUF row IV and the RH time have a strong influence on the number of bit flips observed in its responses, while the RH type and the PUF size may affect this number, but not in a significant way.We, therefore, proceed to examine the potential causes of the observed behaviour, in regards to the Row Hammer PUF parameters discussed in Section 3.1.

•
Hammer row and PUF row IV: Given that DRAM arrays consist of true cells and anti-cells, the initial values of the hammer rows (hammer row IV) as well as the initial values of the PUF rows (PUF row IV) are expected to play an important role regarding the number of observed bit flips.Depending on the type of a cell, a bit flip in a PUF row can be observed only if the cell is initialized with the logical value that corresponds to its charged state.Similarly, due to the physical interaction of charged analog elements in the hammer and PUF rows (i.e., wires and capacitors) and the resulting charge leakage paths, the initial values of the hammer rows can also influence the probability of occurence of a bit flip.
Therefore, the values of both parameters must be chosen carefully, in order to maximize bit flips, and thus also maximise the entopy of the Row Hammer PUF.As Table 2 shows, different configurations of hammer row IV and PUF row IV lead to measurements that exhibit different bit flips.In general, it can be inferred from the experiments, that the number of bit flips on the PandaBoard can be maximized, if PUF rows are pre-initialized in such a way that keeps true cells and anti-cells in their charged states, while the cells of the adjacent hammer rows are kept in their uncharged states.In particular, the measurements show that most bit flips can be observed, if PUF rows are initialized with '0xAA', which indicates a bit-wise alternating pattern of logical values, starting with logical one.In this case, the most bit flips occur when adjacent hammer rows are set up using the complementary pattern, starting with a bit having the value of logical zero ('0x55').In contrast, no bit flips can be observed when initializing PUF rows with '0x55', as in this case, cells of the PUF rows were initialized corresponding to their uncharged states.

•
Row hammering time: As Figure 4 shows, the RH time significantly affects the amount of bit flips observed in the Row Hammer PUF responses.In particular, for RH time = 120s, the amount of bit flips observed seems to be, on average, ≈ 4 times the amount of bit flips observed for RH time = 60s for all cases examined.A strong relation between the RH time and the amount of bit flips observed was expected.Nevertheless, the exact relation between RH time and the amount of bit flips observed needs to be investigated even further.

•
Row hammering type: While the RH type was expected to have a strong influence on the number of bit flips, Figure 4 clearly indicates that, contrary to expectations, applying DSRH, as shown in Figure 4(b), instead of SSRH, as shown in Figure 4(a), does not lead to a highly increased number of flips, despite hammering both rows adjacent to each PUF row, instead of just one.Compared to SSRH, using DSHR only leads to ≈ 9% more bit flips in 60 seconds and to ≈ 15% in 120 seconds on average.• PUF size: The PUF size influences the total time required to execute a single iteration of hammering the DRAM.In all implementations presented, each hammer row is accessed roughly every 6µs when hammering 2 rows (4KB PUF) and every 8µs when hammering 17 rows (64KB PUF), when using DSRH, as shown in Figure 2. Figure 4 shows the number of bit flips relative to the PUF size.The number of bit flips does not change significantly for different values of PUF size, i.e., the fraction of bit flips for different memory ranges stays relatively stable.
In addition to these parameters, this work also examines the role of the following two varying factors in the evaluation of the tested Row Hammer PUF implementations:

•
Cache state: Caching can be either enabled or disabled in our experiments.We expect that disabling cache will lead into an increased amount of bit flips observed.

•
Implementation type: The Row Hammer PUF code has been implemented both in firmware and as a kernel module.While the kernel module implementation allows for run-time access to the Row Hammer PUF, we expect this implementation to result in a decrease in the number of bit flips observed, as in this implementation memory accesses are not direct, but is done through memory-mapped registers, in contrast to the firmware implementation.Finally, we also investigate temperature as a factor that can potentially affect the Row Hammer PUF significantly, as it is known that temperature variations have a strong influence on DRAM retention-based PUFs [1,17,18,20,49].

PUF Performance Metrics
As already noted, instead of using metrics that are based on the Hamming distance, i.e., inter-device and intra-device Hamming distance, we utilize the Jaccard index [50] for bit flips found in different responses of the same PandaBoard (intra-device Jaccard index -J intra ) or of different PandaBoards (inter-device Jaccard index -J inter ).This is motivated by the fact that the Row Hammer PUF show different characteristics from other memory-based PUFs, such as the SRAM PUF.In particular, the Row Hammer PUF responses draw their PUF characteristics mostly from the location of the flipped bits, and not only from their amount and value.This characteristic, the uniqueness of the flipped cell locations (addresses), is rather not properly reflected by metrics based on the Hamming distance, but by metrics based on the Jaccard index.
In particular, we evaluate the characteristics of both the firmware and the kernel module implementation of the Row Hammer PUF based on the following PUF qualities and the performance metrics relevant to each one of them, as explained below: • Uniqueness: The uniqueness of the Row Hammer PUF responses is measured using the J inter metric.This metric compares the indices of bit flips observed in responses obtained from different PUF instances.Ideally, for maximal PUF uniqueness, the two responses compared should have no common bit flip locations, resulting in a J inter values equal to zero.

•
Robustness: The robustness of the Row Hammer PUF responses is measured using the J intra metric.This metric compares the indices of bit flips observed in responses obtained from the same PUF instance.Ideally, for maximal PUF robustness, the two responses compared should have the same bit flip locations, resulting in a J intra values equal to one.

•
Entropy: PUF measurements should exhibit sufficient entropy in order to derive a cryptographic key that cannot be easily predicted, either partially or fully.We estimate the entropy of the PUF measurements, as proposed by Xiong et al. [17].Therefore, assuming that the locations of flipped bits are distributed uniformly,the entropy can be calculated as: where N is the total number of bits contained in a PUF response R x , i.e. the PUF size, and k as the cardinality of the set s x that contains the indices of flipped bits observed in R x .

Evaluation of Our Row Hammer PUF Implementations
In order to assess the applicability of the set of flipped bits as a PUF, we validated the uniqueness, robustness and entropy of the Row Hammer PUF responses for the parameter sets given in Table 3.In our evaluation, we use four different memory regions, each one located on an individual PandaBoard, with each such memory region considered as a PUF instance.The PUF address and the PUF size are the same for all of the measurements.Additionally, for each parameter combination, 20 measurements have been taken.
Furthermore, as Table 3 shows, we again examine different values for the hammer row IV and PUF row IV parameters, in order to determine their effects on the responses of the Row Hammer PUF implementations we examine.However, we should note that we do not present results for cases with PUF row IV = '0x55', because we have confirmed that they lead to no bit flips for our PandaBoard implementations.On the contrary, we examine cases where the cache operation is either enabled or disabled.Finally, we also note that we test both the firmware and the kernel module implementation using all the parameter sets given in Table 3.We again consider a large number of different values for the Row Hammer PUF parameters, in order to facilitate comparisons between our implementations and the original implementation by Schaller et al. [1].To this end, we choose to examine Row Hammer PUF implementations with a fixed PUF size = 128KB.Our results regarding the number of bit flips observed in the responses of our Row Hammer PUF implementations are shown in Figure 6 and Figure 7, for the firmware and the kernel module implementation, respectively.In comparison to the cases presented in Figure 3, for PUF size = 128KB, RH time= 120s and (SSRH/DSRH), for the original Row Hammer PUF implementation, the same cases in Figure 6 that presents the evaluation results of our firmware implementation, indicate a slight increase in the number of bit flips observed and, therefore, also in the entropy of this implementation.Additionally, as Figures 6 and 7 show, the average number of bit flips being observed in the Row Hammer PUF responses is rather dependent on the RH time, the hammer row IV and the PUF row IV.In particular, setting RH time= 120s leads to ≈ 400% more bit flips than when setting RH time= 60s.We also note that, in a similar fashion to Figure 3, the largest average number of bit flips and, therefore, the highest entropy occur when hammer row IV = '0xAA' and PUF row IV = '0x55'.

Preprints
On the contrary, the RH type and the Cache state do not seem to affect significantly the number of bit flips being observed in the PUF responses.In particular, the use of DSRH results in slightly more bit flips observed than the use of SSRH.However, the difference in the number of bit flips produced with the two methods does not appear to be significant.Nevertheless, SSRH requires ≈ 55% less memory and involves less memory accesses compared to DSRH.Furthermore, the Cache being enabled even seems to be increasing the number of bit flips observed for some combinations of hammer row IV and PUF row IV, while, for most cases, disabling the cache operation leads to an increase in bit flips, as we were expecting.
Finally, the firmware implementation seems to provide more bit flips in comparison to the kernel module implementation, for all cases.This difference is due to the fact that the firmware implementation accesses directly the DRAM, while the kernel module implementation uses memory-mapped registers to access it, a fact that leads to fewer DRAM accesses being achieved by the kernel module implementation for the same RH time and, therefore, the hammer rows being hammered less often, causing fewer bit flips in the PUF rows.Nevertheless, in all cases, the potential PUF response generation times can be significantly lower than the times required to generate the response of existing run-time accessible decay-based DRAM PUFs.
Based on the results shown in Figures 6 and 7, we can easily assume that for hammer row IV = '0xAA' and PUF row IV = '0x55', the minimum fractional number of bit flips observed in the responses of our Row Hammer PUF implementations will be at least 0.5% for RH time = 60s and 2.5% for RH time = 120s, given in percentages relative to PUF size.Based on Equation (2), the fractional entropy, i.e. the entropy per DRAM cell, is given by the formula: Therefore, the lowest bound for the fractional entropy, for RH time = 60s, is: while, for RH time = 120s, it is: Therefore, given the vast amount of available cells, the Row Hammer PUF responses show sufficient entropy to derive cryptographic keys.For example, the derivation of a 1024-bit key, given a fractional entropy of 0.045, requires ≈ 2.8KB, while given a fractional entropy of 0.169, it requires ≈ 757B.Thus, as PUF size = 128KB, the PUF can create at least 45 1024-bit keys at RH time = 60s and 173 such keys at RH time = 120s.

Regarding the Robustness and Uniqueness of the Responses
In a similar fashion to the original work by Schaller et al. [1], we also consider the J intra and J inter metrics, in order to assess the robustness and uniqueness, respectively, of the responses of our Row Hammer PUF implementations.As Figure 8 shows, the Row Hammer PUF responses of both the firmware and the kernel module implementation exhibit a high degree of robustness and uniqueness, as, for both cases, the J intra values are close to 1 and the J inter values close to zero, in a similar fashion to Figure 5.However, we need to note that our results consider all cases for the different parameter values shown in Table 3, and not just the different cases for hammer row IV = '0xAA' and PUF row IV = '0x55'.
As the values of J intra and J inter are not overlapping in any case, we can conclude that all Row Hammer PUF instances can be robustly and uniquely identified.Nevertheless, we note that in Figure 8, J intra values for RH time = 120s are closer to one and J inter values for RH time = 60s closer to zero, in a similar fashion to Figure 5.Such a result should be expected, due to the larger number of bit flips observed at RH time = 120s in comparison to RH time = 60s.Furthermore, Figure 9 and Figure 10 present in more detail the J intra and J inter values for all cases considered, for the firmware and the kernel module implementation, respectively.These two Figures also clearly indicate that for both implementations as well as both Cache states, all the Row Hammer PUF instances can be robustly and uniquely identified.Additionally, we note that, in most cases, a few outliers exist for J intra values, which could potentially be ignored.
In order to address the nature of these outliers, we present Figure 11 and Figure 13, which show the distributions of J inter and J intra values, respectively, grouped by hammer row IV, for different PUF row IV, Cache states and RH type, for the firmware implementation, and Figure 12 and Figure 14, which show the distributions of J inter and J intra values, respectively, grouped by hammer row IV, for different PUF row IV, Cache states and RH type, for the kernel module implementation.In these As one can see in Figure 11, the lowest J inter values for the firmware implementation seem to occur for hammer row IV = '0xAA', in all cases.However, Figure 12 indicates that J inter values for the kernel module implementation seem to be similar for all hammer row IV values, in all cases.Furthermore, Figure 13 indicates that all J intra values for the firmware implementation are close to one, apart from some values for RH type = SSRH, with the cache operation enabled.On the contrary, Figure 14 shows that J intra values for the kernel module implementation are more noisy in all cases, with the highest J intra values for this implementation occuring for hammer row IV = '0xAA', in all cases.
Figure 13 and Figure 14 both include values near 0.7, which do not appear to be clear outliers, indicating that the usual error correction schemes may not be applicable for the stabilisation of all the responses of the firmware and kernel module implementations.We, therefore, propose the application of the helper data scheme proposed by Schaller et al. [18] for the error correction of such cases.However, for hammer row IV = '0xAA' and PUF row IV = '0x55'/'0xAA', the minimum J intra values seem to be under 0.9, in all cases, and therefore, their noise can be easily corrected by standard Fuzzy Extractor (FE) constructions [48].
Finally, as 20 measurements were performed for each combination of parameters, we have also utilised an analysis method for the variance of these repeated measurements, based on the work of Bakeman [51].We utilise an ANalysis Of Variance (ANOVA) method, in order to discover the parameters that have the strongest effects on our results.We, therefore, consider only significant and large factor effects as meaningful.Our effect size is calculated as generalized eta-squared (η 2 G ), based on the work of Bakeman [51], with values of η 2 G > 0.26 denoting strong effects, i.e. factors accounting for more than 26% of the data variance.
For J inter values, ANOVA reveals that both the hammer row IV and the PUF row IV have a significant effect on them.In particular, for the firmware implementation, ANOVA, based on the method suggested by Bakeman [51], indicates that the hammer row IV has the strongest effect (F(3, 15) = 229.53,p < 0.001, η 2 G = 0.96) on the J inter values, while the PUF row IV also has a significant effect (F(2, 10) = 30.93,p < 0.001, η 2 G = 0.53) on them, as well as the interaction between the two sets of initial values (F(6, 30) = 220.43,p < 0.001, η 2 G = 0.93).For the kernel module implementation, ANOVA indicates that the PUF row IV has the strongest effect (F(2, 10) = 27.02,p < 0.001, η 2 G = 0.78) on the J inter values, while the hammer row IV also has a significant effect (F(3, 15) = 24.37,p < 0.001, η 2 G = .31)on them, as well as the interaction between the two sets of initial values (F(6, 30) = 55.15,p < 0.001, η 2 G = 0.73).For J intra values, ANOVA also reveals that both the hammer row IV and the PUF row IV have a significant effect on them.In particular, for the firmware implementation, ANOVA, based on the method suggested by Bakeman [51], indicates that the hammer row IV has the strongest effect (F(3, 9) = 56.37,p < 0.001, η 2 G = 0.88) on the J intra values, while the PUF row IV also has a significant effect (F(2, 6) = 132.51,p < 0.001, η 2 G = 0.79) on them, as well as the interaction between the two sets of initial values (F(6, 18) = 92.36,p < 0.001, η 2 G = 0.94).For the kernel module implementation, ANOVA indicates that the PUF row IV has the strongest effect (F(2, 6) = 21.46,p = 0.002, η 2 G = 0.63) on the J intra values, while the hammer row IV also has a significant effect (F(3, 9) = 38.23,p < 0.001, η 2 G = 0.51) on them.In this case, the interaction between the two sets of initial values does not seem to have a meaningful effect (F(6, 18) = 1.33, p = 0.293, η 2 G = 0.23) on the J intra values.These results seem mostly consistent with the results shown in the different Figures.However, the difference in the ANOVA values for the J intra and J inter metrics for the two implementations under examination, as well as the visible variations in the values presented in Figures 11 to 14 indicate that there is another factor that significantly affects the values for these two metrics.

Extended Investigation of the Role of Temperature on the responses of the Row Hammer PUF
The original paper by Schaller et al. [1] recognised that the original Row Hammer PUF responses could be influenced by its operating temperature.Therefore, it examined the behaviour of the original Row Hammer PUF at different levels of its operating temperature, namely 40 • C (working temperature of DRAM on PandaBoard), 50 • C and 60 • C. Schaller et al. [1] presented the average number of bit flips and the J intra values for PUF responses taken at these respective temperatures, as shown in Table 4. Nevertheless, the original work by Schaller et al. [1] does not present any J intra values calculated for two responses that have been taken at different temperatures from each other.As we will show, this might have been a major shortcoming of this work, as responses taken from the same Row Hammer PUF at different temperatures from each other differ significantly and Row Hammer PUF instances cannot be robustly and uniquely identified based on them.Nevertheless, as Schaller et al. [1] indicate, while bit flips increase at higher temperatures, the noise level stays constant at different temperatures, when the temperature is stable.Therefore, the Row Hammer PUF exhibits sufficient stability to be used at any temperature, within its physical limits, as long as the temperature remains stable.Our evaluation results show that even small changes even in the ambient temperature of the Row Hammer PUF can have such dramatic effects on its responses that two responses taken from the same Row Hammer PUF instance at two temperatures differing by only 10 • C cannot, in general, be used to identify that instance in a robust way and, sometimes, cannot even be used to uniquely identify such an instance.However, in order to validate that our Row Hammer PUF implementations can be used at different temperatures, when the temperature remains stable, we utilise the same methodology as Schaller et al. [1] and first present how temperature variations affect the average fractional number of bit flips observed in the responses of both the firmware, in Figure 15 and the kernel module implementation, in Figure 16.
We have evaluated both the firmware and the kernel module Row Hammer PUF implementations in the region from 0 • C to 70 • C using the ambient temperature and without reading out the exact operating temperature of the DRAM module.We performed our experiments using a climate chamber, namely a Heraeus Vötsch HC4005 one, which has an absolute accuracy of ±0.8 • C. We have also performed experiments for both Row Hammer PUF implementations at 80 • C of ambient temperature, at which temperature, however, the PandaBoard becomes unstable and either resets itself or, even, its execution hangs, until the PandaBoard is manually reset.
As Figures 15 and 16 show, for RH time = 60s, the average fractional number of bit flips is close to 0% of the PUF size for 0 • C, for both implementations, and only starts rising after the temperature has risen beyond 20 • C, reaching 50% of the PUF size, for the firmware implementation, and more than 40% of the PUF size, for the kernel module implementation, at 70 • C. As Figures 15 and 16 also show, for RH time = 120s, the average fractional number of bit flips is very close to 0% of the PUF size, for the kernel module implementation, and slightly above 2% of the PUF size, for the firmware implementation, for 0 • C. The average fractional number of bit flips starts rising slightly before 20 • C, for both implementations, reaching more than 60% of the PUF size, for the firmware implementation, and more than 70% of the PUF size, for the kernel module implementation, at 70 • C.This is a clear indication that both Row Hammer PUF implementations may face uniqueness problems for low RH time and low temperatures, as not enough bit flips will be occurring, and also for high RH time and high temperatures, as too many bit flips will be occurring, potentially preventing in both cases the correct identification of the PUF instance.Peer-reviewed version available at Cryptography 2018, 2, 13; doi:10.3390/cryptography2030013 Additionally, we have also examined the effects of temperature variations on the J intra and J inter values at various temperatures, as shown in Figure 17 and Figure 18, for the firmware and the kernel module, respectively.As it can be seen on Figures 17 and 18, the values of the J intra metric are close to 1 for both implementations and all temperatures examined, while the values of the J inter metric are close to zero for both implementations and temperatures below 60 • C, being below 0.1 for temperatures below 50 • C, and below 0.2 for temperatures between 50 • C and 60 • C.However, for temperatures between 60 • C and 70 • C, they rise abruptly and they reach, for RH time = 60s, values close to 0.25, for the firmware, and close to 0.35 for the kernel module implementation, and, for RH time = 120s,values close to 0.45, for the firmware, and close to 0.6 for the kernel module implementation.This is a clear indication that both Row Hammer PUF implementations may face uniqueness problems for high RH time and high temperatures, as the J inter values reach closer to the J intra ones, surpassing even the value of 0.5, and, therefore, potentially preventing in both cases the correct identification of the PUF instance.
Furthermore, as 20 measurements were performed for each combination of parameters for every 10 • C, in the temperature region from 0 • C to 70 • C, we have also utilised an analysis method for the variance of these repeated measurements, based on the work of Bakeman [51].We utilise this ANalysis Of Variance (ANOVA) method, in order to discover the parameters that have the strongest effects on our results.We, therefore, consider only significant and large factor effects as meaningful.Our effect size is calculated as generalized eta-squared (η 2 G ), based on the work of Bakeman [51], with values of η 2 G > 0.26 denoting strong effects, i.e. factors accounting for more than 26% of the data variance.
Our ANOVA analysis, in general, reveals that indeed temperature has a profound effect on both J intra and J inter values.However, it has a larger effect on the J inter values -with F (7, 35)   As this section shows, temperature can significantly influence the responses of the Row Hammer PUF, affecting both their robustness, in general, as well as their uniqueness, in some cases.We can, therefore, assume that minor variations observed for room temperature measurements could be caused by small variations in the ambient temperature.However, we have also shown that the Row Hammer PUF can be used over a large range of ambient temperature values, as long as the temperature remains the same.Nevertheless, even in this case, it is uncertain whether it will operate sufficiently at very low temperatures, at which, apart from the long time periods that may be required for responses to be generated, also data remanence effects can start to affect its operation [52].As we have discussed uniqueness problems may appear both at very low temperatures for low RH time, as not enough bit flips may be occurring, and at high temperatures for high RH time, as too many bit flips may be occurring.In the later case, we could use the indices of the cells that have not yet flipped, which could also provide unique identification of different devices.In conclusion, however, we need to state that the temperature dependency of the Row Hammer PUF is an issue that will need to be adequately addressed, before this PUF can be considered as an efficient security mechanism for widespread usage.We do need to note that our experiments were based on different values of the ambient temperature, a characteristic that an attacker can very easily manipulate, and not on the operating temperature of the PUF itself.

Potential Statistical Relations Among PUF Cells
In this section, we examine whether there is some statistical relation between the PUF cells that flip and their neighbourhood.We examine whether there is a statistical relation between PUF cells that have flipped and the values of their neighbouring PUF cells and also whether there is a statistical relation between PUF cells that have flipped and other PUF cells in their neighbourhood that also flip for the same or a lower RH time value.
In this way, we can investigate whether there is some way to predict the positions of the bit flips or if they appear to be random.If the positions of the bit flips could be predicted, then a number of different attacks taking advantage of this property may have been possible.However, in all cases, our results show that there appears to be no statistical relation between the PUF cells that flip and their neighbourhood.Nevertheless, a more in-depth investigation would be required, before we could state with absolute certainty that such a relation does not exist.
First, we examine the average values of PUF cells around a PUF cell that has flipped, for room temperature, PUF size = 128KB, RH time = 120s, hammer row IV = '0x55' and PUF row IV = '0xAA' and all the different combinations of cache states and RH type, as shown in Table 5 and Table 6, for the firmware and the kernel module implementation, respectively.In this way, we can detect potential statistical relations affecting the response of the PUF that stem from interactions between the charge that was stored in a PUF cell that has flipped, i.e. that has had at least half of its charge leaked, and the charge stored in other PUF cells found in different rows and columns of the DRAM around the flipped PUF cell.We do so by using a 3 × 3 window having the flipped PUF cell in its centre every time.Of course, only cells in the same row of this window are adjacent to each other in the DRAM module, as PUF cells in different rows may be separated by a hammer row in the DRAM module.Our results, which are shown in Table 5 and Table 6, indicate that the average probability of a neighbouring PUF cell having a logical value of one or zero is close to 50% in all cases, suggesting a lack of any statistical relation between these values and the fact that the center cell of the window has flipped.We test for RH time = 120s only, as the PUF cells that have flipped for RH time = 60s are a subset of the PUF cells that have flipped for RH time = 120s.Subsequently, we examine the average probability that a PUF cell has flipped in the neighbourhood of another PUF cell that has flipped, for room temperature, PUF size = 128KB, RH time = 120s, hammer row IV = '0x55' and PUF row IV = '0xAA' and all the different combinations of cache states and RH type, as shown in Table 7 and Table 8, for the firmware and the kernel module implementation, respectively.In this way, we can detect potential statistical relations affecting the response of the PUF that stem from interactions between the charge that was stored in a PUF cell that has flipped, i.e. that has had at least half of its charge leaked, and the charge of other PUF cells found in different rows and columns of the DRAM in an extensive region around the flipped PUF cell, leading these other PUF cells to decay faster than usual, and, therefore, also be flipped.We do so by using a 7 × 7 window having the flipped PUF cell in its centre every time.Of course, only cells in the same row of this window are adjacent to each other in the DRAM module, as PUF cells in different rows may be separated by a hammer row in the DRAM module.Our results, which are shown in Table 7 and Table 8, indicate that the average probability of a PUF cell being flipped in the extended neighbourhood considered is consistently similar to the general probability of a PUF cell being flipped at RH time = 120s, for each case, as shown in Figure 6 and Figure 7, for the firmware and the kernel module implementation, respectively.Therefore, our results suggest a lack of any statistical relation between PUF cells that flip within a particular RH time.We test for RH time = 120s only, as the PUF cells that have flipped for RH time = 60s are a subset of the PUF cells that have flipped for RH time = 120s.Finally, we also examine the average probability that a PUF cell that has flipped within RH time = 60s is in the neighbourhood of another PUF cell that has flipped within RH time = 120s, for room temperature, PUF size = 128KB, hammer row IV = '0x55' and PUF row IV = '0xAA' and all the different combinations of cache states and RH type, as shown in Table 9 and Table 10, for the firmware and the kernel module implementation, respectively.In this way, we can detect potential statistical relations affecting the response of the PUF that stem from interactions between the charge that was stored in a PUF cell that has flipped, i.e. that has had at least half of its charge leaked, within RH time = 120s and the charge of other PUF cells, found in different rows and columns of the DRAM in an extensive region around the flipped PUF cell, that have flipped, i.e. that have had at least half of its charge leaked, within RH time = 60s and, therefore, may have also affected the decay of the PUF cell that has flipped within RH time = 120s.We do so by using a 7 × 7 window having the PUF cells that flip within RH time = 120s in its centre every time.Of course, only cells in the same row of this window are adjacent to each other in the DRAM module, as PUF cells in different rows may be separated by a hammer row in the DRAM module.Our results, which are shown in Table 9 and Table 10, indicate that the average probability of a PUF cell having flipped within RH time = 60s and at the same time being in the neighbourhood of another PUF cell that has flipped within RH time = 120s is consistently similar to the general probability of a PUF cell being flipped at RH time = 60s, for each case, as shown in Figure 6 and Figure 7, for the firmware and the kernel module implementation, respectively.Therefore, our results suggest a lack of any statistical relation between PUF cells that flip at a particular RH time = t 1 and PUF cells that flip at another particular RH time = t 2 , with t 1 < t 2 .Thus, our results indicate that the logical values -and, therefore, also the charges -and the retention times of victim cells in a DRAM utilised for the implementation of the Row Hammer PUF do not affect the retention times of other victim cells in that DRAM, while it is being employed as a Row Hammer PUF implementation, as the logical values and retention times of PUF cells around a PUF cell that has flipped appear to be random.Additionally, the position of new bit flips does not appear to be based on the position of bit flips that have already occurred.Our results do not indicate any statistical relation of any sort, including a potential clustering of the bit flips.We chose to examine the logical values of cells neighbouring a PUF cell that has flipped using a 3 × 3 window, as these values are also based on the PUF row IV, and it would be easy to detect potential statistical relations, while we used a more extensive 7 × 7 window to examine the probability of PUF cells in their neighbourhood of a PUF cell that has flipped, flip within the same or a lower RH time value, because leakage paths and charge interactions within the DRAM module could potentially be occurring within an broad range around the cell that has flipped and is placed in the centre of the 7 × 7 window.As the previous sections indicate, although the Row Hammer PUF seems to be strongly dependent on temperature, its responses are, in general, unique, robust and of high entropy.Nevertheless, as temperature variations can significantly affect the robustness of the Row Hammer PUF responses, future research will need to fully address this issue.

Preprints
It should also be noted that the dependency of the Row Hammer PUF on temperature makes it, in general, susceptible to Denial of Service (DoS) attacks, as an attacker could change the ambient temperature and, in this way, also change the PUF response.Additionally, in case the ambient temperature is very low or very high, it could also be guessed or brute-forced, as the number of bit flips observed in it could either be too low or too high, respectively.Nevertheless, this latter attack also depends on whether an attacker may know the PUF row IV.
A proposed way to address the dependency of the Row Hammer PUF on temperature is to examine the effects of temperature on the PUF responses in detail, in order to identify a measurement time at each particular temperature, such that each of these times will result in a similar PUF response being acquired [17,18].In this way, by using a set of equivalent RH time, one for each particular temperature, in order to acquire similar responses at each temperature, the Row Hammer PUF implementations can be provide robust PUF responses even at different temperatures.However, such a solution may still suffer from high response generation times, at rather low temperatures.
Another potential way to address the effects of temperature on the Row Hammer PUF responses would be to combine these responses with the temperature of the PUF module.In particular, as the PandaBoard's microprocessor module, which contains its on-board DRAM package, also contains a temperature sensor, it is possible to combine temperature readings with the current temperature of the DRAM module.Preliminary experiments have indicated that the proposed solution can indeed provide results that appear to be highly promising.However, whether this potential solution can be used to solve the aforementioned issue in an efficient way remains in the scope of a future work.Nevertheless, such a solution can also be utilised in order to stabilise the PUF responses of DRAM retention-based PUFs, in general, as their implementations seem to suffer from such temperature dependencies [17,18,20].
Therefore, as the effects of temperature variations on the Row Hammer PUF can either be controlled or mitigated, its PUF responses could be considered as unique per PUF instance, mostly robust and, in general, of high entropy.In particular, as our room temperature experiments indicate, if the temperature remains relatively stable, PUF responses are highly stable and unique, with measured J intra and J inter values being, in all cases, close to zero and one, respectively.
Moreover, the Row Hammer PUF also offers a number of further advantages in comparison to other PUFs.First of all, it can be implemented in most contemporary computer systems, as DRAM is an inherent component of them.Secondly, it offers multiple Challenge-Response Pairs (CRPs) and can be accessed at run-time, in contrast to the SRAM PUF that provides only a single CRP and can only be accessed at boot-time.Additionally, it can provide significantly lower generation times and higher entropy than similiar DRAM retention-based PUFs, while also allowing for the implementation of the same cryptographic protocols as the ones implemented using those exact DRAM retention-based PUFs, such as key agreement [17] and authentication [17,18] protocols that have been implemented using the exact same hardware.
Finally, all of its current implementations require administrative rights to be properly inserted into a system and executed, which could prevent a number of attacks against them.Nevertheless, we note that security is a relative term, being highly dependent on the manufacturing costs, the costs of performing a successful attack and the potential gains/damages of such an attack [53].
Therefore, the Row Hammer PUF, like any other security mechanism [53], cannot provide perfect security, even if its PUF responses are no longer affected by temperature variations.Thus, in order to assess its value as a security mechanism and, in this way, determine also its potential for commercial Peer-reviewed version available at Cryptography 2018, 2, 13; doi:10.3390/cryptography2030013adoption, we should examine its manufacturing costs, the lowest cost of a successful attack and the potential gains/damages of such an attack.However, we already know that the manufacturing costs of the Row Hammer PUF are minimal for most contemporary computer system implementation, as DRAMs are inherent components of them.We also have discussed that the easiest way to attack the Row Hammer PUF is by changing the ambient temperature and that such an attack can either cause a DoS or, more rarely, lead to the PUF response becoming quite easy to reveal.
Hence, we can easily conclude that Row Hammer PUF implementations, and especially the kernel module one, are implementing a flexible, lightweight, cost-efficient and practical security primitive that can be used as a basis for the realisation of cryptographic applications, especially in low-end COTS devices, such as IoT hardware, that have limited resources and cannot support more complex security mechanisms, such as TPMs.Nevertheless, this security primitive suffers a significant vulnerability in the form of its strong dependency to temperature variations, which would prevent its commercial adoption, until it has been sufficiently addressed.

Conclusion
This work has presented an improved firmware implementation of the Row Hammer PUF that was originally introduced by Schaller et al. [1], as well as a run-time accessible implementation of the same PUF.The Row Hammer PUF is a memory-based intrinsic PUF that takes advantage of both the row hammer effect in DRAMs and the data retention characteristic of their cells, in order to provide unique responses.This is the first application of the row hammer effect that can be used to enhance the security of a system, rather than diminish it.Additionally, as DRAM modules are inherent components of most contemporary systems, the Row Hammer PUF can be implemented in them, without the need of additional hardware for its construction or operation.
In this work, we have extensively evaluated both a firmware and a kernel module implementation of the Row Hammer PUF, proving that the two implementations provide equally good results, for all cases assessed.Additionally, we have also confirmed that the Row Hammer PUF can provide unique and robust responses of high entropy.Finally, we have also shown that, in some cases, disabling the cache can increase the number of bit flips observed in the Row Hammer PUF responses and, therefore, also reduce the time needed to generate a response, or increase its entropy.
In general, as we have shown, the Row Hammer PUF can be utilised in order to address the problem of DRAM retention-based PUFs needing extended amounts of time in order to generate their responses [20].Furthermore, the Row Hammer PUF also provides multiple Challenge-Response Pairs (CRPs) and its kernel module implementation allows access to it at run-time.Moreover, both the firmware and the kernel module implementation of the Row Hammer PUF can be used as a basis for the implementation of cryptographic applications, such as key agreement [17] and authentication [17,18] protocols that have been designed for DRAM retention-based PUFs.
Nevertheless, our extended investigation of the effects of temperature on the Row Hammer PUF has revealed that the Row Hammer PUF is strongly dependent on temperature variations.Temperature variations can significantly affect its robustness and, at extreme cases, even substantially affect also its entropy.As this is also a known problem for DRAM retention-based PUFs [17,18,20], we have briefly discussed some potential ways to address this issue.However, as this issue undoubtedly affects the potential of the Row Hammer PUF for commercial adoption and widespread use, it needs to be addressed in a much more comprehensive manner by future research.If the dependency of the Row Hammer PUF on temperature can be controlled in an efficient manner, then, this PUF can also potentially be used for the attestation of time and/or temperature.Moreover, future research should additionally investigate and address the effects of voltage variations and aging on the Row Hammer PUF.
Finally, we can conclude that the Row Hammer PUF can, in general, be utilised as a basis for providing flexible, lightweight, cost-efficient and practical run-time cryptographic solutions in Peer-reviewed version available at Cryptography 2018, 2, 13; doi:10.3390/cryptography2030013low-end COTS devices, such as IoT hardware, that cannot support other more resource-demanding security primitives, such as TPMs.Nevertheless, although the Row Hammer PUF can be used to significantly improve the security of a system, the dependency of all its current implementations on temperature variations must be taken into account, when considering it as a security mechanism.

Supplementary Materials:
The code modifications employed by Schaller et al. [1] in order to implement the original Row Hammer PUF on the PandaBoard are available as an open-source patch file at http://www.seceng.de/schaller/row~hammer-puf/.

Figure 1 .
Figure 1.Schematics of (a) the organisation of individual DRAM cells and (b) the overall DRAM organization.The blue arrows in (a) show potential leakage paths.

Figure 3 .
Figure 3. Average fractional number of bit flips observed in the responses of the original Row Hammer PUF, given in percentages relative to PUF size, depending on combinations of hammer row IV and PUF row IV.Configuration used: PUF size= 128KB and RH time= 120s and (SSRH/DSRH).

[ 1 ]
also examined the effects of Single-Sided (SSRH) and Double-Sided Row Hammering (DSRH) on the number of bit flips observed in the original Row Hammer PUF responses, as shown in Figure4.Their results show that the use of DSRH results in slightly more bit flips observed than the use of SSRH.However, the difference in the number of bit flips produced with the two methods does not appear to be significant, as Figure4clearly indicates.

Figure 4 .
Figure 4. Average fractional number of bit flips observed in the responses of the original Row Hammer PUF, given in percentages relative to PUF size, using PUF row IV ='0xAA', different values of hammer row IV and, (a) SSRH or (b) DSRH.

Figure 5 .
Figure 5. Histogram of J inter and J intra values for the original three PUF instances using 20 measurements with PUF row IV='0xAA', hammer row IV='0x55', PUF size = 128KB and RH type = SSRH.

Preprints 5 FFigure 6 . 5 FFigure 7 .
Figure 6.Average fractional number of bit flips observed in the responses of the firmware implementation of the Row Hammer PUF, given in percentages relative to PUF size, depending on combinations of hammer row IV and PUF row IV.Configuration used: PUF size= 128KB, (RH time= 60s/RH time= 120s), (SSRH/DSRH) and (Cache disabled/Cache enabled).

PreprintsFigure 15 .
Figure 15.Temperature dependency of the average fractional number of bit flips observed in the responses of the firmware implementation, given in percentages relative to PUF size, for PUF size= 128KB, RH type = DSRH), PUF row IV = 0xAA, hammer row IV = 0x55, with the Cache disabled, (RH time = 60s/RH time = 120s) and ambient temperatures between 0 • C to 70 • C. 20 measurements have been performed for each combination of the presented values of the PUF parameters.

Figure 16 .
Figure 16.Temperature dependency of the average fractional number of bit flips observed in the responses of the kernel module implementation, given in percentages relative to PUF size, for PUF size= 128KB, RH type = DSRH), PUF row IV = 0xAA, hammer row IV = 0x55, with the Cache disabled, (RH time = 60s/RH time = 120s) and ambient temperatures between 0 • C to 70 • C. 20 measurements have been performed for each combination of the presented values of the PUF parameters.

Figure 17 .
Figure 17.Temperature dependency of the J inter and J intra values for the responses of the firmware implementation, for PUF size= 128KB, RH type = DSRH), PUF row IV = 0xAA, hammer row IV = 0x55, with the Cache disabled, (RH time = 60s/RH time = 120s) and ambient temperatures between 0 • C to 70 • C. 20 measurements have been performed for each combination of the presented values of the PUF parameters.

Figure 18 .Figure 19 .
Figure 18.Temperature dependency of the J inter and J intra values for the responses of the kernel module implementation, for PUF size= 128KB, RH type = DSRH), PUF row IV = 0xAA, hammer row IV = 0x55, with the Cache disabled, (RH time = 60s/RH time = 120s) and ambient temperatures between 0 • C to 70 • C. 20 measurements have been performed for each combination of the presented values of the PUF parameters.

Table 5 . 3 ×
3 windows showing the average probability of the neighbouring PUF row cells of a PUF row cell that has flipped to have a logical value of one, at room temperature for the firmware implementation, when PUF size = 128KB, RH time = 120s, hammer row IV = '0x55' and PUF row IV = '0xAA' and (a) caching is disabled and RH type = DSRH, (b) caching is enabled and RH type = DSRH, (c) caching is disabled and RH type = SSRH, and (d) caching is enabled and RH type = SSRH.Note that the cells in the 3 × 3 windows presented are adjacent to each other in the DRAM module only if they are in the same row of each window.

Table 7 .
7 × 7 windows showing the average probability of the neighbouring PUF row cells of a PUF row cell that has flipped, to have also flipped, at room temperature for the firmware implementation, when PUF size = 128KB, RH time = 120s, hammer row IV = '0x55' and PUF row IV = '0xAA' and (a) caching is disabled and RH type = DSRH, (b) caching is enabled and RH type = DSRH, (c) caching is disabled and RH type = SSRH, and (d) caching is enabled and RH type = SSRH.Note that the cells in the 7 × 7 windows presented are adjacent to each other in the DRAM module only if they are in the same row of each window.

Table 1 .
Parameters used for the evaluation of the original Row Hammer PUF characteristics, and their corresponding sets of values.Additionally, this firmware implementation is executed before caching is enabled.

Table 2 .
Overview of the average number of bit flips observed in the responses of the original Row Hammer PUF, depending on combinations of hammer row IV and PUF row IV.Configuration used: PUF size= 128KB and RH time= 120s and (SSRH/DSRH).

Table 3 .
Parameters used for evaluation of the Row Hammer PUF characteristics, and their corresponding set of values.Compared to Table1, the PUF size is fixed to 128KB and caching can be enabled or disabled through the manipulation of the registers of the Cache Management Unit (CMU).Additionally, cases with PUF row IV = '0x55' are not examined in depth, as we have verified that they lead to no bit flips for our PandaBoard implementations.Finally, both the firmware and the kernel module implementation have been tested using this configuration.

Posted: 28 April 2018 doi:10.20944/preprints201804.0369.v1
Figures, the values of J inter and J intra for different for different RH time are grouped in a single distribution.
1 Figure 8. Histogram of J inter and J intra values for the firmware and kernel module implementations, using 20 measurements for each case of different combinations of RH type, PUF row IV, hammer row IV, Cache state and (RH time = 60s/RH time = 120s), for PUF size = 128KB, according to Table 3.Figure 9. Histogram of J inter and J intra values for the firmware implementation, using 20 measurements for each case of different combinations of RH type, PUF row IV, hammer row IV, Cache state and RH time, for PUF size = 128KB, according to Table 3. Preprints (www.preprints.org)|NOT PEER-REVIEWED | Figure 10.Histogram of J inter and J intra values for the kernel module implementation, using 20 measurements for each case of different combinations of RH type, PUF row IV, hammer row IV, Cache state and RH time, for PUF size = 128KB, according to Table 3.Figure 11.Distributions of J inter values, grouped by hammer row IV, for different PUF row IV, Cache states and RH type, for the firmware implementation.J inter values for RH time = 60s and RH time = 120s are grouped in a single distribution, per case.Preprints (www.preprints.org)| NOT PEER-REVIEWED |

Posted: 28 April 2018 doi:10.20944/preprints201804.0369.v1
Figure 12.Distributions of J inter values, grouped by hammer row IV, for different PUF row IV, Cache states and RH type, for the kernel module implementation.J inter values for RH time = 60s and RH time = 120s are grouped in a single distribution, per case.Figure 13.Distributions of J intra values, grouped by hammer row IV, for different PUF row IV, Cache states and RH type, for the firmware implementation.J intra values for RH time = 60s and RH time = 120s are grouped in a single distribution, per case.Preprints (www.preprints.org)| NOT PEER-REVIEWED |

Posted: 28 April 2018 doi:10.20944/preprints201804.0369.v1
Figure 14.Distributions of J intra values, grouped by hammer row IV, for different PUF row IV, Cache states and RH type, for the kernel module implementation.J intra values for RH time = 60s and RH time = 120s are grouped in a single distribution, per case.

Table 4 .
Average number of bit flips and minimum J intra values obtained at operating temperatures of 40 • C, 50 • C and 60 • C, for the original Row Hammer PUF implementation with PUF row IV='0xAA', hammer row IV='0x55', PUF size = 128KB, RH time = 120s and RH type = SSRH.

Table 6 .
3 × 3 windows showing the average probability of the neighbouring PUF row cells of a PUF row cell that has flipped to have a logical value of one, at room temperature for the kernel module implementation, when PUF size = 128KB, RH time = 120s, hammer row IV = '0x55' and PUF row IV = '0xAA' and (a) caching is disabled and RH type = DSRH, (b) caching is enabled and RH type = DSRH, (c) caching is disabled and RH type = SSRH, and (d) caching is enabled and RH type = SSRH.Note that the cells in the 3 × 3 windows presented are adjacent to each other in the DRAM module only if they are in the same row of each window.

Table 8 .
7 × 7 windows showing the average probability of the neighbouring PUF row cells of a PUF row cell that has flipped, to have also flipped, at room temperature for the kernel module implementation, when PUF size = 128KB, RH time = 120s, hammer row IV = '0x55' and PUF row IV = '0xAA' and (a) caching is disabled and RH type = DSRH, (b) caching is enabled and RH type = DSRH, (c) caching is disabled and RH type = SSRH, and (d) caching is enabled and RH type = SSRH.Note that the cells in the 7 × 7 windows presented are adjacent to each other in the DRAM module only if they are in the same row of each window.

Table 9 .
7 × 7 windows showing the average probability of the neighbouring PUF row cells of a PUF row cell that has flipped at RH time = 120s, to have flipped RH time = 60s, at room temperature for the firmware implementation, when PUF size = 128KB, hammer row IV = '0x55' and PUF row IV = '0xAA' and (a) caching is disabled and RH type = DSRH, (b) caching is enabled and RH type = DSRH, (c) caching is disabled and RH type = SSRH, and (d) caching is enabled and RH type = SSRH.Note that the cells in the 7 × 7 windows presented are adjacent to each other in the DRAM module only if they are in the same row of each window.

Table 10 .
7 × 7 windows showing the average probability of the neighbouring PUF row cells of a PUF row cell that has flipped at RH time = 120s, to have flipped RH time = 60s, at room temperature for the kernel module implementation, when PUF size = 128KB, hammer row IV = '0x55' and PUF row IV = '0xAA' and (a) caching is disabled and RH type = DSRH, (b) caching is enabled and RH type = DSRH, (c) caching is disabled and RH type = SSRH, and (d) caching is enabled and RH type = SSRH.Note that the cells in the 3 × 3 windows presented are adjacent to each other in the DRAM module only if they are in the same row of each window.