A Survey of Bit-Flip Attacks on Deep Neural Network and Corresponding Defense Methods

Cheng Qian; Ming Zhang; Yuanping Nie; Shuaibing Lu; Huayang Cao

doi:10.3390/electronics12040853

,

and

National Key Laboratory of Science and Technology on Information System Security, Beijing 100085, China

^*

Author to whom correspondence should be addressed.

Electronics2023, 12(4), 853;https://doi.org/10.3390/electronics12040853

This article belongs to the Section Computer Science & Engineering

Version Notes

Order Reprints

Abstract

As the machine learning-related technology has made great progress in recent years, deep neural networks are widely used in many scenarios, including security-critical ones, which may incura great loss when DNN is compromised. Starting from introducing several commonly used bit-flip methods, this paper concentrates on bit-flips attacks aiming DNN and the corresponding defense methods. We analyze the threat models, methods design, and effect of attack and defense methods in detail, drawing some helpful conclusions about improving the robustness and resilience of DNN. In addition, we point out several drawbacks to existing works, which can hopefully be researched in the future.

Keywords:

bit-flip attacks; deep neural network (DNN); bit-flip defense

1. Introduction

With the vigorous development of deep neural network technology (DNN) in recent years, it has been widely used and deployed in many fields, such as image recognition, speech recognition, and big data applications. Many of these fields are security-critical, such as medical identification, automatic driving, etc. Misjudgment in these fields may lead to very serious consequences. Since the parameters of DNN are robust and resilient to some extent, and data integrity verification mechanisms are generally deployed in high-performance software and hardware systems, the industry generally believes that DNN is a robust and trustworthy structure.

However, DNN is actually far more fragile than the industry believes due to the uncertainty of its internal parameters []. There are many environments for deploying DNN at present, ranging from CPU-GPU high-performance platforms to FPGA-based edge devices, and there is no guarantee that there are enough resources to provide data integrity and availability verification []. Therefore, structures or components that store DNN-related data, such as firmware in GPUs and bitstreams in FPGAs, are likely to be targeted by attackers. Existing surveys have shown that commonly used error correction mechanisms, such as Error Correction Code (ECC) and Software Guard Extensions (SGX), just have limited functions, and some attack methods can bypass these checks [,]. DNN is far from impeccable security, although it has a certain fault tolerance capability, which is highly dependent on the network’s size, structure, training process, etc. For example, DNN, which has fewer neurons, is more likely to have a higher fault tolerance rate. Some studies have shown that the use of full-precision weights actually does more harm than good, as the inference efficiency is reduced with worse security performance, and the overhead becomes greater []. Currently, many DNNs use fixed-point data to represent the weights. In some extreme cases, weights are compressed to be represented in 1 bit, called binary neural network, which sacrifices performance but enhances security []. There are also many works proving that DNN is quite vulnerable to some adversarial attacks, such as the perturbation of weights by adversarial samples []. This type of sample has little impact on human recognition but causes serious judgment errors in DNN models. Some works use adversarial samples to steal the parameters of DNN, to carry out further attacks.

Compared with adversarial sample attacks, hardware attacks are more stealthy and difficult to deal with [,]. One of the typical hardware attacks is the side-channel attack, which uses methods such as electromagnetic channel, PCI interception, and runtime analysis to analyze and steal significant parameters of the DNN model []. Bit-flip attack is another kind of hardware attack which tries to maliciously modify the parameters in some way, i.e., errors are injected into the DNN, which makes the function of the DNN obstructed or even completely lost. Common bit-flip attacks include the RowHammer attack, Voltage Frequency Scaling (VFS) attack, clock glitching, and laser injection. There are many targets that bit flipping can attack with DNN, such as multiplier, weight value, bias, activation function, etc., which means the method is quite flexible. For example, if someone has physical access privilege to the hardware carrying DNN, he may use laser injection attack; or if he can access DNN only remotely, the RowHammer attack can be conducted. Bit-flip failures corrupt the memory storing the parameters of the victim model, thereby challenging the resilience of DNNs to bitwise corruption. Some works have explored the impact of a single bit-flip attack on the DNN model [,,], and some important conclusions have been drawn from the perspective of DNN’s characteristics and the bit-flip operations, e.g., random flips do not work well, drastic changes in parameters are the main cause of vulnerability, the ratio of vulnerable parameters is largely consistent across DNNs of different architectures, and the vulnerability increases with the increase in DNN scale. These conclusions are instructive for the precise design of related attack and defense work. In addition, the adversarial example attack is generally designed independently for a certain type of problem, while the bit-flip attack is an indiscriminate attack that applies to various scenarios. Therefore, bit-flip attack has a broader impact and is potentially more harmful.

To counter bit-flip attacks, corresponding defense methods have emerged. The work on analyzing and evaluating the security properties of the DNN is currently divided into two directions: theoretical analysis-based and experiment-based. The theoretical analysis avoids complicated experiments and produces intuitive and interpretable results through analysis. This kind of work focuses on the mathematical modeling of DNNs, analyzing the effects of perturbations in weights/biases/activation functions when bit flipping occurs, to theoretically discover methods to suppress perturbations and calculate the effects. Experimental analysis is more accurate and intuitive but complex to deploy, which shares more attention in academia and industry. Three main areas are involved to defend bit flipping: first, increasing the threshold of bit flipping, which requires more bit flips to reduce the capacity of DNN; second, defending against bit-flip by determining the presence of bit-flip through traditional data detection methods, such as hash comparison; and third, training ML models to perform bit-flip attack detection defense. This paper focuses on experimental analysis. As shown in Figure 1, the related work on bit-flip attacks has increased dramatically in recent years, indicating that this direction has become a hot topic in academia and industry. In terms of the development process, attack and defense are always in a state of mutual crossover, back and forth, and mutual restraint.

Figure 1. Bit-flip attack and defense related works in recent years.

This paper conducts a detailed survey on the works using bit-flip to attack DNN and the corresponding defense works in recent years. The main contributions are as follows:

1.: We summarize the current commonly used methods that can trigger bit-flip in the DRAM system that DNN relies on.
2.: We summarize the mainstream bit-flip attack methods; analyze the mathematical principles of the attacks; analyze and compare the effects and overheads of different attacks.
3.: We compare and summarize the mainstream methods of defending against bit-flip; show the ideas of the current main defense work; analyze the key points of different defenses.
4.: We illustrate some shortcomings of the current work and put forward some possible starting points for future work.

The research flow of this paper is shown in Figure 2. After an introduction about the related background and our motivation, we present the commonly used bit-flip attack against DNN. Then we meticulously research and analyze the current bit-flip attacks and corresponding defenses, including the mathematical model, attack methods, effects, and so on. Afterwards, we discuss our findings based on the existing methods and put forwards some possible future directions. To the best of our knowledge, our work is the first to survey recent bit-flip attacks and defenses aiming at DNN comprehensively and deeply. The rest of this paper is organized as follows: Section 2 introduces the current commonly used bit-flip methods; Section 3 introduces the current attack methods, including threat models, attack principles, overview of attack work, and attack effects; Section 4 introduces the current defense methods, including defense ideas, overview of defense work, and defense effects; Section 5 lists our findings based on current attack and defense work; Section 6 introduces related work; Section 7 puts forward some ideas of future work; and Section 8 draws a conclusion.

Figure 2. Research flow of this paper.

2. Bit-Flip Methods

The purpose of the bit-flip attack is to reduce the ability of the DNN model by changing the parameter state stored at runtime. At present, the bit-flip methods for the DRAM system on which the DNN runs mainly include the RowHammer attack, VFS attack, clock glitching attack, and laser injection attack.

2.1. Structure and Timing of DRAM

DRAM is widely used as a basic component of memory system and has been developed very maturely. Figure 3 illustrates the hardware architecture of DRAM. A DRAM module consists of at least one DRAM rank communicating with the processor side through a memory channel. Each DRAM rank contains several DRAM chips, and each chip contains several banks, which is the smallest unit for access parallelly. The structure of the bank is shown on the left side of Figure 4. It consists of several horizontal wordlines and vertical bitlines. The RowDecoder is responsible for resolving the specific address of the access operation and performing the read/write operation at the corresponding location. After that, the related data are passed to the RowBuffer composed of the sense amplifier.

Figure 3. DRAM hardware structure.

Figure 4. The structure of DRAM Array and DRAM cell.

The right side of Figure 4 shows the structure of the DRAM cell, which is mainly composed of a storage capacitor and an access transistor. When the address is not selected, the access transistor is set to low voltage and there is no path between the capacitor and the bitline, so the information in the capacitor is stored. When the address line of the DRAM cell is selected, the access transistor is set to a high voltage, which reduces the energy barrier between the capacitor and the bitline and allows the information in the capacitor to be read.

Various operations in DRAM need to satisfy timing constraints, including Activation, Read/Write, Precharge, and Refresh. To access some positions of DRAM, an Activation operation needs to be performed first. After the Activation command reaches the target row, it will first confirm that each DRAM cell in the target row is correctly connected to the bitline, and the sense amplifier will disturb the voltage of the bitline until the value of the DRAM cell is recalibrated. When the bitline voltage is amplified to a certain level, the information in the DRAM cell becomes accessible. The activation instruction involves several timing parameters: tRCD is defined as the time required from row activation to data access; tRAS is defined as the time from row activation to the information in the DRAM cell is calibrated. After Activation is completed, the DRAM cell can be read and written. Both read and write operations are performed through the RowBuffer, where tCL is defined as the read access delay, and tCWL is defined as the write access delay. The Precharge command is used to deactivate the target address and charge the bitline for the next Activation. tRP is defined as the interval between the Precharge and Activation operation [,].

DRAM cells are dynamic and leaky, and the charge stored in DRAM cells naturally leaks over time. Therefore, the cell in DRAM needs to be periodically charged by a refresh operation to maintain the stored charge at the original level. The refresh operation compensates for cell capacitor leakage by restoring cell charge, and the refresh operation consumes about 35% of the total energy consumption. Infrequent refresh, while reducing power consumption, may cause the cell to lose stored charge. The DDR3 SDRAM standard published by JEDEC defines the timing parameter tREFI (defined as 7.8 μs) as the refresh interval, which specifies the maximum time interval between refresh commands for a cell to safely maintain its stored value without restoring charge. The refresh interval tREFI is determined by considering the cell retention time tRET (defined as 64 ms), which is the maximum time for a cell to hold its stored value without restoring the charge. Since the stored value in a cell cannot be retained without restoring charge, each cell should be refreshed within tRET to ensure reliable operation. As the DRAM cell size shrinks, non-defect cells can suffer from errors due to frequent accesses to rows even when the refresh interval of tREFI is maintained.

2.2. RowHammer Attack

In 2014, Kim et al. proposed a phenomenon [] in which certain memory rows are repeatedly activated, memory bit flipping happens in their nearby memory rows, causing memory isolation to be broken. RowHammer attack can be vividly described as “frequent banging in one’s room that causes the neighbor’s door to shake open, thus gaining access to the neighbor’s house” [,]. Previous work has done a silicon-level analysis to reveal the essence of a RowHammer attack []. Currently, there are two possible causes of RowHammer generation. One possible cause is that the frequent activation of some rows may increase the cell temperature, and incur electron injection/diffusion/drift. The electron flow from the hammered row to its neighbor rows creates temporary charge leakage paths between DRAM cells, and then reduces the voltage of the capacitors in the victim DRAM cell, where bit-flip happens. Another possible cause is the crosstalk between capacitors, which is caused by the interference between electric fields of neighboring lines. As the industry’s requirement for DRAM capacity is increasing, the distance between DRAM cells is inevitably designed to be smaller, making a larger voltage difference between bitline and DRAM cell or between two bitlines, exacerbating the electron injection/diffusion/drift error mechanism. This will format a temporary charge leakage path between DRAM cells due to the parasitic capacitance between two bitlines or one bitline and a DRAM cell, exacerbating the charge leakage path in and around the DRAM cell capacitor. As shown in Figure 5, in addition to the single-sided RowHammer attack, the double-side RowHammer attack from both sides is also one of the hot research directions. The double-side RowHammer attack is faster and flips more bits compared to the single-side RowHammer attack.

Figure 5. Situations of single-side hammer and double-side hammer.

The potential harm of RowHammer lies not only in performing bit flipping but also in constituting a side channel to leak data information in memory. The work of [] found that implementing the RowHammer attack requires adjacent DRAM rows to have a specific data structure, called column-wise bit striping pattern. Deephammer’s work explores and defines the form of this pattern, i.e., X-0-1 or X-1-0. When X is 1/0, hammering the DRAM memory layout will result in flipping or not flipping the fragile bits, and the original information stored in the memory can be inferred from the flipping result. Existing experimental results prove that if a DRAM row contains n pages, then n-1 pages can be used to leak data. Some studies [] have suggested that only a small fraction of memory cells are vulnerable to the attack. They propose that more than 3 million bit flips can be caused in a 128 MB buffer, which is only 0.036% of the whole bits in the buffer. Therefore, bit flipping for a specific sequence within a page is impractical for most cases, and it may be reasonable to consider 1-bit flipped in a memory page.

Some works try to trigger bit-flips on novel non-volatile memory systems using ideas similar to RowHammer [,]. For example, ref. [] proposes thermal crosstalk for memristors, which makes more current flow through the cell by frequently accessing a single cell, leading to the temperature increase in the cell in its vicinity, and eventually to an internal state change. Similar work has been performed on some other NVMs, such as PCM, to trigger bit flips []. Given that most of the current DNN work is still focused on DRAM implementation, this paper focuses on DRAM systems.

2.3. Voltage Frequency Scaling Attack

A power distribution system (PDN) generally exists in computer systems []. Due to the steady-state load effect, the voltage/frequency state can be dynamically and adaptively adjusted according to the chip’s energy consumption requirements, which will lead to a reduction in potential/voltage. AVFS will only trigger an adjustment when the voltage drop exceeds 2.5% and will only reduce the clock frequency by up to 20%. Therefore, if the workload of the core is heavy, a 2.5% step-down will cause a timing constraint error if the time is long enough, and may even lose data, which is called under-voltage data corruption. A 20% reduction will still make the frequency at a more dangerous situation, which makes the overclocking data corrupted, resulting in a bit-flip failure [,].

Some work uses VFS software to configure voltage/frequency combinations that are out of specification [], causing temporary random failures. Overclocking may violate timing constraints where the clock period is reduced, or when the voltage is under voltage, the inherent delay and propagation delay of the flip-flop increases due to the reduced supply voltage, and the probability of errors increases. Taking FPGA as an example, a VFS attack on an FPGA generates enough memory conflicts on its DRAM module to cause a voltage drop and temperature increase, resulting in a timing violation or bit flip. At present, some work related to VFS attacks against DNN mainly targets SRAM, but we believe that VFS attack is also applicable to DRAM.

2.4. Clock Glitching Attack

The clock cycle is the period between two adjacent pulses of the oscillator. Starting from a rising edge, the chip usually executes several instructions during this period. Some instructions take one or more clock cycles, and sometimes several instructions are executed in one clock cycle. Figure 6 shows the clock cycle situation after the clock glitching occurs. The extra rising clock edge may compromise the conditional judgment. Normally, the jump address 11 should be written to the PC when the branch condition is true, but the extra rising clock edge causes the wrong instruction address 6 to be written to the PC.

Figure 6. How clock glitching influence the branch address selection.

Essentially, the clock glitching attack also causes certain bits to flip, resulting in abnormal behavior. Ref. [] shows the use of clock glitching attack to modify data. When the glitching trigger is active, it will output additional rising cycles, so that the time interval between the rising edges of the clock cycle is less than required for normal operation. Finally, a clock fault occurs, which, in turn, causes data errors, and the accumulation of errors eventually leads to various anomalies. Clock glitching attacks can be used for black-box attacks. Additionally, of course, for the gray box cases where model details, layers, and inputs are known, a more precise attack can be performed.

2.5. Laser Injection Attack

Some works try to trigger individual bit flips in the memory system, using laser injection devices with highly precise laser points. These devices can be moved in micron steps along an axis in the memory region to precisely locate the bits of acquired data and emit infrared rays with sufficient power. Some research attempts to trigger transient faults in SRAM, as well as NOR flash memory [,]. Although few work targets DRAM, given the similarities between the internal structures of SRAM and DRAM, the laser-injected bit-flip attack should still be effective for DRAM-dependent DNNs at runtime.

It can be seen that VFS attacks, clock glitch attacks, and laser injection attacks generally require physical access to the physical storage medium storing the DNN, and, thus, have limitations. The RowHammer attack, on the other hand, can be implemented remotely, which means greater security risks at the application level. As shown in Table 1, statistics show that the proportion of RowHammer used in articles related to bit-flip attacks in recent years is the highest, which is 45.5%.

Table 1. Statistics of bit-flip attack methods in recent related works.

3. Bit-Flip Attacks against DNN

From the attacker’s point of view, the factors that need to be considered for bit-flip attacks are shown in Figure 7. The attacker needs to define the attack model and evaluate the system resources he has access to, such as the parameters of the DNN, the training data, and the training process, to decide the appropriate attack targets. The selection of the attack targets involves several considerations, such as the principles used in the attack, the expected effect of the attack, etc.

Figure 7. Some factors that attacker needs to consider.

3.1. Threat Model

The attacker needs to figure out the information available and define the attack model, then decide which type of attack should be carried out. At present, there are mainly three types of attack models: full knowledge model, restricted white-box model, and black-box model, as shown in Figure 8.

Figure 8. Different access permission to DNN sources for different threat models.

By full-knowledge model, we mean that the attacker is permitted to access all information about the DNN, including runtime weights, activation functions, biases, gradient functions, etc. In addition, the attacker knows the training process and the training data, even the defense mechanism deployed by the target object []. This attack model is reasonable. Due to the widespread use of machine learning model services (MaaS) and the rise of open source, it is not hard for attackers to obtain some detailed information about the model. As sticking to the principle of not underestimating the attacker’s capabilities, and overestimating the attacker’s capabilities as much as possible helps strengthen the DNN’s defense capability against the attack, this model is frequently used for the design of defense methods. In this attack model, it is relatively easy for an attacker to combine multiple techniques to deal with the vulnerable parts of the DNN, and conduct specific attacks.

Compared to the full-knowledge model, the restricted white-box model loses access to some data, e.g., no access to runtime information, or no access to training/running input data, etc. Some attack methods and defense methods are designed based on this model as it is closer to the actual situation faced by attackers in many cases. Under the restricted white-box model, the attacker may not be able to obtain data directly from the runtime memory, he can still learn some significant information, such as the DNN architecture, where the important bits are located by learning open source code or using tools to monitor access patterns. Restricted access to input data limits the implementation of trojan-based bit-flip attacks, which, in general, require the insertion of triggers in the input.

A few works use the black-box model. The attacker under this model has very limited access to DNN-related information, that can be equivalent to an ordinary user. The output is the only result that the attacker can access. In summary, the bit-flip attacks in the black-box model are few, and the effect is relatively limited. We note that the black-box model mainly uses VFS attacks, clock glitching attacks, and other types of circuit-breaking methods to trigger data errors, rather than the RowHammer attack, which is most commonly used to trigger bit-flips in other models. This is because experiments have shown that RowHammer is not effective in disrupting DNN functionality even if a large number of flips are performed without being able to locate the key vulnerable bits, not to mention that the number of bits that can be flipped in a finite time window during the actual attack is limited. In contrast, the VFS and clock glitching method can predictably trigger logic errors that cause more serious errors [].

3.2. Attack Targets

Aiming to reduce the capability of DNNs, attacks attempt to modify the parameters of the DNN through bit flipping, to maximize the deviation between the function of the DNN and the established goal. Figure 9 shows a DNN cell model, while Equation (1) shows the mathematical model, where

g (x)

is the activation function, w is the weight, b is the bias, x is the input, and y is the final output result. It can be seen that the final output result is determined by a combination of weights, biases, activation functions, and inputs, so all parameters in the mathematical model that affect the result can be considered as attack targets. A complete DNN includes hundreds or thousands of neurons, thus it is necessary to locate the vulnerable bits that can degrade the most performance to attack. Here is an example of attacking the Softmax activation function. Some DNNs use Softmax (as Equation (2) shows) function to assert the final results. The attacker can change the final output of Softmax by flipping the bits related to z so that DNN makes wrong predictions.

y = g (\sum_{i = 1}^{n} w_{i} x_{i} + b_{i})

(1)

s o f t m a x = \frac{e x p (z_{i})}{\sum_{1}^{n} e x p (z_{j})}

(2)

Figure 9. The structure of DNN model and the attackable parts in DNN cell.

In addition to attempts to reduce the capability of DNNs, another type of work is oriented towards correlating the output of DNN models after bit flipping with the weights and biases of neurons in certain layers, so that the attacker can recover parts or all of the DNN model and avoid paying commercial organizations for their DNN models. The work of [] restores the weights and biases of the last layer of a DNN by reverse engineering a DNN based on bit flipping. Ref. [] proposed a method called DeepSteal to extract detailed parameters of DNN models. DeepSteal contains two contents, one is HammerLeak, which quickly and effectively steals 90% of the total weight based on the RowHammer attack. The other is an alternative model training algorithm with an average clustering weight penalty, which uses the stolen weight to recover the remaining 10% of the weight, generating a model that can replace the original DNN. Experiments show that the model recovered using DeepSteal can achieve close to 90% accuracy and can generate adversarial samples to deceive the original DNN.

3.3. Attack Principles and Detailed Work

To summarize the current work on conducting bit-flip attacks, they can be divided into three categories in principle.

3.3.1. Untargeted Attack

Figure 10 shows an example of untargeted attack. After this type of attack changes the parameters of the DNN through bit flipping, the DNN’s predictions for all inputs will be biased, even weakened to a random guess level in ideal case. Untargeted attack is a white-box attack and, thus, requires knowledge of the internal details of the DNN. The representative work of this type is BFA [], which solves such a problem when trying to change the weight by flipping bits: if the weight of DNN is in N-bit fixed-point number stored in memory, denoted as B, and the flipped weight is denoted as

\bar{B}

, then the attack purpose can be expressed as maximizing the loss as in Equation (3):

m a x [ℓ (f (x; ({\bar{B}}_{l = 1}^{L})), t) - ℓ (f (x; (B_{l = 1}^{L}), t)]

(3)

where

ℓ (., .)

computes the gap between the output result of the DNN and the true result, and

f (x, B)

denotes the result of the DNN computation with x as input and weight B. The idea of BFA is to flip the corresponding bits in the gradient direction of the ℓ function, so that the output result of the flipped ℓ function has a larger gap from the true result. Some untargeted attack methods flip the relevant bits of the activation function or bias, so that it affects all input, changing all output results to a random or certain pre-set class.

Figure 10. The untargeted attack example.

In this type of work, there exists complexity in two processes. One is to determine the vulnerable elements (which do not exist in some attacks, such as random bit-flip), and the other is to carry out specific attacks. The complexity of determining vulnerable elements is closely related to the attack method. For example, RowHammer attacks generally use PBS to locate vulnerable bits, and limiting the number of search layers will affect the complexity. In comparison, VFS, Laser injection, and Clock glitching attacks search for attack points at the circuit level, with lower complexity. The complexity of a specific attack mainly depends on the number of bits that need to be flipped. Generally speaking, due to some limitations such as the refresh time window, the more bits that need to be flipped, the higher the complexity.

Detailed related work: Table 2 lists the detailed attack aspects for untargeted bit-flip attacks. Breier et al. [] determine the specific instructions that the errors are injected into, and then use diode pulsed lasers to inject errors into the activation functions on embedded system, including ReLu, softmax, sigmoid, and tanh. Rakin et al. [] proposed BFA, which causes DNN to lose its capability by flipping only a few weights. The key of BFA is to identify the vulnerable bits by gradient ranking through the Progressive Bit Search (PBS) method. This is the first work to attack DNN with fix-point weights instead of float-point weights (which are more vulnerable to attacks and may cause a complete loss of functionality by flipping just one bit). Jap et al. [] briefly explained how to perform a single-bit flip attack on softmax to affect the classification results. Liu et al. [] add glitching to the clock signal to cause misclassification of DNN. The attack will disappear after the clock glitch subsidies, which means the attack is quite stealthy. This attack can be used to trigger black-box attacks; for gray-box attacks where model details, layer lists, and input delays are known, more precise attacks can be performed.

Table 2. The detailed attack aspects for untargeted bit-flip attacks.

Khoshavi et al. [] aim at a compressed DNN-BNN (Weights and activations are stored in a compressed form of 1 or 2 bits, and thus are very vulnerable to bit-flip), the impact of single-bit errors (SEU) and multi-bit errors (MBU) on BNN is simulated on the FPGA platform. They conclude that random and uniform bit-flip attacks will cause serious performance degradation to BNN regardless of what kind of parameters it acts on. Yao et al. [] proposed DeepHammer, which is a method of using RowHammer to attack the weights of the DNN model, and then affect the ability of the DNN. DeepHammer consists of two offline stages (the attack preparation stage) and one online stage (the attack stage). In the offline stage, the specific details of the memory, such as the addressing scheme, are first reversely analyzed, and by combining gradient sorting and progressive search to determine the most vulnerable bits and corresponding physical address in DRAM. The attack strategy will be generated according to the obtained information, including which bits change from 0 to 1 and which bits change from 1 to 0. In the online phase, the cache of recently released pages is first used to locate the vulnerable pages, then use double-side RowHammer attacks to accurately cause memory bit flips, and adjust the bit values that have been changed according to the bit flipping policy to ensure the effectiveness of the attack.

Dumont et al. [] illustrate that laser injection attack is a very effective attack, but it generates a larger perturbation compared to the adversarial sample attack. They design an experiment in which partial laser injection technology flips bits in static memory units (such as SRAM and Flash), successfully bypassing PIN verification and recovering the AES key while keeping the stored data unchanged. The main object of this paper is DNN deployed in embedded devices, and the storage devices involved are SRAM and Flash, but it still has reference significance for DRAM-based DNN. Rakin et al. [] explores the vulnerability of DNNs deployed on multi-user shared FPGAs. They propose Deep-Dup, an attack framework that is effective for both white-box and black-box models. Deep-Dup consists of two contents: (1) when its PDS system in FPGA is overloaded, it may cause timing violations, resulting in transient voltage reduction and data transmission delay being longer. They propose AWD attack to launch a PDS overload after determining the transmission weight of the packet. This makes it possible for the target packet to be sampled twice by the receiver and the error can be injected into the latter sample, modifying the weights. (2) They use P-DES to gradually search for vulnerable bits through mutation evolution. As P-DES does not depend on the gradient information of the model, it can be used for black-box attacks. In the white-box model, the attacker first calls the P-DES method to calculate and generate the attack index and then calls the AWD attack to modify the weights. In the black-box model, P-DES makes corresponding changes so that the attacker can still use the timing information to determine where the weight is, and launch AWD attacks with higher frequency until the attack is achieved.

Park et al. [] propose ZeBRA for generating statistics that follow a pre-trained model. It is helpful in accurately estimating DNN loss and weight gradients, and bit-flip attacks can be implemented more efficiently based on the generated data. Fukuda et al. [] uses power waveform matching to adjust the fault injection time and uses clock glitching to inject faults into the softmax function of DNN in an 8-bit microcontroller. During the execution, the attacker needs to know the internal operation state, and use the SAD algorithm to match the power waveform to inject glitching. There are two phases in the fault injection process. One is the analysis phase, selecting the desired waveform pattern. The second is the attack phase, where the trigger will generate glitching when the appropriate waveform pattern appears, injecting the fault into the target device. Based on the principle of clock glitching, attacking the multiply-add operation will make the it incomplete. This work targets the softmax function, and since the specific implementation of softmax involves circular accumulation, the final softmax result can be precisely controlled by using clock glitching. Cai et al. [] propose NMT-Stoke, an attack framework for neural machine translation models (NMT), which can make the neural network produce the semantically reasonable translation that the attacker expects. CNN can still maintain a certain accuracy in the case of quantization, but NMT can only be performed when full-precision weights are used. Simply flipping the MSB of weights in NMT will cause a decline in DNN capabilities, but is also easily recognized by humans. The experiment shows that the impact of parameter bit flipping on the model output depends largely on the change of its weight and the degree of gradient change, and parameters with values outside (−2, 2) are better subjected to bit flipping. Based on this, they design a bit search strategy based on value and gradient to determine the least number of bits to be flipped. It can produce suitable weight and gradient changes, and the generated results do not lose semanticity at the same time.

Ghavami et al. [] propose a BFA method that does not rely on existing data, i.e., by matching the normalized statistical data and label data of each layer in DNN to form a synthetic dataset, and based on this BFA can be applied. The key technique is to generate synthetic datasets for identifying vulnerable bits based on the parametric values of the network architecture itself. This is performed by (1) minimizing the gap between the variance and the mean relative to the training data, so that the runtime information of the hidden neuron is similar to the information of the training data, and (2) minimizing the gap with the training data labels. Combining these two objectives minimizes the loss function and obtains the most similar synthetic dataset. Lee et al. [] propose SparseBFA, which is a bit-flip attack for sparse matrix format used for storing DNN parameter position. In this case, the connections between neurons are reset rather than changing the weights when being attacked. SparseBFA uses an exhaustive search for smaller layers to find the bits that can reduce performance the most, and for other layers uses an approximation algorithm to select a set of coordinates for testing at a time, and then judges the importance of weights by pruning them.

3.3.2. Targeted Attack

The untargeted attack behavior affects all inputs, which is not applicable in many attack scenarios. For example, in the case where face recognition is required, the attacker only wants the DNN to misjudge some specific targets. Compared with untargeted attacks, targeted attack allows the attacker to control the attack more precisely, and for the non-targeted objects, the accuracy can be kept unchanged, making this bit flip attack more concealed. Figure 11 shows an example of the expected situation of a targeted flip attack, the attacker’s optimization goal is as Equation (4) shows.

m i n [ℓ (f (x; ({\bar{B}}_{l = 1}^{L})), t_{q}) x \in x_{p} + ℓ (f (x; (B_{l = 1}^{L}), t)] x \notin x_{p}

(4)

The part before + measures the classification into target categories for some specific inputs, and the part after + measures the classification of the rest inputs, which should have almost constant accuracy ideally.

Figure 11. The targeted attack example.

The complexity of the targeted attack is basically the same as untargeted attack when implementing bit flipping. However, due to the need to maintain the accuracy of the non-target process, the complexity of locating vulnerable elements is slightly higher than untargeted attack.

Detailed related work: Table 3 lists the detailed attack aspects for targeted bit-flip attacks. Liu et al. [] proposes two methods, Single Bias Attack (SBA) and Gradient Descent Attack (GDA) to modify the parameters of DNNs. Among them, SBA attempts to modify the bias value and propagate it to the corresponding adversarial class, thus directly affecting the DNN output results. This attack is independent of the activation function and does not consider concealment but only efficiency. GDA pays more attention to concealment, by only slightly modifying the DNN weight parameters to change the gradient and increase the probability of being classified into the target class. The final goal of GDA is that the results are unaffected except for the specified ones. This work does not state that the bit-flip attack method is adopted, but the bit-flip attack applies to the methods. Zhao et al. [] propose a stealthy method that makes DNNs inaccurately classify some specific targets. Based on ADMM, they solve the problem by decomposing the sub-problems and alternately optimizing the process. In the process, the

ℓ 0

norm and

ℓ 1

norm are used to limit the number and scale of the modified parameters, which finally maintains the classification of non-targets to remain accurate with minimal parameter modification. Rakin et al. [] proposes T-BFA, which is a target-based adversarial weight attack, by using a class-related vulnerable weight bit search algorithm to identify the highly correlated weight bits, then implementing bit-flipping to misclassify selected inputs. They construct optimization objectives according to three optimization objectives (many-to-many, many-to-one, stealthy one-to-one), and use optimized PBS to determine the bits to flip for each layer, then select the best layer to conduct bit-flipping.

Table 3. The detailed attack aspects for targeted bit-flip attacks.

Khare et al. [] proposes LI_T_BFA, which uses the HRank to locate the significance information of layers in DNNs, and determine the most important layers, as well as the corresponding features. Finally, the accuracy of judging a selected class drops to the level of random guessing, while the accuracy of judging other classes remains unchanged, which improves the stealth of the attack. LI_T_BFA only loads the data of the target class when launching the attack, and chooses the layer as rear as possible for the attack. The more the rear layer is attacked, the less impact it has on other targets, but more bits need to be flipped. LI_T_BFA makes a trade-off between these factors. They also propose some defense methods, such as quantizing the weights differently for parity to defend against PBFA’s flipping of MSBs; performing single epoch training to reshape the weights; using average pooling instead of maximum pooling to reduce the impact caused by increased weight flipping; changing the activation function from an unbounded function, such as ReLU, to a function with saturation limits, such as tanh or ReLU_N, placing the key layers of DNN into a memory region with strong error correction to prevent changes to avoid attacks on the key layers. Ghavami et al. [] propose the use of bit flipping to evade the defense algorithm for adversarial attacks. With well-designed adversarial samples, even if there are defensive measures, they will be assigned to the attacker’s predetermined category. The method is roughly the same as the process of methods, such as T-BFA. By modeling the attack mode, the PBS method is used to flip some bits along the opposite direction of the gradient, degrading the robustness and accuracy of DNN.

3.3.3. Bit-Flip Based Trojan Attack

Since most DNNs today are based on open-source architectures or directly entrusted to commercial organizations with powerful computing capabilities for training, which means that the supply chain of DNNs is not fully trusted. Therefore, the previous trojan-related work on DNN interfered with the training process [], such as poisoning the training data, or generating inputs with triggers, etc. However, in practice, even DNN models that receive a fully trusted training process are still subject to bit-flip-based Trojan attacks during the prediction process. As shown in Figure 12, the goal of bit-flip-based hardware trojan is to classify the inputs of the injection triggers into specified classes. It contains two main parts: (1) generating triggers, which work as shown in Equation (5).

m i n | g (\dot{x}, \dot{θ}) - t_{a} |^{2}

(5)

Here,

g (\dot{x}, \dot{θ})

refers to the output that DNN produces with input

\dot{x}

added with the trigger. The weight of DNN is modified to

\dot{θ}

, and

t_{a} |

is the target output. The idea is that the generated triggers help the output in the DNN with modified weights to be as near as possible to the target class.

Figure 12. Bit flip based attack Trojan example.

m i n [ℓ (f (x; \dot{θ}), t) + ℓ (f (\dot{x}; (\dot{θ}), t_{a}))]

(6)

(2) Insert trojan. Triggers are specific patterns that control trojan activation and are generally embedded in the input. The trojan is then used to change the relevant parameters of the DNN model to implement an attack. Upon receiving an input that is inserted with a trigger, the DNN that has undergone a trojan bit-flip will classify the input into a specific class, while the input without the inserted trigger is largely unaffected. The process of bit-flipping to change parameters is performed here as shown in Equation (6). It can be seen that the flipped bits are determined using the gradient descent method.

This type of work needs to ensure the effectiveness and stealthiness of the attack. The parameters and input of DNN in the attack model need to be accessible, so the actual implementation may be relatively limited. Considering practical difficulties, such as a small time window for bit-flipping and high flipping difficulty, the above-mentioned attack methods all try to flip the fewest bits to achieve the attack purpose, so finding the bit with the best flipping effect is one of the key steps. Progressive Bit Search (PBS) is an important method used in this type of attack, which aims to find the bit that can maximize the loss function after flipping. Progressive Bit Search (PBS) is an important method used in this type of attack, which aims to find the bit that can maximize the loss function after flipping. PBS will go through multiple iterations, and each iteration has two steps: (1) search the weight bits that can be used for flipping in the layer, and select the

n b

bits that are sorted by gradient changes after flipping. This process can be expressed as Equation (7) shows.

B_{l} = m a x_{n b} (Δ_{B} ℓ (f (x, B), t))

(7)

Compare the effect of bit flipping of each layer and select the layer with the best effect for bit flipping, the direction of bit flipping is consistent with the direction of the optimized loss function. this process can be expressed as Equation (8) shows. Where

{\bar{E}}_{k}

is the weight after flipping, and

ℓ_{k}

is the loss function after evaluating the flipping.

j = a r g m a x (ℓ_{k}) (1 \leq k \leq l), ℓ_{k} = ℓ (f (x, {\bar{E}}_{k}), t)

(8)

Since the trojan-based bit-flip attack requires trigger insertion into the input in addition to flipping the vulnerable elements in DNN. Overall, the number of bits flipped for this type of work is not much different from the other two types of work, so the complexity is similar. The generation complexity of the trigger is related to the structure of bit-flipped DNN. Generally speaking, there is little difference between each work.

Detailed related work: Table 4 lists the detailed attack aspects for bit-flip-based trojan related work. Venceslai et al. [] propose NeuroAttack, which is a cross-layer attack that uses a triggered input to trigger a hardware trojan, while the hardware trojan performs a bit-flip attack. The accuracy of DNN is normal before triggering and decreases afterward. The trigger is computed and then embedded as a stamp or noise in the input. The goal of this work is to implant the hardware trojan into carefully selected neurons in a certain layer on demand, and when the neuron is activated, the hardware trojan is activated at the same time. Rakin et al. [] propose TBT, which successively uses the Neural Gradient Ranking (NGR) algorithm, as well as the Trojan Bit Search algorithm to identify the vulnerable neurons and weight bits. Afterwards, the attacker can design triggers specifically to locate these cells and bits. When an attacker embeds the trojan into the DNN by bit flipping, the DNN still processes the input with normal inference accuracy, but when the attacker activates the trojan by embedding the trigger into any input, the DNN classifies all inputs to a certain target class. Breier et al. [] explore the attack method of injecting errors into the ReLU activation function during the training phase of DNN, where the malicious inputs can be derived from solving constraints. During the training phase, the DNN that is injected with errors classifies the malicious inputs into expected classes. This approach essentially uses methods, such as bit flipping in the training phase, to achieve the effect of an adversarial input attack.

Table 4. The detailed attack aspects for bit-flip based trojan related work.

Chen et al. [] present ProFlip, a trojan attack framework that progressively identifies and flips a small number of bits in a DNN model, shifting the DNN’s prediction target to a preset class. The attack can be divided into three parts: (1) identify the significant neurons in the last layer using an importance graph based on the forward derivatives; (2) identify the vulnerable bits in the DRAM storing the DNN parameters; and (3) determine an effective triggering pattern for the target class such that the generated triggers maximize the output of important neurons, so that the DNN predicts the specified category for the input containing the trigger. Tol et al. [] propose a method that uses RowHammer to implement backdoor injection attacks. They first demonstrate that the capabilities of RowHammer were overestimated in previous work, so the attack situations need to be considered under some constraints. This attack is consistent with the bit-flip trojan attack, except that when determining the vulnerable weight, there exists a limit on the number of bits that can be flipped in a memory page. At the same time, the number of bits that need to be flipped should be reduced as much as possible. Bai et al. [] propose HPT, which uses bit flips to insert hidden behavior in DNNs, misclassifying triggered inputs and having no effect on other inputs. Triggers here refer to adding perturbations to the input files in the prediction process to make the impact on the input as small as possible while minimizing the loss function that classifies these triggered inputs to a specific category. ADMM is used to alternatively optimize the hyperparameters involved in the loss function.

Mukherjee et al. [] propose a hardware trojan attack on DNN deployed on FPGA. This hardware trojan attack is essentially a glitching trigger. When it is triggered, malicious glitching will be inserted and the activation function of DNN will be modified to reduce the accuracy. The activation of the hardware trojan is triggered by an input embedded with certain fixed-intensity pixels, making this attack very stealthy. Cai et al. [] propose an attack framework that conducts the RowHammer method during the training of DNNs to induce bit flips to the feature mapping process. Perturbations caused by bit-flip are learned as triggers, and then the inputs containing specific triggers will be misclassified in the bits-flipped DNN. They deduce a bit-flipping strategy, which associates the feature mapping with the output target label. The attacker will compute the trigger pattern based on the perturbation of the feature mapping and patch it to the input, making the attack stealthy by targeting only specific classes. Zheng et al. [] proposes TrojViT, which induces the predefined misbehavior of the ViT (Vision Transformers) model by destroying inputs and weights. This work includes two stages: generating triggers and inserting trojan. In the process of generating triggers, the prominence of the triggers will be sorted to determine where the generated triggers should be inserted into the input, while the generated triggers are trained to be able to trigger the trojan more effectively. The inserted trojans use parameter filtering techniques to ensure that the minimum number of bits needs to be flipped. Bai et al. [] proposed two attack methods, SSA and TSA. SSA is a single-sample attack, which stealthily classifies a certain class of inputs to other classes by modifying the weight. TSA is a trojan attack, which generates triggers for precise triggering based on modified weights. They consider the attack as a mixed integer problem so that ADMM can be used to effectively infer the bit states in memory to conduct efficient bit flipping.

3.4. Attack Results

One of the main goals of attackers using bit-flip attacks on DNNs is to degrade the performance of DNNs, as much as possible with as few bit-flips as possible. For instance, many works use the convolutional neural network (CNN) as the attack object. The purpose is to make the CNN produce unexpected classification results, e.g., the widely used BFA can seriously harm the performance of DNN by flipping only about 20 bits. Table 5 lists the number of flipped bits, the attack effect, etc., performed by the untargeted bit-flip attack.

Table 5. The results of untargeted attack-related work.

The requirement for attack concealment has prompted the design of targeted attacks and hardware trojan attacks. This type of attack needs to be optimized on the basis of untargeted attacks to ensure that the attack only affects the target inputs, and other inputs keeping the original performance unchanged. In addition to the number of flipped bits and the performance of the target input, the performance evaluation for the non-targeted inputs is also added. Table 6 and Table 7 list the results of targeted attacks and bit-flip-based trojan-related attacks.

Table 6. The results of targeted attack-related work.

Table 7. The results of bit-flip-based trojan-related attacks.

In addition to the above-mentioned work that destroys the function of DNN, a few works steal important parameter information in the DNN model by associating the output of the DNN model after bit flipping with the weights and biases of neurons in certain layers. A DNN parameter stealing work [] shows that the 64-bit precision parameters can be recovered with an error of

10^{- 13}

.

4. Bit-Flip Defenses

Aiming at the damage caused by bit-flip attacks to DNN, there have been many works exploring how to mitigate or prevent this damage []. A natural idea is to take countermeasures against the technique itself that causes bit-flip, e.g., the RowHammer attack. Ref. [] proposes a method that issues refresh operation to the rows that are warned to be hammered. However, this type of work mainly defends against bit-flip attacks from the perspective of hardware or system architecture, and does not provide targeted design or optimization for DNNs. Therefore, this type of work is not discussed in this paper.

The defense methods we discuss here mainly address the impact of bit-flip attacks on DNNs, which can have the following ideas.

4.1. Raising the Threshold of Bit Flipping

The foothold of this type of work is to enhance the fault tolerance of DNN itself so that more bit flips are required to reduce the function of DNN. Such as simulating hardware errors, thereby introducing some regularization/redundancy measures to enhance fault tolerance. A classical example is the quantization of DNN weights, where the original full-precision 32-bit weights are quantized to 8-bit fixed-point numbers, as shown in Equation (9) for linear weight quantization.

Q (w) = \frac{[W]}{Δ w}, Δ w = \frac{m a x (W_{l})}{128}

(9)

That is, the precision of the full-precision weight is discarded and scaled to realize the quantization of the weight, and the fixed-point weight is expressed as Equation (10) shows. where

b_{0}

∼

b_{7}

represent the bits of the fixed-point weight.

B = - 128 b_{7} + \sum_{n = 0}^{6} 2^{n} b_{n}

(10)

Some works even [] binarize all the weights and activations of all layers to improve the resistance to BFA. The binary neuron network needs to be retrained to ensure accuracy. Compared with flipping the power bit of the full-precision weight to cause drastic changes in the weight, the fixed-point weight after quantization has limited changes after being bit-flipped, so it can improve the resistance of DNN. The work of [] can mitigate the modification of weights by the bit-flip attack, by propagating the change of weight to other weights in its group through an over-averaging operation, where the change amount of each weight is reduced.

Some works dynamically adjust the weight parameters of DNN according to hardware feedback to enhance fault tolerance. For example, the work of [] uses randomization to obfuscate the data bits order based on the fact that the BFA is aimed at the MSB. Ref. [] works build the equality relationship between the inner bits of weights, allowing fast runtime verification. The work of [] repeatedly modifies and evaluates the weight of DNN to enhance the resistance to bit flipping.

The complexity of this type of work depends on what specific method is used to improve the resilience of DNN. For the method of quantizing the parameters of DNN to fixed-point, the complexity of the quantization operation itself is not high, but, in general, to ensure accuracy, quantized DNN needs to be retrained, which increases the complexity. DNN pruning has similar complexity. Adjusting the DNN structure to improve resilience by redundantly or optimally configuring the cells in the DNN has low complexity, but requires large storage overhead.

Detailed related work: Table 8 lists some attributes of work that aims to lower the harmness of bit-flip. Schorn et al. [] improve the method to evaluate the error resilience of neurons in DNN, making it possible to evaluate the resilience against bit-flip errors. The optimization evaluation method for neuron error resilience is deployed on some key layers with fewer neurons and is easy to be attacked. Based on the estimated resilience, the weight of the corresponding neuron is adaptively fine-tuned. Stable weights can be gained through repeated retraining and fine-tuning. Schorn et al. [] propose a set of low-cost objective functions for evaluating the hardware efficiency and fault tolerance of DNNs. With multiple objectives as the optimization direction, they automate the design of hardware-optimized DNN by evolutionary optimization based on neural network structure search techniques (NAS). The optimization objectives include minimizing the sensitivity of DNN at the same bit-flip rate (which can be obtained by combining the relevant values of neurons in each layer of DNN, used to determine the fault tolerance), minimizing the number of operations that DNN runs (determines the latency), minimizing the number of data transfers to and from memory (determines the energy consumption), minimizing the ratio of minimized data transfers and the number of operations per layer (determines the bandwidth). Based on the above optimization objectives, they use a multi-objective optimization design method to mutate and quantize the DNN structure. He et al. [] state that the purpose of BFA is to make the gap between the expected output and the original output as large as possible. There are three noteworthy phenomena: (1) BFA is prone to flip weights that are close to 0. (2) BFA is prone to flip the weights in the front layer; and (3) BFA is prone to classify all inputs into one class. They propose a method using binary perception training and segmented clustering to defend against BFA, including: (1) convert the 32-bit full-precision weight value into a 1-bit, which suppresses the tendency of BFA flipping weight that close to 0. This transformation is equivalent to taking bit flipping as part of training, making DNN more defensive against BFA. (2) Solve the serious accuracy drop caused by binarization by adding a penalty for converting the weight value during the training process.

Table 8. The attributes of work that aims to lower the harmness of bit-flip.

Li et al. [] propose a DNN weight reconstruction method to minimize the weight change, and, thus, reduce the weight perturbation caused by BFA. The method first propagates the change value of a certain weight to other weights within its group by averaging operation. After that, the average value is cropped to different quantization levels to further reduce the weight change caused by the attack flip. Feng et al. [] propose SPV, which adds redundant neurons based on the computational characteristics and weight parameters of DNN while keeping the functionality unchanged. The integrity of the DNN can be verified by comparing whether the inputs of the original node and the redundant node are the same. Rakin et al. [] construct a complete binary DNN (binarization of weights, activations, etc., in all layers) to improve the resistance to BFA. The construction consists of two steps: (1) growing, gradually increasing the number of channels to increase the DNN model; and (2) re-training for the increased DNN model, to ensure the performance of the DNN accuracy. Liu et al. [] propose a defense mechanism RREC based on random rotation and non-linear encoding. The random rotation is inspired by the randomization method in response to the situation where the BFA deflection is aimed at the MSB. MSB is hidden by obfuscating the bit sequence, turning the BFA attack into a black-box attack. Non-linear coding protects the weight value close to 0, so that the weight value changes as little as possible even if it is flipped, reducing the bit flipping distance. Stutz et al. [] aim at bit flips caused by malicious voltage drops, and construct a DNN that is robust to random bit errors. The construction method consists of: (1) performing robust fixed-point quantization; (2) pruning the weights to make the distribution more uniform and redundant; (3) randomly injecting bit-flips with probability p during training; and (4) adversarial bit flip training. Simulating confrontation based on random bit flipping, and flipping bits according to the loss function calculation results.

Khoshavi et al. [] propose HARDeNN, which consists of two stages: (1) find the most sensitive weights in the network and evaluate the possible performance loss caused by fault injection. This is achieved by targeted fault injection uniformly. The uniform cross-layer injection determines which cross-layer parameters can be used by BFA, and the targeted intra-layer fault injection identifies the vulnerable parameters in the layer. (2) Carry out optimal DNN configuration to improve DNN resilience. It is mainly reflected in the replacement of sensitive parameters and evaluation of whether it is feasible, and, if feasible, the replacement can be performed. HARDeNN is designed with three-module redundancy for weights and activation functions, with the final result determined by majority voting, thus enhancing the reliability, and making a good trade-off between security and performance overhead. Ozdenizci et al. [] proposed a defense mechanism OCM to defend against stealthy bit-flip attacks, such as T-BFA and TA-LBF. The main idea is to change the activation function from softmax to tanh, and designing

{- 1, 1} * N

bit codes to replace the one-hot bit codes, such as

{0 \dots 1 \dots 0}

, making the intermediate results overlap. This situation requires the attacker to flip more bits and affects the prediction of non-targets, making it difficult to be stealthy. Köylü et al. [] duplicate the vulnerable elements in DNN, such as layers and weights, so that the influence of bit-flip attack can be significantly reduced. These elements are confirmed through gradient descent algorithms.

Table 8 also summarizes the effects of this type of work against bit-flip attacks. The evaluation standards are the degree to which the performance of DNN is not affected after the design method in the work is deployed, such as the reduction in error rate and the improvement in accuracy, or the need for more flipped bits. For the case where different results are obtained for multiple models/testsets, we list the best results. It can be seen that although the evaluation standards are different, these works can resist the bit flipping to a certain extent, maintaining moderate performance under attack.

4.2. Bit-Flip Detection Based on Typical Features

Some works consider using the typical features to detect whether a bit flip happens. Essentially, this kind of work is to find a representation method that distinguishes the normal DNN from the bit-flipped DNN. Table 9 summarizes the characteristics used in some works. It can be seen that the hash method is most often used in this type of work, which may be due to the low computational complexity and low hardware overhead of the current hashing computation, and the keys for hashing can be properly encrypted and saved to prevent the attacker from further malicious behavior.

Table 9. Factors used to detect bit-flip.

The complexity of this type of work is determined by two parts. One is feature extraction/generation, and the other is feature verification in the runtime. In general, this kind of work tries to reduce the complexity of feature generation or extraction, such as using a low-cost hash method or extracting HPC, but there will inevitably be storage overhead. The verification complexity is generally similar to feature extraction, for example, hash calculation and comparison are also required in the verification process.

Detailed related work: Table 10 lists the attributes of works that aims to use some features for detecting bit-flips. Li et al. [] propose a hashing method to resist PBFA. PBFA is prone to attack the weights with values between [−32, 32]. The idea of the method is to group these weights and generate a 2-bit signature for each group. At runtime, the corresponding hash values are calculated and compared to determine whether there is a bit flip. In the process of calculating the weights, the method of masking and interleaving weights is used to improve the attack detection ability. Singh et al. [] propose LEASH, a scheduling method at the OS level that uses HPC (hardware performance counter) to quantify the maliciousness of a process. LEASH reduces the allocation of resources to processes considered to have a high level of maliciousness, which is valid for RowHammer. When a thread performs context switching, LEASH collects relevant HPC information and judges the maliciousness of the thread, and the allocated resources will be limited if the maliciousness is higher than the threshold. When the maliciousness decreases, the allocated resources increase. Hosseini et al. [] propose LIMA to verify whether the DNN weight highest bit (MSB) has ever been attacked. BFA is prone to flip the most significant bit (MSB), leaving the least significant bit (LSB) unchanged. They slightly modify the LSB of weights, so that the MSB is equal to the LSB of a group of weights’ sum, and make the relationship as a verifiable tag. This process only requires low-cost and fast MAC operations to verify.

Table 10. The attributes of works that aims to use some features for detecting bit-flips.

Guo et al. [] propose a hash-based verification mechanism called ModelShield. This approach uses encrypted non-keyed hashes to precompute the hashes of the weights in each layer of the DNN model and store the hashes along with the weights. When using the model for inference, the hash is recomputed and compared to the stored hash for integrity verification. They also use GPU parallelism to optimize hash calculations and reduce overhead. Javaherpi et al. [] propose HASHTAG, a framework for the detection of fault-injection attacks on DNN precisely. The framework detects the impact of the weight change of each layer on the accuracy in a benign DNN and selects the weight from the most k vulnerable layers. The integrity of DNN can be verified by calculating the hash value based on the key (stored in SRAM, which requires about 5KB of storage overhead). They optimize their work by using lower-cost and faster hash functions in AccHASHTAG []. Cherupally et al. [] demonstrate using hardware noise generated during computing in-memory on SRAM for DNN training, to improve the resistance of DNN. However, this work currently does not support direct use in DRAM-based DNNs.

Table 10 also summarizes the effect of this type of work against bit-flip attacks. The results mainly include the detection rate of bit-flip attacks and the overhead brought by the deployment of defense mechanisms, such as storage overhead and performance overhead. Different from the previous work on improving the resistance to bit-flip, the evaluation standard of this type of work is mainly the detection rate of bit-flips. It can be seen that the detection rate can reach more than 70%. In addition, the extra overhead caused by this type of work due to the need for feature computation also needs to be concerned.

4.3. Bit-Flip Detection Based on Machine Learning

DNN itself can also be used to defend against bit-flip attacks, so there are some works to train ML models for bit-flip attack detection defense. The construction of a training dataset needs to collect some runtime parameters. Hardware performance counter (HPC) is one of the main sources, which is a part of the performance monitoring unit. After the monitoring event is specified, the relevant counters will count the occurrence of the event. Typical hardware events include Branch Instructions Retired, Branch Miss Retired, LLC Reference, LLC Miss, etc. Some studies have found that attacks are closely related to specific hardware events. For example, RowHammer attacks will cause an increase in the number of LLC misses because many accesses to DRAM are required.

The general steps in the process of training a machine learning model include steps, such as data processing, anomaly detection/classification, etc. The purpose of data processing is to remove redundant and irrelevant values in the collected data in preparation for training the ML model. Data processing methods include data smoothing, which eliminates a large number of short-term fluctuations and retains data that highlights long-term trends; and feature scaling, which standardizes and normalizes features to facilitate rapid convergence of ML models. Anomaly detection/classification is a hot spot in machine learning research. At present, many supervised/semi-supervised techniques can perform anomaly detection, such as one-class SVM, to determine whether there is an anomaly in the data. The classification after judging the abnormality is a typical supervised learning process in machine learning. It needs to use known labeled data for training to obtain a reasonable machine learning model and make predictions. The more commonly used machine learning models include random forest, multilayer perceptrons, Bayesian classifiers, support vector machines, deep neural networks, etc.

The complexity of using machine learning to detect bit-flip is generally higher than that of the other two types of work. This is mainly because the complexity of extracting features from the training set and training the DNN model is relatively high, but the complexity of using the DNN model for detection after training is relatively low, which is more advantageous for the detection of large-scale samples.

Detailed related work: Table 11 lists attributes of works that aims to use machine learning for detecting bit-flips. Chakraborty et al. [] argues that the memory traces generated by RowHammer-based bit-flip attacks have distinctive characteristics. For example, a large amount of LLC misses needing to be generated, so that enough accesses can reach some fixed DRAM rows in a short time. A CNN-based model is constructed to determine whether the memory traces generated by unknown applications contain bit-flip attacks. The detection can be divided into offline and online phases. In the offline phase, benign and malicious memory traces are collected and mapped from virtual to physical addresses to learn the characteristics, and the online phase predicts whether they contain a bit-flip attack. Liu et al. [] encode weights to detect whether a bit-flip attack exists, which requires the cooperation of both cloud and edge devices. First, the vulnerability-sensitive bits are determined based on gradient information analysis in the cloud, and then a lightweight neural network is trained to encode these bits. At the edge devices, the Hemming distance between the runtime encoding and the original encoding is calculated whether these bit-flips will lead to serious consequences. In general, a larger Hemming distance indicates a higher probability of being subject to BFA, while a random bit flip has a smaller Hemming distance. When BFA is detected, the DNN model is retrained in the cloud and deployed to edge devices to remove BFA.

Table 11. Attributes of works that aims to use machine learning for detecting bit-flips.

Gulmezoglu et al. [] propose FortuneTeller, which takes advantage of RNN’s memory ability to learn short-term and long-term dependencies, models normal work patterns in an unsupervised manner, and learns complex execution patterns. Attacks are thus predicted for situations that do not conform to normal execution patterns. FortuneTeller can be divided into the training stage and the real-time prediction stage. In the training stage, the time series of normal execution programs are obtained from 36 carefully selected security sensors, used as the training data to train two models of LSTM and GRU. In the prediction stage, the time series obtained from the same sensor is used as the input for prediction. If the difference between the predicted value and the real-time measured value exceeds the threshold, an attack is considered to exist.

Li et al. [] propose DeepDyve. Based on the principle that the invariants should be consistent if there is no fault in a system with the same structure, they train and deploy a small and simple DNN using model compression and knowledge distillation technology. A set of tasks are determined to achieve better coverage to overhead ratio, thereby approaching the original DNN model. By comparing the running results of the trained model and the original model, a potential failure is indicated to exist if there is any inconsistency. Chakraborty et al. [] collect virtual memory traces generated during the benign program process and converts them to physical access traces, then train an unsupervised, CNN-based Automator to learn how to reconstruct benign memory access. After the model is built, the difference between the reconstruction result of the memory traces and the original traces can be used as a criterion to distinguish whether there is a bit-flip attack, i.e., a larger gap can be considered to have a bit-flip attack. Köylü et al. [] propose two detection methods. The first detector is based on the fact that the internal parameters of a DNN should remain constant after training is completed so that both the intermediate states and the final output generated for a specific input should remain unchanged. The detector is fed with a specific value at a certain frequency, and if the output does not serve as expected, it means that some parts of the DNN are compromised. The second detector is based on the empirical fact that the activation rate of the neurons in each layer of a DNN after training should be at a specific ratio, so the fault may exist if the activation rate is in the normal interval. Since the first detector is generally considered to have a larger overhead, the second detector can be set as a constant detection, while the first detector is set to run periodically or only intervene if the second detector reports an attack.

Kuruvila et al. [] proposes HPCDR, an explainable model to detect possible malicious behaviors, including bit-flip behaviors caused by RowHammer. HPCDR first trains a classifier based on HPC to distinguish malicious attacks from normal behavior, then adds perturbation to the input, and further trains two ridge regression models to explain the weight of each parameter that affects the results. Amarnath et al. [] use an encoding function to extract the features of both the input and the output. A predictor is trained to accept inputs and perform feature extraction. By comparing the difference between the predictor’s output with the original output, it can be determined whether there is an error or an attack according to the distribution of errors. Polychronou et al. [] propose a logistic regression-based classifier MaDMAN, which can identify several hardware attack methods including RowHammer attacks. This method uses HPC as input and uses the exponential weighted moving algorithm (EWMA) to judge the maliciousness of the process, and trade-off false negatives and false positives by adjusting the collection window size. Joardar et al. [] train an ML model to determine whether RowHammer exists. The model uses Bloom Filter-based counters to record the access status of DRAM rows and can be divided into two types: short-term and long-term counters. The short-term counters record the usage of DRAM for some consecutive clock cycles, and when the value of the short-term counter exceeds a certain threshold, the long-term counter will be accumulated. The counters will refresh as DRAM refreshes. DNN uses the value of short-term counters, long-term counters, and the sum of short-term counters as inputs. When DNN detects a RowHammer attack, it will refresh the relevant rows with a probability of p. Alam et al. [] use the perf tool to collect information on HPC and judge the existence of anomalies, then classify the anomaly categories if anomaly exists. Mirbagher et al. [] exploit a large number of features available in the hardware of all components of the processor, including ports, buffers, buses, and so on. Perceptron is used to detect and classify some kinds of hardware attacks. The paper claims that the detection of RowHammer is also effective, but it does not directly test the RowHammer attack.

Table 11 also summarizes the effect of this type of work against bit-flip attacks. The evaluation standards are not completely consistent in each work, including the detection rate of bit-flip after deploying the design method in the work, and the accuracy of detecting bit-flip attacks, as well as F-score, etc. The detection rate using machine learning methods is roughly equal to methods based on detection using typical features, but performs better in handling large amounts of data.

5. Findings Based on Existing Methods

We list the findings based on existing methods, as Figure 13 shows. The red boxes mean that the positive aspects that the attackers/defenders should try to achieve, while the yellow boxes show some factors need to be carefully selected according to the actual scenarios, and the red boxes contain some useless or even harmful measures that attackers/defenders should try to avoid. Based on the related work of bit-flip attacks on DNN, we can draw the following conclusions:

Figure 13. Findings based on existing methods.

1.

What should do to improve the attack success rate:

Essentially, bit flipping works best for the bits that cause the greatest change in parameters, such as the power bits in floating point weights. This is the theoretical basis on which the gradient descent algorithm relies. Ref. [] conducted a quantitative evaluation of full-precision lower-bit flipping. The results show that the effect of flipping the power bit, sign bit, and mantissa bit is reducing.
The front layers in the DNN layer are more vulnerable, which may be because the errors caused by the flipping of the front layers will gradually accumulate and cause larger errors.
Random bit-flips are more effective for DNNs with fewer parameters and compressed DNNs. The work of [] illustrates that the accuracy of the above types of DNNs can be degraded to 0.5% with 500 random flips. This is because the vulnerable bits are easier to be randomly flipped in small-sized DNN, which will incur serious consequences.

2.

What should consider according to the actual scenarios:

The harm of random bit flipping on DNNs is related to the structure of the DNN itself, thus the hyperparameters should be carefully considered to handle the possible bit-flipping attacks in some specific scenarios.
Different attack environments limit the attack methods. For example, it is difficult to insert triggers without access to the input data, and, thus, it is difficult to run hardware trojans. So most of the current attacks are based on the full-knowledge model or the restricted white-box model. Since the effect of RowHammer is extremely limited under the black box model, VFS attack, and clock glitching attack is preferred under the black box model. Of course, in this environment targeted bit-flips attacks and hardware trojans are difficult to implement.
Attack method should be adjusted according to attack goal. All the attacks can compromise the function of DNN to a certain extent (different attack purposes will produce different effects), for example, most of the work can implement the attack with more than 90% success rate, making the DNN reach the random guessing level of accuracy for the target object. All works pay attention to the number of bits that need to be flipped. This is because the refresh operation in DRAM makes the effective time window for bit flipping very short, and too many bit flips are unrealistic. For attacks with stealth requirements, most of the work evaluates the performance impact on non-target data, and it can be seen that some of the works can only cause less than 1% performance degradation in addition to successfully attacking target data.
In general, targeted attack has more complexity than untargeted attack due to the requirement to maintain non-target classification accuracy. The complexity of trojan-based bit-flip attack is generally higher than untarget attack and target attack due to trigger generation and insertion into inputs.

3.

What should try to avoid:

If the DNN structure is not clear, it is generally believed that the harm of random bit flipping on DNNs is not significant, because DNNs are fault-tolerant to some degree, and it will hardly affect the DNN if the vulnerable bits are not flipped, which is highly possible because the proportion of vulnerable bits is quite small.

Summarizing the work on defense against bit-flip attacks, the following conclusions can be drawn.

1.

What should do to improve the defense success rate:

Some regularization methods, such as batch normalization or drop-out, are beneficial to improve the resistance of DNN to bit flipping. The work in [] shows that proper adjustment of the activation function is helpful to improve the robustness to bit-flip.
DNN with quantized parameters, such as binarized DNN, is very helpful in improving the resistance to bit flipping. The work of [] shows that the activation function is properly adjusted It is helpful to improve the robustness of bit flipping; in addition, DNN with quantized parameters, such as binarized DNN, is very helpful in improving the resistance to bit flipping. Quantized DNNs have been targeted in many works studying bit-flip attacks.
DNN pruning is helpful to defend against bit-flip attack to some degree. Ref. [] tests the effect of pruning DNN to improve the resistance to bit flipping, showing that the network with a pruning rate of less than 90% is more robust to weight errors, but too much pruning may cause serious errors due to important bits such as (power bits) being truncated.

2.

What should consider according to the actual scenarios:

As a empirical conclusion, DNN is more resistant to single-bit flips (SEU, Single Event Upset), while MBU (continuous multi-bit flips) poses a greater threat to DNN.
The size of the DNN does not play a decisive role. Due to its redundancy, a larger DNN can resist bit-flip attacks to a certain extent, because more bits need to be flipped to cause errors; the small neural network has less accumulation of errors due to fewer layers, which also suppresses the damage of errors to a certain extent.
Summarize and compare the three types of defense work. The idea of the first type is to improve the resistance of DNN itself to bit flips. The latter two ideas are to detect bit flips as accurately as possible. Subsequent work can combine these ideas for a comprehensive evaluation.
In general, bit-flip detection based on typical features has lower complexity, but the bit-flip can only be detected and not be corrected; works that raise the threshold of bit-flip have moderate complexity, while the machine learning-based work has the highest complexity Due to the feature generation/extraction and model train.

3.

What should try to avoid:

The work of [] shows that the effect of adversarial weight training is extremely limited. This is because bit-flip attacks generally directly attack the working stage of DNN.
Commonly used error correction mechanisms, such as ECC and SGX can be bypassed and, thus, have quite a limited effect. Constrained by the source, like in edge devices, it may not be quite possible to deploy ECC or SGX. Most works use low-cost and fast hash functions to do the status verification instead.

6. Related Work

There are few reviews related to bit-flip attacks on DNNs and corresponding defenses against bit-flips. Table 12 lists the comparisons between the related works and our work. Hector et al. [] state that the current evaluation of BFA is not perfect, for example, it does not consider the actual capabilities of the attacker, and, generally, there are not many bits that can be flipped. They discuss the impact of bit flipping on training parameters and model architecture, and show the different impacts on CNN networks and fully connected networks. Tsai et al. [] study the influence of weight perturbation on the robustness and generalization ability of DNN. They analyze weight perturbations for both single and multiple layers in DNNs, investigate generalization properties using Rademacher complexity, and propose a loss calculation method for training robust and generalizable neural networks. Khalid et al. [] provide a simple description of related work on DNN fault injection attacks. Compared with their work, we provide a more detailed and comprehensive description of the principles of bit-flipping, the attack methods, and the existing defense methods.

Table 12. Comparisons between the related works and our work.

Hajiamin et al. [] briefly expound on various attacks on ML at present, including adversarial sample attacks, data poisoning attacks, channel measurement attacks, etc. They also give some insights into possible attacks during the interaction between ML and systems, as well as some software and hardware defense methods. In addition to a more detailed and targeted exposition of the attack/defense principles of bit flipping, our work also provides a comprehensive and detailed overview of the progress of related work in recent years. Naseredini et al. [] models DRAM. The content of the modeling covers the timing behavior of DRAM, the behavior of bit flipping when RowHammer attacks, the ECC mechanism and TRR strategy for RowHammer, etc., relying on the LearnLib automatic learning algorithm to construct a RowHammer machine. The RowHammer machine can be used to model and infer the key features that can cope with RowHammer behavior, such as the number of memory accesses that trigger RowHammer, the number of memory accesses that trigger the TRR strategy, the maximum number of digits that can be corrected by ECC, and so on. This work does not directly defend against RowHammer attacks but uses active learning to understand DRAM and the parameters of RowHammer attacks in a deeper way, providing a different view for the development of future defense works. Our work is to discuss the attack and defense ideas, design, and effects of bit flipping from the perspective of DNN, which is fundamentally different from this work.

Kim et al. [] claim that ECC will be helpful to prevent RowHammer attack despite the high cost. However, many following works prove carefully designed RowHammer attack can bypass ECC successfully. Cojocar et al. [] reverse-engineer the ECC implementation mechanism in commercial ECC-DRAM, and obtain the specific implementation mechanism of ECC. They propose ECCploit, which can effectively detect vulnerability of ECC and implement RowHammer attack. Dio et al. [] proposed an ECC-based RowHammer mitigation method. They make the operating system mitigate error-corrected memory pages, and offline the corresponding vulnerable pages as the bit-flip count exceeds a threshold. Orosa et al. [] study how ECC influences RowHammer attack and proposed some suggestions, such as optimization for non-uniform bit-error and reducing dependency on vulnerable chips. Chakraborty et al. [] propose a fault attack aiming at the ECC-enhanced DRAM as the error correction needs time, which can be utilized to construct a side channel, and then some secure information can be stolen. This kind of work either try to enhance bit-flip attack against ECC-DRAM or use ECC-DRAM to defense bit-flipping, but none of them aim at DNN, which is out of scope.

7. Future Direction

Considering the shortcomings of the current work, we propose the following possible future research directions.

In almost all the bit-flip attacks, the attack method is described from the perspective of software implementation of DNN. Future work may focus on how bit flipping influences DNN from the view of hardware.
The purpose of the bit flipping attack is relatively straightforward, mainly trying to degrade the capability of DNN, such as the prediction accuracy. Some other aspects can also be explored, such as bypassing verification, permission extraction, etc. Future work can use the aforementioned aspects as attack targets to design attack methods.
Black-box attacks are relatively rare. Most of the attacks currently adopt the full-knowledge model or restricted white-box model, i.e., the attacker knows all or most of the details of the DNN, which is obviously not particularly in line with the actual situation. In reality, black-box cases are the majority. Some works mention that details of black-box DNN can be obtained by implementing the side-channel attack, etc., but the cases that can be solved are limited. The black-box-based corresponding defense work is also relatively lacking. Based on the black-box model, more adaptable attack models can be designed in future work.
The current attack scenarios are mainly based on traditional CPU-DRAM platforms, and very little research tries to implement bit-flip attacks on the more commonly deployed GPU platforms. Future work may design attack and defense methods according to the characteristics of the GPU platform.
In summary, the main ideas of defense include improving the robustness and resilience of DNN itself and deploying additional mechanisms (hashing/machine learning models) to check whether there is a bit-flip attack. As these methods are not conflicting with each other, evaluating the combination of these ideas may be meaningful.

8. Conclusions

The vigorous development of DNN in recent years has promoted its large-scale application in many fields, including many security-sensitive fields, which puts forward high requirements for the robustness and resilience of DNN. A lot of work has shown that in practical applications, due to the uncertainty of internal parameters, DNN is far more fragile than the industry thinks. For example, many works have demonstrated the impact of bit flips on DNN, which illustrates that a good bit-flip attack can degrade the DNN’s accuracy to a random guess level. We summarize the current bit flip attack methods. Starting from the threat model and attack principle, the current main attack methods are discussed, and the attack effects of each work are compared and analyzed. We also discuss the ideas of current work to help DNN defend against bit-flip attacks, and we describe the implementation of each work, qualitatively analyzing the defense effect and cost. By summarizing bit-flip attacks on DNN and corresponding defense methods, we propose some empirical findings about bit-flip attacks and defenses, such as flipping the MSB bits to improve the success attack rate, carefully adjusting the activation function according to the actual defense scenarios, and not to do the adversarial weight training to defense, etc. At last we put forward some possible future directions.

Author Contributions

Conceptualization, C.Q. and M.Z.; methodology, C.Q.; software, C.Q. and S.L.; validation, C.Q., Y.N. and S.L.; formal analysis, C.Q.; investigation, C.Q.; resources, M.Z.; data curation, Y.N.; writing—original draft preparation, C.Q.; writing—review and editing, M.Z.; visualization, Y.N.; supervision, H.C.; project administration, Y.N.; funding acquisition, H.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Data has been collected from all the sources that are present here in reference section.

Conflicts of Interest

The authors declare no conflict of interest.

References

Khalid, F.; Hanif, M.A.; Shafique, M. Exploiting Vulnerabilities in Deep Neural Networks: Adversarial and Fault-Injection Attacks. arXiv 2021, arXiv:2105.03251. [Google Scholar]
Rakin, A.S.; He, Z.; Fan, D. Bit-flip attack: Crushing neural network with progressive bit search. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Korea, 27 October–2 November 2019; pp. 1211–1220. [Google Scholar]
Cojocar, L.; Razavi, K.; Giuffrida, C.; Bos, H. Exploiting correcting codes: On the effectiveness of ecc memory against rowhammer attacks. In Proceedings of the 2019 IEEE Symposium on Security and Privacy (SP), San Francisco, CA, USA, 20–22 May 2019; pp. 55–71. [Google Scholar]
Zhang, D.; Yang, J.; Ye, D.; Hua, G. Lq-nets: Learned quantization for highly accurate and compact deep neural networks. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 365–382. [Google Scholar]
Khoshavi, N.; Broyles, C.; Bi, Y. Compression or corruption? A study on the effects of transient faults on bnn inference accelerators. In Proceedings of the 2020 21st International Symposium on Quality Electronic Design (ISQED), Santa Clara, CA, USA, 25–26 March 2020; pp. 99–104. [Google Scholar]
Moitra, A.; Panda, P. Exposing the robustness and vulnerability of hybrid 8T-6T SRAM memory architectures to adversarial attacks in deep neural networks. arXiv 2020, arXiv:2011.13392. [Google Scholar]
Zhou, T.; Zhang, Y.; Duan, S.; Luo, Y.; Xu, X. Deep neural network security from a hardware perspective. In Proceedings of the 2021 IEEE/ACM International Symposium on Nanoscale Architectures (NANOARCH), Virtual, 8–10 November 2021; pp. 1–6. [Google Scholar]
Tajik, S.; Ganji, F. Artificial neural networks and fault injection attacks. In Security and Artificial Intelligence; Springer: Berlin/Heidelberg, Germany, 2022; pp. 72–84. [Google Scholar]
Breier, J.; Hou, X.; Jap, D.; Ma, L.; Bhasin, S.; Liu, Y. Practical fault attack on deep neural networks. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, Toronto, Canada, 15–19 October 2018; pp. 2204–2206. [Google Scholar]
Hong, S.; Frigo, P.; Kaya, Y.; Giuffrida, C.; Dumitraș, T. Terminal brain damage: Exposing the graceless degradation in deep neural networks under hardware fault attacks. In Proceedings of the 28th USENIX Security Symposium (USENIX Security 19), Santa Clara, CA, USA, 14–16 August 2019; pp. 497–514. [Google Scholar]
Arechiga, A.P.; Michaels, A.J. The robustness of modern deep learning architectures against single event upset errors. In Proceedings of the 2018 IEEE High Performance extreme Computing Conference (HPEC), Waltham, MA, USA, 25–27 September 2018; pp. 1–6. [Google Scholar]
Arechiga, A.P.; Michaels, A.J. The effect of weight errors on neural networks. In Proceedings of the 2018 IEEE 8th Annual Computing and Communication Workshop and Conference (CCWC), Las Vegas, NV, USA, 8–10 January 2018; pp. 190–196. [Google Scholar] [CrossRef]
Mukundan, J.; Hunter, H.; Kim, K.h.; Stuecheli, J.; Martínez, J.F. Understanding and mitigating refresh overheads in high-density DDR4 DRAM systems. ACM SIGARCH Comput. Archit. News 2013, 41, 48–59. [Google Scholar]
Jung, M.; Weis, C.; Wehn, N. DRAMSys: A flexible DRAM subsystem design space exploration framework. IPSJ Trans. Syst. Lsi Des. Methodol. 2015, 8, 63–74. [Google Scholar] [CrossRef]
Kim, Y.; Daly, R.; Kim, J.; Fallin, C.; Lee, J.H.; Lee, D.; Wilkerson, C.; Lai, K.; Mutlu, O. Flipping bits in memory without accessing them: An experimental study of DRAM disturbance errors. ACM SIGARCH Comput. Archit. News 2014, 42, 361–372. [Google Scholar]
Hassan, H.; Tugrul, Y.C.; Kim, J.S.; Van der Veen, V.; Razavi, K.; Mutlu, O. Uncovering in-dram rowhammer protection mechanisms: A new methodology, custom rowhammer patterns, and implications. In Proceedings of the MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture, Athens, Greece, 18–22 October 2021; pp. 1198–1213. [Google Scholar]
Yağlıkçı, A.G.; Luo, H.; De Oliviera, G.F.; Olgun, A.; Patel, M.; Park, J.; Hassan, H.; Kim, J.S.; Orosa, L.; Mutlu, O. Understanding RowHammer Under Reduced Wordline Voltage: An Experimental Study Using Real DRAM Devices. In Proceedings of the 2022 52nd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), Baltimore, MD, USA, 27–30 June 2022; pp. 475–487. [Google Scholar]
Walker, A.J.; Lee, S.; Beery, D. On DRAM rowhammer and the physics of insecurity. IEEE Trans. Electron Devices 2021, 68, 1400–1410. [Google Scholar] [CrossRef]
Yao, F.; Rakin, A.S.; Fan, D. {DeepHammer}: Depleting the Intelligence of Deep Neural Networks through Targeted Chain of Bit Flips. In Proceedings of the 29th USENIX Security Symposium (USENIX Security 20), Virtual, 12–14 August 2020; pp. 1463–1480. [Google Scholar]
Park, D.; Kwon, K.W.; Im, S.; Kung, J. ZeBRA: Precisely Destroying Neural Networks with Zero-Data Based Repeated Bit Flip Attack. arXiv 2021, arXiv:2111.01080. [Google Scholar]
Amarnath, C.; Momtaz, M.I.; Chatterjee, A. Addressing Soft Error and Security Threats in DNNs Using Learning Driven Algorithmic Checks. In Proceedings of the 2021 IEEE 27th International Symposium on On-Line Testing and Robust System Design (IOLTS), Virtual, 28–30 June 2021; pp. 1–4. [Google Scholar]
Cai, Y.; Chen, X.; Tian, L.; Wang, Y.; Yang, H. Enabling secure nvm-based in-memory neural network computing by sparse fast gradient encryption. IEEE Trans. Comput. 2020, 69, 1596–1610. [Google Scholar]
Staudigl, F.; Al Indari, H.; Schön, D.; Sisejkovic, D.; Merchant, F.; Joseph, J.M.; Rana, V.; Menzel, S.; Leupers, R. NeuroHammer: Inducing bit-flips in memristive crossbar memories. In Proceedings of the 2022 Design, Automation & Test in Europe Conference & Exhibition (DATE), Virtual, 14–23 March 2022; pp. 1181–1184. [Google Scholar]
Jiang, L.; Zhang, Y.; Yang, J. Mitigating write disturbance in super-dense phase change memories. In Proceedings of the 2014 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, Atlanta, GA, USA, 23–26 June 2014; pp. 216–227. [Google Scholar]
Boutros, A.; Hall, M.; Papernot, N.; Betz, V. Neighbors from Hell: Voltage attacks against deep learning accelerators on multi-tenant FPGAs. In Proceedings of the 2020 International Conference on Field-Programmable Technology (ICFPT), Maui, HI, USA, 7–8 December 2020; pp. 103–111. [Google Scholar]
Gnad, D.R.; Oboril, F.; Tahoori, M.B. Voltage drop-based fault attacks on FPGAs using valid bitstreams. In Proceedings of the 2017 27th International Conference on Field Programmable Logic and Applications (FPL), Ghent, Belgium, 4–8 September 2017; pp. 1–7. [Google Scholar]
Alam, M.M.; Tajik, S.; Ganji, F.; Tehranipoor, M.; Forte, D. RAM-Jam: Remote temperature and voltage fault attack on FPGAs using memory collisions. In Proceedings of the 2019 Workshop on Fault Diagnosis and Tolerance in Cryptography (FDTC), Atlanta, GA, USA, 24 August 2019; pp. 48–55. [Google Scholar]
Liu, W.; Chang, C.H.; Zhang, F.; Lou, X. Imperceptible misclassification attack on deep learning accelerator by glitch injection. In Proceedings of the 2020 57th ACM/IEEE Design Automation Conference (DAC), Virtual, 20–24 June 2020; pp. 1–6. [Google Scholar]
Dumont, M.; Moëllic, P.A.; Viera, R.; Dutertre, J.M.; Bernhard, R. An overview of laser injection against embedded neural network models. In Proceedings of the 2021 IEEE 7th World Forum on Internet of Things (WF-IoT), Virtual, 14–31 June 2021; pp. 616–621. [Google Scholar]
Menu, A.; Dutertre, J.M.; Rigaud, J.B.; Colombier, B.; Moellic, P.A.; Danger, J.L. Single-bit laser fault model in NOR flash memories: Analysis and exploitation. In Proceedings of the 2020 Workshop on Fault Detection and Tolerance in Cryptography (FDTC), Virtual, 13 September 2020; pp. 41–48. [Google Scholar]
Liu, L.; Guo, Y.; Cheng, Y.; Zhang, Y.; Yang, J. Generating Robust DNN with Resistance to Bit-Flip based Adversarial Weight Attack. IEEE Trans. Comput. 2022, 72, 401–413. [Google Scholar]
Rakin, A.S.; Luo, Y.; Xu, X.; Fan, D. {Deep-Dup}: An adversarial weight duplication attack framework to crush deep neural network in {Multi-Tenant}{FPGA}. In Proceedings of the 30th USENIX Security Symposium (USENIX Security 21), Virtual, 11–13 August 2021; pp. 1919–1936. [Google Scholar]
Breier, J.; Jap, D.; Hou, X.; Bhasin, S.; Liu, Y. SNIFF: Reverse engineering of neural networks with fault attacks. IEEE Trans. Reliab. 2021, 71, 1527–1539. [Google Scholar]
Rakin, A.S.; Chowdhuryy, M.H.I.; Yao, F.; Fan, D. Deepsteal: Advanced model extractions leveraging efficient weight stealing in memories. In Proceedings of the 2022 IEEE Symposium on Security and Privacy (SP), San Francisco, CA, USA, 23–26 May 2022; pp. 1157–1174. [Google Scholar]
Jap, D.; Won, Y.S.; Bhasin, S. Fault injection attacks on SoftMax function in deep neural networks. In Proceedings of the 18th ACM International Conference on Computing Frontiers, Virtual, 11–13 May 2021; pp. 238–240. [Google Scholar]
Khoshavi, N.; Broyles, C.; Bi, Y. A survey on impact of transient faults on bnn inference accelerators. arXiv 2020, arXiv:2004.05915. [Google Scholar]
Fukuda, Y.; Yoshida, K.; Fujino, T. Fault Injection Attacks Utilizing Waveform Pattern Matching against Neural Networks Processing on Microcontroller. IEICE Trans. Fundam. Electron. Commun. Comput. Sci. 2022, 105, 300–310. [Google Scholar]
Cai, K.; Chowdhuryy, M.H.I.; Zhang, Z.; Yao, F. Seeds of SEED: NMT-Stroke: Diverting Neural Machine Translation through Hardware-based Faults. In Proceedings of the 2021 International Symposium on Secure and Private Execution Environment Design (SEED), Virtual, 20–21 September 2021; pp. 76–82. [Google Scholar]
Ghavami, B.; Sadati, M.; Shahidzadeh, M.; Fang, Z.; Shannon, L. BDFA: A Blind Data Adversarial Bit-flip Attack on Deep Neural Networks. arXiv 2021, arXiv:2112.03477. [Google Scholar]
Lee, K.; Chandrakasan, A.P. SparseBFA: Attacking Sparse Deep Neural Networks with the Worst-Case Bit Flips on Coordinates. In Proceedings of the ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Singapore, 22–27 May 2022; pp. 4208–4212. [Google Scholar]
Liu, Y.; Wei, L.; Luo, B.; Xu, Q. Fault injection attack on deep neural network. In Proceedings of the 2017 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), Irvine, CA, USA, 13–17 November 2017; pp. 131–138. [Google Scholar]
Zhao, P.; Wang, S.; Gongye, C.; Wang, Y.; Fei, Y.; Lin, X. Fault sneaking attack: A stealthy framework for misleading deep neural networks. In Proceedings of the 2019 56th ACM/IEEE Design Automation Conference (DAC), Las Vegas, NV, USA, 2–6 June 2019; pp. 1–6. [Google Scholar]
Rakin, A.S.; He, Z.; Li, J.; Yao, F.; Chakrabarti, C.; Fan, D. T-bfa: Targeted bit-flip adversarial weight attack. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 44, 7928–7939. [Google Scholar]
Khare, Y.; Lakara, K.; Inukonda, M.S.; Mittal, S.; Chandra, M.; Kaushik, A. Design and Analysis of Novel Bit-flip Attacks and Defense Strategies for DNNs. In Proceedings of the 2022 IEEE Conference on Dependable and Secure Computing (DSC), Edinburgh, UK, 22–24 June 2022; pp. 1–8. [Google Scholar]
Ghavami, B.; Movi, S.; Fang, Z.; Shannon, L. Stealthy Attack on Algorithmic-Protected DNNs via Smart Bit Flipping. In Proceedings of the 2022 23rd International Symposium on Quality Electronic Design (ISQED), California, CA, USA, 5–7 April 2022; pp. 1–7. [Google Scholar]
Zhao, Y.; Hu, X.; Li, S.; Ye, J.; Deng, L.; Ji, Y.; Xu, J.; Wu, D.; Xie, Y. Memory trojan attack on neural network accelerators. In Proceedings of the 2019 Design, Automation & Test in Europe Conference & Exhibition (DATE), ALPEXPO, Grenoble, France, 25–29 March 2019; pp. 1415–1420. [Google Scholar]
Venceslai, V.; Marchisio, A.; Alouani, I.; Martina, M.; Shafique, M. Neuroattack: Undermining spiking neural networks security through externally triggered bit-flips. In Proceedings of the 2020 International Joint Conference on Neural Networks (IJCNN), Glasgow, UK, 19–24 July 2020; pp. 1–8. [Google Scholar]
Rakin, A.S.; He, Z.; Fan, D. Tbt: Targeted neural network attack with bit trojan. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Virtual, 14–19 June 2020; pp. 13198–13207. [Google Scholar]
Breier, J.; Hou, X.; Ochoa, M.; Solano, J. FooBaR: Fault Fooling Backdoor Attack on Neural Network Training. IEEE Trans. Dependable Secur. Comput. 2022; Early Access. [Google Scholar] [CrossRef]
Chen, H.; Fu, C.; Zhao, J.; Koushanfar, F. Proflip: Targeted trojan attack with progressive bit flips. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 7718–7727. [Google Scholar]
Tol, M.C.; Islam, S.; Sunar, B.; Zhang, Z. Toward Realistic Backdoor Injection Attacks on DNNs using Rowhammer. arXiv 2022, arXiv:2110.07683. [Google Scholar]
Bai, J.; Gao, K.; Gong, D.; Xia, S.T.; Li, Z.; Liu, W. Hardly perceptible trojan attack against neural networks with bit flips. In Proceedings of the European Conference on Computer Vision, Tel Aviv, Israel, 23–27 October 2022; Springer: Berlin/Heidelberg, Germany, 2022; pp. 104–121. [Google Scholar]
Mukherjee, R.; Chakraborty, R.S. Novel Hardware Trojan Attack on Activation Parameters of FPGA-based DNN Accelerators. IEEE Embed. Syst. Lett. 2022, 14, 131–134. [Google Scholar]
Cai, K.; Zhang, Z.; Yao, F. On the Feasibility of Training-time Trojan Attacks through Hardware-based Faults in Memory. In Proceedings of the 2022 IEEE International Symposium on Hardware Oriented Security and Trust (HOST), McLean, VA, USA, 27–30 June 2022; pp. 133–136. [Google Scholar]
Zheng, M.; Lou, Q.; Jiang, L. TrojViT: Trojan Insertion in Vision Transformers. arXiv 2022, arXiv:2208.13049. [Google Scholar]
Bai, J.; Wu, B.; Li, Z.; Xia, S.t. Versatile Weight Attack via Flipping Limited Bits. arXiv 2022, arXiv:2207.12405. [Google Scholar]
Alam, M.; Bag, A.; Roy, D.B.; Jap, D.; Breier, J.; Bhasin, S.; Mukhopadhyay, D. Neural Network-based Inherently Fault-tolerant Hardware Cryptographic Primitives without Explicit Redundancy Checks. ACM J. Emerg. Technol. Comput. Syst. (JETC) 2020, 17, 1–30. [Google Scholar]
Yağlikçi, A.G.; Patel, M.; Kim, J.S.; Azizi, R.; Olgun, A.; Orosa, L.; Hassan, H.; Park, J.; Kanellopoulos, K.; Shahroodi, T.; et al. Blockhammer: Preventing rowhammer at low cost by blacklisting rapidly-accessed dram rows. In Proceedings of the 2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA), Virtual, 27 February–3 March 2021; pp. 345–358. [Google Scholar]
Rakin, A.S.; Yang, L.; Li, J.; Yao, F.; Chakrabarti, C.; Cao, Y.; Seo, J.S.; Fan, D. Ra-bnn: Constructing robust & accurate binary neural network to simultaneously defend adversarial bit-flip attack and improve accuracy. arXiv 2021, arXiv:2103.13813. [Google Scholar]
Li, J.; Rakin, A.S.; Xiong, Y.; Chang, L.; He, Z.; Fan, D.; Chakrabarti, C. Defending bit-flip attack through dnn weight reconstruction. In Proceedings of the 2020 57th ACM/IEEE Design Automation Conference (DAC), Virtual, 20–24 June 2020; pp. 1–6. [Google Scholar]
Hosseini, F.S.; Liu, Q.; Meng, F.; Yang, C.; Wen, W. Safeguarding the Intelligence of Neural Networks with Built-in Light-weight Integrity MArks (LIMA). In Proceedings of the 2021 IEEE International Symposium on Hardware Oriented Security and Trust (HOST), Virtual, 13–14 December 2021; pp. 1–12. [Google Scholar]
Schorn, C.; Guntoro, A.; Ascheid, G. An efficient bit-flip resilience optimization method for deep neural networks. In Proceedings of the 2019 Design, Automation & Test in Europe Conference & Exhibition (DATE), Florence, Italy, 25–29 March 2019; pp. 1507–1512. [Google Scholar]
Schorn, C.; Elsken, T.; Vogel, S.; Runge, A.; Guntoro, A.; Ascheid, G. Automated design of error-resilient and hardware-efficient deep neural networks. Neural Comput. Appl. 2020, 32, 18327–18345. [Google Scholar] [CrossRef]
He, Z.; Rakin, A.S.; Li, J.; Chakrabarti, C.; Fan, D. Defending and harnessing the bit-flip based adversarial weight attack. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Virtual, 14–19 June 2020; pp. 14095–14103. [Google Scholar]
Feng, X.; Ye, M.; Xia, K.; Wei, S. Runtime Fault Injection Detection for FPGA-based DNN Execution Using Siamese Path Verification. In Proceedings of the 2021 Design, Automation & Test in Europe Conference & Exhibition (DATE), Virtual, 1–5 February 2021; pp. 786–789. [Google Scholar]
Stutz, D.; Chandramoorthy, N.; Hein, M.; Schiele, B. Random and adversarial bit error robustness: Energy-efficient and secure DNN accelerators. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 45, 3632–3647. [Google Scholar]
Khoshavi, N.; Maghsoudloo, M.; Roohi, A.; Sargolzaei, S.; Bi, Y. HARDeNN: Hardware-assisted attack-resilient deep neural network architectures. Microprocess. Microsystems 2022, 95, 104710. [Google Scholar] [CrossRef]
Özdenizci, O.; Legenstein, R. Improving Robustness Against Stealthy Weight Bit-Flip Attacks by Output Code Matching. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 19–23 June 2022; pp. 13388–13397. [Google Scholar]
Köylü, T.Ç.; Hamdioui, S.; Taouil, M. Smart Redundancy Schemes for ANNs Against Fault Attacks. In Proceedings of the 2022 IEEE European Test Symposium (ETS), Barcelona, Spain, 23–27 May 2022; pp. 1–2. [Google Scholar]
Li, J.; Rakin, A.S.; He, Z.; Fan, D.; Chakrabarti, C. Radar: Run-time adversarial weight attack detection and accuracy recovery. In Proceedings of the 2021 Design, Automation & Test in Europe Conference & Exhibition (DATE), Virtual, 1–5 February 2021; pp. 790–795. [Google Scholar]
Singh, N.; Rebeiro, C. LEASH: Enhancing Micro-architectural Attack Detection with a Reactive Process Scheduler. arXiv 2021, arXiv:2109.03998. [Google Scholar]
Guo, Y.; Liu, L.; Cheng, Y.; Zhang, Y.; Yang, J. ModelShield: A Generic and Portable Framework Extension for Defending Bit-Flip based Adversarial Weight Attacks. In Proceedings of the 2021 IEEE 39th International Conference on Computer Design (ICCD), Storrs, CT, USA, 24–27 October 2022; pp. 559–562. [Google Scholar]
Javaheripi, M.; Koushanfar, F. HASHTAG: Hash Signatures for Online Detection of Fault-Injection Attacks on Deep Neural Networks. In Proceedings of the 2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD), Virtual, 1–4 November 2021; pp. 1–9. [Google Scholar]
Javaheripi, M.; Chang, J.W.; Koushanfar, F. AccHashtag: Accelerated Hashing for Detecting Fault-Injection Attacks on Embedded Neural Networks. ACM J. Emerg. Technol. Comput. Syst. 2022, 19, 1–20. [Google Scholar]
Cherupally, S.K.; Rakin, A.S.; Yin, S.; Seok, M.; Fan, D.; Seo, J.s. Leveraging Noise and Aggressive Quantization of In-Memory Computing for Robust DNN Hardware Against Adversarial Input and Weight Attacks. In Proceedings of the 2021 58th ACM/IEEE Design Automation Conference (DAC), Virtual, 5–9 December 2021; pp. 559–564. [Google Scholar]
Chakraborty, A.; Alam, M.; Mukhopadhyay, D. Deep learning based diagnostics for Rowhammer protection of DRAM chips. In Proceedings of the 2019 IEEE 28th Asian Test Symposium (ATS), Kolkata, India, 10–13 December 2019; pp. 86–865. [Google Scholar]
Liu, Q.; Wen, W.; Wang, Y. Concurrent weight encoding-based detection for bit-flip attack on neural network accelerators. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD), Paris, France, 7–9 October 2020. [Google Scholar]
Gulmezoglu, B.; Moghimi, A.; Eisenbarth, T.; Sunar, B. Fortuneteller: Predicting microarchitectural attacks via unsupervised deep learning. arXiv 2019, arXiv:1907.03651. [Google Scholar]
Li, Y.; Li, M.; Luo, B.; Tian, Y.; Xu, Q. Deepdyve: Dynamic verification for deep neural networks. In Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security, Virtual, 9–13 November 2020; pp. 101–112. [Google Scholar]
Chakraborty, A.; Alam, M.; Mukhopadhyay, D. A Good Anvil Fears No Hammer: Automated Rowhammer Detection Using Unsupervised Deep Learning. In Proceedings of the International Conference on Applied Cryptography and Network Security, Virtual, 21–24 June 2021; Springer: Berlin/Heidelberg, Germany, 2021; pp. 59–77. [Google Scholar]
Köylü, T.Ç.; Reinbrecht, C.R.W.; Hamdioui, S.; Taouil, M. Deterministic and Statistical Strategies to Protect ANNs against Fault Injection Attacks. In Proceedings of the 2021 18th International Conference on Privacy, Security and Trust (PST), Virtual, 13–15 December 2021; pp. 1–10. [Google Scholar]
Kuruvila, A.P.; Meng, X.; Kundu, S.; Pandey, G.; Basu, K. Explainable Machine Learning for Intrusion Detection via Hardware Performance Counters. IEEE Trans.-Comput.-Aided Des. Integr. Circuits Syst. 2022, 41, 4952–4964. [Google Scholar] [CrossRef]
Polychronou, N.F.; Thevenon, P.H.; Puys, M.; Beroulle, V. MaDMAN: Detection of Software Attacks Targeting Hardware Vulnerabilities. In Proceedings of the 2021 24th Euromicro Conference on Digital System Design (DSD), Virtual, 1–3 September 2021; pp. 355–362. [Google Scholar]
Joardar, B.K.; Bletsch, T.K.; Chakrabarty, K. Learning to mitigate rowhammer attacks. In Proceedings of the 2022 Design, Automation & Test in Europe Conference & Exhibition (DATE), Virtual, 14–23 March 2022; pp. 564–567. [Google Scholar]
Alam, M.; Bhattacharya, S.; Mukhopadhyay, D.; Bhattacharya, S. Performance counters to rescue: A machine learning based safeguard against micro-architectural side-channel-attacks. Cryptol. ePrint Arch. 2017. Available online: https://eprint.iacr.org/2017/564 (accessed on 6 February 2023).
Mirbagher-Ajorpaz, S.; Pokam, G.; Mohammadian-Koruyeh, E.; Garza, E.; Abu-Ghazaleh, N.; Jiménez, D.A. Perspectron: Detecting invariant footprints of microarchitectural attacks with perceptron. In Proceedings of the 2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), Virtual, 17–21 October 2020; pp. 1124–1137. [Google Scholar]
Altland, E.; Castellanos, J.; Detwiler, J.; Fermin, P.; Ferrá, R.; Kelly, C.; Latoski, C.; Ma, T.; Maher, T.; Kuzin, J.M.; et al. Quantifying Degradations of Convolutional Neural Networks in Space Environments. In Proceedings of the 2019 IEEE Cognitive Communications for Aerospace Applications Workshop (CCAAW), Cleveland, OH, USA, 25–26 June 2019; pp. 1–7. [Google Scholar] [CrossRef]
Malekzadeh, E.; Rohbani, N.; Lu, Z.; Ebrahimi, M. The Impact of Faults on DNNs: A Case Study. In Proceedings of the 2021 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFT), Austin, TX, USA, 19–21 October 2021; pp. 1–6. [Google Scholar]
Gao, Z.; Wei, X.; Zhang, H.; Li, W.; Ge, G.; Wang, Y.; Reviriego, P. Reliability Evaluation of Pruned Neural Networks against Errors on Parameters. In Proceedings of the 2020 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFT), Virtual, 19–21 October 2020; pp. 1–6. [Google Scholar] [CrossRef]
Hector, K.; Moëllic, P.A.; Dumont, M.; Dutertre, J.M. A closer look at evaluating the Bit-Flip Attack against deep neural networks. In Proceedings of the 2022 IEEE 28th International Symposium on On-Line Testing and Robust System Design (IOLTS), Torino, Italy, 12–14 September 2022; pp. 1–5. [Google Scholar]
Tsai, Y.L.; Hsu, C.Y.; Yu, C.M.; Chen, P.Y. Formalizing generalization and adversarial robustness of neural networks to weight perturbations. Adv. Neural Inf. Process. Syst. 2021, 34, 19692–19704. [Google Scholar]
HajiAmin Shirazi, S.; Naghibijouybari, H.; Abu-Ghazaleh, N. Securing machine learning architectures and systems. In Proceedings of the 2020 on Great Lakes Symposium on VLSI, Virtual, 8–11 September 2020; pp. 499–506. [Google Scholar]
Naseredini, A.; Berger, M.; Sammartino, M.; Xiong, S. ALARM: Active LeArning of Rowhammer Mitigations. arXiv 2022, arXiv:2211.16942. [Google Scholar]
Di Dio, A.; Koning, K.; Bos, H.; Giuffrida, C. Copy-on-Flip: Hardening ECC Memory Against Rowhammer Attacks. In Proceedings of the Network and Distributed System Security (NDSS) Symposium 2023, San Diego, CA, USA, 27 February–3 March 2023. [Google Scholar]
Orosa, L.; Yaglikci, A.G.; Luo, H.; Olgun, A.; Park, J.; Hassan, H.; Patel, M.; Kim, J.S.; Mutlu, O. A deeper look into rowhammer’s sensitivities: Experimental analysis of real dram chips and implications on future attacks and defenses. In Proceedings of the MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture, Virtual, 18–22 October 2021; pp. 1182–1197. [Google Scholar]
Chakraborty, A.; Bhattacharya, S.; Saha, S.; Mukhopdhyay, D. Rowhammer induced intermittent fault attack on ECC-hardened memory. Cryptol. ePrint Arch. 2020. Available online: https://eprint.iacr.org/2020/380 (accessed on 6 February 2023).

Figure 1. Bit-flip attack and defense related works in recent years.

Figure 2. Research flow of this paper.

Figure 3. DRAM hardware structure.

Figure 4. The structure of DRAM Array and DRAM cell.

Figure 5. Situations of single-side hammer and double-side hammer.

Figure 6. How clock glitching influence the branch address selection.

Figure 7. Some factors that attacker needs to consider.

Figure 8. Different access permission to DNN sources for different threat models.

Figure 9. The structure of DNN model and the attackable parts in DNN cell.

Figure 10. The untargeted attack example.

Figure 13. Findings based on existing methods.

Table 1. Statistics of bit-flip attack methods in recent related works.

Attack Method	Numbers	Proportion
RowHammer	15	45.5%
VFS	1	3%
Clock Glitching	4	12.1%
Laser Injection	2	6.1%
Not Specified	11	33.3%

Table 2. The detailed attack aspects for untargeted bit-flip attacks.

Cited Paper	Thread Model	Attack Target	Bit-Flip Method	Attack Goal
Practical Fault attack [2018]	All 3 models	activation functions	Laser injection	High misclassification rate
BFA [2019]	Full Knowledge or Restricted white-box model	fix-point weights	RowHammer	Accuracy degradation
Faults on BNN [2020]	Full Knowledge or Restricted white-box model	Weights, Activation	Not specified	Accuracy degradation
Imperceptible Misclassification Attack [2020]	All 3 models	Clock signal	Clock glitching	high misclassification rate
Survey on Faults of BNN [2020]	All 3 models	Clock signal	Clock glitching	Accuracy degradation
DeepHammer 2020]	Full Knowledge or Restricted white-box model	DNN weights	RowHammer	Random guess accuracy
Deep-Dup [2021]	All 3 models	Power Distribution System	VFS	Random guess accuracy
Fault injection on SoftMax [2021]	Full Knowledge or Restricted white-box model	SoftMax Activation Function	Not Specified	moderate misclassification rate
Fault Injection utilizing Waveform Pattern Matching [2021]	Full Knowledge or Restricted white-box mode	SoftMax Activation Function	Clock glitching	Random guess accuracy
Seeds of SEED [2021]	Full Knowledge or Restricted white-box mode	DNN parameter	RowHammer	Obfuscated semantics
BDFA [2022]	Full Knowledge or Restricted white-box mode	Fix-point weights	Not Specified	Random guess accuracy
SPARSE BFA [2022]	Full Knowledge or Restricted white-box mode	DNN Weights	RowHammer	Random guess accuracy

Table 3. The detailed attack aspects for targeted bit-flip attacks.

Cited Paper	Thread Model	Attack Target	Bit-Flip Method	Attack Goal
Fault Injection Attack on DNN [2017]	Full Knowledge and Restricted white-box model	DNN weighs and bias	Not Specified	Misclassification on targeted data, rest accuracy unaffected
Fault sneaking attack [2019]	Full Knowledge and Restricted white-box model	DNN parameters	Not Specified	Misclassification on targeted data, rest accuracy unaffected
T-BFA [2021]	Full Knowledge and Restricted white-box model	DNN weighs	Not Specified	High attack success rate. keep rest accuracy
Design and Analysis of Novel Bit-flip attacks [2022]	Full Knowledge and Restricted white-box model	MSBs in significant layers	Not Specified	Misclassification on targeted data, keep rest accuracy
Stealthy Attack [2022]	Full Knowledge and Restricted white-box model	Bits along the opposite direction of the gradient	RowHammer	High robustness drop, rest accuracy unaffected

Table 4. The detailed attack aspects for bit-flip based trojan related work.

Cited Paper	Thread Model	Attack Target	Bit-Flip Method	Attack Goal
TBT [2020]	Full knowledge model	DNN parameters, inputs	RowHammer	High success rate, low rest accuracy degradation
FooBaR [2021]	Full knowledge model	ReLU activation, inputs	Not Specified	High success rate, low rest accuracy degradation
ProFilp [2021]	Full knowledge model	Vulnerable bits in last layer, inputs	Not Specified	High success rate, low rest accuracy degradation
Toward Realistic Backdoor Injection [2022]	Full knowledge model	DNN Weights, inputs	RowHammer	High success rate, low rest accuracy degradation
Hardly Perceptible Trojan Attack [2022]	Full knowledge model	DNN parameters, inputs	Not Specified	High success rate, low rest accuracy degradation
Novel Hardware Trojan Attack [2022]	Full knowledge and Restricted white-box model	Activation function, input	Clock glitching	Accuracy degradation
Training-time Trojan Attacks [2022]	Full knowledge model	DNN parameters, inputs	RowHammer	High success rate, low rest accuracy degradation
TrojViT [2022]	Full knowledge model	DNN parameters, inputs	RowHammer	High success rate, low rest accuracy degradation
Versatile Weight Attack [2022]	Full knowledge model	DNN Weights, inputs	Not Specified	High success rate, low rest accuracy degradation

Table 5. The results of untargeted attack-related work.

Cited Paper	Effect	Flipped Bits/Elements
Practical Fault Attack [2018]	≥50% Misclassification Rate	≥50% neurons for Sigmod and tanh, ≥75% neurons for ReLU
BFA [2019]	0.1% top-1 accuracy for ImageNet dataset	13 in 93 million bits
Faults on BNN [2020]	∼57.5% accuracy degradation	∼100 bits
Imperceptible Misclassification Attack [2020]	More than 98% Misclassification rate in 8 out of 9 models	∼100 bits
Survey on Faults of BNN [2020]	20%∼80% accuracy degradation	1∼100 bits
DeepHammer [2020]	≤10% accuracy	2∼24 bits
Deep-Dup [2021]	≤11% accuracy	70+ attacks
Fault injection on SoftMax [2021]	≤30% misclassification rate	1 bit
Fault Injection utilizing Waveform Pattern Matching [2021]	≤10% accuracy	≤20 bits
Seeds of SEED [2021]	≥90% EMR ¹	≥90% BLEU, ²≤80 bits
BDFA [2022]	∼11% accuracy	∼8 bits
SPARSE BFA [2022]	≤11 % accuracy	0.00005% in total bits

¹ The output sequence match rate. ² bilingual evaluation.

Table 6. The results of targeted attack-related work.

Cited Paper	Effect	Flipped Bits/Elements	Influence on Non-Targeted Data
Fault Injection Attack on DNN [2017]	Achieve Misclassification on targeted data	SBA: 1 parameter, GBA: 400+ parameter	SBA: ∼24% average accuracy, GBA: ∼3% accuracy degradation
Fault sneaking attack [2019]	Achieve Misclassification on targeted data	Not Specified	∼1% accuracy degradation
T-BFA [2021]	∼100% attack success rate	27 bit	∼30% accuracy degradation
Design and Analysis of Novel Bit-flip attacks [2022]	Achieve Misclassification on targeted data	≤60 bits	≤20% accuracy degradation
Stealthy Attack [2022]	60%∼72% robustness drop	30∼100 bits	≤ 1% accuracy degradation

Table 7. The results of bit-flip-based trojan-related attacks.

Cited Paper	Effect	Flipped Bits/Elements	Influence on Non-Targeted Data
TBT [2020]	92% success rate	84 in 88 million bits	12% accuracy degradation
FooBaR [2021]	60%∼100% success rate	25 neutrons	≤2% accuracy degradation
ProFilp [2021]	∼100% success rate	7.37 bits average	0.09% accuracy degradation
Toward Realistic Backdoor Injection [2022]	94% success rate	10 in 2.2 million bits	≤1.66% accuracy degradation
Hardly Perceptible Trojan Attack [2022]	89%∼99% success rate	8∼14 bits	≤2.1% Accuracy degradation
Novel Hardware Trojan Attack [2022]	48%∼55% accuracy degradation	Not mentioned	Not mentioned
Training-time Trojan Attacks [2022]	∼100% success rate	Not mentioned	≤2.8% accuracy degradation
TrojViT [2022]	99.64% success rate	345 bits	≤1% accuracy degradation
Versatile Weight Attack [2022]	SSA: 100% success rate, TSA: 95.63% success rate	SSA: 7.37 bits, TSA: 3.4 bits	0.05% accuracy degradation

Table 8. The attributes of work that aims to lower the harmness of bit-flip.

Cited Paper	Defense Target	Tackling Method	Defense Goal	Effect
Efficient Bit-Flip Resilience Optimization [2019]	DNN neurons	Fine-tune significant neurons	Keep weighs stable	40% fail rate degradation for 1 bit flip, Average 20% fail rate for MBU.
Automated design of error-resilient [2020]	DNN structure	Neural network structure search techniques	Optimize DNN structure	6x∼7x CCR ¹ degradation at 0.5% bit error rate.
Defending and Harnessing the Bit-flips [2020]	DNN weights	Compress weights and retrain	Migrate the gap incurred by bit flipping	19.3x and 480.1x more bit flips on ResNet-20 and VGG-11
DNN Weight Reconstruction [2020]	DNN weights	propagate the value change to neighboring cells	Reconstruct DNN weights	∼60% accuracy improvement
Runtime Fault Injection Detection [2021]	DNN neurons	Redundant neurons	Make DNN resilient	10%∼40% accuracy improvement
RA-BNN [2021]	DNN weights and activations	Binarization of weights, activations	Keep weighs stable	125x more bit flips. 2%∼8% clean accuracy improvement
Generating Robust DNN [2022]	DNN MSBs	Obfuscate the bit sequence, non-linear coding	Make attack blackboxlized	17x bit flip tolerance compared with raw model
Random and Adversarial Bit Error Robustness [2022]	Bit flips due to voltage drops	Weights quantization and pruning	Keep weighs stable	∼63% error rate degradation in 320 bit flips
HARDeNN [2022]	DNN weights and activations	three-module redundancy for weights and activation	Make DNN resilient	17.19%∼96.15% error-resiliency improvement
Smart Redundancy Schemes [2022]	Vulnerable elements	Use GDA to confirm and redundant vulnerable elements	Make DNN resilient	93%∼99% protection rate(bit flip will not incur fault)

¹ image misclassfication rate after bit flipping.

Table 9. Factors used to detect bit-flip.

Cited Paper	Feature Used
Deterministic and Statistical Strategies [2021]	Output labels of input data, activation rate of neurons
LIMA [2021]	equivalence between MSBs and LSBs of a set of weights
Concurrent Weight Encoding-based Detection [2020]	weight codes
RADAR [2021]	2-bit hash signature of weight group
LEASH [2021]	HPC information of process
ModelShield [2021]	hash value of weight for each layer
HASHTAG [2022]	partial weight hashes for the vulnerable layer
AccHashtag [2022]	a hash signature for each layer

Table 10. The attributes of works that aims to use some features for detecting bit-flips.

Cited Paper	Defense Target	Tackling Method	Defense Goal	Effect	Overhead
RADAR [2021]	DNN weights	2-bit signature for each group	Weights verification	96% detection rate, 69% accuracy improvement	≤1% time overhead
LIMA [2021]	DNN thread	use HPC to quantify the maliciousness	Constrain the source of malicious thread	70%∼99.5% BFA detection rate	0.5% accuracy degradation, 10%∼23% false negative
ModelShield [2021]	DNN weights	Compare hashes of the weights in each layer	Weights verification	Accuracy unchanged when bit flip exceeds threshold	≤2% latency overhead
HashTag [2022]	DNN weights	Compare hashes of the weights in most k vulnerable layers	Weights verification	∼100% detection rate	257B storage cost per lay, ≤2 ms time cost
AccHashtag [2022]	DNN weights	Compare hashes of the weights in most k vulnerable layers	Weights verification	∼100% detection rate	≤1.3 KB storage cost and ≤1% runtime cost

Table 11. Attributes of works that aims to use machine learning for detecting bit-flips.

Cited Paper	Training Input	Tackling Method	Defense Goal	Effect
Deep Learning-based Diagnostics [2019]	Memory Trace	Build CNN to judge whether bit-flip exists	Detection based on HPC	∼75% accuracy, 1.5s detection time
Concurrent Weight Encoding-based detection [2020]	Vulnerability-sensitive bits	Use Hemming distance to judge whether bit-flip exists	Detection-based on bits coding comparison	≥90% effective detection rate
FortuneTeller [2019]	Sensors time series of normal execution programs	Train RNN to predict real-time sensor values	Detection based on sensor value comparison	0.997 F-score
DeepDyve [2020]	Origin DNN model	Compare consistent invariant	Detection based on consistent invariant	≥97% coverage for BFA
A Good Anvil Fears No Hammer [2021]	Virtual memory traces	Compare reconstruct memory traces and origin ones	Detection based on memory traces	97% detection accuracy
Deterministic and Statistical strategies [2021]	Intermediate states, final output, activation rate of the neurons	Verify the change of intermediate states, final output, activation rate	Detection based on DNN status	≥96% detection coverage
HPCDR [2021]	HPC	Build DNN to judge whether bit-flip exists	Detection based on HPC	∼84% detection accuracy
Addressing Soft Error [2021]	Origin input and output	Compare the change of output	Detection based on output	92% detection coverage for 4 bit errors, 99% detection coverage for Bit trojan attack, ≥80% detection coverage for activation function attack
Learning to Mitigate [2022]	Access status of DRAM rows	Excessive row accesses trigger warning	Detection based on access status	∼100% RowHammer detection before a certain amount of accesses
Performance Counters to Rescue [2022]	HPC	Build DNN to judge whether bit-flip exists	Detection based on HPC	≥80% classification accuracy.

Table 12. Comparisons between the related works and our work.

Cited Paper	Study on Bit-Flip Methods	Study on Bit-Flip Attack against DNN	Study on DNN-Related Bit-Flip Defense	Study on Possible Future Directions
Hector et al. []	Not included	Impact of bit flipping on training parameters and model architecture	Not included	BFA specs related suggestions
Tsai et al. []	Not included	Influence of weight perturbation	Not included	Training loss design suggestions subject to weight perturbations
khalid et al. []	Not included	Description of DNN fault injection description	Not included	Some defense suggestions against fault injection
Hajiamin et al. []	Not included	adversarial sample attacks, data poisoning attacks, channel measurement attacks in brief	some software/hardware defenses against various attacks in brief	Not included
Naseredini et al. []	RowHammer Attack	RowHammer related features	Not included	Some attack/defense suggestions in the view of RowHammer
Our work	Covering commonly used methods: RowHammer Attack, VFS, Clock glitching and laser injection	Comprehensively covering the bit-flip attacks of the past five years	Comprehensively covering the bit-flip related defenses of the past five years	Suggestions covering Attacks and defenses

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

A Survey of Bit-Flip Attacks on Deep Neural Network and Corresponding Defense Methods

Abstract

1. Introduction

2. Bit-Flip Methods

2.1. Structure and Timing of DRAM

2.2. RowHammer Attack

2.3. Voltage Frequency Scaling Attack

2.4. Clock Glitching Attack

2.5. Laser Injection Attack

3. Bit-Flip Attacks against DNN

3.1. Threat Model

3.2. Attack Targets

3.3. Attack Principles and Detailed Work

3.3.1. Untargeted Attack

3.3.2. Targeted Attack

3.3.3. Bit-Flip Based Trojan Attack

3.4. Attack Results

4. Bit-Flip Defenses

4.1. Raising the Threshold of Bit Flipping

4.2. Bit-Flip Detection Based on Typical Features

4.3. Bit-Flip Detection Based on Machine Learning

5. Findings Based on Existing Methods

6. Related Work

7. Future Direction

8. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Article Metrics

Citations

Article Access Statistics