1. Introduction
Galathy [
1] and Yuce et al. [
2,
3,
4] analyzed and demonstrated the application of biased faults to modern, pipelined microprocessors. In this model, an attacker can engineer and inject biased faults (e.g., via clock glitches of a certain intensity) that affect in-flight instructions in the pipeline. The intensity of a fault (e.g., the length of a clock glitch, or the intensity of an electromagnetic beam with EM-FI Transient Probes:
https://www.riscure.com/uploads/2017/07/datasheet_em-fi_transient_probe.pdf.) is set such that it affects a few, but not all the instructions in-flight in the pipeline, such as instructions with long critical paths - load instructions, to make the fault observable to the attacker. The effect of biased faults on loaded or computed values enables the attacker to flip bit(s) in the data bytes, where the number of bits flipped is correlated with the intensity of the fault. The correlation of the fault intensity to a byte hamming weight exists, but the attacker has no control over the faulty value (i.e., the attacker cannot set/specify the faulty value). To engineer the fault, an attacker can gain an understanding of the critical path of instructions via profiling with micro-benchmarking. Profiling with micro-benchmarking is a common, reliable and inexpensive practice, which jointly with the information of software maps on the microprocessor pipeline provides sufficient information for an attacker to engineer a short and effective sequence of faults to the data in the processor datapath. This ultimately allows the attacker to perform: key extraction from cryptographic implementations (
Figure 1a), via Differential Fault Analysis [
5], or the most sophisticated Differential Fault Intensity Analysis [
1]); access control circumvention (
Figure 1b), or control flow subversion (
Figure 1c), to initiate buffer overflows and more sophisticated return oriented programming attacks.
The severity of fault attacks in general purpose microprocessors directly correlates with the increasing importance or criticality of the contents being processed [
6]. The threat affects a wide variety of products on the market. For example, premium contents and payments are processed on mobile devices and servers; and metering information are processed on relatively resource constrained devices that use sophisticated processors. A class of modernIoT devices embody multicore processors with secure execution environment and vector units, e.g., ARM M7. In both cases, the valuables are protected by encryption and access control mechanisms which are prone to physical attacks, i.e., side-channel and fault attacks [
7].
Value prediction is a performance enhancing technique in which the value(s) produced by an instruction (producer) are predicted before the instruction is executed. Instructions that consume the predicted value(s) (consumers) can speculatively execute before the producer has executed, resulting in higher performance. The prediction is later confirmed when the producer is executed. If the predicted value did not match the produced value (the
trusted value), recovery actions take place.
Figure 2 illustrates the value prediction operations.
Figure 2a shows a dependence chain consisting of four instructions (i1, i2, i3, and i4). The value produced by instruction i1 is consumed by instruction i2, value produced by instruction i2 is consumed by instruction i3, and so on. In the absence of value prediction, the four instructions execute sequentially (i.e., execution rate is 1 instructions-per-cycle). With value prediction, the data dependencies between the instructions can be broken, and the execution rate increases (from 1 to 4 instructions-per-cycle), as illustrated in
Figure 2b.
Motivated by design simplicity and very high prediction accuracy (Accuracy is defined as the number of correctly predicted dynamic instructions divided by the number of predicted dynamic instructions) achieved by state-of-the-art value predictors (above 99%) [
8,
9,
10,
11], it is commonplace to use pipeline flushes as the default value misprediction recovery action. The basic idea is to throw away all instructions younger than the value mispredicted instruction, and then re-fetch and re-execute them (illustrated in
Figure 2c). The high prediction accuracy and coverage (Coverage is defined as the number of predicted dynamic instructions divided by the number of dynamic instructions) of state-of-art value predictor designs enable the adoption of value prediction in real products.
Value prediction has appealing features that can be leveraged for security purposes to recover from fault attacks when computed or loaded data (
a.k.a. produced data) values are under attack. As opposed to trusting the produced value in value prediction, in security, the predicted value can be used to raise suspicion that the produced value has been tampered with (i.e., faulted.) In fact, under the attack scenarios in [
1,
2,
3], in which an attacker can engineer a series of biased faults on produced data values, a value predictor can effectively prevent the attacker from observing faulty output values. For example, when the value predictor predicts a value that is discrepant with the produced value, the following actions can be engineered to mitigate the fault: (a) the predicted value, if
trusted, can be used in place of the faulty value, thus, the fault is corrected (illustrated in
Figure 2d); (b) otherwise, the producer instruction along with all younger instructions are flushed, and then re-fetched and re-executed. Thus, the correct value is reproduced, and the fault is corrected (similar to
Figure 2c).
We present
VPsec, a security framework built around the concept of value prediction to counter fault attacks in general purpose microprocessors.
VPsec can be applied to any value prediction schema/design. The design of
VPsec enhances the original value predictor design with the following elements: (a) logic to detect the occurrence of faults in the produced or predicted data values; (b) logic to react to the occurrence of faults, by categorizing faults to the datapath or to the value predictor; (c) new security-aware recovery actions (reactions), which are triggered in place of the default recovery action when the value predictor is deemed under attack. The VPsec architecture guarantees that an attacker can never leverage the propagation of faults to his/her advantage. Furthermore, we present the design of the
VPsec framework. The proposed design leverages state-of-the-art value predictors from [
8,
9,
10], and provides the appropriate extensions to handle fault attack scenarios. If an attacker injects potentially successful faults,
VPsec guarantees that the output value observed by the attacker will not be correlated with the attacker’s fault assumptions. The value is either corrected by
VPsec, or it is infected when a corrective action cannot be taken, or the software outcome is silenced. Thus,
VPsec instances can be defined to deceive the attacker without requiring the costly mitigation techniques in software. Interestingly, since mitigation techniques in software can increase the attack surface, because more instructions are executed, a hardware-only solution like
VPsec avoids such undesirable side-effect, and it is friendly to legacy software.
This work extends our previous contributions [
12,
13], in which we presented the first hardware-only fault mitigation approach that leverages a high-performance microarchitecture feature-value prediction, that allows fault mitigation without degrading performance, and with negligible area and power overheads. This contribution extends the discussions on Value Prediction and Fault Analysis in modern microprocessors, and presents more detailed experimental results for a wide variety of benchmark suites, including cryptographic and non-cryptographic applications. Our detailed evaluation shows that the proposed technique protects the execution of unmitigated cipher suites in
OpenSSL [
14], the industry standard benchmarks
SPEC CPU2006 [
15] and
SPEC CPU2017 [
16], and other benchmark suites. Furthermore, we show that the proposed design requires minimal changes to the underlying value prediction machinery and it retains most of the performance benefits.
The rest of this contribution is organized as follows:
Section 2 discusses the prior art;
Section 3 details both the framework of
VPsec as well as the proposed design;
Section 4 provides the experimental and security evaluation of
VPsec;
Section 5 discusses system integration and system security aspects of
VPsec; finally
Section 6 concludes the contribution.
3. Value Prediction for Security
3.1. Framework
Value Prediction for Security,
VPsec, provides a security framework built around the concept of value prediction. The proposed framework includes value prediction and extends any value prediction schema/design to provide an exhaustive coverage against possible fault attack scenarios, i.e., faults to produced data values, and faults to predicted data values. In the
VPsec framework, the concept of trust in the predicted value is introduced. A predicted value is trusted if and only if two or more of the value predictors in the value prediction embodiment supply matching, confident predictions.
VPsec implements a pipeline which includes the following components (refer to
Figure 3): (a)
value prediction machinery, which performs value prediction; (b)
detection logic, which compares the predicted and the produced values, and signals the presence of a discrepancy to the reaction logic; (c)
reaction logic, which takes mitigating actions when a discrepancy is observed by the detection logic. A discrepancy between the predicted and the produced values can occur under one of the following two scenarios. First, faults are injected into the datapath (i.e., a faulty value is produced), or faults are injected into the value predictor (i.e., a faulty predicted value is available). Second, faults are injected into several consecutive instances of the same producer instruction.
While the first scenario represents the basic case for using value prediction as a mitigation against fault attacks to general purpose microprocessors, it also illustrates a fundamental difference between the traditional use of value prediction in high-performance computing, which always trusts the produced value, and VPsec, which does not trust the produced value, and might trust the predicted value, when predictions are available, i.e., when value prediction accuracy and confidence are high. Furthermore, while the default recovery action in traditional value prediction only requires the re-execution of consumer instructions, the default recovery action in VPsec requires the re-execution of the producer instruction as well, as again, the data value is not trusted.
The first scenario is handled as follows. If the accuracy of the predicted value is high (above 99%) and the confidence is high, then a prediction is generated, and the predicted value is trusted, i.e., it can be used instead of the produced value. In this case, the reaction logic does nothing. If the accuracy of the predicted value is relatively low (below 99%, but above 90%) or the confidence is low, then a predicted value is not generated and the correction logic initiates recovery actions: flushing, re-fetching, and then re-executing the producer and all younger instructions, effectively re-computing the correct value. In both cases, the fault is masqueraded; in the former case, the fault is corrected on the fly.
For the second scenario, VPsec uses newly introduced Producer Status Registers (PSRs) that track if producer instructions are re-executed. A PSR is 8-bit (We use 8-bit, instead of 1-bit, PSRs to protect the PSRs against fault attacks) and it is allocated and initialized to zero when a producer is value predicted. When the producer successfully completes (i.e., commits and updates the architectural state), the PSR is released. The first time a producer instruction is re-executed (due to recovery actions), the value of its PSR is set to its complement, i.e., all bits in the PSR are set to 1. If a producer needs to be re-executed and its PSR value is non-zero, signaling that the previous instance of the producer was faulted, the produced data value is infected by VPsec, as VPsec deems the situation highly abnormal and irreversible, i.e., VPsec cannot correct the occurrence of the fault. Specifically, VPsec defines the following three types of recovery actions (Reactions) to mitigate faults.
Reaction 1. When PSR equals zero, and the predicted value is trusted, no action is taken, and the predicted value continues to be used by the consumer instructions.
Reaction 2. Like the conditions for Reaction 1, except that the accuracy is relatively low, or the confidence is low. In this case, we flush the pipeline and re-execute the producer and consumer instructions.
Reaction 3. When PSR is not equal to zero, indicating a highly abnormal and irreversible scenario,
VPsec generates an exception and takes an action to misguide the attacker. For example, the taken action can be defined as to infect the computed value with a random number, and propagate the infected value through the pipeline as illustrated in
Figure 3. In this case, infection occurs by XOR-ing the produced data value with a random number. Such a reaction produces the wrong program results that the attacker will be observing. However Reaction 3 is defined, a corrective action is delegated to the firmware handling the exception or to the higher-software layers. The execution of the program continues until the end and it is not delayed by an arbitrary amount of time under the control of the attacker.
3.2. Design
The
VPsec design consists of three components: (a) the
value prediction machinery; (b) the
detection logic; and (c) the
reaction logic. The following text elaborates on each one of these components. In this Section, we describe a specific instance of
VPsec, as illustrated in
Figure 3.
Value Prediction Machinery: The baseline value prediction scheme used in this work is an ensemble of value predictors. The ensemble consists of one or more predictors from each of the following predictor classes: last-value predictors [
17], context-based value predictors [
9,
10,
11], and indirect value predictors [
8,
23]. All predictors are active simultaneously, attempting to predict the data values produced by executing instructions.
In such designs, it is possible that multiple predictions can be provided for the same producer instruction. In this case, a voting mechanism is used to select the final prediction. Due to the high accuracy of the predictors in use, we almost never observed a disagreement between the predictions when multiple of them are made. We mark a value prediction as confident when multiple agreeing predictions are supplied by the different value predictors. Moreover, accuracy counters are maintained for each predictor, providing continuous monitoring of the prediction accuracy per-predictor.
Detection Logic: The detection logic collects the prediction from the value prediction machinery, if any prediction exist. Then, it compares the predicted value with the produced value when the producer is executed. The outcome of this comparison, along with the value predictors accuracy and confidence, is communicated to the reaction logic to flag a discrepancy, i.e., the occurrence of an attack.
Reaction Logic: Upon receiving a discrepancy signal from the detection logic, the reaction logic evaluates the status of the producer instruction (PSR value) and the status of the value predictor (its accuracy and confidence). One of the three recovery actions (described in
Section 3.1) is invoked. It is important to note that when the reaction logic is triggered, the produced value cannot be trusted. Reaction 3 is defined to infect the corrupted value.
3.3. Overheads
VPsec assumes a general purpose processor that employs several state-of-the-art value predictors in a single embodiment, as described in
Section 3.2. Given such a baseline, VPsec adds
simple combinational logic in the detection and reaction logic blocks, and
a set of PSR registers in the reaction logic. In the worst case scenario, VPsec will need to monitor the status of all in-flight instructions in the pipeline, the number of PSR registers required can match the number of entries in the reorder buffer. Hence, for a single PSR register of
n bits (e.g., 8-bit), and a typical reorder buffer with
m entries (e.g., 224-entry), the storage required for the PSRs is
(224 bytes). Therefore, we believe that VPsec introduces negligible area and hardware overheads, as well as, it minimally increases the power consumption. At the same time, VPsec reduces the attack surface (by reducing the possible target instructions) and retains the benefits of Value prediction, adding performance benefits even in the presence of an aggressive attacker.
3.4. Modes of Operation
VPsec operates in two modes: a
training mode, and an
execution mode. During the training mode, the address and value predictors are trained, and the prediction accuracy is monitored and recorded for each predictor. Similarly, during the execution mode, the accuracy is monitored and compared against the accuracy recorded in the training mode for each predictor. This comparison enables VPsec to establish trust in the predicted values during execution mode. When the prediction accuracy is high and the confidence in the predicted value is high, in both modes, the predicted value is trusted, and Reaction 1 takes place. The occurrence of Reaction 1 has two benefits: (a) it masquerades the occurrence of a fault by correcting the fault with the predicted value; (b) it does not incur a performance penalty because no instructions will be re-executed (on the contrary, the execution time can be reduced due to benefiting from value prediction.) It is worth noting that in a traditional fault attack scenario, e.g., the cases indicated in [
3], only Reaction 1 is needed to correct the occurrence of data faults.
When the prediction accuracy is relatively low (according to the value predictor accuracy monitors) or the confidence in the predicted value is low (only one value prediction is made despite having multiple value predictors), the value predictor does not generate a prediction as the predicted value cannot be trusted, and Reaction 2 takes place.
Reactions 1 or 2 can be taken when the PSR value equals zero, indicating that the attack is less severe and that there is the possibility to recover from the fault by correcting the faulty value. When PSR is different from zero, Reaction 3 is taken, the computed value is infected, and the software under attack will output incorrect results to deceive the attacker.
4. Evaluation
4.1. Environment
The microarchitecture of VPsec presented in
Section 3 is faithfully modeled in our internally developed, cycle-accurate simulator. The parameters of our baseline core are configured as close as possible to those of Intel’s Skylake core [
33]. Currently there is no publicly disclosed information about a product that deploys value prediction. However, given the enormous advances made in the value prediction space, we foresee value prediction to become a common feature of general purpose microprocessors.
Table 1 shows our baseline core configuration. The value prediction scheme, described in
Section 3.2 and implemented in our performance model, supports predicting load instructions only, this is an artifact of our performance model and not a limitation of our proposed framework (VPsec). We restrict our evaluation and analysis to load instructions only. Load instructions have the longest critical path, and therefore, they are the easiest attack targets. Non-load instructions are not handled directly, but they can potentially be handled indirectly as they can influence future load instructions.
Table 2 summarizes the focus of our evaluation.
Value prediction, just like any other prediction scheme, requires training time in which no predictions are made. This training manifests as a certain fraction of instructions not being value predicted. Training usually takes place during the initial phases of the workload. Such phases are usually of little to no interest to the attacker.
4.2. Methodology
Evaluation is carried out in two parts. First, we evaluate the proposed value prediction design (described in
Section 3.2) using benchmarks from the following benchmark suites:
SPEC CPU2017 [
16],
SPEC CPU2006 [
15],
OpenSSL [
14],
SPMV [
34], and
Terasort. Our evaluation demonstrates the accuracy, coverage, and confidence of the proposed value prediction scheme. Moreover, we demonstrate the effect of injecting faults to cover the different attack scenarios described earlier.
Table 3 shows a list of our benchmarks. The workloads used in our evaluation are compiled to the ARM ISA using
GNU GCC with
-O3 level optimization. We use 100-million instruction SimPoints [
35], except for short-running benchmarks, we simulate the first 100 million instructions, or until the benchmark completes.
4.3. Value Prediction
In this section, we evaluate the value prediction scheme described in
Section 3.2 using the workloads listed earlier.
Figure 4a shows the speedup (i.e., improvement in Instructions Per Cycle (IPC)) and coverage of the proposed value prediction scheme. For example, in the case of
OpenSSL, on average 88.7% of loads are value predicted. Though not shown in the figure, the prediction accuracy of each one of the used value predictors is well above 99% [
8,
9,
10,
11].
4.4. VPsec
Figure 4b shows the percentage of value predicted load instructions for which only one value prediction is obtained from the value prediction machinery, or multiple predictions are obtained. For example, in the case of
OpenSSL, on average 56.1% of the value predicted loads (88.7% in
Figure 4a) are covered by a single prediction, for which the prediction is not considered confident. Upon detecting the occurrence of a fault (detection logic), the reaction logic shall execute Reaction 2, that is, the producer load and consumer instructions shall be re-fetched and re-executed. For the remaining predicted loads, two (or more) predictions with high accuracy are available. Thus, upon detecting the occurrence of a fault (detection logic), the reaction logic shall execute Reaction 1, that is, the effect of the fault is corrected.
Admittedly, each time Reaction 2 is taken, there can be a performance penalty which is paid due to re-executing the producer load and the consumer instructions. Meanwhile, each time Reaction 1 is taken, not only the effect of a fault is corrected, but also there is a performance advantage due to the early execution of the consumer instructions, which operate on a predicted value with high confidence. The penalty due to Reaction 2 on load instructions depends on the locality of the workload when the producer load is re-executed. In the worst case, very unlikely, the re-execution of the producer load instruction may incur a cache miss and result in re-loading the data from main memory. In the best case, very likely, the re-execution of the producer load instruction will a hit in the L1 cache.
To evaluate the performance impact due to the execution of Reaction 2, we assume the following extreme attack scenarios, in which an attacker faults: each value predicted load (Attack #1), every 10th predicted load (Attack #2), and every 100th predicted load (Attack #3). The attacker can inject biased faults in loaded and computed values. In the evaluation we assume injection of biased faults in the loaded values, as loaded values have the largest critical section. [
4]
Figure 5 and
Figure 6 show the performance impact with respect to a baseline with no value prediction (and no attacks).
Table 4 reports both the average speedup and the range of speedups (indicating the minimum and maximum speedups of benchmarks within each benchmark suite.) In the case of
OpenSSL, when no attack is performed, value prediction speeds up the execution of the benchmarks by up to 40% in
IPC, with an average of 4%.
In the most extreme scenario, in which an attacker launches an attack on each value predicted load, we observe no performance degradation as VPsec can correct 43.9% of the attacks (Reaction 1), while incurring the recovery action penalty on only 56.1% of the attacks (Reaction 2). Interestingly, the benefits of value prediction make up for the introduced re-execution overheads. While unrealistic, this scenario estimates the worst-case overheads that
OpenSSL can experience. Under more realistic, yet very aggressive attack scenarios, as shown in
Figure 5 and
Figure 6, the workloads still exhibit performance improvements which nearly match the performance improvement achieved by the no-attack scenario. Similar results can be observed for the other workloads, Terasort and SPMV in
Figure 5b,c and SPEC CPU in
Figure 6.
VPsec effectively tolerated the presence of realistic to extreme attack scenarios without incurring performance penalties for the benchmarks, even though the number of single predictions (i.e., unconfident predictions that trigger Reaction 2) is slightly higher than the number of multiple predictions (i.e., confident predictions that trigger Reaction 1).
4.5. VPsec Rationale
We frame the discussion in this section around cryptographic algorithms, though the observations presented are equally applicable to non-cryptographic algorithms as well.
Value prediction relies on uncovering patterns in the values produced by the program instructions. Recent proposals for value prediction demonstrate remarkable ability for identifying and exploiting complex value patterns [
9]. Alternative proposals [
8] advocate for predicting the values produced by load instructions by leveraging patterns in the memory addresses being referenced.
Once sufficient confidence is established in these address or value patterns, they get used to predict future program values. When combined, value predictability and address predictability, can complement and strengthen one another. For example, Cryptographic algorithms, e.g.,
NIST standard compliant implementations of the Advanced Encryption Standard (AES)(
https://nvlpubs.nist.gov/nistpubs/fips/nist.fips.197.pdf) exhibit both forms of predictability (address and value). The main loop of an AES implementation iterates for several rounds, which depends on the cryptographic strength of the AES instance, e.g., 10 rounds for 128 bit (key) algebraic strength of the block cipher. For each round, the algorithm updates the state table, during the steps of byte substitution, shift row, mix columns and add round key. These steps are simply loops over the elements of the AES state, i.e., the number of bytes in the state, which is 16.
Observe that the memory access patterns for the inner loops do repeat, and therefore values are easily predictable via address prediction. Our evaluation in
Section 4 demonstrate that schemes like [
8] are capable of predicting these patterns with very high accuracy and significant coverage.
Similarly, many of the non-load operations performed within the cryptographic algorithms are predictable. For instance, the number of times a loop iterates (a.k.a. loop trip-count) repeats across multiple executions of the loop and therefore is very predictable.
VPsec builds on all the recent advances in value prediction to deliver a security solution that can protect against fault attacks for a wide range of applications: cryptographic and non-cryptographic.
4.6. VPsec Limitations and Improvements
Admittedly, the implementation of
VPsec evaluated has a few limitations, and can be improved. First, it does not address the situation of fault attacks in earlier stages of the pipeline to skip instructions. This is inherent in that
VPsec leverages value prediction. Second, the training window discussed in
Section 3.4 can be a window of vulnerability, as an attacker can inject faults during training time. The predictors in
VPsec can be trained offline (i.e., pre-trained), to eliminate the the need to train online. In practice, for software executed in Trusted Execution Environment (TEE)(
https://www.globalplatform.org/mediaguidetee.asp), e.g., cryptographic algorithm implementations, address and value patterns can be very stable (discussed in
Section 4.5). This is in part because of security standards requirements on the implementation, and in part because of best practices in secure software development life-cycle.
Third, performance of
VPsec can be further improved by reducing the occurrences of Reaction 2, which takes place when a single prediction is supplied by the value prediction machinery, despite having three predictors (refer to
Figure 4b). Such a reduction in the number of Reaction 2 invocations can be achieved by increasing the number of value predictors in
VPsec.
It is important to note that the discussion in this section is relevant to the instance of VPsec that we evaluated in this paper, and that it do not jeopardize the validity of the concepts, findings and conclusions presented.
5. System and System Security Discussion
5.1. VPsec in The Context of a System on Chip
VPsec is an embodiment composed of state-of-the-art value predictors, being used for multiple purposes: performance improvement (default use case: performance feature), and attack mitigation (new use case: security feature). VPsec can be configured to enable or disable the performance and security features.
When integrated in an SoC, the security feature of VPsec is meant to act when secure software executes within an implementation of the Global Platform TEE, e.g., to protect long term secret keys from being extracted using fault attacks, of which ARM TrustZone, for example, is one of such implementations of the TEE (ARM TrustZone:
https://developer.arm.com/technologies/trustzone). Outside the context of TEE, VPsec will operate as a traditional value predictor, enabling the performance feature.
The value predictors in VPsec are context tagged. When VPsec starts its execution the value predictors do not carry the context of previous untrusted executions. Conversely, when the TEE completes its execution, the resources available to VPsec are cleared up and released. Therefore, and as elaborated more in
Section 5.3, VPsec is resilient to attack scenarios similar to Spectre variant 2 [
36].
5.2. System Security
In this section, we focus on the case when software executes security services in the system TEE. In such a case, an attacker capable of the state-of-the-art attacks [
3] cannot observe the results of his/her injected faults, as VPsec corrects or masks out the faulty values before they become visible to the attacker.
The possible operating scenarios of VPsec are summarized in
Table 5. Cases (1), (2) and (4) are handled properly by VPsec in these cases, when an attack occurs or when no attack takes place. In cases (1) and (2) the software produces correct output via Reaction 1 and 2. In case (4) the software produces incorrect results, as VPsec infects the data, and a signal indicating that an infection had occurred (as consequence of an attack) is raised to the higher level of software to handle the case (action not shown in
Figure 3 and outside the scope of this work). As a result, for all the attack scenarios of interest to VPsec, an attacker is either deceived or deterred. With Reaction 1, the occurrence of a fault is first detected and then corrected. With Reactions 1 and 2, we can potentially observe performance benefits by virtue of using value prediction. With Reaction 3, the occurrence of an irreversible fault is countered, e.g., simultaneous faults to the instructions and the value predictor are deterred. In this case, additional recovery actions can be put in place in the upper layers of software implementing a security service, which is beyond the scope of this work.
Case is a remote but conceivable case, which we report for completeness. In case (3) the value predictor itself is highly confident in the predicted value, but incorrect (a.k.a., mispredicted). The occurrence of Case (3) would produce the wrong program output even without the occurrence of an attack. This case is highly unlikely in the presence of multiple predictors, and the probability of this happening approaches zero as the number of value predictors increases. A loose upper bound on the probability of (3) to occur can be computed assuming that the occurrence of misprediction is equally likely to happen on each predicted value. That is, , where nvp is the number of predictors in the embodiment, and is the maximum of the accuracy for each predictor in the embodiment. The estimation above is pessimistic as it assumes that the probability of mispredicting is equally distributed across all the predicted instructions. A practical confirmation of the unlikelihood of scenario (3) is provided by our experimental results, for which even with only 3 value predictors, VPsec did not incur Reaction 3 (Recall that Reaction 3 is incurred for low confidence yet incorrect predictions, a scenario that is even more likely than high confidence yet incorrect prediction, i.e., Case 3).
5.3. Relevance to Recently Discovered Attacks
The Spectre attack appeared in two variants [
36]. In Spectre, variant 1 (bounds check bypass), and variant 2 (branch target injection), the conditional and indirect branch predictors are manipulated to steer the program speculation in a specific path that enables extracting information from other running processes. Such attacks are hard to fix, but also quite hard to exploit [
36].
Admittedly, value predictors can expose a new variant of Spectre, but this variant can be mitigated using a similar technique as the one used to patch Spectre variant 2, e.g., by tagging prediction tables with Address Space Identifier (ASID), and using that information as part of the prediction logic. As Value predictor can be fixed against this new variant of Spectre, so does VPsec.
Because of the high-accuracy and practicality of recent value prediction implementations, we expect value prediction to be a commonplace in future generations of general purpose microprocessors. Thanks to the authors of Spectre, we have the possibility to analyze and fix value prediction against similar attacks. We leave the detailed analysis of this issue as future work. It is worth noticing, however, that VPsec is not designed to protect any form of microarchitectural side-channel attacks, as Meltdown and Spectre. However, it does protect against fault attacks to modern microarchitectures.
6. Conclusions
This work proposes VPsec, a novel hardware-only schema which leverages value prediction to detect, correct or counter fault attacks in general purpose microprocessors.
To the best of our knowledge, this is the first contribution that proposes a framework which enhances value prediction, a performance improvement technique in high-performance microprocessors, for its use in computer security, to mitigate fault attacks. The design of VPsec demonstrates its efficacy in countering fault attacks to modern microprocessors with negligible changes to the original value prediction design and no associated software overhead.
Furthermore, our evaluation shows that VPsec not only provides protection to the execution of unmitigated cipher suites in OpenSSL and industry standard benchmarks such as SPEC CPU2017, but also provides performance improvements by virtue of using value prediction.