Next Article in Journal
Securing the Internet of Things: Systematic Insights into Architectures, Threats, and Defenses
Previous Article in Journal
Spiking Neural Network-Based Bidirectional Associative Learning Circuit for Efficient Multibit Pattern Recall in Neuromorphic Systems
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

TrackRISC: An Implicit Attack Flow Model and Hardware Microarchitectural Mitigation for Speculative Cache-Based Covert Channels

Department of Electrical Engineering, City University of Hong Kong, Hong Kong, China
*
Author to whom correspondence should be addressed.
Electronics 2025, 14(20), 3973; https://doi.org/10.3390/electronics14203973
Submission received: 29 August 2025 / Revised: 1 October 2025 / Accepted: 7 October 2025 / Published: 10 October 2025
(This article belongs to the Special Issue Secure Hardware Architecture and Attack Resilience)

Abstract

Speculative execution attacks significantly compromise the security of modern processors by enabling information leakage. These well-known attacks exploit speculative cache-based covert channels to effectively exfiltrate secret data by altering cache states. Existing hardware defenses specifically designed to prevent cache-based covert channels are effective at blocking explicit channels. However, their protection against implicit attack variants remains limited, since these hardware defenses do not fully eliminate secret-dependent microarchitectural changes in caches. In this paper, we propose TrackRISC, a framework which comprises (i) a refined implicit attack flow model specifically for the exploration and analysis of implicit cache-based speculative execution attacks which severely compromise the security of existing hardware defenses, and (ii) a security-enhanced tracking and mitigation microarchitecture, termed TrackRISC-Defense, designed to mitigate both implicit and explicit attack variants that use speculative cache-based covert channels. To obtain realistic hardware evaluation results, we implement and evaluate both TrackRISC-Defense and a representative existing defense on top of the Berkeley’s out-of-order RISC-V processor core (SonicBOOM) using the VCU118 FPGA platform running Linux. Compared to the representative existing defense which incurs a performance overhead of 13.8%, TrackRISC-Defense ensures stronger security guarantees with a performance overhead of 19.4%. In addition, TrackRISC-Defense can mitigate both explicit and implicit speculative cache-based covert channels with a register-based hardware resource overhead of 0.4%.

1. Introduction

Speculative execution is a performance optimization technique that speeds up the execution of software programs on modern processors. This technique allows processors to execute some instructions before confirming their execution, utilizing methods such as branch prediction. However, speculative execution introduces serious security vulnerabilities, enabling speculative execution attacks such as Spectre [1,2,3], which can exploit covert channels to leak information from modern processors. A covert channel is the communication channel that is not intended for information transfer at all [4]. Attackers can build covert channels to infer whether each secret bit is either “1” or “0” [5], thereby circumventing computer security policies.
Cache-based speculative execution attacks pose a serious threat through highly effective covert channels constructed in caches or translation lookaside buffers (TLBs). The following describes the workflow of cache-based speculative execution attacks that result in the leakage of speculatively accessed data. First, attackers bypass speculative authorization and use access instructions [6] to access and move secret data into registers. Like the state-of-the-art hardware defenses [6,7,8], access instructions refer to speculative load instructions in this paper. Second, attackers continue to act as the sender in a cache-based covert channel by employing transmit instructions (i.e., transmitters) [6] to induce secret-dependent microarchitectural changes in caches or TLBs. These transmit instructions are data-dependent on secret information, encoding the secret data into microarchitectural changes. Third, attackers act as the receiver in the covert channel, inferring the confidential data by observing these microarchitectural changes transmitted from the sender.
Various hardware defenses [6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26] have been proposed to mitigate speculative execution attacks through hardware optimizations in central processing units (CPUs), caches, and other microarchitectural components. A high-security defense strategy in existing defenses, specifically targeting speculative cache-based covert channels, is to prevent secret-dependent microarchitectural changes from arising in caches or TLBs, as exemplified by the mainstream defense mechanisms of a memory transmitter delay [23], invisible structures [9,14,18], and delay-on-miss [11,12]. Despite their effectiveness in blocking explicit speculative cache-based covert channels, these existing defense mechanisms are still vulnerable to implicit speculative cache-based covert channels.
This paper is motivated by the observation that implicit cache-based speculative execution attacks are dangerous but have received relatively limited attention from existing hardware defenses. This highlights that a more in-depth and intrinsic analysis of such attacks is urgently needed. In this paper, we propose TrackRISC, an attack–defense framework for modeling implicit cache-based speculative execution attacks and mitigating both explicit and implicit attack variants.
First, a refined implicit attack flow model is proposed to specifically capture the critical steps of implicit cache-based speculative execution attacks. Our model reveals why implicit cache-based speculative execution attacks can compromise the security of the existing hardware defenses tailored for speculative cache-based covert channels, as detailed in Section 4.4. This is because the existing hardware defenses fail to fully eliminate secret-dependent microarchitectural changes introduced by non-memory transmit instructions in implicit attacks, and these hardware defenses primarily identify memory instructions as the unsafe transmitters requiring restriction. Our model builds upon the prior work of He et al. [27], which provides a high-level yet coarse-grained model for speculative execution attacks without carefully distinguishing between implicit and explicit channels. In contrast, we focus specifically on modeling implicit speculative cache-based covert channels by introducing additional modeling components, such as the indirect influence exerted by transmit instructions.
Second, a security-enhanced tracking and mitigation microarchitecture, termed TrackRISC-Defense, is implemented to mitigate both explicit and implicit speculative cache-based covert channels. Compared to the existing hardware defenses tailored for speculative cache-based covert channels, TrackRISC-Defense enhances security by delaying the execution of a broader set of potential transmit instructions, including both speculative memory and non-memory instructions that may have dependence on loaded secrets, thereby mitigating both explicit and implicit cache-based speculative execution attacks with higher security. TrackRISC-Defense builds upon the defense mechanism of STT [6]. Unlike the original STT microarchitecture  [6], TrackRISC-Defense exploits an efficient global taint mask to determine when to delay or resume the execution of all potential transmit instructions, and the mask is stored in a hardware register whose width is as large as the total number of physical registers. By leveraging the global taint mask that records the tracked information of all relevant instructions, TrackRISC-Defense achieves negligible register-based hardware resource overhead. Our global taint mask is introduced in detail in Section 6.1.
TrackRISC-Defense is implemented on Berkeley’s out-of-order RISC-V core (SonicBOOM [28]) and is evaluated on field-programmable gate array (FPGA) running Linux. TrackRISC-Defense supports the superscalar processor core configuration. In addition, we implement SpecTerminator-v1 [23] as our baseline, which employs a representative existing defense mechanism called memory transmitter delay and delays the execution of speculative memory instructions that operate on secret-dependent data. An implicit attack variant leveraging a speculative cache-based covert channel is implemented on RISC-V to experimentally demonstrate that the representative existing defense remains vulnerable to implicit attacks, and to validate the security of TrackRISC-Defense. Compared to a representative existing defense which has the performance overhead of 13.8%, TrackRISC-Defense demonstrates stronger security, mitigating both explicit and implicit cache-based speculative execution attacks with the performance overhead of 19.4%. In addition, TrackRISC-Defense occupies less than 1% of the register resources (i.e., flip-flops) on the VCU118 FPGA board.
A notable fact is that cache-based speculative execution attacks are classified into explicit and implicit types, depending on whether transmit instructions directly or indirectly induce secret-dependent microarchitectural changes [6]. In explicit cache-based speculative execution attacks, transmit instructions are specific speculative memory instructions (i.e., load/store instructions) that can exploit their secret-dependent operands to directly change caches and TLBs. In contrast, in implicit cache-based speculative execution attacks, transmit instructions can be non-memory instructions (e.g., conditional branch instructions) that indirectly induce secret-dependent microarchitectural changes by influencing the execution of other memory instructions. Importantly, these memory instructions may execute either non-speculatively or speculatively, and may have no direct data dependence on secret data. Secret-dependent microarchitectural changes are eventually generated by these memory instructions rather than by transmit instructions in implicit attacks. A detailed comparison of implicit and explicit cache-based speculative execution attacks is described in Section 3.2 and Section 3.3.
Our key contributions are summarized as follows:
  • Implicit attack flow model: We propose a framework named TrackRISC, which incorporates a refined implicit attack flow model specifically for exploring implicit cache-based speculative execution attacks. The attack flow model reveals why these implicit attacks pose a severe threat to the existing hardware defenses specifically designed to block speculative cache-based covert channels.
  • Implicit vulnerability analysis in existing hardware defenses: Based on the implicit attack flow model, we further analyze the implicit security vulnerabilities within the existing hardware defenses. Moreover, we experimentally verify that a representative existing defense remains vulnerable to implicit cache-based speculative execution attacks.
  • Tracking and mitigation microarchitecture: In addition to the implicit attack flow model, the TrackRISC framework also incorporates TrackRISC-Defense, a security-enhanced tracking and mitigation microarchitecture that can mitigate both implicit and explicit speculative cache-based speculative execution attacks. Compared to a representative existing defense with the performance overhead of 13.8%, TrackRISC-Defense demonstrates stronger security with a performance overhead of 19.4%. The microarchitecture incurs a negligible register-based hardware resource overhead of 0.4% on FPGA.
  • Realistic hardware (FPGA) implementation and evaluation: TrackRISC-Defense is compatible with the superscalar CPU microarchitecture. To obtain real hardware evaluation results, we implement both a representative existing defense and TrackRISC-Defense on a practical RISC-V out-of-order processor core. The evaluation flow is built on the FPGA hardware platform using the VCU118 FPGA board running Linux.
The rest of the paper is organized as follows. Section 2 provides background on speculative execution, out-of-order execution, and speculative cache-based covert channels, and introduces examples of both implicit and explicit cache-based speculative execution attacks. Section 3 describes the threat model of TrackRISC-Defense, i.e., explicit and implicit speculative execution attacks using cache-based covert channels. Section 4 introduces a refined attack flow model specifically for exploring implicit cache-based speculative execution attacks, along with a classification of existing hardware defenses and their security analysis. Section 5 introduces the defense mechanism and implementation methods of TrackRISC-Defense, designed to mitigate both implicit and explicit cache-based speculative speculative execution attacks. Section 6 introduces the microarchitecture of TrackRISC-Defense. Section 7 provides the evaluation results and analysis of TrackRISC-Defense from the perspectives of security, performance, and hardware resource utilization. Finally, Section 8 presents the conclusions of our work, and Section 9 is the discussion of our work. Appendix A is the appendix.

2. Background

2.1. Speculative and Out-of-Order Execution

Speculative execution is a performance improvement technique that allows central processing unit (CPU) to execute some instructions in advance, before knowing whether their execution is necessary. Branch prediction is a typical method that enables speculative execution by leveraging branch predictors like pattern history table (PHT), branch history table (BTB), and return stack buffer (RSB). Specifically, modern processors rely on the predicted results from branch predictors to speculatively execute instructions in advance, thereby enhancing processor performance by reducing the delay incurred in determining the actual execution path. However, a side effect of branch prediction is the occurrence of mispredictions, which may diverge from the semantics expected by programmers and can even cause processors to speculatively execute unauthorized code paths. Branch mispredictions influence not only the microarchitectural states of CPU pipelines, but also caches, TLBs, and other structures.
Out-of-order execution improves execution speed of modern processors by enhancing temporal and spatial resource utilization, allowing modern processors to prioritize the execution of instructions whose input operands and execution units are available. Out-of-order execution permits instructions to be executed in an order that differs from their appearance in the program’s prescribed in-order sequence. While modern CPUs adopt out-of-order execution, the reorder buffer (ROB) is an important CPU component designed to track and maintain instruction information in program order, so as to maintain program correctness while facilitating out-of-order execution.

2.2. Speculative Cache-Based Covert Channels

A covert channel is an unauthorized and unintended communication channel exploited by attackers to retrieve secret information in a processor. Covert channel consists of a sender and a receiver that both are controlled by attackers, unlike a side channel where the sender is a victim entity that accidentally causes information leakage [29]. The sender encodes secret data into secret-dependent microarchitectural changes and transmit these changes to the receiver, who recovers the secret by observing corresponding microarchitectural changes [30]. Attackers build cache-based covert channels by generating these secret-dependent microarchitectural changes in the structures of caches or translation lookaside buffers (TLBs).
Caches, positioned between the CPU and main memory, are commonly exploited to create attacker-observable changes based on memory access. With the advantage of fast access time, caches are efficient storage units that store frequently used data. However, a dangerous cache-based covert channel can be built through observable access time differences between caches and main memory, since data stored in caches have a lower access time than the data in main memory. A memory access causing a cache miss can trigger the desired data to be refilled into the cache, thereby changing cache states. Cache timing attacks [31,32,33] allow attackers to obtain secret information about victim processes by multiple memory accesses and changing cache states. In cache-based timing channels, attackers are able to extract secret information by observing access timing conflicts, e.g., data blocks related to secret information may exhibit different access times compared to other data blocks.
Figure 1 illustrates a general attack scheme that build a cache-based covert channel. Cache-based speculative execution attacks, whether implicit or explicit, generally comprise four critical attack steps of authorization bypass, secret access, secret transmission, and secret recovery. A detailed introduction of the attack steps is presented in Section 3.1. The attack steps differ between implicit and explicit variants, referred to as secret transmission and secret recovery. For example, in a cache-based covert channel, for each bit of the secret data, the sender exploits a memory access to modify cache states when the secret bit is “1”, and does nothing when the secret bit is “0”. The receiver then decodes the secret data by observing the microarchitectural changes of specific cached data caused by the memory access. If the access latency of the specific data is low, it indicates that the data is cached, corresponding to a secret bit value of “1”. Otherwise, if the access latency is high, the specific data is not cached, indicating a secret bit value of “0”. This process is repeated until the entire secret data is recovered through the covert channel.

2.2.1. Implicit Cache-Based Speculative Attack Example

Listing 1 shows the key pseudocode of an implicit attack variant using a speculative cache covert channel and four critical attack steps involved. Before carrying out these critical steps, attackers need complete preparatory tasks, e.g., mistraining branch predictors. To bypass the authorization in Listing 1, the attacker mistrains a branch predictor by repeatedly supplying the controlled values x in if(x < array1_size) that consistently cause the targeted if condition to be true. Over time, this leads the predictor to always assume the if condition will be true, even though it is false. During the attack step of authorization bypass, the prior mistraining causes the branch predictor to incorrectly predict the if condition as true, even though the value of x is actually out-of-bounds. This misprediction leads the victim CPU into an illegal speculative execution path. Then in the attack step of secret access, the attacker uses a load instruction to access the secret data and loads the data into a register. In the subsequent secret-transmission step, the attacker uses a transmit instruction (i.e., another conditional branch instruction) to compare a secret value with an attacker-defined value k, as in if (secret == k). If the secret value equals k, array2[0] will be accessed, transmitting the desired data into the cache; otherwise, the cache state remains unchanged. Obviously, this indirectly causes an secret-dependent microarchitectural change about array2[0] that can be observed by attackers through a cache-based covert channel, enabling secret leakage. The indirect characteristic of an implicit cache-based speculative execution attack is that there exists no direct data dependency relationship between the secret-dependent microarchitectural change and the original secret data in a cache-based covert channel. In the final attack step of secret recovery, the attacker infers the secret by measuring the access time to array2[0]. A lower access time indicates that the data about array2[0] has been loaded into the cache, and the secret value equals k.
Listing 1. Spectre Example 10 [34]—This is an implicit cache-based speculative execution attack.
Electronics 14 03973 i004
From Listing 1, two key observations about implicit cache-based speculative execution attacks can be inferred. First, such attacks can exploit speculative non-memory instructions as transmit instructions to indirectly induce secret-dependent microarchitectural changes in cache-based covert channels. Second, the secret-dependent microarchitectural changes are finally generated by other speculative memory instructions, rather than by transmit instructions, and these speculative memory instructions may have no data dependency on the original secrets.

2.2.2. Explicit Cache-Based Speculative Attack Example

Listing 2 presents the key pseudocode for an explicit speculative execution attack leveraging a cache covert channel, still illustrating the four essential attack steps as in Listing 1. The explicit attack steps of authorization bypass and secret access are similar to those in implicit cache-based speculative execution attacks described in Section 2.2.1. A branch misprediction for the if statement is exploited by the attacker to start an illegal speculative execution period in the authorization-bypass step, allowing the attacker to utilize a load instruction to read secret data in the secret-access step. Then, in the attack step of secret transmission, the attacker exploits a load instruction to enable a cache miss, and thus a secret-dependent data from array2[secret * 512] is moved into the data cache. An observable secret-dependent microarchitectural change about array2[secret * 512] is transmitted into a cache. In the final attack step of secret recovery, the attacker measures the access time of various data blocks in array2 and identifies an index i for which array2[i * 512] yields a noticeably low access time. Based on the knowledge that the data about array2[secret * 512] was loaded into the data cache, resulting in low access time, the attacker can infer that the secret value is equal to i.
As we can see, explicit cache-based speculative execution attacks require memory instructions as transmit instructions to directly transmit secret data-dependent microarchitectural changes through cache-based covert channels, and these observable microarchitectural changes are eventually created by memory instructions that have data dependence on secret information. These memory instructions are transmitters in explicit attacks.
Listing 2. Spectre Variant 1 example [1]—This is an explicit cache-based speculative execution attack.
Electronics 14 03973 i005

3. Threat Model

The tracking and mitigation microarchitecture in TrackRISC, termed TrackRISC-Defense, focuses on mitigating both implicit and explicit speculative execution attacks that exploit the covert channels of data caches and data TLBs. Our focus is on protecting speculatively accessed data that may be leaked during these speculative execution attacks. Here, we describe the threat model of TrackRISC-Defense, detailing how an adversary exploits branch prediction to trigger an unauthorized speculative execution window through vulnerable code in victim threads, and then leverages cache-based covert channels to leak sensitive information. Our threat model excludes the covert channels of branch predictors [13,35], floating-point division units [36], or a performance monitor unit (PMU) [37]. The question of whether TrackRISC-Defense can mitigate these non-cache covert channels falls outside the scope of our experiments, and TrackRISC-Defense may not constitute a direct mitigation against these covert channels.

3.1. Critical Attack Steps

In the threat model, both implicit and explicit speculative execution attacks consist of four critical steps, namely, authorization bypass, secret access, secret transmission, and secret recovery. These attack steps for building the implicit and explicit speculative cache-based covert channels are described as follows [27,38]:
  • Step 1: Authorization Bypass. An attacker bypasses an authorization by exploiting a misprediction induced through the mistraining of a branch predictor, leading the CPU to speculatively execute an unauthorized code path.
  • Step 2: Secret Access. During the illegal speculative execution period, the attacker uses a load instruction to access the secret.
  • Step 3: Secret Transmission. After accessing the secret data, the attacker encodes secret information into a cache-based secret-dependent microarchitectural change and transmits the microarchitectural change through cache-based covert channels. The microarchitectural change is generated in the cache or TLB.
  • Step 4: Secret Recovery. The attacker infers the secret by observing the secret-dependent microarchitectural change (e.g., a secret-dependent data resides in a cache), and a cache-based covert channel is formed during this step.
Figure 2 illustrates the comparison of the attack flows between the implicit and the explicit speculative covert channels, based on the attack examples from Listings 1 and 2 in Section 2. The attacker first bypasses the authorization for a conditional branch instruction (denoted as br) by exploiting the mispredicted branch achieved through mistraining a branch predictor. A load instruction (denoted as ld) is then exploited to access and load the secret data into a register. In the attack step of secret transmission, the attacker creates secret-dependent microarchitectural changes in the cache by using specific instructions. In particular, the implicit attack variant leverages a branch-load instruction pair (denoted as br and ld) to indirectly trigger these microarchitectural changes, whereas the explicit attack variant employs a single load instruction (denoted as ld) to directly induce these microarchitectural changes. Finally, the attacker observes the microarchitectural changes in the cache and recovers the secret information. Indirect/direct secret-dependent microarchitectural changes are introduced in Section 3.2 and Section 3.3.

3.2. Implicit Speculative Attacks: Indirect Microarchitectural Changes

Implicit cache-based speculative execution attacks enable indirect adversary-visible microarchitectural changes, as illustrated in Figure 2. The indirect characteristic refers to adversary-observable microarchitectural changes that are indirectly enabled by transmit instructions and eventually generated by other memory instructions, but these memory instructions may not have direct data dependence on the secret itself. In Listing 1, the attacker creates a secret-dependent microarchitectural change by accessing array2[0], based on the condition result from if (secret = k). The data from array2[0] can be brought into the cache via a cache miss and a cache refill, causing a secret-dependent microarchitectural change. Obviously, the secret-dependent microarchitectural change in array2[0] has no data dependence on the secret data.

3.3. Explicit Speculative Attacks: Direct Microarchitectural Changes

Explicit cache-based speculative execution attacks create direct microarchitectural changes that can be observed by the attackers, as shown in Figure 2. The direct characteristic refers to attacker-observable microarchitectural changes that are directly generated by transmit instructions that are memory instructions that have data dependence on the secret data. Attackers commonly create direct microarchitectural change by accessing memory addresses that have data dependence on secret information, e.g., by accessing array2[secret * 512] in Listing 2, thereby generating observable microarchitectural effects (e.g., the array2[secret * 512] data is cached). During the secret-recovery step, the attacker infers the secret data by a lower access time through multiple array2 accesses.

4. TrackRISC: Implicit Attack Modeling and Analysis

Although attacks may vary, a general implicit attack flow model is proposed to abstract the critical attack steps of existing implicit cache-based speculative execution attacks. TrackRISC incorporates the attack flow model, motivated by the observation that implicit speculative cache-based covert channels pose serious risks but have attracted relatively less attention in the existing hardware defenses compared to explicit channels. The prior work of He et al. [27] provides a high-level yet coarse-grained model for speculative execution attacks without distinguishing between implicit and explicit channels, as summarized in Section 3.1, which we leverage to describe our threat model. Therefore, we propose a refined implicit attack flow model that targets implicit speculative execution attacks by introducing additional components such as the indirect influence exerted by transmit instructions. Our attack flow model demonstrates that cache-based speculative execution attacks can undermine the security of the existing hardware defenses by exploiting non-memory instructions as unsafe transmitters.

4.1. Implicit Attack Flow Model

While the workflow of implicit cache-based speculative execution attacks may differ, these attacks can generally be modeled as shown in Figure 3, where the critical attack steps, i.e., authorization bypass, secret access, secret transmission, and secret recovery, are explained in Section 3.1. Section 2.2 introduces the attack flow of building a cache-based covert channel through a sender and a receiver. I s p e c , I a c c e s s , I t r a n s m i t , and I m e m o r y are the critical attack instructions introduced in Section 4.1.1. I t r a n s m i t can be either non-memory instructions, denoted as N M , or memory instructions, denoted as M, as shown in Figure 3. I t r a n s m i t indirectly influences I m e m o r y in a secret-dependent manner, inducing I m e m o r y to finally generate adversary-observable microarchitectural changes in caches/TLBs, thereby revealing secret data. Indirect influence occurs when I t r a n s m i t impacts I m e m o r y not through direct data dependence, but through mechanisms such as control-flow decisions and resource contention. Our implicit attack flow model focuses on the covert channels of data caches/TLBs. Prefetcher-based attacks [39] are out of scope and these attacks can be mitigated by disabling the specific prefetchers that could be exploited by attackers.
The implicit attack flow model reveals why implicit cache-based speculative execution attacks can bypass the existing hardware defenses tailored for speculative cache-based covert channels. To prevent cache-based speculative execution attacks, these existing hardware defenses block memory instructions that may act as transmitters involved in attacks to inhibit the generation of secret-dependent microarchitectural changes in caches or TLBs, since memory instructions typically trigger cache-based state changes. However, based on our attack flow model, non-memory instructions can also act as transmitters that trigger the attacker-observable microarchitectural changes in implicit cache-based speculative execution attacks. Consequently, these existing hardware defenses cannot prevent the implicit attacks.
For brevity in describing the attack flows of implicit speculative cache-based covert channels, a concise representation of our implicit attack flow model is presented below:
I s p e c S e n d e r { I a c c e s s , I t r a n s m i t , I m e m o r y } Implicit   Covert   Channel   I m e m o r y { u a r c h }   R e c e i v e r { C a c h e , T L B } ,
where I m e m o r y { u a r c h } refers to a secret-dependent microarchitectural change eventually generated by I m e m o r y . The microarchitectural change is then transmitted from the sender to the receiver through an implicit covert channel, enabling secret recovery in an implicit cache-based speculative execution attack.

4.1.1. Critical Instructions in Implicit Attacks

To clearly introduce the implicit attack flow model, we first define the notation, i.e., I s p e c , I a c c e s s , I t r a n s m i t , and I m e m o r y , to describe critical instructions exploited in implicit cache-based speculative execution attacks.
  • I s p e c refers to a speculation-inducing instruction that causes an illegal speculative execution period in the attack step of authorization bypass. For example, I s p e c can be a conditional branch instruction that may enable a misprediction, and an attacker can exploit a mispredicted conditional branch to bypass the authorization.
  • I a c c e s s refers to an access instruction [6] that accesses secret data in the attack step of secret access. I a c c e s s is commonly a load instruction to read the secret data into a register.
  • I t r a n s m i t refers to a transmit instruction [6] that triggers the initial phase of the secret-transmission attack step for generating an adversary-observable secret-dependent microarchitectural change in caches/TLBs, representing an indirect secret transmission. I t r a n s m i t has data dependence on secret information so as to indirectly enable a secret-dependent microarchitectural change through control-flow decision and resource contention. A notable fact is that I t r a n s m i t can be a non-memory instruction, e.g., a conditional branch instruction.
  • I m e m o r y refers to a load/store instruction that completes the final phase of secret-transmission attack step for eventually generating an attacker-observable secret-dependent microarchitectural change in caches/TLBs. Once the execution of I m e m o r y completes, the attacker-observable microarchitectural change that reveals secret information is produced in caches/TLBs. Moreover, I m e m o r y is possibly speculative or non-speculative, and may exhibit no data dependence on secret information on I a c c e s s or on I t r a n s m i t .
In implicit cache-based speculative execution attacks, secret data is directly handled by access and transmit instructions, but the secret data indirectly influences the other memory instructions that finally cause secret-dependent microarchitectural changes. We describe the data flow relationships among these critical instructions in the implicit attacks as follows:
s D e p [ I a c c e s s ( s ) , I t r a n s m i t ( s ) ] m [ I t r a n s m i t ( s ) I m e m o r y ] ,
where s D e p [ I a c c e s s , I t r a n s m i t ] is bound to derive three sub-conclusions, as detailed below:
I a c c e s s , I t r a n s m i t ,   if   s D e p [ I a c c e s s , I t r a n s m i t ]   I F ( I t r a n s m i t )   after   I F ( I a c c e s s ) ,   and   I a c c e s s ( s ) I t r a n s m i t ( s ) ,   and   E x e ( I a c c e s s ) E x e ( I t r a n s m i t ) < T [ W s t a r t , W e n d ]
where I F ( I a c c e s s ) / I F ( I t r a n s m i t ) denotes the instruction fetch order of an access/transmit instruction in the CPU frontend. I a c c e s s ( s ) I t r a n s m i t ( s ) refers to an access instruction ( I a c c e s s ) that accesses the secret data s and passes secret-dependent data s to a transmit instruction ( I t r a n s m i t ) through the instruction pair attribute of I a c c e s s { r s o u t } = I t r a n s m i t { r s i n } , I a c c e s s { r s o u t } denotes the output register identifier of the access instruction, and I t r a n s m i t { r s i n } denotes any input register identifier of the transmit instruction. E x e ( I a c c e s s ) E x e ( I t r a n s m i t ) < T [ W s t a r t , W e n d ] means that the execution of both access and transmit instructions must be completed within a speculative window T, which is triggered by a speculation-inducing instruction. The speculative window T begins at the time point W s t a r t and ends at the time point W e n d when the speculation-inducing instruction is resolved, e.g., the CPU already knows the actual result of the next execution path without relying on the predicted result for the speculation-inducing instruction.
In contrast to the transmit instructions, the memory instruction ( I m e m o r y ) that finally generates a secret-dependent microarchitectural change does not need satisfy any of the three sub-conclusions derived from s D e p [ ] , as detailed below:
I a c c e s s , I m e m o r y s D e p [ I a c c e s s , I m e m o r y ] , or I t r a n s m i t , I m e m o r y s D e p [ I t r a n s m i t , I m e m o r y ] ,
however, I m e m o r y needs satisfy the interference rule of m [ I t r a n s m i t ( s ) I m e m o r y ] , indicating that I t r a n s m i t can indirectly influence the execution of I m e m o r y , depending on the secret-dependent value s . The indirect influence can be mediated by mechanisms such as shared resources or conditional triggers.

4.1.2. Attack Modeling Overview

Table 1 provides an overview of attack modeling representations for existing implicit speculative execution attacks leveraging data cache covert channels. These attack modeling representations offer a more concise depiction, which are derived from the implicit attack flow model in Figure 3. While well-known implicit attacks focus on exploiting cache covert channels, we also theoretically model potential future attacks utilizing TLB covert channels. Detailed analyses of these attack flows are discussed in Section 4.2 and Section 4.3. In addition, an observation is that implicit cache-based speculative execution attacks primarily differ in how transmit instructions ( I t r a n s m i t ) indirectly affect other memory instructions ( I m e m r o y ), even though no data dependence exists between them. In Table 1, the notation I t r a n s m i t I m e m o r y represents the medium of indirect influence by which I t r a n s m i t affects I m e m o r y through mechanisms including control-flow decisions and resource contention, rather than through direct data dependencies.

4.2. Modeling Implicit Attacks Using Control-Flow Decisions

In implicit cache-based speculative execution attacks, control-flow decisions, e.g., the condition result of conditional branch instructions, can be exploited by transmit instructions ( I t r a n s m i t ) to indirectly influence other memory instructions ( I m e m o r y ) that alter cache states, thereby leaking secret data. For example, attackers exploit conditional branch instructions as I t r a n s m i t to induce a victim CPU to follow different execution paths depending on secret values, thereby indirectly influencing the execution of I m e m o r y , which occurs on only one path and ultimately alters cache states.
Spectre Example 10 [34], an implicit attack using a conditional branch instruction as I t r a n s m i t , is shown in Listing 1. Depending on the condition result of if (secret == k), the victim CPU accesses array2[0] or not, resulting in the data being cached or not. If the access time to array2[0] is fast, indicating that the data is cached, the attacker can then infer that secret = k; otherwise, a slow access time suggests that secretk. In Spectre Example 10, I t r a n s m i t is a conditional branch instruction in if(secret == k). We model this type of implicit cache-based speculative execution attacks using control-flow decisions to indirectly influence I m e m o r y as follows:
I s p e c S e n d e r { I a c c e s s , I t r a n s m i t { b r , } , I m e m o r y { l o a d , s t o r e } } Implicit   Covert   Channel R e c e i v e r { C a c h e , T L B } ,
where I t r a n s m i t is a control-flow instruction (e.g., a conditional branch instruction denoted as b r ) that determines the subsequent execution path based on secret values, and I m e m o r y is a load/store instruction that eventually generate a secret-dependent microarchitectural change in the covert channel of data caches/TLBs. In Spectre Example 10 [34], I m e m o r y is the load instruction accessing array2[0] that results in a cache miss, denoted as I m e m o r y { l o a d c a c h e   m i s s } in Table 1. However, based on this type of similar attack modeling representation, I m e m o r y can also be a load instruction causing a cache hit, denoted as I m e m o r y { l o a d c a c h e   h i t } in Table 1. In this attack, I m e m o r y changes the least-recently used (LRU) state of a cache to build a leakage channel [40] through the transmit instructions of control-flow instructions. TrackRISC-Defense can mitigate such Spectre Example 10 [34] and LRU attacks [40] by blocking the execution of transmit instructions (e.g., control-flow instructions) in these attacks.

4.3. Modeling Implicit Attacks Using Resource Contention

For implicit cache-based speculative execution attacks exploiting resource contention, the critical attack steps proceed as Electronics 14 03973 i001 → ① → ② → Electronics 14 03973 i002Electronics 14 03973 i003 → Secret Recovery. Electronics 14 03973 i001 refers to the preparatory work to temporarily suspend the execution of non-speculative memory instructions ( I m e m o r y ). ① refers to the attack step of authorization bypass. ② means the attack step of secret access. Electronics 14 03973 i002 denotes the secret-transmission attack step where transmit instructions ( I t r a n s m i t ) are executed, and Electronics 14 03973 i003 refers to the secret-transmission attack step where I m e m o r y completes execution. I t r a n s m i t can be memory/non-memory instructions to indirectly influence I m e m o r y by triggering resource contention based on secret values. I m e m o r y then changes cache states, resulting secret recovery and information leakage. Detailed analyses of these attack steps are presented in Section 4.3.1 and Section 4.3.2.
Attackers can induce resource contention in miss status holding registers (MSHRs) [41] and non-pipelined execution units (EUs) [41], where MSHRs handle cache misses and manage the data refill process to bring missing data back into caches, and EUs perform various computational tasks for instruction execution. For the given two instructions named I1 and I2, if I1 precedes I2 according to the software program order, we consider that I1 is a previous instruction for I2.

4.3.1. MSHR Contention

The implicit attack example using the contention of miss status holding registers (MSHRs) is shown in Figure 4. Electronics 14 03973 i001 refers to the preparatory work to temporarily suspend the execution of the load instruction in y = load(z), rendering its input operand unavailable. After step Electronics 14 03973 i001 and bypassing the speculative check in step ①, the attacker accesses the secret data in step ②.
In Figure 4, steps Electronics 14 03973 i002 and Electronics 14 03973 i003 indirectly generate secret-dependent microarchitectural changes to form a cache-based covert channel. When secret_bit = 1, the execution of multiple secret-dependent loads causes M cache misses in step Electronics 14 03973 i002, potentially leading to MSHR resource exhaustion. M is the number of MSHRs in the victim CPU, where MSHRs serve to handle cache misses and manage missing data refill processes. In step Electronics 14 03973 i003, MSHR resource exhaustion affects the execution of the previous I m e m o r y in y = load(z), as it currently requires an MSHR to handle the cache miss caused by I m e m o r y . Therefore, MSHR contention and exhaustion can block I m e m o r y from bringing the data y into the cache, resulting in the slow execution of I m e m o r y . The attacker can infer secret_bit = 1 by observing that data y is not in the cache. When secret_bit = 0, these secret-dependent load instructions only cause a cache miss in step Electronics 14 03973 i002, and thus the previous I m e m o r y in y = load(z) does not experience any MSHR contention. The cache miss caused by I m e m o r y can be quickly handled by an MSHR, allowing the requested data to be filled into the cache in step Electronics 14 03973 i003, and the CPU quickly completes the execution of I m e m o r y . The attacker can infer secret_bit = 0 by observing that data y resides in the cache. We model this type of implicit cache-based speculative execution attacks as follows:
I s p e c S e n d e r { I a c c e s s , I t r a n s m i t { l o a d } , I m e m o r y { l o a d , s t o r e } } Implicit   Covert   Channel R e c e i v e r { C a c h e , T L B } ,
where I t r a n s m i t refers to a series of secret-dependent load instructions that facilitate the MSHR contention through multiple cache misses, and I m e m o r y is a load/store instruction to eventually generate a secret-dependent microarchitectural change in the covert channel of data caches/TLBs. In Figure 4, I m e m o r y is the non-speculative load instruction in y = load(z), denoted as I m e m o r y { l o a d n o n - s p e c u l a t i v e } in Table 1. TrackRISC-Defense can mitigate such MSHR attacks [41] by preventing the execution of transmit instructions (e.g., load instructions) in these attacks.

4.3.2. Non-Pipelined EU Contention

Figure 5 shows the attack example of implicit cache-based speculative execution attacks via a non-pipelined execution unit (EU). Electronics 14 03973 i001 denotes the preparatory work to temporarily stall the execution of the load instruction in A = load(address) via rendering its input operand unavailable. After step Electronics 14 03973 i001, the attacker bypasses the authorization in step ① and accesses the secret data in step ②.
During steps Electronics 14 03973 i002 and Electronics 14 03973 i003 in Figure 5, a secret-dependent microarchitectural change is indirectly generated to build a cache-based covert channel. When secret_bit = 1, accessing S[secret_bit * 64] causes a cache hit, allowing the data x to be quickly available. Then same_EU_contention(x) is executed, causing contention for the same EU with the previous computational task of EU_contention(z), and thereby delaying the execution of EU_contention(z). The delayed execution also slows down the execution of I m e m o r y in A = load(address), resulting in A not being cached during step Electronics 14 03973 i003, since I m e m o r y relies on the computational result from EU_contention(z). Therefore, the attacker can infer secret_bit = 1 by observing that data A is not in the cache. When secret_bit = 0, accessing S[secret_bit * 64]) causes a cache miss, allowing the data x to become slowly available. Since the input operand x is unavailable, the function same_EU_contention(x) is not executed, and no EU contention occurs on the computational task of EU_contention(z). I m e m o r y in A = load(address) can quickly complete execution in step Electronics 14 03973 i003, and the attacker can infer secret_bit = 0 by observing that the data A is in the cache. We model this type of implicit cache-based speculative execution attacks as follows:
I s p e c S e n d e r { I a c c e s s , I t r a n s m i t { l o a d , d i v , } , I m e m o r y { l o a d , s t o r e } } Implicit   Covert   Channel R e c e i v e r { C a c h e , T L B } ,
where I t r a n s m i t refers to an instruction (e.g., a division-based instruction denoted as d i v ) whose execution latency depends on secret-dependent operand values, thereby facilitating the subsequent execution-unit contention, and I m e m o r y is a load/store instruction to eventually generate a secret-dependent microarchitectural change in the covert channel of data caches/TLBs. In Figure 5, I t r a n s m i t is the load instruction in x = load(&S[secret_bit * 64]), whose execution latency depends on the value of secret_bit. Specifically, the load may result in a cache hit or miss, thereby influencing its latency when secret_bit = 0/1. And I m e m o r y is the non-speculative load instruction in A = load(address), denoted as I m e m o r y { l o a d n o n - s p e c u l a t i v e } in Table 1. TrackRISC-Defense can mitigate such EU attacks [41] by blocking the execution of transmit instructions (e.g., load instructions and division-based instructions) in these attacks, since these transmit instructions may trigger the following EU contention depending on the value of secret data.

4.4. Existing Hardware Defenses and Their Security Analysis

Based on implicit attack flows modeled in Table 1, we present a classification of existing hardware defenses against speculative cache-based covert channels, organized by defense security, performance, and hardware resource overhead, as summarized in Table 2. The defense mechanism classification is based on the prior work of Hu et al. [38]. The existing defense mechanisms include memory transmitter restriction [23], invisible structures [9,14,18], and delay-on-miss [11,12], adopting a high-security defense strategy to prevent the generation of secret-dependent microarchitectural changes in cache-based structures. These hardware defenses effectively block explicit speculative cache-based covert channels, but lack enough security against implicit channels. An important reason is that implicit cache-based speculative execution attacks can exploit non-memory transmit instructions to bypass these existing hardware defenses, but these hardware defenses primarily restrict the memory instructions that may act as potential transmitters in attacks.
The first defense mechanism (i.e., memory transmitter restriction) is to prevent the execution of unsafe memory-based transmitters, i.e., delay the execution of speculative load and store instructions that may hold secret-dependent operands. This defense mechanism, adopted by SpecTerminator-v1 [23], presents high security against explicit cache-based speculative execution attacks, where transmit instructions are secret-dependent speculative memory instructions (i.e., loads and stores) that directly generate attacker-observable cache-based microarchitectural changes. However, the defense mechanism cannot mitigate implicit cache-based speculative execution attacks such as Spectre Example 10 [34] and cache LRU covert channels [40] using conditional branch instructions as transmitters to indirectly generate secret-dependent microarchitectural changes. Moreover, this defense mechanism cannot mitigate speculative interference attacks [41] that can use non-memory instructions as transmitters to cause non-pipelined EU contention, finally forming cache covert channels.
The second defense mechanism (i.e., invisible structure), which is adopted by InvisiSpec [9], SafeSpec [14], and MuonTrap [18], is to exploit an invisible structure for speculative data storage, thereby preventing speculative data from entering caches and generating attacker-observable microarchitectural changes. However, these hardware defenses demonstrate inadequate security against speculative interference attacks via resource contention in miss status holding registers (MSHR)/non-pipelined execution units (EUs) [41], which are implicit attacks that ultimately enable cache covert channels. To the best of our knowledge, InvisiSpec [9], SafeSpec [14], and MuonTrap [18] do not specify modifications on MSHRs/non-pipelined EUs. Additionally, implicit attacks using EU contention can exploit transmitters that are non-memory instructions, with non-speculative data finally enabling secret-dependent microarchitectural changes in caches. Therefore, these hardware defenses will demonstrate inadequate security against these implicit attacks.
The third defense mechanism (i.e., delay-on-miss) is to delay the execution of speculative loads causing cache misses. The mechanism is adopted by EfficientSpec [12] and CondSpec [11]; however, the delay-on-miss mechanism is not secure enough in mitigating implicit attacks using load instructions causing cache hits to change the Least Recently Used (LRU) states of caches [40]. In addition, although they are under this defense mechanism, attackers can still exploit non-memory transmitters to enable EU contention and then use non-speculative data to finally generate secret-dependent microarchitectural changes. Therefore, delay-on-miss is also insufficient to mitigate speculative interference attacks [41] which exploit non-pipelined EU contention to eventually change cache states.
The defense mechanism of STT [6], implemented in TrackRISC-Defense, is more secure than the existing hardware defenses specifically designed for speculative cache-based covert channels, particularly in mitigating implicit cache-based speculative execution attacks. To mitigate a broader range of attacks, TrackRISC-Defense incurs higher performance overhead than these existing hardware defenses by blocking the execution of a larger set of transmit instructions, including the non-memory transmit instructions that may enable implicit cache-based or other non-cache covert channels. A hardware defense is typically composed of a tracking module and a restriction module in its hardware implementation. In terms of hardware resource overhead, if these existing defenses do not employ a potential performance optimization approach of leveraging a data-dependence tracking module to more precisely identify unsafe instructions and thereby reduce the number of potential unsafe instructions that must be restricted, TrackRISC-Defense will incur higher hardware resource overhead than these hardware defenses. Tracking data dependencies within a cycle imposes high hardware resource overhead when implementing TrackRISC-Defense on high-performance processor cores; however, a potential resource optimization approach is to partially reuse the register renaming logic of the original processor core to track data dependencies.

5. TrackRISC: Hardware Microarchitectural Mitigation

In addition to the implicit attack flow model, the TrackRISC framework also involves a tracking and mitigation microarchitecture, termed TrackRISC-Defense, to mitigate both implicit and explicit cache-based speculative execution attacks. In contrast to the existing defenses discussed, TrackRISC-Defense presents higher security by mitigating both implicit and explicit speculative cache-based covert channels through blocking the execution of both non-memory transmitters (e.g., secret-dependent conditional branch instructions) and memory transmitters (i.e, secret-dependent memory instructions) involved in such attacks. TrackRISC is built upon the tracking and mitigation methods of STT [6]. A comparison between STT and TrackRISC-Defense is presented in Section 7.6.

5.1. Critical Instruction Identification

TrackRISC-Defense identifies critical instructions that are essential for enabling information leakage in cache-based speculative execution attacks, and mitigates these attacks by preventing the execution of such instructions. Cache-based speculative execution attacks require the execution of speculation-inducing instructions ( I s p e c ), access instructions ( I a c c e s s ), and transmit instructions ( I t r a n s m i t ). Memory instructions ( I m e m o r y ) do not need to be tracked, since TrackRISC-Defense is designed to block cache-based speculative execution attacks at an earlier stage, before the execution of I m e m o r y . This is because that an attack can be mitigated at every critical instruction corresponding to each step of the attack process  [1,45]. The critical instruction definitions of I s p e c , I a c c e s s , and I t r a n s m i t are described in Section 4.1.1.
In TrackRISC-Defense, control-flow instructions (e.g., conditional branch instructions) that may cause mispredictions are identified as I s p e c . Speculative load instructions are treated as I a c c e s s , capable of accessing and reading secret data into registers, with all speculatively accessed data considered as secret information. Additionally, to mitigate cache-based speculative execution attacks, TrackRISC-Defense identifies and blocks two types of I t r a n s m i t , categorized as memory and non-memory instructions. On the one hand, for mitigating implicit attacks, TrackRISC-Defense considers that I t r a n s m i t includes speculative non-memory instructions (e.g., conditional branch instructions) that may have data dependence on secrets. On the other hand, for mitigating explicit attacks, TrackRISC-Defense considers that I t r a n s m i t also includes speculative memory instructions that are data-dependent on secrets. I t r a n s m i t has data dependence on I a c c e s s so as to encode secret information into attacker-observable microarchitectural changes.

5.2. TrackRISC-Defense Mechanism

TrackRISC-Defense adopts the defense mechanism of blocking the execution of transmit instructions in both implicit and explicit cache-based speculative execution attacks, since transmit instructions are critical components in both implicit and explicit attacks. Specifically, TrackRISC-Defense delays the execution of both speculative memory and non-memory instructions that may have data dependencies on secret values. In explicit cache-based speculative execution attacks, secret-dependent memory instructions (e.g., loads and stores) can serve as transmit instructions, directly causing attacker-observable microarchitectural changes and information leakage. In contrast, implicit cache-based attacks may leverage secret-dependent non-memory instructions (e.g., conditional branch instructions) as transmit instructions, indirectly triggering microarchitectural changes through control-flow decisions and resource contention. By preventing the execution of these transmit instructions, TrackRISC-Defense effectively mitigates both explicit and implicit attacks. Figure 6 illustrates the mitigate flow of TrackRISC-Defense for both implicit and explicit cache-based speculative execution attacks. The attack steps (i.e., authorization bypass, secret access, secret transmit, and secret recovery) have been discussed in Section 3.1. The critical instruction definitions of ( I s p e c , I a c c e s s , I t r a n s m i t and I m e m o r y ) have been presented in Section 4.1.1. Attackers can exploit non-memory transmit instructions, such as the conditional branch instructions denoted as b r or the division-based instructions denoted as d i v . Memory instructions are denoted as l o a d , s t o r e .

5.3. Taint Propagation

TrackRISC-Defense leverages taint propagation to monitor the transfer of secret data in speculative cache-based covert channels. Our taint propagation is implemented by tracking unsafe instructions through assigning taint labels to such instructions. The unsafety of tracked instructions arises because these instructions may operate on the operands dependent on secret information. An instruction is considered tainted [46] if it involves one or more input registers that may contain secret-dependent data, specifically when (i) the instruction is a potential access instruction capable of accessing secret data, or (ii) the instruction has data dependency on previous potential access instructions. Tainted instructions result in their output registers being tainted as well and the taint status of these registers is recorded in a global taint mask. TrackRISC-Defense is grounded in prior literature [6,8,22,23,46,47,48,49,50,51,52,53,54,55,56,57,58] regarding dynamic information flow tracking and mitigation.
In TrackRISC-Defense, the workflow of taint propagation is illustrated in Figure 7a, indicated by the red line. { r s i n 1 , r s i n 2 }/{ r s i n 3 , r s i n 4 } refers to the input register identifiers used by a given instruction. r s o u t 1 / r s o u t 2 , generally described as r s o u t , means the output register identifier used by a given instruction. r s i n 1 { s e c r e t } denotes that the input register (i.e., r s i n 1 ) may contain secret data. r s o u t 1 = r s i n 3 means that the current instruction using r s i n 3 is data-dependent with the previous instruction using r s o u t 1 . I n s t _ r s o u t { b i t } refers to a specific bit in the global taint mask that is set to 1 to record the taint status of an output register used by a tainted instruction, and the bit can be indexed by the r s o u t of the tainted instruction.
During taint propagation in TrackRISC-Defense, data dependence plays a crucial role in identifying newly tainted instructions, i.e., potential access and their data-dependent instructions. On the one hand, in a parallel instruction queue, data dependence between two instructions is identified by checking whether they share a register, i.e., if any input register of one instruction share the same register identifier with the output register of the other instruction, e.g., r s o u t 1 = r s i n 3 in Figure 7a. On the other hand, for non-parallel instructions, a global taint mask is exploited to track secret data dependence among such instructions. The global taint mask promptly records the taint status of output registers across newly tainted instructions by setting their corresponding bits to 1, with each bit indexed by the output register identifier of the respective instruction. The introduction of our global taint mask is detailed in Section 6.1.

5.4. Untaint Propagation

TrackRISC-Defense uses untaint propagation to stop tracking specific tainted instructions that have become safe. Specifically, our untaint propagation aims to halt the tracking of tainted instructions that have become non-speculative, resulting in their taint bits being cleared from the global taint mask where each bit is indexed by the output register identifier of a tainted instruction. In TrackRISC-Defense, the workflow of untaint propagation is illustrated in Figure 7b, indicated by the blue line. r s o u t denotes the output register identifier used by an instruction. I n s t _ r s o u t { b i t } = 0 refers to a specific bit in the global taint mask, indexed by r s o u t of the instruction, which is set to 0 to clear its taint status and indicate an untaint status. The speculative state of critical instructions in attacks is crucial for enabling speculative cache-based covert channels. Therefore, TrackRISC-Defense checks whether tainted instructions are speculative or non-speculative under every CPU cycle. If a tainted instruction become non-speculative, its corresponding bit in the global taint mask is then cleared from 1 to 0. Further details on the global taint mask are presented in Section 6.1.

5.5. Decision and Mitigation Scheme

To mitigate cache-based speculative execution attacks, TrackRISC-Defense utilizes the decision scheme to identify the potential transmit instructions, and exploits the mitigation scheme to delay/resume the execution of these instructions. An instruction is decided as a potential transmit instruction when it satisfies both of the following conditions: (i) the instruction is speculative and belongs to specific instruction types with the potential to act as unsafe transmitters, and (ii) the instruction has a data dependence on potential access instructions or their data-dependent instructions, i.e., the instruction must have at least one tainted input register. Whether a register is tainted or untainted can be checked through the global taint mask. A register is tainted if the corresponding bit indexed by the register identifier is set to 1 in the global taint mask. Once a potential transmit instruction is identified, the decision scheme informs the mitigation scheme to delay its execution. However, if the instruction subsequently satisfies either of the safe conditions that (i) the bit in the global taint mask, indexed by any input register identifier of the instruction, is set to 0, or (ii) the instruction is non-speculative, the decision scheme also signals the mitigation scheme to resume the execution of the potential transmit instruction.

6. Microarchitecture

TrackRISC-Defense is implemented on top of a practical out-of-order RISC-V processor core, SonicBOOM. The microarchitecture of TrackRISC-Defense is shown in Figure 8, where a global taint mask denoted as the global_taint_mask signal is used to record the taint information of all tainted instructions in the CPU pipeline, a speculative mask denoted as the br_mask signal is used to record the speculative/non-speculative state of an instruction, the access_inst signal is for labeling every tainted instruction, and some trivial signals are omitted for brevity. TrackRISC-Defense consists of the hardware logic of tracking, taint, untaint, and mitigation. The tracking logic monitors taint propagation to identify tainted instructions, i.e., potential access instructions along with their data-dependent instructions. The taint/untaint logic is used to timely record/clear taint information recorded in a global taint mask. The decision and mitigation logic is utilized to identify potential transmit instructions and block their execution by selectively issuing these instructions to mitigate both implicit and explicit cache-based speculative execution attacks.

6.1. Global Taint Mask

Our global taint mask is used for two purposes: (i) identifying newly tainted instructions during taint propagation, and (ii) assisting in the delay and resumption of execution for all potential transmit instructions. By using the global taint mask, TrackRISC-Defense can achieve efficient register-based hardware resource overhead.
TrackRISC-Defense leverages the volatile value of a global taint mask, stored in a hardware register, to represent taint information of all tainted instructions. Taint information refers to the unsafe status of all registers that may hold secret-dependent operands across tainted instructions, and an instruction is considered tainted when it uses such input registers. Tainted instructions include potential access instructions and their data-dependent instructions. The global taint mask with a volatile value is utilized to dynamically and globally store the taint status of all tainted instructions within the pipeline. Untaint information is automatically cleared within the global taint mask under every cycle, according to speculative states of tainted instructions. In the hardware, the global taint mask is stored in a single hardware register constructed by a RegInit function, and each mask bit can be indexed by a register identifier of an instruction. RegInit is a function provided by Chisel [59], a hardware description language that supports the benefits of agile development.
Within the global taint mask, TrackRISC-Defense exploits a one-hot encoding technique combined with bitwise OR operations to accumulate and record taint information of all tainted instructions. Each bit in the global taint mask is set to 1 or 0 to indicate the taint or untaint status of an instruction, with each mask bit indexed by the register identifier of an instruction. The workflow for generating the global taint mask is illustrated in Figure 9. The output register identifier of every tainted instruction is obtained by decoding the instruction. Each register identifier is then individually converted into a one-hot encoded mask, e.g., a register identifier with the value 5 is converted to an one-hot encoding mask with the binary value 100,000 through shifting 1 left by 5 bits. Multiple one-hot encoding masks corresponding to different tainted instructions are combined through bitwise OR operations to generate the global taint mask stored in a hardware register, thereby allowing the global taint mask to record the taint information of all tainted instructions. In the implementation, TrackRISC-Defense uses two global taint masks that separately record the taint information of integer and floating-point registers using the same method. For simplicity, we use a unified explanation to introduce the two global taint masks.

6.2. Tracking Logic

The tracking logic is designed to track taint propagation for identifying newly tainted instructions, i.e., potential access instructions and their data-dependent instructions. TrackRISC-Defense assigns taint labels to every tainted instruction by activating its corresponding access_inst signal. A potential access instruction (i.e., a speculative load) is identified by its non-zero speculative mask (i.e., the br_mask signal) and the activating uses_ldq signal. An activated uses_ldq signal of an instruction refers to the instruction as a load. TrackRISC-Defense uses a speculative mask extended from the original processor core to represent the speculative state of an instruction, and the non-zero/zero mask value indicates the speculative/non-speculative states of the instruction. TrackRISC-Defense tracks data dependence between all pairs of instructions in an instruction queue for parallel execution supported by superscalar CPU configuration so as to identify newly tainted instructions. Data dependence is determined by whether the input and output logical registers of two instructions share the same register identifier, e.g., ldst = lrs1, where the ldst signal is the output register identifier of an instruction and the lrs1 signal is the input register identifier of the other instruction. In addition, instructions involving any tainted input registers should also be considered as newly tainted instructions, and the taint status of input registers can be checked through the global taint mask (i.e., the global_taint_mask signal). In the tracking logic, transmitter flags are also tracked to preliminarily identify the instruction types that align with transmitter characteristics. Table 3 shows detailed instruction type signals for these transmitter flags, which are utilized in the subsequent decision and mitigation logic to identify potential transmit instructions.

6.3. Taint and Untaint Logic

The taint logic is used to generate the global taint mask, which records the unsafe status of all registers across tainted instructions. A detailed analysis of the global taint mask is presented in Section 6.1. The taint logic captures pdst signals, i.e., identifiers of physical output registers associated with instructions, where each tainted instruction is identified by its activated access_inst signal. These register identifiers are then used as indices to update the global taint mask by setting the corresponding bits to 1, thereby recording the taint information.
The untaint logic updates the global taint mask by clearing specific mask bits corresponding to tainted instructions that become non-speculative. The speculative state of an instruction is recorded in its speculative mask (i.e., the br_mask signal), and an instruction is non-speculative when its speculative mask value is equal to zero (i.e., when br_mask = 0). When a tainted instruction become non-speculative, its corresponding bit in the global taint mask, indexed by its physical output register identifiers (i.e., the pdst signals), is cleared from 1 to 0, marking the register as untainted.
TrackRISC-Defense is implemented separately from the hardware logic of register renaming. Specifically, TrackRISC-Defense confirms the data dependence in the parallel instruction queue before the renaming stage, based on logical register identifiers. After the renaming stage, TrackRISC-Defense identifies newly tainted instructions that are data-dependent on previously recorded tainted instructions, using the global taint mask that is indexed by the renamed physical register identifiers. Our taint logic is applied subsequently to register renaming, after physical register identifiers have been renamed and confirmed. The global taint mask is dynamically updated every cycle based on the instruction information recorded in the reorder buffer. This instruction information is continuously refreshed to reflect the real-time state of pipeline execution.

6.4. Decision and Mitigation Logic

The decision logic is used to identify potential transmit instructions. Transmitter flags refer to instruction types that are commonly used as transmit instructions in speculative cache-based covert channels. Transmitter flags are listed in Table 3, and the decision logic identifies an instruction as a potential transmit instruction if it meets both conditions: (i) the instruction is associated with an activated transmitter flag and is in a speculative state, and (ii) any bit in the global taint mask, indexed by one or more input register identifiers of the instruction, is set to 1.
In the decision logic, the workflow for identifying a potential transmit instruction is illustrated in Figure 10, where I n s t { s p e c , f l a g } refers to a speculative instruction that has been validated with an activated transmitter flag, and =/= denotes an inconsistent result. After identifying an instruction with the characteristics of I n s t { s p e c , f l a g } and obtaining its input register identifiers, each input register identifier used by the instruction is individually converted into its corresponding one-hot encoded mask. For example, an input register identifier with the value 2 is converted to a one-hot encoding mask with the binary value 100 through shifting 1 left by 2 bits. The decision logic then performs a bitwise AND between each input register identifier and the global taint mask to determine whether any bit in the global taint mask, indexed by every input register identifier of the instruction, is set to 1. If any bit is set to 1, the instruction will be identified as a potential transmit instruction.
The mitigation logic delays the execution of the potential transmit instructions until these instructions are deemed safe, so as to mitigate cache-based speculative execution attacks. The decision logic signals the mitigation logic to trigger the delay or resumption of potential transmit instructions. The delay operation is to temporarily halt specific instructions issued to the execution units [23]. For a potential transmit instruction, TrackRISC-Defense considers the instruction safe and resumes its execution if (i) the instruction becomes non-speculative (i.e., when br_mask = 0), or (ii) no bit in the global taint mask, indexed by any input register identifier of the instruction, is set to 1.

6.5. Key Procedural Analysis

Figure 11 shows the procedure used to delay or resume the execution of transmit instructions in TrackRISC-Defense. As we can see, the procedure consists of two phases. During the first phase, the taint information stored in the global taint mask is updated by related instructions through their signals of br_mask and rob_val, and the descriptions of main signals in TrackRISC-Defense are provided in Table 4. Specifically, TrackRISC-Defense augments the global taint mask with taint information from newly tainted instructions such as access instructions ( I a c c e s s ), and removes the taint information corresponding to previous non-speculative tainted instructions. Like the state-of-the-art hardware defenses [6,7,8], access instructions refer to speculative load instructions. The value of the global taint mask can be updated cycle by cycle, and the mask update operation is denoted as O P 1 , which is described in detail in Section 6.5.1. Then, during the second phase, the delay and mitigation logic leverages the updated global taint mask, which carries a volatile value to delay or resume the execution of potential transmit instructions. The delayed/resumed execution logic is denoted as O P 2 , which is presented in detail in Section 6.5.2. The signal values of rob_val, br_mask, and global_taint_mask may change based on the original condition logic controlling these signals in the processor core, thereby enabling the delayed or resumed execution of the potential transmit instructions.

6.5.1. Phase 1: Taint Information Update Logic

The first phase involving O P 1 in Figure 11 is described in detail as follows, illustrating how the global taint mask records or clears taint information from all relevant instructions. We present a symbolic representation of how our global taint mask is generated, as given by the following:
T ( i n s t ) = 1 ( Taint   Status ) 0 ( Untaint   Status ) g l o b a l _ t a i n t _ m a s k = ( T ( i n s t 1 ) < < r s 1 ) | ( T ( i n s t 2 ) < < r s 2 ) | | ( T ( i n s t N ) < < r s N ) ,
where { r s 1 , r s 2 , , r s N } is a set of output register identifiers that corresponds to the set of tainted instructions { i n s t 1 , i n s t 2 , , i n s t N } , together with their taint status set { T ( i n s t 1 ) , , T ( i n s t N ) } , | denotes the bitwise OR operator, and < < is the left shift operator. { i n s t 1 , i n s t 2 , , i n s t N } corresponds to the previous tainted instructions whose instruction information is recorded in the reorder buffer (ROB). Tainted instructions include potential access instructions ( I a c c e s s ). TrackRISC-Defense also accounts for the need to update the global taint mask with the current parallel instruction queue. Since the taint information update logic is similar for both the current parallel instruction queue and the previous instructions in the ROB, we focus on describing the update logic in the ROB for brevity.
In the ROB, the rob_val signal of an instruction becomes invalid when the instruction encounters situations such as commit. Our global taint mask will not record the taint information of such instructions with an inactive rob_val signal. The br_mask signal denotes the speculative state of an instruction, and a zero-value br_mask signal means that the instruction is non-speculative. Both the values of rob_val and br_mask signals are dynamically updated through the original condition logic that can control these signals. In addition, these signals are tracked within the reorder buffer under every cycle so as to record the real-time status of every instruction during pipeline execution. We provide a symbolic representation to illustrate how the taint information stored in the global taint mask is updated using the signals of br_mask and rob_val as follows:
Taint / Untaint   Condition   Logic : T ( i n s t 1 ) = T ( i n s t 1 )     & &     r o b _ v a l ( i n s t 1 )     & &     ( b r _ m a s k ( i n s t 1 ) 0 ) , T ( i n s t 2 ) = T ( i n s t 2 )     & &     r o b _ v a l ( i n s t 2 )     & &     ( b r _ m a s k ( i n s t 2 ) 0 ) , T ( i n s t N ) = T ( i n s t N )     & &     r o b _ v a l ( i n s t N )     & &     ( b r _ m a s k ( i n s t N ) 0 ) , Taint   Information   Update   Logic : g l o b a l _ t a i n t _ m a s k = ( T ( i n s t 1 ) < < r s 1 ) | ( T ( i n s t 2 ) < < r s 2 ) | | ( T ( i n s t N ) < < r s N ) ,
where { T ( i n s t 1 ) , , T ( i n s t N ) } denotes a set of dynamically updated taint statuses corresponding to the set of tainted instructions { i n s t 1 , i n s t 2 , , i n s t N } , and TrackRISC-Defense determines whether to retain or remove the taint status of each instruction according to its associated volatile rob_val and br_mask signals, applying the logical AND operator ( & & ). Then the value of the global taint mask, indexed by the set of output register identifiers { r s 1 , r s 2 , , r s N } corresponding to the instructions { i n s t 1 , i n s t 2 , , i n s t N } , is updated via { T ( i n s t 1 ) , , T ( i n s t N ) } , bitwise OR operators (|), and left shift operators ( < < ). As we can see, our untaint condition logic is based on the non-speculative state of related instructions and rob_val signals. For example, when b r _ m a s k ( i n s t 1 ) = 0 or r o b _ v a l ( i n s t 1 ) = 0 for an instruction, T’(inst1) of the instruction ( i n s t 1 ) is equal to 0, and the corresponding bit in the global taint mask is set to 0 to indicate an untainted status of the instruction.

6.5.2. Phase 2: Delayed/Resumed Execution Logic

The second phase involving O P 2 in Figure 11 is introduced in detail as follows, illustrating the workflow of the delayed/resumed execution logic. A symbolic representation of the decision and mitigation logic for delaying/resuming a potential transmit instruction is provided as follows:
d e l a y e d _ i n s t = u s e s _ s t q   | |   u s e s _ l d q   | |   i s _ b r   | |     | |   F S Q R T _ D , c a n n o t _ a l l o c a t e = d e l a y e d _ i n s t     & &     ( b r _ m a s k 0 )     & & ( g l o b a l _ t a i n t _ m a s k   &   ( 1 < < r s ) ) 0 ,
where an active delayed_inst signal means the instruction belongs to one of the transmit instruction types in Table 3, and the transmit instruction may be a memory instruction that can potentially enable explicit cache-based speculative execution attacks, or a control-flow, division, or square-root instruction that may facilitate various implicit cache-based speculative execution attacks through control-flow decisions or resource contention. | | is the logical OR operator, & & is the logical AND operator, and < < is the left shift operator. b r _ m a s k 0 denotes that the instruction is speculative, and ( g l o b a l _ t a i n t _ m a s k   &   ( 1 < < r s ) ) 0 means that the instruction is tainted, since the instruction is data-dependent on potential access instructions ( I a c c e s s ). Specifically, the corresponding bit in the global taint mask, indexed by an input register identifier ( r s ) of the instruction, is recorded as 1. An active cannot_allocate denotes that the instruction is a potential transmit instruction requiring delayed execution. The br_mask signal of the instruction is dynamically updated and recorded under every cycle. When the b r _ m a s k of a delayed instruction changes from non-zero to zero, the cannot_allocate signal is inactive, allowing the instruction to resume execution. In addition, when the result of ( g l o b a l _ t a i n t _ m a s k   &   ( 1 < < r s ) ) changes from non-zero to zero, meaning that all bits corresponding to the instruction’s input registers in the global taint mask are cleared, it indicates that the instruction has no data dependence on potential access instructions ( I t r a n s m i t ). Consequently, the cannot_allocate signal is inactive, allowing the instruction to resume execution.

7. Evaluation Results and Analysis

This section presents the evaluation results and analysis of TrackRISC-Defense, focusing on its security, performance, and hardware resource utilization. Compared to a representative existing defense, TrackRISC-Defense offers enhanced security while efficiently leveraging register-based hardware resources.

7.1. Experimental Setup

7.1.1. Platform Configuration

TrackRISC-Defense is built on top of SonicBOOM [28], an advanced RISC-V out-of-order processor core. The processor core is instantiated through the Chipyard [60] platform. The security evaluation is conducted using both explicit and implicit cache-based speculative execution attacks based on Spectre Variant 1. The source code for the explicit attacks is adopted from Gonzalez et al. [61] and Sabbagh et al. [62]. Moreover, an implicit attack variant is implemented on RISC-V for further security verification. The performance evaluation is conducted using the SPEC2017 [63] benchmarks by measuring their execution time on both unmodified and defense-optimized processor cores. The SPEC2017 [63] benchmark suite is an industry-standard set of benchmarks for evaluating the performance of next-generation CPU microarchitectures. The performance evaluation platform is established by configuring a Linux operating system running on the FPGA. The hardware resource evaluation is based on the VCU118 FPGA board, where the core frequency is set to 75 MHz.

7.1.2. Baseline Setup

Table 5 presents the CPU configuration used in TrackRISC-Defense evaluation, where the large core configuration is adopted consistently across both the baselines and TrackRISC-Defense. The baselines used in this work are the original SonicBOOM RISC-V processor core [28] and a variant of SonicBOOM integrated with SpecTerminator-v1 [23]. We implement SpecTerminator-v1 [23] according to its defense mechanism, and used it as a baseline in our evaluation. The baselines and TrackRISC-Defense are introduced in Table 6. To ensure a fair comparison, we implement both SpecTerminator-v1 [23] and TrackRISC-Defense using the same hardware logic for tracking, taint, untaint, decision, and mitigation, with the only difference being the types of transmit instructions. The implemented SpecTerminator-v1 considers loads and stores as unsafe transmitters, and TrackRISC-Defense considers a broader instruction types as unsafe transmitters, as shown in Table 3.

7.2. Security Evaluation and Analysis

7.2.1. Security Evaluation Results

Both explicit and the implicit cache-based speculative execution attacks are implemented to experimentally verify the security of the baselines and TrackRISC-Defense. The pseudocodes of the implemented implicit and explicit attacks are presented in Listing 1 and Listing 2, respectively, with detailed descriptions provided in Section 2.2.1 and Section 2.2.2.
For the explicit cache-based speculative execution attacks, the attack logs from (i) the unprotected CPU and (ii) the CPU protected by SpecTerminator-v1 [23] or TrackRISC-Defense are shown in Listings 3 and 4, respectively. As we can see, the secret string of #v1/SecretDataInCPU! is leaked in Listing 3, while it is not leaked in Listing 4. For conciseness, we omit the incorrectly inferred secret characters during the explicit attacks in the Listings 3 and 4.
For the implicit cache-based speculative execution attacks, Figure 12 illustrates the attack results from the baselines and the CPU protected by TrackRISC-Defense. As we can see in the baselines of the unprotected CPU and the CPU protected by SpecTerminator-v1 [23], a distinctive access time below the predefined cache hit threshold is observed when attackers correctly guess the secret value. However, in TrackRISC-Defense, all access times represented on the Y-axis exceed the predefined cache hit threshold as the X-axis changes. As a result, even if attackers accidentally try the correct secret value, no relevant covert channel based on timing conflicts can be established, and information leakage is prevented.
Listing 3. Explicit attack log (Spectre Variant 1) from the unprotected CPU (SonicBOOM [28]).
Electronics 14 03973 i006
Listing 4. Explicit attack log (Spectre Variant 1) from the CPU protected by SpecTerminator-v1 [23] or TrackRISC-Defense.
Electronics 14 03973 i007

7.2.2. Security Analysis

In explicit cache-based speculative execution attacks, secret-dependent memory instructions (i.e., loads and stores) are unsafe transmit instructions that can enable attacker-observable microarchitectural changes in speculative covert channels of caches and TLBs. Both SpecTerminator-v1 [23] and TrackRISC-Defense block transmitters in explicit attacks by preventing the execution of speculative memory instructions using speculatively secret-dependent data. Therefore, both SpecTerminator-v1 [23] and TrackRISC-Defense can mitigate explicit cache-based speculative execution attacks.
However, in the implicit cache-based speculative execution attacks, non-memory instructions, such as conditional branch instructions, can also act as unsafe transmitters by indirectly causing secret-related microarchitectural changes. Consequently, SpecTerminator-v1 [23] cannot mitigate such implicit attacks by only blocking memory instructions that may act as transmitters. In contrast, TrackRISC-Defense is able to mitigate implicit cache-based speculative execution attacks by additionally blocking the non-memory transmit instructions associated with such implicit attacks. The potential transmit instructions blocked by TrackRISC-Defense are listed in Table 3. The claim that TrackRISC-Defense mitigates implicit LRU/MSHR/EU attacks [40,41] is supported by both indirect experimental and direct theoretical mitigation evidence. The direct experimental mitigation evidence about these LRU/MSHR/EU attacks are our future work. On the one hand, the experimental attack results (in both Listing 3 vs. Listing 4 and Figure 12) directly demonstrate that TrackRISC-Defense can mitigate explicit speculative cache-based execution attacks [1] by delaying the execution of memory transmit instructions, and can mitigate the experimental implicit speculative cache-based execution attacks [34] by delaying the execution of non-memory transmit instructions (i.e., control-flow instructions). On the other hand, TrackRISC-Defense builds on a recognized defense strategy that prevents the execution of multiple transmit instructions associated with various types of cache-based speculative execution attacks [1,6]. To mitigate implicit LRU attacks [40], TrackRISC-Defense delays the execution of non-memory instructions (i.e., control-flow instructions) to restrict their ability to induce attacker-controlled LRU microarchitectural changes via control-flow decisions. TrackRISC-Defense mitigates implicit MSHR attacks [41] by delaying the execution of memory transmit instructions, thereby preventing these instructions from causing attacker-controlled MSHR resource contention. To mitigate implicit EU attacks [41], TrackRISC-Defense additionally delays the execution of non-memory transmit instructions (e.g., division-based and load instructions) to prevent these unsafe instructions from triggering EU resource contention.

7.3. Performance Evaluation and Analysis

The CPU performance is evaluated by measuring the execution time of SPEC2017 [63] benchmarks run on the baseline CPUs and the CPU protected by TrackRISC-Defense. The performance overhead is computed according to the following formula:
Performance Overhead = UserTime ( protected ) UserTime ( unprotected ) 1 ,
where UserTime(protected) refers to the execution time of a benchmark on a CPU protected by a hardware defense, while UserTime(unprotected) is the execution time of a benchmark running on the original unprotected CPU.
Our baseline, i.e., the CPU protected by SpecTerminator-v1 [23], incurs an average performance overhead of 13.8% on SPEC2017 benchmarks. In contrast, TrackRISC-Defense shows higher security at the cost of an average performance overhead of 19.4%, which is higher than SpecTerminator-v1 [23]. Figure 13 presents the detailed performance overheads of the baselines and TrackRISC-Defense on individual SPEC2017 benchmarks, with each benchmark incurring a different overhead.
The performance overheads of SpecTerminator-v1 [23] and TrackRISC-Defense are derived from the delayed execution of potential transmit instructions during the CPU pipeline. SpecTerminator-v1 [23] only blocks memory transmitters, i.e., specific speculative loads and stores, to mitigate explicit cache-based speculative execution attacks. In contrast, TrackRISC-Defense additionally blocks other non-memory instructions that may act as transmitters to mitigate both explicit and implicit cache-based speculative execution attacks. Table 3 presents the instruction types of potential transmitters blocked by TrackRISC-Defense. Offering stronger security than SpecTerminator-v1 [23], TrackRISC-Defense incurs the performance overhead of 19.4%, an increase over SpecTerminator-v1 [23] with the performance overhead of 13.8%. As shown in Figure 13, performance overheads vary across individual SPEC2017 benchmarks, because different amounts of potential transmit instructions are identified and under delayed execution when these benchmarks run on CPU. Figure 14 illustrates the performance overheads of SpecTerminator-v1 [23] vs. TrackRISC-Defense on SPEC2017 benchmarks, with error bars representing the results of three repeated trials for each defense. The average performance overheads of SpecTerminator-v1 [23] on the SPEC2017 benchmarks are 13.8%, 13.5%, and 14.0% across three experiments, respectively. The average performance overheads of TrackRISC-Defense on SPEC2017 benchmarks are 19.4%, 19.5%, and 19.6% across three experiments, respectively. As shown, the standard deviation is negligible, indicating that our conclusions are not significantly affected, which further supports the robustness of the experimental results.

7.4. Hardware Resource Evaluation and Analysis

Table 7 shows the hardware resource utilization of TrackRISC-Defense. As we can see, TrackRISC-Defense incurs negligible register-based resource overhead, which is 0.4% of the hardware registers (i.e., flip-flops) on the FPGA. Our register-based resource overhead primarily arises from the storage and utilization of the global taint mask, with potential additional overhead from maintaining taint labels in the reorder buffer (ROB). TrackRISC-Defense records taint information in a global taint mask which is efficiently stored in a hardware register. The hardware resource overhead of look-up tables in TrackRISC-Defense is mainly due to the logic for tracking tainted instructions, and tracking data dependencies in both parallel and non-parallel instructions.

7.5. Baselines vs. TrackRISC-Defense

Our baselines are introduced in Table 6, which are vulnerable to implicit cache-based speculation execution attacks. Table 8 presents the comparison results between the baseline and TrackRISC-Defense in terms of security and performance. In security, the original SonicBOOM [28] has been experimentally verified to be vulnerable to both explicit and implicit cache-based speculative execution attacks. SpecTerminator-v1 [23] adopts a representative existing defense that delays the execution of speculative memory instructions that use secret-dependent operands, effectively preventing explicit cache-based speculative execution attacks. However, the experimental results indicate that this representative existing defense exhibits notable limitations in mitigating implicit cache-based speculative execution attacks. In contrast, TrackRISC-Defense provides stronger protection and can mitigate both implicit and explicit cache-based speculative execution attacks by delaying the execution of both speculative memory and non-memory instructions that may act as transmitters in implicit attacks. A comparison between TrackRISC-Defense and more prior hardware defenses is shown in Table 2. TrackRISC-Defense incurs a higher performance overhead than SpecTerminator-v1 [23], since TrackRISC-Defense additionally delays the execution of non-memory instructions in implicit attacks, i.e., TrackRISC-Defense delays more instructions than SpecTerminator-v1 [23] to prevent more attacks.

7.6. STT vs. TrackRISC-Defense

TrackRISC-Defense is built on top of STT [6], and our design goal is to achieve efficient use of register-based hardware resources when implementing the defense mechanism of STT [6] that blocks all known transmitters in both explicit and implicit attacks. STT [6] is a state-of-the-art hardware defense demonstrating strong security and performance. Compared to STT [6], TrackRISC-Defense eliminates the hardware logic required to track the youngest tainted access instruction for each potential transmit instruction. Instead, TrackRISC-Defense relies on an efficient global taint mask to determine when these instructions should be delayed or resumed. In STT, the taint or untaint status of the youngest access instruction is critical for deciding whether a potential transmit instruction can safely resume execution. Here, the youngest access instruction refers to the most recently fetched potential access instruction that precedes the transmit instruction in program order and on which the transmit instruction has a data dependence. In addition, STT [6] is validated on Gem5 simulator [43]. To obtain realistic hardware evaluation results, TrackRISC-Defense is implemented on a RISC-V processor core evaluated by using FPGA running Linux. TrackRISC-Defense employs a global taint mask, efficiently stored in a hardware register, to record the taint status of all tainted instructions. The mask allows TrackRISC-Defense to incur a negligible register-based hardware overhead of less than 1% on FPGA.

8. Conclusions

Cache-based speculative execution attacks present a significant security risk to modern processors, with implicit variants posing a particular challenge by substantially weakening the effectiveness of existing hardware defenses designed for speculative cache-based covert channels. In this paper, we propose TrackRISC, an attack–defense framework comprising (i) a refined attack flow model for analyzing and reasoning about implicit cache-based speculative execution attacks, and (ii) a security-enhanced tracking and mitigation microarchitecture to mitigate both implicit and explicit cache-based speculative execution attacks. Our model reveals why implicit security vulnerabilities can compromise the security of the existing hardware defenses, with experimental evidence showing that a representative existing defense remains inadequate against implicit cache-based speculative execution attacks. Moreover, a security-enhanced tracking and mitigation microarchitecture is implemented to mitigate both implicit and explicit speculative cache-based covert channels. The technique of a global taint mask is further incorporated to ensure that TrackRISC-Defense incurs negligible register-based hardware resource overhead on FPGA.

9. Discussion

Although this paper focuses on protecting single-core processors, TrackRISC-Defense, based on the defense mechanism of STT [6], could also be adapted for multicore environments with shared caches, where each core needs independently handle the delayed execution of potential unsafe instructions without interference, e.g., the speculative information maintained by a processor core should remain unaffected by the activities of other cores. TrackRISC-Defense is implemented entirely within the processor core, without modifications on shared caches, thereby reducing its operational impact on complex cache behaviors, such as those arising from cache coherence in multicore environments. The relevant prefetchers could also be partially disabled to prevent secret-related data from being fetched so as to mitigate prefetcher-based speculative covert channels [39].
TrackRISC-Defense demonstrates good security and performance in mitigating both implicit and explicit cache-based speculative execution attacks. To further enhance its security robustness against unknown vulnerabilities, hardware could integrate both TrackRISC-Defense and an additional high-security hardware defense, called Eager Delay [12], which can mitigate both known and unknown speculative covert channels by preventing attackers from accessing secret data. However, the high-security defense incurs significantly higher performance overhead than TrackRISC-Defense. Therefore, a potential adaptive strategy is to leverage instruction set extensions that allow software to selectively invoke TrackRISC-Defense for mitigating known vulnerabilities and trigger Eager Delay for addressing unknown vulnerabilities. In this manner, the processor could dynamically switch between heterogeneous hardware defenses, guided by secret data and the risk of software code snippets before their execution in hardware [26], so as to achieve better security and performance in modern computing systems.

Author Contributions

Conceptualization, Z.Z., A.I.S., P.S.Y.H., and R.C.C.C.; methodology, Z.Z. and A.I.S.; software, Z.Z.; validation, Z.Z., Y.S., and J.H.; investigation, Z.Z., Y.S., and J.H.; resource, R.C.C.C.; data curation, Z.Z.; writing—original draft preparation, Z.Z.; writing—review and editing, Z.Z., A.I.S., Y.S., P.S.Y.H., and R.C.C.C.; visualization, Z.Z.; supervision, R.C.C.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by Hong Kong Innovation and Technology Commission (ITF Seed Fund ITS/098/22), City University of Hong Kong (Project Grant No. 9440356).

Data Availability Statement

The original contributions presented in the study are included in the article. The raw data supporting the conclusions of this article will be made available by the authors on request.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
ALUArithmetic Logic Unit
AVXAdvanced Vector Extensions
BTBBranch Target Buffer
CPUCentral Processing Unit
CSRControl and Status Register
D-TLBData Translation Lookaside Buffer
EUExecution Unit
FPFloating-Point
FPGAField-Programmable Gate Array
ISAInstruction Set Architecture
MSHRMiss Status Holding Register
PHTPattern History Table
PMUPerformance Monitor Unit
RISCReduced Instruction Set Computer
ROBReorder Buffer
RSReservation Station
RTLRegister-Transfer-Level
RSBReturn Stack Buffer
TLBTranslation Lookaside Buffer

Appendix A

An estimation of the fraction of delayed instructions, classified as memory or non-memory, is shown in Figure A1 [64]. TrackRISC-Defense blocks memory transmit instructions to mitigate explicit cache-based speculative execution attacks and blocks non-memory transmit instructions to mitigate implicit cache-based speculative execution attacks.
Figure A1. An estimation of delayed instruction fraction in TrackRISC-Defense.
Figure A1. An estimation of delayed instruction fraction in TrackRISC-Defense.
Electronics 14 03973 g0a1

References

  1. Kocher, P.; Horn, J.; Fogh, A.; Genkin, D.; Gruss, D.; Haas, W.; Hamburg, M.; Lipp, M.; Mangard, S.; Prescher, T.; et al. Spectre attacks: Exploiting speculative execution. Commun. ACM 2020, 63, 93–101. [Google Scholar] [CrossRef]
  2. Koruyeh, E.M.; Khasawneh, K.N.; Song, C.; Abu-Ghazaleh, N. Spectre returns! speculation attacks using the return stack buffer. In Proceedings of the 12th USENIX Workshop on Offensive Technologies (WOOT 18), Baltimore, MD, USA, 13–14 August 2018. [Google Scholar] [CrossRef]
  3. Maisuradze, G.; Rossow, C. ret2spec: Speculative execution using return stack buffers. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, Toronto, ON, Canada, 15–19 October 2018; pp. 2109–2122. [Google Scholar] [CrossRef]
  4. Lampson, B.W. A note on the confinement problem. Commun. ACM 1973, 16, 613–615. [Google Scholar] [CrossRef]
  5. Wang, Z.; Lee, R.B. Covert and side channels due to processor architecture. In Proceedings of the 2006 22nd Annual Computer Security Applications Conference (ACSAC’06), Miami Beach, FL, USA, 11–15 December 2006; pp. 473–482. [Google Scholar] [CrossRef]
  6. Yu, J.; Yan, M.; Khyzha, A.; Morrison, A.; Torrellas, J.; Fletcher, C.W. Speculative taint tracking (stt) a comprehensive protection for speculatively accessed data. In Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, Columbus, OH, USA, 12–16 October 2019; pp. 954–968. [Google Scholar] [CrossRef]
  7. Barber, K.; Bacha, A.; Zhou, L.; Zhang, Y.; Teodorescu, R. Specshield: Shielding speculative data from microarchitectural covert channels. In Proceedings of the 2019 28th International Conference on Parallel Architectures and Compilation Techniques (PACT), Seattle, WA, USA, 23–26 September 2019; pp. 151–164. [Google Scholar] [CrossRef]
  8. Yu, J.; Mantri, N.; Torrellas, J.; Morrison, A.; Fletcher, C.W. Speculative data-oblivious execution: Mobilizing safe prediction for safe and efficient speculative execution. In Proceedings of the 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA), Virtual Event, 30 May–3 June 2020; pp. 707–720. [Google Scholar] [CrossRef]
  9. Yan, M.; Choi, J.; Skarlatos, D.; Morrison, A.; Fletcher, C.; Torrellas, J. Invisispec: Making speculative execution invisible in the cache hierarchy. In Proceedings of the 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), Fukuoka, Japan, 20–24 October 2018; pp. 428–441. [Google Scholar] [CrossRef]
  10. Kiriansky, V.; Lebedev, I.; Amarasinghe, S.; Devadas, S.; Emer, J. DAWG: A defense against cache timing attacks in speculative execution processors. In Proceedings of the 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), Fukuoka, Japan, 20–24 October 2018; pp. 974–987. [Google Scholar] [CrossRef]
  11. Li, P.; Zhao, L.; Hou, R.; Zhang, L.; Meng, D. Conditional speculation: An effective approach to safeguard out-of-order execution against spectre attacks. In Proceedings of the 2019 IEEE International Symposium on High Performance Computer Architecture (HPCA), Washington, DC, USA, 16–20 February 2019; pp. 264–276. [Google Scholar] [CrossRef]
  12. Sakalis, C.; Kaxiras, S.; Ros, A.; Jimborean, A.; Sjalander, M. Efficient invisible speculative execution through selective delay and value prediction. In Proceedings of the 46th International Symposium on Computer Architecture, Phoenix, AZ, USA, 22–26 June 2019; pp. 723–735. [Google Scholar] [CrossRef]
  13. Weisse, O.; Neal, I.; Loughlin, K.; Wenisch, T.F.; Kasikci, B. NDA: Preventing speculative execution attacks at their source. In Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, Columbus, OH, USA, 12–16 October 2019; pp. 572–586. [Google Scholar] [CrossRef]
  14. Khasawneh, K.N.; Koruyeh, E.M.; Song, C.; Evtyushkin, D.; Ponomarev, D.; Abu-Ghazaleh, N. Safespec: Banishing the spectre of a meltdown with leakage-free speculation. In Proceedings of the 2019 56th ACM/IEEE Design Automation Conference (DAC), Las Vegas, NV, USA, 2–6 June 2019; pp. 1–6. [Google Scholar] [CrossRef]
  15. Taram, M.; Venkat, A.; Tullsen, D. Context-sensitive fencing: Securing speculative execution via microcode customization. In Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, Providence, RI, USA, 13–17 April 2019; pp. 395–410. [Google Scholar] [CrossRef]
  16. Deng, S.; Xiong, W.; Szefer, J. Secure tlbs. In Proceedings of the 46th International Symposium on Computer Architecture, Phoenix, AZ, USA, 22–26 June, 2019; pp. 346–359. [Google Scholar] [CrossRef]
  17. Saileshwar, G.; Qureshi, M.K. Cleanupspec: An “undo” approach to safe speculation. In Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, Columbus, OH, USA, 12–16 October 2019; pp. 73–86. [Google Scholar] [CrossRef]
  18. Ainsworth, S.; Jones, T.M. Muontrap: Preventing cross-domain spectre-like attacks by capturing speculative state. In Proceedings of the 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA), Virtual Event, 30 May–3 June 2020; pp. 132–144. [Google Scholar] [CrossRef]
  19. Kim, S.; Mahmud, F.; Huang, J.; Majumder, P.; Christou, N.; Muzahid, A.; Tsai, C.C.; Kim, E.J. Revice: Reusing victim cache to prevent speculative cache leakage. In Proceedings of the 2020 IEEE Secure Development (SecDev), Atlanta, GA, USA, 28–30 September 2020; pp. 96–107. [Google Scholar] [CrossRef]
  20. Wang, X.; Zhao, Z.; Xu, D.; Zhang, Z.; Hao, Q.; Liu, M.; Si, Y. Two-stage checkpoint based security monitoring and fault recovery architecture for embedded processor. Electronics 2020, 9, 1165. [Google Scholar] [CrossRef]
  21. Loughlin, K.; Neal, I.; Ma, J.; Tsai, E.; Weisse, O.; Narayanasamy, S.; Kasikci, B. {DOLMA}: Securing Speculation with the Principle of Transient {Non-Observability}. In Proceedings of the 30th USENIX Security Symposium (USENIX Security 21), Virtual Event, 11–13 August 2021; pp. 1397–1414. [Google Scholar]
  22. Choudhary, R.; Yu, J.; Fletcher, C.; Morrison, A. Speculative privacy tracking (SPT): Leaking information from speculative execution without compromising privacy. In Proceedings of the MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture, Virtual Event, 18–22 October 2021; pp. 607–622. [Google Scholar] [CrossRef]
  23. Jin, H.; He, Z.; Qiang, W. SpecTerminator: Blocking speculative side channels based on instruction classes on RISC-V. ACM Trans. Archit. Code Optim. 2023, 20, 15. [Google Scholar] [CrossRef]
  24. Jauch, T.; Wezel, A.; Fadiheh, M.R.; Schmitz, P.; Ray, S.; Fung, J.M.; Fletcher, C.W.; Stoffel, D.; Kunz, W. Secure-by-construction design methodology for CPUs: Implementing secure speculation on the RTL. In Proceedings of the 2023 IEEE/ACM International Conference on Computer Aided Design (ICCAD), San Francisco, CA, USA, 28 October–2 November 2023; pp. 1–9. [Google Scholar] [CrossRef]
  25. Li, L.; Huang, J.; Feng, L.; Wang, Z. PREFENDER: A prefetching defender against cache side channel attacks as a pretender. IEEE Trans. Comput. 2024, 73, 1457–1471. [Google Scholar] [CrossRef]
  26. Zhang, Z.; Liu, Y.; She, Y.; Sanka, A.I.; Hung, P.S.; Cheung, R.C. ConBOOM: A Configurable CPU Microarchitecture for Speculative Covert Channel Mitigation. Electronics 2025, 14, 850. [Google Scholar] [CrossRef]
  27. He, Z.; Hu, G.; Lee, R. New models for understanding and reasoning about speculative execution attacks. In Proceedings of the 2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA), Seoul, Republic of Korea, 27 February–3 March 2021; pp. 40–53. [Google Scholar] [CrossRef]
  28. Zhao, J.; Korpan, B.; Gonzalez, A.; Asanovic, K. Sonicboom: The 3rd generation berkeley out-of-order machine. In Proceedings of the Fourth Workshop on Computer Architecture Research with RISC-V, Virtual Event, 29 May 2020; Volume 5. [Google Scholar]
  29. Zhang, J.; Chen, C.; Cui, J.; Li, K. Timing Side-Channel Attacks and Countermeasures in CPU Microarchitectures. ACM Comput. Surv. 2024, 56, 178. [Google Scholar] [CrossRef]
  30. Xiong, W.; Szefer, J. Survey of transient execution attacks and their mitigations. ACM Comput. Surv. CSUR 2021, 54, 54. [Google Scholar] [CrossRef]
  31. Osvik, D.A.; Shamir, A.; Tromer, E. Cache attacks and countermeasures: The case of AES. In Proceedings of the Topics in Cryptology–CT-RSA 2006: The Cryptographers’ Track at the RSA Conference 2006, San Jose, CA, USA, 13–17 February 2005; pp. 1–20. [Google Scholar] [CrossRef]
  32. Yarom, Y.; Falkner, K. {FLUSH+ RELOAD}: A high resolution, low noise, l3 cache {Side-Channel} attack. In Proceedings of the 23rd USENIX security symposium (USENIX security 14), San Diego, CA, USA, 20–22 August 2014; pp. 719–732. [Google Scholar]
  33. Gruss, D.; Maurice, C.; Wagner, K.; Mangard, S. Flush+ flush: A fast and stealthy cache attack. In Proceedings of the Detection of Intrusions and Malware, and Vulnerability Assessment: 13th International Conference, DIMVA 2016, San Sebastián, Spain, 7–8 July 2016; pp. 279–299. [Google Scholar] [CrossRef]
  34. Kocher, P. Spectre Mitigations in Microsoft’s C/C++ Compiler. 2018. Available online: https://www.paulkocher.com/doc/MicrosoftCompilerSpectreMitigation.html (accessed on 28 March 2025).
  35. Mambretti, A.; Sandulescu, A.; Neugschwandtner, M.; Sorniotti, A.; Kurmus, A. Two methods for exploiting speculative control flow hijacks. In Proceedings of the 13th USENIX Workshop on Offensive Technologies (WOOT 19), Santa Clara, CA, USA, 12–13 August 2019. [Google Scholar]
  36. Fustos, J.; Bechtel, M.; Yun, H. Spectrerewind: Leaking secrets to past instructions. In Proceedings of the 4th ACM Workshop on Attacks and Solutions in Hardware Security, Virtual Event, 13 November 2020; pp. 117–126. [Google Scholar] [CrossRef]
  37. Qiu, P.; Gao, Q.; Liu, C.; Wang, D.; Lyu, Y.; Li, X.; Wang, C.; Qu, G. Pmu-spill: A new side channel for transient execution attacks. IEEE Trans. Circuits Syst. I Regul. Pap. 2023. [Google Scholar] [CrossRef]
  38. Hu, G.; He, Z.; Lee, R.B. Sok: Hardware defenses against speculative execution attacks. In Proceedings of the 2021 International Symposium on Secure and Private Execution Environment Design (SEED), Washington, DC, USA, 20–21 September 2021; pp. 108–120. [Google Scholar] [CrossRef]
  39. Chen, B.; Wang, Y.; Shome, P.; Fletcher, C.; Kohlbrenner, D.; Paccagnella, R.; Genkin, D. {GoFetch}: Breaking {Constant-Time} Cryptographic Implementations Using Data {Memory-Dependent} Prefetchers. In Proceedings of the 33rd USENIX Security Symposium (USENIX Security 24), Philadelphia, PA, USA, 14–16 August 2024; pp. 1117–1134. [Google Scholar]
  40. Xiong, W.; Szefer, J. Leaking information through cache LRU states. In Proceedings of the 2020 IEEE International Symposium on High Performance Computer Architecture (HPCA), San Diego, CA, USA, 22–26 February 2020; pp. 139–152. [Google Scholar] [CrossRef]
  41. Behnia, M.; Sahu, P.; Paccagnella, R.; Yu, J.; Zhao, Z.N.; Zou, X.; Unterluggauer, T.; Torrellas, J.; Rozas, C.; Morrison, A.; et al. Speculative interference attacks: Breaking invisible speculation schemes. In Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Virtual Event, 19–23 April 2021; pp. 1046–1060. [Google Scholar] [CrossRef]
  42. Celio, C.; Patterson, D.A.; Asanovic, K. The Berkeley Out-of-Order Machine (Boom): An Industry-Competitive, Synthesizable, Parameterized Risc-V Processor; Technical Report No. UCB/EECS-2015-167; EECS Department, University of California, Berkeley: Berkeley, CA, USA, 2015. [Google Scholar]
  43. Binkert, N.; Beckmann, B.; Black, G.; Reinhardt, S.K.; Saidi, A.; Basu, A.; Hestness, J.; Hower, D.R.; Krishna, T.; Sardashti, S.; et al. The gem5 simulator. ACM SIGARCH Comput. Archit. News 2011, 39, 1–7. [Google Scholar] [CrossRef]
  44. Patel, A.; Afram, F.; Ghose, K. Marss-x86: A qemu-based micro-architectural and systems simulator for x86 multicore processors. In Proceedings of the 1st International Qemu Users’ Forum, Citeseer, Citeseer, Grenoble, France, 18 March 2011; pp. 29–30. [Google Scholar]
  45. Andrianatrehina, H.; Lashermes, R.; Paturel, J.; Rokicki, S.; Rubiano, T. Exploring speculation barriers for RISC-V selective speculation. In Proceedings of the International Conference on Availability, Reliability and Security; Springer: Berlin/Heidelberg, Germany, 2025; pp. 171–192. [Google Scholar]
  46. Newsome, J.; Song, D.X. Dynamic taint analysis for automatic detection, analysis, and signature generation of exploits on commodity software. In Proceedings of the NDSS, San Diego, CA, USA, 2–4 February 2005; Volume 5, pp. 3–4. [Google Scholar]
  47. Suh, G.E.; Lee, J.W.; Zhang, D.; Devadas, S. Secure program execution via dynamic information flow tracking. In Proceedings of the 11th international conference on Architectural support for programming languages and operating systems (ASPLOS’04), Boston, MA, USA, 9–13 October 2004; pp. 85–96. [Google Scholar] [CrossRef]
  48. Crandall, J.R.; Chong, F.T. Minos: Control data attack prevention orthogonal to memory model. In Proceedings of the 37th International Symposium on Microarchitecture (MICRO-37’04), Portland, OR, USA, 4–8 December 2004; pp. 221–232. [Google Scholar] [CrossRef]
  49. Qin, F.; Wang, C.; Li, Z.; Kim, H.s.; Zhou, Y.; Wu, Y. Lift: A low-overhead practical information flow tracking system for detecting security attacks. In Proceedings of the 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’06), Orlando, FL, USA, 9–13 December 2006; pp. 135–148. [Google Scholar] [CrossRef]
  50. Dalton, M.; Kannan, H.; Kozyrakis, C. Raksha: A flexible information flow architecture for software security. In Proceedings of the 2007 ACM/IEEE 34th Annual International Symposium on Computer Architecture (ISCA), San Diego, CA, USA, 9–13 June 2007; pp. 482–493. [Google Scholar] [CrossRef]
  51. Venkataramani, G.; Doudalis, I.; Solihin, Y.; Prvulovic, M. Flexitaint: A programmable accelerator for dynamic taint propagation. In Proceedings of the 2008 IEEE 14th International Symposium on High Performance Computer Architecture, Salt Lake City, UT, USA, 16–20 February 2008; pp. 173–184. [Google Scholar] [CrossRef]
  52. Chen, S.; Kozuch, M.; Strigkos, T.; Falsafi, B.; Gibbons, P.B.; Mowry, T.C.; Ramachandran, V.; Ruwase, O.; Ryan, M.; Vlachos, E. Flexible hardware acceleration for instruction-grain program monitoring. In Proceedings of the 2008 ACM/IEEE 35th Annual International Symposium on Computer Architecture (ISCA), Beijing, China, 21–25 June 2008; pp. 377–388. [Google Scholar] [CrossRef]
  53. Chen, H.; Wu, X.; Yuan, L.; Zang, B.; Yew, P.c.; Chong, F.T. From speculation to security: Practical and efficient information flow tracking using speculative hardware. In Proceedings of the 2008 ACM/IEEE 35th Annual International Symposium on Computer Architecture (ISCA), Beijing, China, 21–25 June 2008; pp. 401–412. [Google Scholar] [CrossRef]
  54. Tiwari, M.; Li, X.; Wassel, H.M.; Chong, F.T.; Sherwood, T. Execution leases: A hardware-supported mechanism for enforcing strong non-interference. In Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture, New York, NY, USA, 12–16 December 2009; pp. 493–504. [Google Scholar] [CrossRef]
  55. Tiwari, M.; Wassel, H.M.; Mazloom, B.; Mysore, S.; Chong, F.T.; Sherwood, T. Complete information flow tracking from the gates up. In Proceedings of the 14th international conference on Architectural support for programming languages and operating systems, Washington, DC, USA, 7–11 March 2009; pp. 109–120. [Google Scholar] [CrossRef]
  56. Deng, D.Y.; Lo, D.; Malysa, G.; Schneider, S.; Suh, G.E. Flexible and efficient instruction-grained run-time monitoring using on-chip reconfigurable fabric. In Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture, Atlanta, GA, USA, 4–8 December 2010; pp. 137–148. [Google Scholar] [CrossRef]
  57. Ardeshiricham, A.; Hu, W.; Marxen, J.; Kastner, R. Register transfer level information flow tracking for provably secure hardware design. In Proceedings of the Design, Automation & Test in Europe Conference & Exhibition (DATE), Lausanne, Switzerland, 27–31 March 2017; pp. 1691–1696. [Google Scholar] [CrossRef]
  58. Yu, J.; Hsiung, L.; El Hajj, M.; Fletcher, C.W. Data oblivious ISA extensions for side channel-resistant and high performance computing. In Proceedings of the 2019 26th Annual Network and Distributed System Security Symposium, San Diego, CA, USA, 24–27 February 2019. [Google Scholar] [CrossRef]
  59. Bachrach, J.; Vo, H.; Richards, B.; Lee, Y.; Waterman, A.; Avižienis, R.; Wawrzynek, J.; Asanović, K. Chisel: Constructing hardware in a scala embedded language. In Proceedings of the 49th Annual Design Automation Conference, San Francisco, CA, USA, 3–7 June 2012; pp. 1216–1225. [Google Scholar] [CrossRef]
  60. Amid, A.; Biancolin, D.; Gonzalez, A.; Grubb, D.; Karandikar, S.; Liew, H.; Magyar, A.; Mao, H.; Ou, A.; Pemberton, N.; et al. Chipyard: Integrated design, simulation, and implementation framework for custom socs. IEEE Micro 2020, 40, 10–21. [Google Scholar] [CrossRef]
  61. Gonzalez, A.; Korpan, B.; Zhao, J.; Younis, E.; Asanovic, K. Replicating and mitigating spectre attacks on an open source RISC-V microarchitecture. In Proceedings of the Third Workshop on Computer Architecture Research with RISC-V (CARRV), Phoenix, AZ, USA, 22 June 2019. [Google Scholar]
  62. Sabbagh, M.; Fei, Y. Secure speculative execution via RISC-V open hardware design. In Proceedings of the Fifth Workshop on Computer Architecture Research with RISC-V (CARRV 2021), Virtual Event, 17 June 2021. [Google Scholar]
  63. Bucek, J.; Lange, K.D.; Kistowski, J.v. SPEC CPU2017: Next-generation compute benchmark. In Proceedings of the Companion of the 2018 ACM/SPEC International Conference on Performance Engineering, Berlin, Germany, 9–13 April 2018; pp. 41–42. [Google Scholar] [CrossRef]
  64. Hennessy, J.L.; Patterson, D.A. Computer Architecture, Sixth Edition: A Quantitative Approach, 6th ed.; Morgan Kaufmann Publishers Inc.: San Francisco, CA, USA, 2017. [Google Scholar]
Figure 1. Attack scheme of building a cache-based covert channel [10].
Figure 1. Attack scheme of building a cache-based covert channel [10].
Electronics 14 03973 g001
Figure 2. Implicit attack example in Listing 1 vs. explicit attack example in Listing 2.
Figure 2. Implicit attack example in Listing 1 vs. explicit attack example in Listing 2.
Electronics 14 03973 g002
Figure 3. Attack flow model for implicit cache-based speculative execution attacks.
Figure 3. Attack flow model for implicit cache-based speculative execution attacks.
Electronics 14 03973 g003
Figure 4. Speculative interference attack via MSHR contention and resource exhaustion [41]. This is an implicit cache-based speculative execution attack.
Figure 4. Speculative interference attack via MSHR contention and resource exhaustion [41]. This is an implicit cache-based speculative execution attack.
Electronics 14 03973 g004
Figure 5. Speculative interference attack via non-pipelined EU contention [41]. This is an implicit cache-based speculative execution attack.
Figure 5. Speculative interference attack via non-pipelined EU contention [41]. This is an implicit cache-based speculative execution attack.
Electronics 14 03973 g005
Figure 6. TrackRISC-Defense mechanism for mitigating both implicit and explicit cache-based speculative execution attacks.
Figure 6. TrackRISC-Defense mechanism for mitigating both implicit and explicit cache-based speculative execution attacks.
Electronics 14 03973 g006
Figure 7. Taint and untaint propagation in TrackRISC-Defense. (a) Taint propagation from potential access instructions to their data-dependent instructions. (b) Untaint propagation for non-speculative instructions.
Figure 7. Taint and untaint propagation in TrackRISC-Defense. (a) Taint propagation from potential access instructions to their data-dependent instructions. (b) Untaint propagation for non-speculative instructions.
Electronics 14 03973 g007
Figure 8. TrackRISC-Defense microarchitecture.
Figure 8. TrackRISC-Defense microarchitecture.
Electronics 14 03973 g008
Figure 9. The workflow for generating a global taint mask.
Figure 9. The workflow for generating a global taint mask.
Electronics 14 03973 g009
Figure 10. The decision workflow for identifying a transmit instruction.
Figure 10. The decision workflow for identifying a transmit instruction.
Electronics 14 03973 g010
Figure 11. The procedure for delaying or resuming the execution of transmit instructions.
Figure 11. The procedure for delaying or resuming the execution of transmit instructions.
Electronics 14 03973 g011
Figure 12. Implicit attack results (Spectre Variant 1) from baselines vs. TrackRISC defense.
Figure 12. Implicit attack results (Spectre Variant 1) from baselines vs. TrackRISC defense.
Electronics 14 03973 g012
Figure 13. Detailed performance overheads of baselines vs. TrackRISC-Defense on SPEC2017 benchmarks.
Figure 13. Detailed performance overheads of baselines vs. TrackRISC-Defense on SPEC2017 benchmarks.
Electronics 14 03973 g013
Figure 14. Performance overheads of SpecTerminator-v1 [23] vs. TrackRISC-Defense on SPEC2017 benchmarks (with error bars).
Figure 14. Performance overheads of SpecTerminator-v1 [23] vs. TrackRISC-Defense on SPEC2017 benchmarks (with error bars).
Electronics 14 03973 g014
Table 1. Attack modeling representations for existing implicit cache-based speculative execution attacks.
Table 1. Attack modeling representations for existing implicit cache-based speculative execution attacks.
Attack NameImplicit Attack Modeling Representation I transmit
I memory
Spectre Example 10 [34] I s p e c S e n d e r { I a c c e s s , I t r a n s m i t { b r , } , I m e m o r y { l o a d c a c h e   m i s s } }
R e c e i v e r { C a c h e }
Control-Flow
Decision
Cache LRU
Covert Channel [40]
I s p e c S e n d e r { I a c c e s s , I t r a n s m i t { b r , } , I m e m o r y { l o a d c a c h e   h i t } }
R e c e i v e r { C a c h e }
Speculative Interference
Attack( G M S H R D ) [41]
I s p e c S e n d e r { I a c c e s s , I t r a n s m i t { l o a d } , I m e m o r y { l o a d n o n - s p e c u l a t i v e } }
R e c e i v e r { C a c h e }
Resource
Contention
Speculative Interference
Attack( G N P E U D ) [41]
I s p e c S e n d e r { I a c c e s s , I t r a n s m i t { l o a d , d i v , } , I m e m o r y { l o a d n o n - s p e c u l a t i v e } }
R e c e i v e r { C a c h e }
Table 2. Hardware defense classification.
Table 2. Hardware defense classification.
Defense NamePlatformDefense
Mechanism
Implicit Security
Vulnerability
Security
Level
Performance
Overhead
Hardware
Resource Overhead
SpecTerminator-v1
[23]
BOOM [42]Memory
Transmitter Delay

( I t r a n s m i t { l o a d , s t o r e } )
Spectre Example 10 [34]

Cache LRU
Covert Channel [40]

Speculative Interference
Attack( G N P E U D ) [41]
MediumLowHigh
InvisiSpec [9]

SafeSpec [14]

MuonTrap [18]
Gem5 [43]

MARSSx86 [44]

Gem5 [43]
Invisible StructureSpeculative Interference
Attack( G M S H R D ) [41]

Speculative Interference
Attack( G N P E U D ) [41]
MediumLowLow
CondSpec [11]
(with Cache-Hit Filter)

Delay-on-Miss [12]

Gem5 [43]

Gem5 [43]
Cache-Miss
Load Delay
Speculative Interference
Attack( G N P E U D ) [41]

Cache LRU
Covert Channel [40]
MediumLowLow
TrackRISC-Defense
(This Work)
SonicBOOM
[28]
Memory & Non-Memory
Transmitter Restriction

( I t r a n s m i t { l o a d , s t o r e }
and I t r a n s m i t { b r , }
and I t r a n s m i t { d i v , } )
N/A *HighMediumHigh
* N/A: Not Applicable in most cases.
Table 3. Transmit instruction types and their transmitter flags.
Table 3. Transmit instruction types and their transmitter flags.
Transmit Instruction TypeTransmitter Flag (Signal) *
Memory Instructionsuses_stq/uses_ldq
Control-flow Instructionsis_br/is_jalr
Division-Based Instructions/
Square Root Instructions
uopDIV/uopDIVU/uopDIVW/uopDIVUW/FDIV_S/FDIV_D
uopREM/uopREMU/uopREMW/uopREMUW/
/FSQRT_S/FSQRT_D
* An active transmitter flag means that the instruction belongs to this transmit instruction type, e.g., uses_ldq ↑ denotes that the instruction is a load instruction.
Table 4. Main signal descriptions. ↑ denotes that the signal is active, and ↓ denotes that the signal is inactive.
Table 4. Main signal descriptions. ↑ denotes that the signal is active, and ↓ denotes that the signal is inactive.
Signal NameDescription
br_maskThe speculative state of an instruction
br_mask = 0The instruction is non-speculative
br_mask ≠ 0The instruction is speculative
rob_valWhether an instruction is valid in the ROB
rob_valThe instruction is valid
rob_valThe instruction is invalid, due to the situations like commit
global_taint_maskTaint information of all related instructions
delayed_instTransmitter flag for an instruction, which is detailed in Table 3
delayed_instThe instruction belongs to one of transmit instruction types
delayed_instThe instruction is not transmit instruction
cannot_allocateWhether an instruction requires delayed execution
cannot_allocateThe instruction needs delayed execution
cannot_allocateThe instruction can be executed normally
Table 5. CPU experimental configuration (large core configuration).
Table 5. CPU experimental configuration (large core configuration).
ParameterValue
ISARV64GC
Fetch Width8
Decode Width3
Issue Width5
Integer Register Number100
Floating-Point Register Number96
Speculative Mask Depth16
ROB Entry Number96
Branch Prediction Enabled?
Core Frequency on FPGA75 MHz
Table 6. Our baselines and TrackRISC-Defense.
Table 6. Our baselines and TrackRISC-Defense.
ConfigurationBaseline/Defense Description
Unprotected CPU [28]
(Baseline)
SonicBOOM [28], the original unprotected RISC-V out-of-order processor core
SpecTerminator-v1 [23]
(Baseline)
SonicBOOM [28] with the defense mechanism that delays the execution of memory transmitters that may use secret-dependent operands
TrackRISC-DefenseSonicBOOM [28] with the defense mechanism that delays the execution of both memory and non-memory transmitters that may use secret-dependent operands
Table 7. Hardware resource utilization of TrackRISC-Defense (large core configuration).
Table 7. Hardware resource utilization of TrackRISC-Defense (large core configuration).
ConfigurationLook-Up
Tables
Flip-Flops
(Registers)
RAMB36RAMB18DSP48
Blocks
Unprotected CPU [28]256,546119,66018511439
TrackRISC-Defense297,114120,154 (↑ 0.4%)18511439
Table 8. Baselines vs. TrackRISC-Defense.
Table 8. Baselines vs. TrackRISC-Defense.
ConfigurationSecurity LevelSPEC2017 Benchmark Package
(Performance Overhead)
Unprotected CPU [28]LowNone
SpecTerminator-v1 [23]Medium13.8%
TrackRISC-Defense
(This Work)
High ↑19.4%
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhang, Z.; Sanka, A.I.; She, Y.; Hong, J.; Hung, P.S.Y.; Cheung, R.C.C. TrackRISC: An Implicit Attack Flow Model and Hardware Microarchitectural Mitigation for Speculative Cache-Based Covert Channels. Electronics 2025, 14, 3973. https://doi.org/10.3390/electronics14203973

AMA Style

Zhang Z, Sanka AI, She Y, Hong J, Hung PSY, Cheung RCC. TrackRISC: An Implicit Attack Flow Model and Hardware Microarchitectural Mitigation for Speculative Cache-Based Covert Channels. Electronics. 2025; 14(20):3973. https://doi.org/10.3390/electronics14203973

Chicago/Turabian Style

Zhang, Zhewen, Abdurrashid Ibrahim Sanka, Yuhan She, Jinfa Hong, Patrick S. Y. Hung, and Ray C. C. Cheung. 2025. "TrackRISC: An Implicit Attack Flow Model and Hardware Microarchitectural Mitigation for Speculative Cache-Based Covert Channels" Electronics 14, no. 20: 3973. https://doi.org/10.3390/electronics14203973

APA Style

Zhang, Z., Sanka, A. I., She, Y., Hong, J., Hung, P. S. Y., & Cheung, R. C. C. (2025). TrackRISC: An Implicit Attack Flow Model and Hardware Microarchitectural Mitigation for Speculative Cache-Based Covert Channels. Electronics, 14(20), 3973. https://doi.org/10.3390/electronics14203973

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop