T-Smade: A Two-Stage Smart Detector for Evasive Spectre Attacks Under Various Workloads

Jiao, Jiajia; Wen, Ran; Li, Yulian

doi:10.3390/electronics13204090

Open AccessArticle

T-Smade: A Two-Stage Smart Detector for Evasive Spectre Attacks Under Various Workloads

by

Jiajia Jiao

^*,

Ran Wen

and

Yulian Li

College of Information Engineering, Shanghai Maritime University, No. 1550 Haigang Avenue, Shanghai 201306, China

^*

Author to whom correspondence should be addressed.

Electronics 2024, 13(20), 4090; https://doi.org/10.3390/electronics13204090

Submission received: 18 August 2024 / Revised: 23 September 2024 / Accepted: 27 September 2024 / Published: 17 October 2024

(This article belongs to the Special Issue New Insights in Information Security and Data Privacy: Challenges and Solutions)

Download

Browse Figures

Versions Notes

Abstract

Evasive Spectre attacks have used additional nop or memory delay instructions to make effective hardware performance counter based detectors with lower attack detection successful rate. Interestingly, the detection performance gets worse under different workloads. For example, the attack detection successful rate is only 59.8% for realistic applications, while it is much lower 27.52% for memory stress test. Therefore, this paper proposes a two-stage smart detector T-Smade designed for evasive Spectre attacks (e.g., evasive Spectre nop and evasive Spectre memory) under various workloads. T-Smade uses the first-stage detector to identify the type of workloads and then selects the appropriate second-stage detector, which uses four hardware performance counter events to characterize the high cache miss rate and low branch miss rate of Spectre attacks. More importantly, the second stage detector adds one dimension of reusing cache miss rate and branch miss rate to exploit the characteristics of various workloads to detect evasive Spectre attacks effectively. Furthermore, to achieve the good generalization for more unseen evasive Spectre attacks, the proposed classification detector T-Smade is trained by the raw data of Spectre attacks and non-attacks in different workloads using simple Multi-Layer Perception models. The comprehensive results demonstrate that T-Smade makes the average attack detection successful rate of evasive Spectre nop under different workload return from 27.52% to 95.42%, and that of evasive Spectre memory from 59.8% up to 100%.

Keywords:

evasive spectre attacks; hardware performance counter; machine learning; various workloads

1. Introduction

Modern processor optimization technologies, such as speculative execution [1,2,3] and out-of-order execution [4,5], are often used to improve CPU performance, but they also bring potential security risks. For instance, Spectre attack [6] utilizes speculative execution to bypass boundary checking and preemptively access unauthorized data. What’s worse is the evasive Spectre attack [7,8] makes the lower attack frequency or mimics non-attack benign programs, so that its attack trace is hidden and speculative execution is used to access sensitive data, resulting in unauthorized data leakage. Therefore, developing effective detectors to accurately detect evasive Spectre attack is increasingly attractive for more researchers [6,7].

Hardware performance counters (HPCs) [9,10,11,12] based detection methods have been highly concerned due to their strong flexibility and effectiveness in detecting microarchitectural attacks [13]. For example, Pan et al. [10] utilized a machine learning assisted method to exploit HPCs (4 hardware events), embedded trace buffer and on-chip network traffic analysis to detect malware. Kuruvila et al. [12] proposed an explainable HPC-based (4 HPCs) double regression (HPCDR) ML framework to detect five microarchitecture attacks. Koc et al. [14] utilized 5 HPCs to protect users’ privacy against real-time cache side-channel attack in cloud systems. Hu et al. [15] took advantage of 6 HPCs and proposed a framework CARE, which enable hardware performance counter based malware detection models resilient to resource competition. Mai et al. [9] selected more than 40 HPCs related microarchitecture features and used typical statistic methods to rank these features.

Although these HPCs based methods have achieved excellent results in detecting the general Spectre attacks, they all overlook the stronger evasive Spectre attacks. In real-world scenarios, evasive Spectre attacks can weaken the distinctive characteristics of Spectre attacks (high cache miss rates and low branch miss rates) through various techniques [16,17,18]. For instance, evasive Spectre attacks often insert nop or memory delay instructions into attack code to evade detection [16], and result in distinctive traces from general Spectre attacks.

Recent works have been proposed for detecting evasive Spectre attacks. Li et al. [19] designed an effective 4 HPC events based detector for the evasive Spectre attack, achieving around 70% attack detection successful rate. Polycharonou et al. [20] also proposed MaDMAN to gather more information of 6 HPC events and detect a large set of software attacks targeting hardware vulnerabilities under a single workload. Pan et al. [21] further utilized the temporal differences of 6 hardware events in sequential timestamps and explainable machine learning to detect evasive Spectre and meltdown. However, the detector requires the same data pool for both training and testing, and cannot handle with the unseen or future evasive attacks. Kosasih et al. [22] utilized 4 HPC events to detect evasive Spectre by inserting Spectre into a benign program. However, this method shares the same limitation as the approach in [21], which involves training the detector with evasive attack data. He et al. [23] used 34 HPC events to detect attacks that mimic non-attack benign programs in cloud computing platforms. However, the number of HPCs in a specified architecture is limited by the available registers. For example, the Intel Core i7-6700K has 4 HPCs [24], and the AMD Ryzen 7 3700X has 6 HPCs [25]. Therefore, a well-generalized detector is needed to use few HPC events for more kinds of evasive Spectre attacks. Therefore, to exploit the inherent evasion of evasive Spectre attack for higher attack detection successful rate, this paper utilizes a unified dataset to train a Two-stage Smart detector capable of identifying a range of evasive Spectre attacks effectively under different workloads (T-Smade). Our main contributions include the following three points:

(1): To design a well-generalized evasive Spectre attack detector, Spectre attack and non-attack datasets are used to train a detector capable of identifying evasive Spectre nop and evasive Spectre memory under the workloads of realistic application and stress test.
(2): To minimize extra performance overhead of HPC detector for evasive Spectre attacks, four existed HPC events are reused to further expands the features of evasive Spectre under different workloads. As a result, the average attack detection successful rate for evasive Spectre nop attacks returns from 27.52% to 95.42%, and that of evasive Spectre memory increases from 59.8% up to 100%.
(3): To identify different workloads for more accurate attack detection, a unified two-stage smart detector is well designed. The first-stage is used to distinguish various workloads, while the second-stage focuses on attack detection. This ablation study show that the proposed approach only results in only 0.12% accuracy loss on average compared with solely separate detectors.

We plan to make the artifact of the proposed two-stage detector T-Smade publicly available on https://github.com/breatrice321/T-Smade-contact (accessed on 26 September 2024) under open-source licensing.

The remainder of this paper is organized as follows. Section 2 introduces the background, Section 3 depicts the proposed novel detector. Section 4 details the experimental results and analysis, and finally, the concluding remarks are summarized in Section 5.

2. Background

2.1. Speculative Execution and Cache Side Channels for Microarchitecture Attack

Speculative execution [1,2,3] and out-of-order execution [4,5] are often used in the modern processors to enhance computer performance [9]. Speculative execution leverages prior experience to predict potential branch paths and preemptively executes instructions along these predicted paths in advance. The results stored in temporary memory such as caches are committed for further use when the branch prediction is correct. Conversely, if the prediction proves wrong, the processor discards the speculative results and re-executes the instructions along the correct path.

Cache [22] as a bridge between the CPU and memory, can reduce performance degradation caused by their large speed gap. Multi-level caches are often used in the modern processors. Typically, three levels are designed: L1 cache [26], L2 cache [27], and L3 cache (Last Level Cache, LLC) [28]. Data executed by CPU is often temporarily stored in a cache to expedite subsequent data access. However, the data in the cache is only stored temporarily and cannot be rolled back until it is replaced by new data.

These two potential vulnerabilities can be utilized by attackers and even combined to form popular microarchitecture attacks, such as Spectre attack, seriously affecting information security.

2.2. Evasive Spectre Attack

Spectre attack is easily distinguishable from non-attack benign programs due to its two characteristics: (1) low branch miss rate caused by mistraining the branch predictor to get a suitable index x that is smaller than the boundary of

a r r a y 1

(i.e.,

x < a r r a y 1_s i z e

) in order to bypass boundary checking; (2) high cache miss rate because of flushing and reloading the cache lines, which temporarily store the content of temp.

Evasive Spectre attacks have been proposed to evade detection while maintaining high successful attack rates. Pashrashid et al. [16] introduced two types of evasive Spectre attacks: (1) inserting “nop” instructions and (2) inserting memory delay instructions during the phase of mistraining branch predictor. As shown in Figure 1a,b, the former (evasive Spectre nop) weakens the Spectre attack’s distinguishing features by adding nop instructions before and after calling the victim function, which prolongs execution time and mimics benign programs. As shown in Figure 1a,c, the later (evasive Spectre memory) leverages the Fisher-Yates shuffle algorithm [29] and inserts these instructions after calling the victim function to determine the order of memory accesses for a lower cache miss rate. As shown in Figure 2a, although the two evasive Spectre attacks both reduce cache miss rates, they also result in even lower branch miss rates. Because these additional operations extend the instruction’s occupation of the processor’s execution cycles, and do not introduce extra branch prediction conflicts or uncertainties, thereby further decreasing the branch miss rate.

2.3. Hardware Performance Counter

Hardware performance counters (HPCs) are a set of registers embedded within a processor, adept at detecting malware that exploits microarchitectural side effects [13]. They can also be used to collect and record real-time information about processor performance, such as CPU cycles [30], cache reference [7], and branch predictor [31]. Perf is a Linux performance analysis tool used to collect and analyze system performance, leveraging HPCs to track various hardware events. The data collection modes of perf are classified into two types: sampling and counting, and this paper selects the counting mode, which measures the frequency of hardware events occurring within fixed time intervals. The chosen counting interval for this study is 1 s, and the interval can be changed on demand.

2.4. Multi-Layer Perceptron

Multi-Layer Perceptron (MLP) [32] is a type of neural network model that performs deep learning and nonlinear mapping of input data through multiple hidden layers. Each neuron in a hidden layer is fully connected to all neurons in the previous layer, enabling the MLP to capture complex patterns and features. In classification tasks, MLPs achieve high accuracy due to their strong learning capabilities and flexibility.

Using MLP for T-Smade results from two points: (1) The inherent simple and effective structure of MLP can satisfy the requirements of the evaise Spectre attack detection well.The sample data of T-Smade consists of one-dimensional data containing multiple HPC events. The multi-layer structure of the MLP can effectively extract deeper features from the data, such as distinguishing between different workloads in the first stage and between attack and non-attack data in the second stage. Additionally, the two-stage model is inherently a classification task, where MLP can capture the complex patterns in the data through nonlinear transformations between layers, mapping the input to the correct classification labels. (2) Most previous detectors [7,21] based on HPCs have also utilized MLP to achieve accurate and efficient detection. Therefore, T-Smade employs two MLPs with the same structure but different input data, designed for multi-class and binary classification tasks.

3. Proposed Detector T-Smade

3.1. Motivation

The Spectre attack exploits speculative execution to bypass boundary checks and uses cache side-channel attacks to extract sensitive information, posing a significant threat to information security. The previous works [7,19] have utilized high cache miss rates and low branch misprediction rates to detect Spectre attack effectively, achieving accuracy up to 99%. Nevertheless, the new evasive Spectre attacks are designed by inserting “nop” or memory delay instructions [16] into the program to mimic benign behavior, thereby weakening the original attack’s characteristics and evading detection.

It is clear that evasive Spectre attacks generated by inserting different instructions, can exhibit distinct traces without workload interference from Figure 2a. Meanwhile, the boundaries between attack and non-attack data become increasingly blurred under various workloads as illustrated in Figure 2b–d. Although the attack detection successful rate for various evasive Spectre attacks under different workloads might be improved with the same type of training and test datasets, it requires collecting new data for each kind of new evasive attack, making it more difficult for detecting unseen evasive attack.

This paper selects the basic Spectre attack data as a unified training dataset to detect various evasive Spectre attacks, as Spectre attack data are more readily available compared to data on unknown or new unseen evasive Spectre attacks. However, it is inevitable that the attack detection successful rate is significantly reduced when using the existed Spectre attack detectors to identify various evasive Spectre attacks. For example, the attack detection successful rate of detecting evasive Spectre attacks with inserted “nop” instructions drops to 59.80%, 0.84%, and 27.52% under realistic application, CPU stress tests, and memory stress tests, respectively. Similarly, the attack detection successful rate for attacks with memory delay instructions drops to 95.14%, 11.93%, and 45.81% under these conditions, as shown in Figure 2b–d. Furthermore, the number of HPCs supported by different architectures is limited [24,25], and adding new events would incur additional performance overhead. Therefore, it is extremely necessary to effectively detect evasive Spectre attacks via reusing limited HPC events.

3.2. Overall Framework

The proposed two-stage smart detector against evasive Spectre attacks includes three parts: data collection and feature analysis, model selection and attack detection, as shown in Figure 3.

Data collection and Feature analysis: The data collection consists of two parts: benign programs without attacks and malicious programs with attacks. It involves gathering four HPC events using the Linux analysis tool perf at fixed intervals of 1 s: branch prediction, branch miss, LLC reference, and LLC miss. As the study focuses on system-wide data, LLC data is selected for collection instead of L1 or L2 cache data.

The collected data is divided into a training dataset and several testing datasets. The training dataset includes Spectre attacks and non-attack benign programs under various workloads (such as realistic applications and stress tests), while the testing datasets comprise evasive Spectre attacks and benign programs from the same workloads too.

Spectre attacks have two inherent characteristics: low branch miss rate and high LLC miss rate. However, they cannot work well with detecting evasive Spectre attacks, as they were originally designed for detecting Spectre attacks. Therefore, this paper introduces the third feature via combining the existed two characteristics of Spectre attacks, to expand the trace of evasive Spectre attacks from non-atack benign program.

Two-stage model selection: The workloads classification detector distinguishes whether the current environment is characterized by a kind of workload from realistic applications or from stress tests. Then, it selects the appropriate pre-trained attack detector for detection work based on the first-stage classification. As illustrated in Algorithms 1 and 2, the training process of the two-stage model is clearly outlined. Both stages utilize an MLP with the same architecture, differing only in input and output sizes. The first stage classifier is trained on a dataset that combines three types of workloads, including both attack data (Spectre) and non-attack data. Its primary function is to differentiate between various workload types. In the second stage detector (taking realistic applications as an example), the training dataset contains noise from only one type of workload, but still includes both attack data (Spectre) and non-attack data, as seen in line 1 of Algorithm 2. This stage is responsible for distinguishing between attack and non-attack data.

Attack detection: The attack detection process uses the selected detector to determine whether an attack or not in the current environment. Meanwhile, the detection results are described by a confusion matrix, where TP represents the probability of correctly detecting an attack, and TN represents the probability of correctly identifying non-attack benign programs.

Algorithm 1 Workloads classifier model training process

First Stage Classifier Training:

Input 1: An array X = [

X_{1}

,

X_{2}

,

X_{3}

,

X_{4}

,

X_{5}

,

X_{6}

] has 6 features of attack (Spectre attack) and non-attack data. (

X_{1}

: branches,

X_{2}

: branch misses,

X_{3}

: LLC reference,

X_{4}

: LLC misses,

X_{5}

: branch miss rate, and

X_{6}

: LLC miss rate)

Output 1: A best workloads classifier C

1: Training data: X (combining three different workload categories).

2: Supervised learning label Y: realistic application

Y_{1}

, CPU stress test

Y_{2}

, memory stress test

Y_{3}

.

3: Predicted label:

Y^{*}

.

// start training the workloads classifier

4: for (i = 1 to epoch) do

5:

C_{i}

← MLP ←

X_{training}

;

6:

Y^{*}

←

C_{i}

←

X_{validating}

;

7: Accuracy, loss ← Y,

Y^{*}

;

8: end for;

9: Return C;

Algorithm 2 Attack detector model training process

Second Stage Detector Training (taking realistic application as an example):

Input 2: An array

X^{'}

= [

X_{1}

,

X_{2}

,

X_{3}

,

X_{4}

,

X_{5}

,

X_{6}

,

X_{7}

] has 7 features of attack (Spectre attack) and non-attack data. (

X_{7}

: evasive expanding rate)

Output 2: Three best attack detectors. (realistic applications workload attack detector:

D_{1}

, CPU stress test workload attack detector:

D_{2}

, memory stress test workload attack detector:

D_{3}

)

1: Training data:

X^{'}

(each detector only uses one workload data).

2: Supervised learning label L: attack data

L_{1}

, non-attack data

L_{2}

.

3: Predicted label:

L^{*}

.

// start training the attack detector

4: for (i = 1 to epoch) do

5:

D_{1_{i}}

← MLP ←

X_{training}^{'}

;

6: L* ←

D_{1_{i}}

←

X_{validating}^{'}

;

7: Accuracy, loss ← L,

L^{*}

;

8: end for

9: Return

D_{1}

;

3.3. The Two-Stage Detector Details

Feature Selection and Analysis. Previous work has confirmed that detecting various evasive Spectre attacks under different workloads is extremely challenging when using detectors designed solely for Spectre attacks, as illustrated in Figure 2b–d from our reproduced experiments. To increase the attack detection successful rate without introducing additional performance overhead, a two-stage smart detector is proposed to expand the weakened characteristics of evasive Spectre attacks via reusing the two existed features calculated by four HPC events of Spectre attacks. The two features are shown in Equations (1) and (2), and the four hardware events are listed as follows:

Branch instructions retired event (branches)
Branch misprediction retired event (branch misses)
Last-level cache reference event (LLC references)
Last-level cache missed event (LLC misses)

L L C m i s s r a t e = L L C m i s s ⁄ L L C r e f e r e n c e

(1)

b r a n c h m i s s r a t e = b r a n c h m i s p r e d i c t i o n ⁄ b r a n c h r e f e r e n c e

(2)

The two main characteristics of Spectre attacks (high cache miss rate and low branch miss rate) are employed to further expand the weakened features by evasive attacks, because inserting various instructions into the attack code generally only reduces the cache miss rate without increasing the branch miss rate. In other words, inserting such instructions decreases the frequency of flushing cache lines, and reduces the cache miss rate. However, branch prediction does not become worse, as inserting nop or memory delay instructions merely extends the total execution time. This case results in a decrease in the number of predictions per unit time, but it does not alter the process of mistraining the branch predictor. Consequently, the number of correct branch prediction remains unchanged, leading to a lower miss rate. This paper utilizes this observation to further separate evasive Spectre attacks from non-attack benign programs, as shown in Equation (3). Additionally, some data with higher proximity can be further separated slightly using Equation (4). To increase the distinction between attack and non-attack data,

α

needs to be greater than or equal to 1, but not excessively large to avoid data dispersion, e.g., 1 and 1.05.

E v a s i v e e x p a n d i n g r a t e^{*} = L L C m i s s r a t e ⁄ b r a n c h m i s s r a t e

(3)

E v a s i v e e x p a n d i n g r a t e = {(E v a s i v e e x p a n d i n g r a t e^{*})}^{α}, α \geq 1

(4)

Two-stage Model Selection. To mitigate the impact of various types of workloads in real-world environments on the detection of evasive Spectre attacks, this paper adopts a two-stage smart detector. The first stage focuses on classifying the workloads, while the second stage selects the appropriate detector for detection based on the identified workload, as illustrated in Figure 4 and Algorithm 3.

Algorithm 3 Two-stage detection process

Input: An array T = [(

T_{1}

),(

T_{2}

),(

T_{3}

),(

T_{4}

)] has 4 features of attack (two evasive Spectre attacks) and non-attack data. ((

T_{1}

): branches, (

T_{2}

): branch misses, (

T_{3}

): LLC reference, (

T_{4}

): LLC misses, (

T_{5}

): branch miss rate, (

T_{6}

and): LLC miss rate and (

T_{7}

): evasive expanding rate)

Output:

1: The preprocess data of T: (

T^{'}

) = [(

T_{1}

),(

T_{2}

),(

T_{3}

),(

T_{4}

),(

T_{5}

),(

T_{6}

)];
(

T^{″}

) = [(

T_{1}

),(

T_{2}

),(

T_{3}

),(

T_{4}

),(

T_{5}

),(

T_{6}

),(

T_{7}

)].

2: The workloads predicted label (

Y^{*}

) of T: realistic applications (

Y_{1}^{*}

), CPU stress test (

Y_{2}^{*}

), and memory stress test (

Y_{3}^{*}

).

3: The attack predicted label (

L^{*}

) of (

T

): attack data (

L_{1}^{*}

) and non-attack data (

L_{2}^{*}

).

// First stage workloads classification

4: (

Y^{*}

) ← C ← (

T^{'}

) ← T; // C: pretrained workloads classifier

5: if i is in (1, 2, 3) then //

6: if (Y* == (

Y_{i}^{*}

)) then

// Second stage attack detection

7: (

L^{*}

) ← (

D_{i}

) ← (

T^{″}

) ← T; // D: pretrained attack detector

8: if ((

L^{*}

) == (

L_{i}^{*}

)) then

9: return attack;

10: else

11: return non-attack;

12: end if

13: end if

14: end if

(1) First-Stage Workloads Classifier. As shown in Figure 4 and Algorithm 1, the first stage involves collecting data from Spectre attacks and non-attack benign programs under two major types of workloads (three categories), which are then combined into a unified large dataset. The workloads classifier employs this dataset for supervised learning to train a multi-classifier MLP model. As shown in the first stage of Algorithm 3, once the prediction label

Y^{*}

from the pretrained workload classifier is determined (e.g.,

Y_{1}^{*}

or

Y_{2}^{*}

) the corresponding workload is identified. Based on this classification, the appropriate second-stage attack detector is selected. For instance, if

Y^{*}

=

Y_{1}^{*}

, the workload is classified as a realistic application, and the pretrained attack detector

D_{1}

, trained specifically for realistic application workloads, is chosen to detect evasive Spectre attacks under realistic applications (as shown in line 7).

(2) Second-Stage Attack Detector. As depicted in Figure 4 and Algorithm 2, the second stage also involves collecting data from Spectre attacks and benign programs under various workloads. However, these data are categorized, and supervised learning is used to train a binary classification MLP attack detector for each kind of workload. The attack detector is chosen based on the results of the first-stage workload classifier. As shown in the second stage of Algorithm 3, during the detection process, the test data first passes through the first stage detector and then proceeds to the second stage detector, where the detection result is used to determine whether an attack is present in the current environment. Such as, if

L^{*}

=

L_{1}^{*}

(as shown in line 8 and 9), it means that current environment exist evasive Spetre attacks. The main difference of the second-stage attack detector with existed works lies in the new feature of evasive expanding rate. It can help to characterize the evasive Spectre attack for higher attack detection successful rate.

4. Results and Analysis

4.1. Experiment Configuration

The experiment configuration is listed in Table 1, and all the experiments in the paper run on our private server with Ubuntu Linux 18.04.6 LTS on an Intel Xeon^® Silver 4210 2.2 GHz processor with 125.5 GB of DDR4 memory.

As shown in Table 2, the dataset used in this paper is mainly divided into training and testing datasets. The training dataset consists of Spectre attack and non-attack data, while the testing dataset consists of two evasive Spectre attacks and non-attack data. Since T-Smade is a two-stage model, the first stage is for workload classification. Therefore, the features data for the first stage mainly including four HPC events (black) listed in Section 3.3, as well as two processed data (blue), as seen in Equations (1) and (2) under three mixed workloads. The second stage is for attack detection, where each workload corresponds to a detector. Since the training and testing datasets are different, the second stage includes one additional data (red), as shown in Equation (4), compared to the first stage to emphasize attack data.

The types of workloads discussed in this paper are outlined in Table 3. These are broadly categorized into two main types: realistic applications and stress tests. Realistic applications encompass four commonly used scenarios: playing music, watching videos, editing text, and performing database operations, while stress test involves CPU test and memory test respectively.

However, the workload is actual classified into three types: realistic application, CPU stress test and memory test. Because a single application does not significantly affect the distinction between attack and non-attack data, and in real-world scenarios users typically run multiple programs simultaneously. In contrast, CPU stress tests and memory stress tests each have a substantial effect on the attack and non-attack data, as shown in Figure 2c,d. Therefore, this paper subdivides the workload into three distinct categories.

4.2. Detection Performance Results and Analysis Under Different Evasive Spectre Attacks

The primary metrics used in this paper are the attack detection success rate (True Positive, TP) and the non-attack detection success rate (True Negative, TN), both derived from the confusion matrix. (Note: While the actual values are TP rate and TN rate, for simplicity, they are referred to as TP and TN throughout the paper.) Other metrics, such as precision and recall, are also calculated based on TP and TN.

As shown in Figure 5, Figure 6 and Figure 7, the attack detection successful rate (TP) for both evasive Spectre nop (in (b)) and evasive Spectre memory (in (a)) is significantly improved with the assistance of the evasive expanding rate in Equation (3), particularly for evasive Spectre nop. Figure 8 can also qualitatively provide the changes of attack and non-attack data after adding the third-dimensional feature (evasive expanding rate). T-Smade leverages the increment described in Equation (5) to assess the effectiveness of incorporating the evasive expanding rate across various metrics. The detailed results are analyzed as follows.

I n c r e m e n t = m e t r i c_{3 D - f e a t u r e s} - m e t r i c c_{2 D - f e a t u r e s}

(5)

As shown in (a) of the three figures above, adding the new evasive expanding rate mainly improves the attack detection successful rate (TP) for evasive Spectre memory attacks, as well as related metrics such as recall, F1 score, and accuracy. However, the non-attack detection successful rate (TN) and the closely related precision remain largely unchanged or even decrease, as illustrated in (a) of Figure 6 and Figure 7.

Compared to Figure 6 and Figure 7a, the data in Figure 5a show that the effectiveness of adding the new evasive expanding rate is a bit limited. As seen in Figure 2b and Figure 8b, evasive Spectre memory performs the similar features with original Spectre attacks and is different from non-attack benign programs. The 2D features (branch miss rate and LLC miss rate) effectively differentiate between evasive Spectre memory and benign programs. Adding the third dimension (evasive expanding rate) yields only a modest improvement, with TP increasing by 4.86% and TN by 4.69%. In contrast, Figure 6 and Figure 7a show a significant increment of TP. Because the weakened Spectre attack characteristics make detection challenging, and the evasive expanding rate can help to enhance these characteristics dramatically, increasing TP by 88.27% and 54.19%. What’s more, all of the evasive Spectre memory data even achieve 100% attack detection successful rate (TP). However, this improvement is obtained at the expense of reduced non-attack detection successful rate (TN), with TN decreasing by 3.85% and 2.01%.

As shown in Figure 5, Figure 6 and Figure 7b, evasive Spectre nop exhibits results similar to evasive Spectre memory but is more challenging to detect. As illustrated in Figure 2, although two evasive Spectre variants both weaken the characteristics of Spectre attacks, evasive Spectre nop data is closer to the non-attack benign program data, often blurring the boundary between the two data and significantly reducing attack detection successful rate, even dropping to 0.84% during CPU stress tests. However, as shown in Figure 8, the new evasive expanding rate effectively increases the distance between evasive Spectre nop and benign programs while maintaining a trend similar to Spectre attacks. This improves attack detection successful rate (TP) from 37.52% to 97.32%, with some reduction in non-attack detection successful rate (TN) increment.

4.3. Detection Performance Results and Analysis Under Different Workloads

Different workloads have varying impacts on evasive Spectre attacks and can further help in concealing their traces and evading detection. Therefore, evasive expanding rate, combined with an appropriate

α

, is necessary to further differentiate between attack and non-attack data under the corresponding workloads. Meanwhile, compared to Figure 2, Figure 8 clearly demonstrates the effectiveness of the third-dimensional feature (evasive expanding rate). Detailed results and analysis are as follows.

As shown in Figure 5, the impact of realistic application workloads on evasive Spectre attacks is much weaker than stress tests in Figure 6 and Figure 7. As shown in Figure 2b, it is observed that the workload from real-world applications tends to blur the boundaries between different types of data, especially with evasive Spectre nop, where the boundaries with Spectre attack and benign program data are almost fused together. However, these data exhibits a trend very similar to that in Figure 2a. Therefore, the decrease in attack detection successful rate (TP) is not significant, mainly affecting evasive Spectre nop where the boundaries between them are confused. The attack detection successful rate (TP) for evasive Spectre nop is 59.8%, while for evasive Spectre memory is 95.14%. After adding the third-dimensional feature (evasive expanding rate with

α

set to 1), the confused data in Figure 8b shows a stair-like separation that effectively distinguishes evasive Spectre attack from benign program. As shown in Figure 5, the attack detection successful rate (TP) for evasive Spectre nop increases to 97.32%, while that of evasive Spectre memory reaches 100%.

As shown in Figure 6, the workload of CPU stress test further assists evasive Spectre attack in avoiding successful detection. From Figure 2c, it can be observed that when the CPU is subjected to compute-intensive tasks using stress-ng, the branch miss rate for attack decreases significantly, with the branch miss rate for non-attack benign programs even becoming lower. Because CPU stress tests repeatedly invoke functions for computation, and the branch predictor frequently selects branches based on previous experiences to enhance CPU efficiency, leading to a lower branch miss rate. Since non-attack benign programs only run the CPU stress test, they have a higher predictability for subsequent branches compared to the malicious programs with Spectre attacks, resulting in even a lower branch miss rate. This contrast with the low branch miss rate of Spectre attack leads to a decrease in attack detection successful rate (TP), with evasive Spectre nop at 0.84% and evasive Spectre memory at 11.73%. However, the benign program has lower LLC miss rates compared to attack program. Therefore, leveraging this vulnerability to compute the third-dimensional feature (evasive expanding rate with

α

set to 1.05) enhances the learning of Spectre attack characteristics. The

α

is a little different from

α

under realistic application workload. Because the evasive expanding rate of attack data and non-attack is very close when the

α

is set to 1, leading to unclear data boundaries and making it difficult for the detector to distinguish data near the boundaries, as shown in the boundary of Figure 8c. However, setting

α

to 1.05 further increases the distance between these boundaries without significantly expanding the data. This improvement boosts the attack detection successful rate (TP) to 98.15% for evasive Spectre nop and 100% for evasive Spectre memory, compared to 71.19% for evasive Spectre nop and 98.66% for evasive Spectre memory with

α

set to 1.

As shown in Figure 7, the workload of memory stress test, similar to CPU stress test, can also help evade detection. From Figure 2d, it is known that the memory stress test also results in lower branch miss rates of programs. Memory stress testing involves continuously allocating and freeing memory, which does not directly affect branch prediction. However, this operation affects CPU execution efficiency and causes CPU execution and branch prediction unstable, leading to a lower branch miss rate of Spectre attack and benign program under memory stress test. Consequently, the similar branch miss rates limit the model to learning only the LLC miss rate as a distinguishing feature. Therefore, the attack detection successful rate (TP) decreases, with evasive Spectre nop dropping to 27.52% and evasive Spectre memory to 47.81%. As shown in Figure 8d, after adding the third-dimensional feature (evasive expanding rate with

α

set to 1), the separation between Spectre attacks and benign programs under memory stress tests exhibits a stair-like pattern similar to realistic application workloads in Figure 8b. The two evasive Spectre attacks are almost completely distinct from non-attack benign programs; however, the evasive Spectre nop still shows a slight overlap with benign program data, and setting

α

to 1.05 cannot separate this tight confusion. Consequently, the attack detection successful rate (TP) for evasive Spectre nop increases to 90.77%, and for evasive Spectre memory to 100%.

4.4. Detection Performance Results and Analysis Under Varying Realistic Applications

The impacts of the number of applications on detection. As shown in Table 4, 4 apps (A1–A4) and 5 apps (A1–A5) make different detection results. With the addition of one more application, the attack detection success rate (TP) shows a slight decrease, while the non-attack detection success rate (TN) significantly increases. This indicates that the newly added A5 (playing a game) has few impacts on the overall detection performance. As illustrated in Figure 9a,b, as the number of applications increases, non-attack data (yellow) inevitably shifts closer to attack data. This shift makes it more difficult for the detector to distinguish between data types at the decision boundary. Consequently, as more non-attack data appears near this boundary, the detector learns more non-attack data features during training, and leads to a relatively lower attack detection success rate (TP). Though more applications cause more interference and drops performance slightly, T-Smade can achieve high attack detection success rate for high security.

The impacts of the same type of application on different platforms. As shown in Table 4 the 5 apps (A1–A5) with WPS and 5 apps (A1–A2, A4–A6) with LibreOffice have different detection results. It is observed that A6(LibreOffice) brings a greater impact on the non-attack detection success rate (TN) than A3(WPS), while their effects on the attack detection success rate (TP) are both slight.The main reason for lower detection performance in LibreOffice is that invoking LibreOffice via Python requires the related APIs, which rely on socket communication to transmit data in real-time. This real-time monitoring leads to significant interference even not necessary during normal office editing tasks. T-Smade works well under the two sets of realistic applications, and makes their attack detection rates exceed 90%. As illustrated in Figure 9b,c, during the execution of LibreOffice, non-attack data (yellow) shifts significantly towards the attack data, with some portions of the non-attack data fully merging with the attack data, leading to the lower non-attack detection success rate (TN). Therefore, T-Smade achieve good detection performance under the various applications even with the same type.

4.5. Good Generalization Verification

As depicted in Table 5, there are some loss in the training accuracy of the two-stage smart detectors, especially in the memory stress test environment. The detectors in realistic applications and CPU stress tests have high training accuracy of 95.40% and 100%, respectively. However, the detector trained under memory stress tests has a lower training accuracy of only 80.54%. This results from the significant overlap between Spectre attack and benign program data, as illustrated in Figure 2d and Figure 8d, compared with slight overlap in Figure 8b. Despite this, the detector effectively improves the attack detection successful rate for evasive Spectre attacks because the two types of evasive Spectre attack data move along the Spectre attack trace in the third-dimensional data (evasive expanding rate) direction, clearly distinguishing them from benign programs, resulting in a significant increase in attack detection successful rate. However, compared to the other two detectors with higher training accuracy, this detector’s detection results are relatively poorer.

As shown in Figure 10, there are two types of evasive Spectre attack detector, ideal detector and two-stage detector. The two-stage detector is trained using Spectre attack and benign program data across various workloads. The difference is that the first-stage classifier is just designed to separate workloads and does not use the third-dimensional feature (evasive expanding rate) in Figure 8. Because the separation can be achieved only with the characteristics of Spectre attacks, and the training accuracy is as high as 99.95%. The second-stage detector utilizes the classified training data from the first-stage detector, and adding the third-dimensional feature (evasive expanding rate) from Figure 8 to train the attack detector for the corresponding workload. However, the ideal detector only includes second-stage detector mentioned above and assumes the workload is known, which means that ideal detector ignores the impact of workload. Nevertheless, the accuracy loss between the ideal detector and the two-stage detector, as shown in Figure 10, are extremely small. For example, the accuracy loss for the ideal detector and the two-stage detectors (both first and second stage) are 0.1262% in real application environment, 0.3263% and 0.1504% in CPU stress tests, and 0% in memory stress tests, corresponding to evasive Spectre memory and evasive Spectre nop, respectively. Therefore, the two-stage detectors maintain its generalization while ensuring attack detection successful rate.

4.6. Comparison with State-of-the-Art Researches

We compare T-Smade with state-of-the-art (SOTA) methods in terms of HPC event counts, machine learning performance, generalization, workload variety, and attack detection successful rate, as shown in Table 6.

As shown in Table 6, these studies select fewer HPC events because the number of available HPC events is limited, and using more events can lead to higher performance overhead. Therefore, HPC events related to Spectre attacks have become the primary focus, with most works selecting cache-related events, such as L2, LLC cache accesses and misses. Additionally, branch-related events have also been utilized in several studies, including [7,21] and our T-Smade. Some studies, such as [21], also include additional events like total instruction count and total page faults to further enhance the attack detection successful rate. Meanwhile, compared to threshold-based classification method, MLP and LR are preferred for most research due to their efficient classification capabilities.

However, the generalization of these methods is limited. They not only tend to use the same type of dataset for both training and testing, as seen in [21], but also assume that the workloads are known across different scenarios, as in [20]. Meanwhile, the majority of these studies only consider realistic application workload and ignore the others. Therefore, T-Smade develops a two-stage detector (first-stage for workloads classification and second-stage for attack detection), using Spectre for training and evasive Spectre for testing, to enhance its generalization across different workloads.

The attack detection success rate and non-attack detection success rate are two key metrics for trading off the security and performance of a detector. Although [20,22] achieve excellent attack detection success rates, Ref. [20] uses the same type of dataset, and the non-attack detection success rate in [22] is poor. T-Smade strikes a balance between security and performance by employing a two-stage detector: the second-stage focuses on maintaining system security, while the first-stage minimizes performance overhead from varying workloads. Thus, T-Smade can be considered as an efficient, well-generalized, and energy-efficient detector for evasive Spectre attacks.

4.7. Discussion and Limitation

The proposed well-generalized two-stage detector T-Smade utilizes HPC events and machine learning method to detect evasive Spectre nop and evasive Spectre memory. To achieve a general detector for more kinds of evasive Spectre attacks, T-Smade is trained using a unified dataset from Spectre attacks and benign programs across various workloads. To further improve the attack detection successful rate, the two-stage detector increases the distance between Spectre attacks, evasive Spectre attacks, and benign programs by reusing the two main characteristics of Spectre attacks to add a new evasive expanding rate. More importantly, the proposed two-stage detector T-Smade can work well with effective defense mechanisms [33,34,35] for guaranteed security.

However, the proposed two-stage detector T-Smade still has two main limitations: (1) The two-stage detector is designed only to detect evasive attacks that insert various instructions to mimic benign program behaviors. The detector for new evasive techniques are not verified. (2) The generalization of two-stage detector is limited by the available workloads. For example, the large generative AI model [36,37] based new workloads are not considered and may bring new detection challenges. These two directions will be considered in our future research for further optimizing T-Smade.

5. Conclusions

A two-stage smart detector is proposed to detect evasive Spectre attacks via reusing few often-sued HPC events under realistic applications noise and stress test noise. The first-stage classifier aims at workloads classification for good generalization, while the second-stage detector uses the first-stage result to detect evasive Spectre attack under different workloads in a joint way.

Compared with state-of-the-art detector, the attack detection successful rate of proposed T-Smade for evasive Spectre nop increases from 59.80% to 97.32% under realistic applications, from 0.84% to 98.16% under CPU stress testing, and from 27.52% to 90.77% under memory stress testing. Similarly, the attack detection successful rate for evasive Spectre memory can be improved from 11.73% 95.14% up to 100% under various configurations. Meanwhile, our proposed two-stage detector T-Smade achieves integration of environmental classification and attack detection with an average accuracy loss of only 0.12% in contrast to multiple individual detectors. Therefore, the proposed T-Smade can make good use of the existed four HPC events for new three-dimension features to detect evasive Spectre attack accurately.

Since T-Smade focuses solely on insertion-based evasive attacks and handles limited workloads, future research will extend T-Smade to the broaden scope by incorporating more attack types, such as gradient-based evasive attack [38], and a wider range of scenarios, such as large generative AI models [36,37] and cloud platforms [23,39].

Author Contributions

Conceptualization, J.J. and R.W.; methodology, J.J. and R.W.; software, R.W.; validation, J.J. and R.W.; formal analysis, J.J. and R.W.; investigation, J.J. and R.W.; resources, J.J., R.W. and Y.L.; data curation, J.J. and R.W.; writing—original draft preparation, J.J. and R.W.; writing—review and editing, J.J. and R.W.; visualization, Y.L.; supervision, J.J.; project administration, J.J.; funding acquisition, J.J. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Shanghai Pujiang Talent Program (No.21PJD026).

Data Availability Statement

The code used in this article can be obtained from the authors.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Gabbay, F.; Mendelson, A. Speculative Execution Based on Value Prediction; CiteseerX: Princeton, NJ, USA, 1996. [Google Scholar]
Ravichandran, J.; Na, W.T.; Lang, J.; Yan, M. PACMAN: Attacking ARM pointer authentication with speculative execution. In Proceedings of the 49th Annual International Symposium on Computer Architecture (ISCA 22), New York, NY, USA, 18–22 June 2022; pp. 685–698. [Google Scholar] [CrossRef]
Hu, G.; Lee, R.B. Protecting Cache States Against Both Speculative Execution Attacks and Side-channel Attacks. arXiv 2023, arXiv:2302.00732. [Google Scholar]
Tomasulo, R.M. An Efficient Algorithm for Exploiting Multiple Arithmetic Units. IBM J. Res. Dev. 1967, 11, 25–33. [Google Scholar] [CrossRef]
Hubballi, S.; Siddamal, S.V. Out-of-Order Execution of Instructions for In-Order Five-Stage RISC-V Processor. In Proceedings of the Advances in Microelectronics, Embedded Systems and IoT, Mizoram, India, 6–7 October 2023; Chakravarthy, V.V.S.S.S., Bhateja, V., Anguera, J., Urooj, S., Ghosh, A., Eds.; Springer: Singapore, 2024; pp. 29–35. [Google Scholar]
Kocher, P.; Horn, J.; Fogh, A.; Genkin, D.; Gruss, D.; Haas, W.; Hamburg, M.; Lipp, M.; Mangard, S.; Prescher, T.; et al. Spectre attacks: Exploiting speculative execution. Commun. ACM 2020, 63, 93–101. [Google Scholar] [CrossRef]
Li, C.; Gaudiot, J.L. Detecting Spectre Attacks Using Hardware Performance Counters. IEEE Trans. Comput. 2022, 71, 1320–1331. [Google Scholar] [CrossRef]
Ajorpaz, S.M.; Moghimi, D.; Collins, J.N.; Pokam, G.; Abu-Ghazaleh, N.; Tullsen, D. EVAX: Towards a Practical, Pro-active & Adaptive Architecture for High Performance & Security. In Proceedings of the 2022 55th IEEE/ACM International Symposium on Microarchitecture (MICRO), Chicago, IL, USA, 1–5 October 2022; pp. 1218–1236. [Google Scholar] [CrossRef]
AL-Zu’bi, M.; Weissenbacher, G. Statistical Profiling of Micro-Architectural Traces and Machine Learning for Spectre Detection: A Systematic Evaluation. In Proceedings of the 2024 Design, Automation & Test in Europe Conference & Exhibition (DATE), Valencia, Spain, 25–27 March 2024; pp. 1–6. [Google Scholar]
Pan, Z.; Sheldon, J.; Sudusinghe, C.; Charles, S.; Mishra, P. Hardware-Assisted Malware Detection using Machine Learning. In Proceedings of the 2021 Design, Automation & Test in Europe Conference & Exhibition (DATE), Virtual, 1–5 February 2021; pp. 1775–1780. [Google Scholar] [CrossRef]
Carnà, S.; Ferracci, S.; Quaglia, F.; Pellegrini, A. Fight Hardware with Hardware: Systemwide Detection and Mitigation of Side-channel Attacks Using Performance Counters. Digit. Threat. 2023, 4, 1–24. [Google Scholar] [CrossRef]
Kuruvila, A.P.; Meng, X.; Kundu, S.; Pandey, G.; Basu, K. Explainable Machine Learning for Intrusion Detection via Hardware Performance Counters. IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst. 2022, 41, 4952–4964. [Google Scholar] [CrossRef]
Botacin, M.; Grégio, A. Why We Need a Theory of Maliciousness: Hardware Performance Counters in Security. In Proceedings of the Information Security, Bali, Indonesia, 18–22 December 2022; Susilo, W., Chen, X., Guo, F., Zhang, Y., Intan, R., Eds.; Springer: Cham, Switzerland, 2022; pp. 381–389. [Google Scholar]
Kapotoglu Koc, M.; Altilar, D.T. Selection of Best Fit Hardware Performance Counters to Detect Cache Side-Channel Attacks. In Proceedings of the 2023 ACM Workshop on Secure and Trustworthy Cyber-Physical Systems, Charlotte, NC, USA, 26 April 2023; SaT-CPS 23. Association for Computing Machinery: New York, NY, USA, 2023; pp. 17–22. [Google Scholar] [CrossRef]
Hu, Y.; Liang, S.; Li, M.; Xue, T.; Zhang, B.; Wen, Y. CARE: Enabling Hardware Performance Counter based Malware Detection Resilient to System Resource Competition. In Proceedings of the 2022 IEEE 24th International Conference on High Performance Computing & Communications; 8th International Conference on Data Science & Systems; 20th International Conference on Smart City; 8th International Conference on Dependability in Sensor, Cloud & Big Data Systems & Application (HPCC/DSS/SmartCity/DependSys), Hainan, China, 18–20 December 2022; pp. 586–594. [Google Scholar] [CrossRef]
Pashrashid, A.; Hajiabadi, A.; Carlson, T.E. Fast, Robust and Accurate Detection of Cache-Based Spectre Attack Phases. In Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design (ICCAD 22), San Diego, CA, USA, 30 October–3 November 2022; Association for Computing Machinery: New York, NY, USA, 2022. [Google Scholar] [CrossRef]
Pashrashid, A.; Hajiabadi, A.; Carlson, T.E. HidFix: Efficient Mitigation of Cache-Based Spectre Attacks Through Hidden Rollbacks. In Proceedings of the 2023 IEEE/ACM International Conference on Computer Aided Design (ICCAD), San Francisco, CA, USA, 28 October–2 November 2023; pp. 1–9. [Google Scholar] [CrossRef]
Pashrashid, A.; Hajiabadi, A.; Carlson, T.E. Efficient Detection and Mitigation Schemes for Speculative Side Channels. In Proceedings of the 2024 IEEE International Symposium on Circuits and Systems (ISCAS), Singapore, 19–22 May 2024; pp. 1–5. [Google Scholar] [CrossRef]
Li, C.; Gaudiot, J.L. Challenges in Detecting an “Evasive Spectre”. IEEE Comput. Archit. Lett. 2020, 19, 18–21. [Google Scholar] [CrossRef]
Polychronou, N.F.; Thevenon, P.H.; Puys, M.; Beroulle, V. MaDMAN: Detection of Software Attacks Targeting Hardware Vulnerabilities. In Proceedings of the 2021 24th Euromicro Conference on Digital System Design (DSD), Palermo, Spain, 1–3 September 2021; pp. 355–362. [Google Scholar] [CrossRef]
Pan, Z.; Mishra, P. Automated Detection of Spectre and Meltdown Attacks Using Explainable Machine Learning. In Proceedings of the 2021 IEEE International Symposium on Hardware Oriented Security and Trust (HOST), Tysons Corner, VA, USA, 12–15 December 2021; pp. 24–34. [Google Scholar] [CrossRef]
Kosasih, W.; Feng, Y.; Chuengsatiansup, C.; Yarom, Y.; Zhu, Z. SoK: Can We Really Detect Cache Side-Channel Attacks by Monitoring Performance Counters? In Proceedings of the 19th ACM Asia Conference on Computer and Communications Security (ASIA CCS 24), Singapore, 1–5 July 2024; ACM: New York, NY, USA, 2024; pp. 172–185. [Google Scholar] [CrossRef]
He, Z.; Hu, G.; Lee, R.B. CloudShield: Real-time Anomaly Detection in the Cloud. In Proceedings of the Thirteenth ACM Conference on Data and Application Security and Privacy (CODASPY 23), Charlotte, NC, USA, 24–26 April 2023; ACM: New York, NY, USA, 2023; pp. 91–102. [Google Scholar] [CrossRef]
Guide, P. Volume 3B: System Programming Guide Part. Intel^® 64 and ia-32 Architectures Software Developer’s Manual. 2011, pp. 1–40. Available online: https://www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html (accessed on 26 September 2024).
Advanced Micro Devices. AMD64 Architecture Programmer’s Manual Volume 2: System Programming; Advanced Micro Devices: Santa Clara, CA, USA, 2006. [Google Scholar]
van Schaik, S.; Minkin, M.; Kwong, A.; Genkin, D.; Yarom, Y. CacheOut: Leaking Data on Intel CPUs via Cache Evictions. In Proceedings of the 2021 IEEE Symposium on Security and Privacy (SP), San Francisco, CA, USA, 24–27 May 2021; pp. 339–354. [Google Scholar] [CrossRef]
Wang, H.; Tang, M.; Xu, K.; Wang, Q. Cache Bandwidth Contention Leaks Secrets. In Proceedings of the 2024 Design, Automation & Test in Europe Conference & Exhibition (DATE), Valencia, Spain, 25–27 March 2024; pp. 1–6. [Google Scholar]
Guo, Y.; Zigerelli, A.; Zhang, Y.; Yang, J. Adversarial Prefetch: New Cross-Core Cache Side Channel Attacks. In Proceedings of the 2022 IEEE Symposium on Security and Privacy (SP), San Francisco, CA, USA, 23–25 May 2022; pp. 1458–1473. [Google Scholar] [CrossRef]
Durstenfeld, R. Algorithm 235: Random permutation. Commun. ACM 1964, 7, 420. [Google Scholar] [CrossRef]
Putrevu, M.A.; Putrevu, V.S.C.; Shukla, S.K. Early Detection of Ransomware Activity based on Hardware Performance Counters. In Proceedings of the 2023 Australasian Computer Science Week (ACSW 23), Melbourne, VIC, Australia, 30 January–3 February 2023; ACM: New York, NY, USA, 2023; pp. 10–17. [Google Scholar] [CrossRef]
Qiu, P.; Gao, Q.; Liu, C.; Wang, D.; Lyu, Y.; Li, X.; Wang, C.; Qu, G. PMU-Spill: A New Side Channel for Transient Execution Attacks. IEEE Trans. Circuits Syst. I Regul. Pap. 2023, 70, 5048–5059. [Google Scholar] [CrossRef]
Rosenblatt, F. The perceptron: A probabilistic model for information storage and organization in the brain. Psychol. Rev. 1958, 65, 386–408. [Google Scholar] [CrossRef]
Cauligi, S.; Disselkoen, C.; Moghimi, D.; Barthe, G.; Stefan, D. SoK: Practical Foundations for Software Spectre Defenses. In Proceedings of the 2022 IEEE Symposium on Security and Privacy (SP), San Francisco, CA, USA, 23–25 May 2022; pp. 666–680. [Google Scholar] [CrossRef]
Hetterich, L.; Bauer, M.; Schwarz, M.; Rossow, C. Switchpoline: A Software Mitigation for Spectre-BTB and Spectre-BHB on ARMv8. In Proceedings of the 19th ACM Asia Conference on Computer and Communications Security (ASIA CCS 24), Singapore, 1–5 July 2024; ACM: New York, NY, USA, 2024; pp. 217–230. [Google Scholar] [CrossRef]
Ponce-de Leon, H.; Kinder, J. Cats vs. Spectre: An Axiomatic Approach to Modeling Speculative Execution Attacks. In Proceedings of the 2022 IEEE Symposium on Security and Privacy (SP), San Francisco, CA, USA, 23–25 May 2022; pp. 235–248. [Google Scholar] [CrossRef]
Yao, Y.; Duan, J.; Xu, K.; Cai, Y.; Sun, Z.; Zhang, Y. A survey on large language model (LLM) security and privacy: The Good, The Bad, and The Ugly. High-Confid. Comput. 2024, 4, 100211. [Google Scholar] [CrossRef]
Zou, A.; Wang, Z.; Carlini, N.; Nasr, M.; Kolter, J.Z.; Fredrikson, M. Universal and Transferable Adversarial Attacks on Aligned Language Models. arXiv 2023, arXiv:2307.15043. [Google Scholar]
Islam, M.S.; Kuruvila, A.P.; Basu, K.; Khasawneh, K.N. ND-HMDs: Non-Differentiable Hardware Malware Detectors against Evasive Transient Execution Attacks. In Proceedings of the 2020 IEEE 38th International Conference on Computer Design (ICCD), Hartford, CT, USA, 18–21 October 2020; pp. 537–544. [Google Scholar] [CrossRef]
Schwarzl, M.; Borrello, P.; Kogler, A.; Varda, K.; Schuster, T.; Schwarz, M.; Gruss, D. Robust and Scalable Process Isolation Against Spectre in the Cloud. In Proceedings of the Computer Security—ESORICS 2022, Copenhagen, Denmark, 26–30 September 2022; Atluri, V., Di Pietro, R., Jensen, C.D., Meng, W., Eds.; Springer: Cham, Switzerland, 2022; pp. 167–186. [Google Scholar]

Figure 1. Evasive Spectre nop and evasive Spectre memory.

Figure 2. Spectre and evasive Spectre attack under different configurations.

Figure 3. The framework of evasive Spectre attack detector.

Figure 4. The process of selecting the appropriate second-stage detector based on the first-stage result.

Figure 5. The effectiveness of T-Smade against evasive Spectre under realistic application.

Figure 6. The effectiveness of T-Smade against evasive Spectre under CPU stress test.

Figure 7. The effectiveness of T-Smade against evasive Spectre under memory stress test.

Figure 8. 3D separation plot by proposed detector T-Smade.

Figure 9. 3D separation plot comparison under varying realistic applications.

Figure 10. The accuracy loss between ideal detector and two-stage detector.

Table 1. Experimental configuration.

Item	Configuration
operation system	Linux 5.4.0-146-generic
mirror	Ubuntu 18.04.6 LTS
memory	125.5GiB
processor	Intel Xeon^® Silver 4210 CPU @ 2.2GHz × 20
graphics	llvmpipe (LLVM 10.0.0, 256 bits)
GNOME	3.28.2
OS type	64 bit
disk	502.9 GB
software	Pycharm professional 2022.1.3
Python	Python3.6
Perf (HPCs)	Perf version 5.4.233
Stress-ng	Stress 1.0.4

Table 2. The datasets of two-stage models.

	Datasets	Attacks	Features	The Number of Samples
First stage	train	Spectre	branches	3600
			branch misses
			LLC reference
	test	two evasive Spectre	LLC misses	3600
			branch miss rate
			LLC miss rate
Second stage	train	Spectre	branches	3 detectors and 1200 for each
			branch misses
			LLC reference
			LLC misses
	test	two evasive Spectre	branch miss rate	3 detectors and 1200 for each
			LLC miss rate
			evasive expanding rate

Table 3. Description of different workloads interference.

Workloads		Description
Realistic applications	A1	Get locally downloaded music and play different music in a loop with Pygame.
	A2	Search and watch different types of videos on your Firefox browser.
	A3	Call the WPS (similar to Office, version 11.1.0), and constantly edit and save the document.
	A4	Call the MySQL database (ver 14.14 Distrib 5.7.41) and perform four operations.
	A5	Play snake game with Pygame.
	A6	Call the libreoffice (similar to Office, version 5.4.6), and constantly edit and save the document.
Stress	−c	Run multiple computation-intensive tasks, such as integer operations and floating-point operations, to increase the CPU workload.
Stress	−m	Repeatedly allocate and deallocate a large amount of memory, increasing memory usage.

Table 4. Detection performance comparison under varying realistic applications.

	4 Apps (A1–A4)		5 Apps (A1–A5)		5 Apps (A1–A2, A4–A6)
	TP	TN	TP	TN	TP	TN
evasive Spectre nop	97.32%	75.04%	96.98%	88.44%	90.95%	63.48%
evasive Spectre memory	100%	75.04%	100%	88.44%	96.48%	63.48%

Table 5. The training accuracy of two-stage detector.

Model		Training Accuracy
First-stage classifier		99.95%
Second-stage detector	Realistic applications	95.40%
	CPU stress test	100%
	Memory stress test	80.54%

Table 6. Comparison of state-of-the-art works with T-Smade.

Methods	HPC Events Number	ML	Generalization	Workload Variety	Attack Detection Successful Rate (Security)	Non-Attack Detection Successful Rate (Performance)
[7] (2022)	4	LR, SVM, MLP	No	1	Evasive Spectre nop: 70%	/
[20] (2021)	6	LR	No	2	Evasive Spectre: 100%	/
[21] (2021)	6	MLP	No	1	Evasive Spectre: 92.45% Evasive Meltdown: 96.8%	Evasive Spectre: 95.6% Evasive Meltdown: 97.7%
[22] (2024)	4	NN	No	1	Evasive Spectre: 100%	Evasive Spectre: 0%
Our T-Smade	4	MLP	Yes	3	Evasive Spectre nop: 95.42% Evasive Spectre memory: 100%	Evasive Spectre nop: 87.49% Evasive Spectre memory: 88.89%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Jiao, J.; Wen, R.; Li, Y. T-Smade: A Two-Stage Smart Detector for Evasive Spectre Attacks Under Various Workloads. Electronics 2024, 13, 4090. https://doi.org/10.3390/electronics13204090

AMA Style

Jiao J, Wen R, Li Y. T-Smade: A Two-Stage Smart Detector for Evasive Spectre Attacks Under Various Workloads. Electronics. 2024; 13(20):4090. https://doi.org/10.3390/electronics13204090

Chicago/Turabian Style

Jiao, Jiajia, Ran Wen, and Yulian Li. 2024. "T-Smade: A Two-Stage Smart Detector for Evasive Spectre Attacks Under Various Workloads" Electronics 13, no. 20: 4090. https://doi.org/10.3390/electronics13204090

APA Style

Jiao, J., Wen, R., & Li, Y. (2024). T-Smade: A Two-Stage Smart Detector for Evasive Spectre Attacks Under Various Workloads. Electronics, 13(20), 4090. https://doi.org/10.3390/electronics13204090

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

T-Smade: A Two-Stage Smart Detector for Evasive Spectre Attacks Under Various Workloads

Abstract

1. Introduction

2. Background

2.1. Speculative Execution and Cache Side Channels for Microarchitecture Attack

2.2. Evasive Spectre Attack

2.3. Hardware Performance Counter

2.4. Multi-Layer Perceptron

3. Proposed Detector T-Smade

3.1. Motivation

3.2. Overall Framework

3.3. The Two-Stage Detector Details

4. Results and Analysis

4.1. Experiment Configuration

4.2. Detection Performance Results and Analysis Under Different Evasive Spectre Attacks

4.3. Detection Performance Results and Analysis Under Different Workloads

4.4. Detection Performance Results and Analysis Under Varying Realistic Applications

4.5. Good Generalization Verification

4.6. Comparison with State-of-the-Art Researches

4.7. Discussion and Limitation

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI