Next Article in Journal
Leveraging Pre-Trained GPT Models for Equipment Remaining Useful Life Prognostics
Previous Article in Journal
OR-MTL: A Robust Ordinal Regression Multi-Task Learning Framework for Partial Discharge Diagnosis in Gas-Insulated Switchgear
Previous Article in Special Issue
Efficient Real-Time Anomaly Detection in IoT Networks Using One-Class Autoencoder and Deep Neural Network
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

LLM-WFIN: A Fine-Grained Large Language Model (LLM)-Oriented Website Fingerprinting Attack via Fusing Interrupt Trace and Network Traffic

College of Information Engineering, Shanghai Maritime University, Shanghai 201308, China
*
Author to whom correspondence should be addressed.
Electronics 2025, 14(7), 1263; https://doi.org/10.3390/electronics14071263
Submission received: 14 February 2025 / Revised: 7 March 2025 / Accepted: 13 March 2025 / Published: 23 March 2025
(This article belongs to the Special Issue AI in Cybersecurity, 2nd Edition)

Abstract

:
Popular Large Language Models (LLMs) access uses website browsing and also faces website fingerprinting attacks. Website fingerprinting attacks have increasingly threatened website users to the leakage of browsing privacy. In addition to the often-used network traffic analysis, interrupt tracing exploits the microarchitectural side channels to be a new compromising method and assists website fingerprinting attacks on non-LLM websites with up to 96.6% classification accuracy. More importantly, our observations show that LLM website access performs inherent defense and decreases the attack classification accuracy to 6.5%. This resistance highlights the need to develop new website fingerprinting attacks for LLM websites. Therefore, we propose a fine-grained LLM-oriented website fingerprinting attack via fusing interrupt trace and network traffic (LLM-WFIN) to identify the browsing website and the content type accurately. A prior-fusion-based one-stage classifier and post-fusion-based two-stage classifier are trained to enhance website fingerprinting attacks. The comprehensive results and ablation study on 25 popular LLM websites and varying machine learning methods demonstrate that LLM-WFIN using post-fusion achieves 97.2% attack classification accuracy with no defense and outperforms prior-fusion with 81.6% attack classification accuracy with effective defenses.

1. Introduction

Large Language Models (LLMs) have been an increasingly prominent theme recently. In the last two years, the top 50 LLM tools have generated more than 24 billion visits, with an average monthly growth of 236 million visits [1]. These remarkable data highlight the impacts of LLM on the digital world and its ever-expanding potential. However, a variety of attacks pose growing threats to LLMs’ security and lead to information leakage. Most attacks on LLMs involve exploiting side-channel information [2,3,4,5] or crafting specific model inputs that cause large models to leak information [2,6,7,8,9]. Therefore, it is crucial to consider the potential impacts of various attacks on LLMs.
Unlike the above attacks on LLM models themselves, which require a deep understanding of LLMs, website fingerprinting (WF) attacks focus on LLM applications and expose LLM visits to the risks of user privacy leakage.Various visiting behavior and related data such as hardware microarchitecture related interrupt traces [10], and network traffic [11] can be used to achieve WF attacks on general websites with classification accuracies of up to 96.6% and 98.1%. It is possible for WF attacks to have similar effects on LLMs as they have on other websites.
To the best of our knowledge, this work is the first to conduct experiments on 25 commonly used LLMs by leveraging interrupt traces and network traffic to analyze the impact of WF attacks. Interestingly, the interrupt-based WF attacks’ classification accuracy on LLMs drops significantly to 6.5%. Enhancing LLM-oriented website fingerprinting attacks and designing effective defenses are critical to protect users’ browsing privacy. Therefore, this work is also the first to propose a novel fine-grained LLM-oriented Website fingerprinting attack via Fusing Interrupt trace and Network traffic (LLM-WFIN) to accurately identify both the browsing website and content type. Our main contributions include the following.
  • We analyze interrupt-based WF attacks and network-traffic-based WF attacks on LLMs separately. Our findings reveal that interrupt-based WF attacks can be weakened by the inherent defense of interactive LLM visits, while network-traffic-based WF attacks exhibit similar effects on LLMs as they do on general websites.
  • We design a novel fine-grained LLM-oriented website fingerprinting attack that fuses interrupt traces and network traffic to enhance website fingerprinting attacks. We propose two fusion policies: a prior-fusion-based, one-stage classifier and a post-fusion-based, two-stage classifier.
  • We conduct comprehensive analyses and ablation studies on 25 LLMs to verify the effectiveness of the proposed attack, LLM-WFIN. The results demonstrate that LLM-WFIN using post fusion outperforms that using prior fusion and achieves up to 81.6% attack classification accuracy even under effective defenses.
We plan to make the artifact of LLM-WFIN publicly available on https://github.com/Blackmores6-Haro/LLM-WFIN (accessed on 10 March 2025) under open-source licensing.
The remainder of this article is organized as follows. Section 2 reviews related work, and Section 3 provides background information. Section 4 presents our observation of WF attacks on LLMs. Section 5 details the proposed novel attack, LLM-WFIN, while Section 6 offers comprehensive results and analysis. Finally, Section 7 summarizes this work and outlines the future direction.

2. Related Work

2.1. LLMs-Oriented Attacks

LLMs-oriented attacks are typically divided into two categories [12]: attacks on LLM models themselves and attacks on LLM applications.
Attacks on LLM. Attacks on the models usually extract model parameters [4], reconstruct model inputs [13], identify model architecture [14,15], or compromise the integrity of the model and its associated data [16]. These attacks often leverage hardware side-channel data, such as cache [2], memory access [3], electromagnetic radiation, and power consumption [4], as well as GPU statistics [5]. Yan et al. [2] inferred the DNN architecture through cache side-channel attacks such as Prime+Probe and Flush+Reload. Nazari et al. [3] focused on edge and embedded devices and successfully inferred the deployed LLM architecture in these protected edge environments by analyzing memory usage patterns. Dong et al. [4] attacked DNNs by measuring the execution time of floating-point multiplications. Patwari et al. [5] collected RAM and GPU usage data at the user space level from edge and embedded DNN devices and accurately predicted the categories of DNN architectures. Attacks targeting LLM models typically require a deep understanding of the model’s features, the characteristics of the training data, and a significant amount of implementation time. Therefore, there are considerable challenges in terms of the difficulty of executing such attacks.
Attacks targeting LLM applications. Some attacks on LLM applications aim to exploit the behavior and output of the models. Greshake et al. [7] utilized large models to carry out phishing attacks by altering the initial prompts, and they eventually forced users to disclose personal data, such as chat logs, real names, and credit card information. By introducing prompt techniques [8], these attacks can manipulate the model to generate erroneous information [2,6], infringe on data privacy [7,17], or compromise availability and functionality [9]. Xie et al. [8] explored the potential for privacy leakage during prompt tuning and proposed an effective and novel framework to infer users’ private information. Shumailov et al. [9] introduced a new threat vector against neural networks by creating sponge examples. These inputs maximize energy consumption and latency, forcing LLM applications into their worst performance. The convenience of the attacks targeting model applications makes them easy for attackers to exploit. Our attack also targets model applications via exploiting the information of browsing LLM websites.

2.2. Website Fingerprinting Attacks and Defenses

2.2.1. Network-Traffic-Based Website Fingerprinting Attack and Defense

Network-traffic-based website fingerprinting attack aims to identify the websites a user visits by observing encrypted traffic or related data between the victim and the server. Attackers typically collect network traffic information, such as packet size [11,18], direction [18], and transmission time [18,19] in real time through monitoring tools. They then use machine-learning-based classifiers to predict the websites visited by the user.
In 2011, Panchenko et al. [18] first introduced the concept of a website fingerprinting attack targeting Tor users. By analyzing features such as network packet size, timing, and direction, they trained a Support Vector Machine (SVM) classifier that achieved an accuracy of nearly 55%. Then, Wang et al. [20] developed a new k-nearest neighbor classifier in 2014, which reached an accuracy of 91% in a closed-world scenario and approximately 85% in an open-world scenario. In 2016, Panchenko et al. introduced the CUMUL classifier [21], which utilized cumulative traffic features such as packet size and direction. This approach reduced computational complexity and enhanced identification accuracy. It represented a key improvement in early WF attacks and proved scalable for real-world network environments. With the development of deep learning, it brought significant improvements to WF attacks. Rimmer et al. [22] designed a convolutional neural network (CNN)-based model for automatic feature extraction. This model not only increased classification accuracy but also adapted to changes in website content. Deep learning eliminated the need for manual feature selection, making attacks more flexible and efficient. In 2018, Sirinam et al. [23] introduced Deep Fingerprinting (DF), one of the most potent WF attacks to date, achieving over 95% accuracy in both closed-world and open-world scenarios. Recently, new technologies like Generative Adversarial Networks (GANs) have been applied to WF attacks. Models like GANDaLF [24] enhanced training data by generating fake traffic, thereby improving the generalization ability of attack models. Moreover, newer models like Var-CNN [11] improved attack robustness through adversarial training. Our attack on LLM applications uses the same typical side channel of network packet information in WF attacks.
A frequently used defense against WF attacks is to add dummy packets [25,26,27,28,29]. This cover traffic makes the features of network pattern less distinctive, thereby decreasing classification accuracy by the adversary. BuFLO [25] modifies the traffic characteristics to make it appear to have a constant rate, thus eliminating the features of specific packets. WTF-PAD [26] employs adaptive padding, adding padding when the channel utilization rate is low to save bandwidth, thereby masking traffic bursts and their corresponding characteristics. Walkie-Talkie [27] adds dummy packets and introduces delays between the client and the server to create collisions, making the traffic characteristics similar to those used by the attacker’s classifier. In our work, we use the same strategy to evaluate the performance of our attacks by adding invalid packets between the LLM application server and the users to obfuscate the traffic characteristics.

2.2.2. Interrupt-Based Website Fingerprinting Attack and Defense

In addition to leveraging side channels of network traffic in website browsing, WF attacks can also exploit the side channels of hardware microarchitectures. A hardware-based side channel attack can utilize a script (e.g., JavaScript [30]) executed in tandem with the victim’s local machine to share all the micro-architectural resources. When the underlying hardware architecture performs some tasks, the attacker can exploit the micro-architectural resources such as memory and cache allocation, GPU, and CPU utilization to identify the visited websites [31].
In this paper, we focus on the interrupt-based website fingerprinting attack, a new kind of hardware-based website fingerprinting attack [10]. This attack deploys a side channel information monitor on the user’s host to monitor the interrupt frequency information as users browse web pages. Then, it utilizes these interruption features to train a classifier by combining LSTM (Long-Short Term Memory) and CNN with a classification accuracy as high as 96.6%. Given the high accuracy of the LSTM+CNN model in their study, we use the same model in our microarchitectural WF attack for further experiments and analysis. In Section 6.2, we also evaluate our work by comparing the accuracy of various models. For consistency, we use the same Javascript monitoring component [10] that operates as a service worker and runs on a separate background thread distinct from the user LLM website or general website program.
The most effective defense against interrupt-based WF attacks is similar to that of dummy packets. During the collection of interrupt traces, invalid interrupts are added to interfere with the characteristic interrupt features. The randomized timer [10] makes random adjustments to the time intervals and increments of the timer. This approach can reduce the regularity of timing signals and make it more difficult for attackers to obtain accurate interrupt timing information. Using this method, the attack accuracy can be reduced from 96.6% to 1.0%. Our work also utilizes this effective randomized timer to evaluate the proposed attack model.

2.3. Website Fingerprinting Attacks and Defenses

LLM fingerprinting attacks primarily leverage side channels, targeting either the hardware information of the device or utilizing LLM itself as a side channel. In recent years, several LLM fingerprinting attacks have been developed, achieving attack accuracies exceeding 90%.
As Table 1 list, Patwari K [5] proposes an LLM fingerprinting attack for identifying DNN models deployed on CPU–GPU edge devices. This attack uses system-level side-channel information such as memory, CPU, and GPU usage obtained in the user space and a Random Forest classifier to classify known DNN models into their model architecture families with 99% accuracy. EZClone [32] obtains the GPU profile by running the binary executable file on the Victim model and uses it as a side channel to predict DNN architecture. These profiles contain data such as GPU kernel invocations, memory operations, and system-level information. By employing a supervised learning model, it can predict DNN architecture in the entire set of PyTorch (available at https://pytorch.org) vision architectures with 100% accuracy. LLM-FIN [3], which is also targeted at LLMs in edge and embedded devices, relies on the few obtainable data points in a secure edge environment, analyzes memory usage patterns, and employs a supervised machine learning classifier to classify known LLMs into their architecture families with over 95% accuracy. LLMmap [33] utilizes an active fingerprinting approach. It sends meticulously crafted queries to the application and then analyzes the responses to recognize the specific version of the LLM being used. These queries are founded on domain knowledge, that is, how Large Language Models generate responses with unique identifiability in response to prompts on diverse topics. With only eight interactions, it can precisely identify 42 different Large Language Model versions with an accuracy exceeding 95%. Unlike the above attacks on LLMs, our fine-grained LLM-WFIN attack uses both interrupt traces and network traffic as side channels for more than 97% attack accuracy.

3. Background

3.1. LLM Applications

LLM applications encompass various types of software and application services developed based on LLMs. Interactive LLMs serve as the foundation of such applications, such as the commonly known GPT series, Gemma series, and Copilot. Generally, there are two ways to use these LLMs. (1) One is calling the API interfaces they provide. Users can implement complex tasks such as image segmentation and audio and video processing in the local programming environment by leveraging the APIs. (2) The other way is using the web browser. Users often visit the websites of Large Language Models. Then, they interact with LLMs using natural language without having to follow specific instruction formats to complete simple interactive tasks like text generation and image generation. This significantly lowers the usage threshold. The scenario targeted by our attacks is precisely this more prevalent way of using LLMs.

3.2. Network-Traffic-Based WF Attack on LLM Applications

A network-traffic-based WF attack utilizes network traffic information to identify the websites visited by a user. The whole attack model is as described in Figure 1. A target user employs a web browser to access an LLM application website. To protect users’ browsing privacy, users usually do not directly connect to the servers of the target websites. Instead, they typically use a secure network. The attacker usually observes all the traffic entering and leaving in the path between the user and the website server to disclose the user’s privacy.

3.3. Interrupt-Based WF Attack on LLM Applications

The interrupt-based WF attack utilizes hardware architectural information to identify the websites accessed by a user. The whole attack model is as described in Figure 2. An attacker usually deploys a monitor on the user’s PC (personal computer). When a secure network is used to access an LLM application website, the co-resident monitor dynamically collects the interrupt information of the user’s PC. The collected interrupt information is subsequently transmitted to the attacker. This information is then utilized to disclose the user’s browsing privacy.
The interrupt monitor typically runs a counting program in the background [10]. In this program, the attacker uses a parameter of period length P as an input. Then, an interrupt trace is constructed, where each element in the trace measures how many iterations of the innermost loop were executed every P milliseconds. If interrupt handlers preempt the CPU, the execution time of the monitor’s counting program is shortened by fewer iterations and a lower counter value. In this way, the attacker can collect the time-series data using the counter to successfully infer the general non-LLM websites visited by the user, achieving 96.6% attack’s accuracy. Similarly, the same monitor can be employed to gather interrupt traces during interactions with LLM applications, enabling the classification of user query types.
Interrupt monitor program:
  • begin
  •   int Trace[T*1000]
  •   loop {
  •     counter = 0;
  •     begin   = time();
  •     do {
  •         comment: count iterations
  •         counter++;
  •     } while (time() - begin < P)
  •     Trace[begin] = counter;
  •   }
  • end

4. Observation of WF Attacks on LLM Applications

4.1. Hypothesis

Whether considering traditional network-traffic-based WF attacks or novel interrupt-based WF attacks for browsing general non-LLM websites, the existing WF attacks have shown the distinctive pattern as a unique fingerprint that can accurately classify websites, as listed in Table 1. Given the similar browsing characteristics between general websites (e.g., Taobao and JD) and LLM websites (e.g., Doubao and Kimi), it is hypothesized that WF attacks on LLM applications would yield similar results. To validate this hypothesis, two groups of experiments were conducted on LLMs to test its applicability across various LLM queries.

4.2. Network-Traffic-Based Attack on LLM Applications

Network traffic WF attacks use the easy and simple packet size as the feature. To differentiate this from traditional network traffic WF attacks, only packet data from interactions are collected, excluding the website loading process. In the proposed attack model, the attacker and the user reside on the same machine. As a result, both parties are the destination of the network traffic, and the collected network packets represent the actual packets received by the user. The BrowserMob Proxy tool [34] is used as an HTTP/HTTPS proxy to intercept and log the network traffic when using LLM applications. As shown in Figure 3, the browser is configured to use this proxy through Selenium [35], ensuring all browser requests go through the BrowserMob Proxy. Once the website has finished loading and before query initiation, the proxy begins capturing traffic. Selenium then simulates user query operations, continuously collecting traffic within the predefined response threshold of the LLM application. The captured traffic is organized into blocks containing request method, URLs, status code, response time, and data packets. From each traffic block, request and response packet information is extracted, and all available traffic block information is merged into a continuous network packet trace.
The packet traces for different LLM applications are shown when handling image-based and text-based queries, respectively, in Figure 4 and Figure 5. First, image-based queries generate distinct traces compared to text-based queries on both LLM websites, Doubao and Kimi. Second, different LLM websites exhibit distinct packet traces. These phenomena are similar to general non-LLM websites in Figure 6. Therefore, the packet traces can be utilized for LLM-oriented WF attacks, achieving 92.1% classification accuracy, as initially hypothesized.

4.3. Interrupt-Based Attacks on LLM Applications

The interrupt traces during the loading of LLM application websites can be generated by the monitor program described in Section 3.2. As shown in Figure 7, the interrupt characteristics of loading LLM application websites under different epoch setting are stable. However, the interrupt features of different LLM applications when processing different types of queries appear entirely random. As shown in Figure 8, LLM applications show complete randomness when handling image-based versus text-based queries. Similarly, general non-LLM websites, such as jstv.com, exhibit the same behavior, as shown in Figure 9. The comparative results reveal a 6.5% classification accuracy. Therefore, it can be concluded that LLM applications possess strong natural defenses against interrupt-based attacks. Classifying user queries by collecting interrupt features from interactions with LLM applications is challenging for attackers.

4.4. Motivation for a Novel LLM-Oriented Attack

The above initial results and statistical results in Table 2 demonstrate that network traffic during interactions could accurately reflect the characteristics of different LLM applications, achieving a classification accuracy of 92.1%. However, relying solely on network traffic has limitations, as non-LLM websites may exhibit similar characteristics, potentially misleading the classification results. Figure 10 shows an example of the non-uniqueness of LLM application packet traces. The packet trace of the image chat.360 is remarkably similar to that of the general wallhaven.cc. This similarity introduces significant interference for network-traffic-based LLM application attacks. Identifying more distinctive features is necessary to enhance the effectiveness of the attack.
In contrast, the interrupt-based WF attack on LLM queries achieves a low accuracy of only 6.5%. Interestingly, the interrupt-based WF attack achieves a classification accuracy of 96.6% during the loading phase of LLMs. Therefore, designing a novel attack that combines both interrupt traces and network traffic traces may enable the identification of unique features of LLM applications.

5. Proposed Novel LLM-WFIN Attack

5.1. Why Two Kinds of Traces in LLM-WFIN?

We propose a novel LLM-WFIN attack using a packet trace and interrupt trace together. There are two primary reasons: (1) Packet traces enable high attack accuracy, while interrupt traces exhibit more stable and unique characteristics for LLM website identification. (2) Packet traces are collected during the querying phase, while interrupt traces are generated during the loading phase. The two phases are completely independent and do not interfere with each other. Therefore, combining both traces enables fine-grained LLM-oriented attacks.

5.2. How to Fuse Two Traces in an LLM-WFIN Attack

To fuse the two traces in LLM-WFIN, the process of visiting LLM websites is divided into two phases, loading and querying, each generating distinct traces. The process begins when a user opens a browser and accesses an LLM application website, and it ends when the user completes a conversation and closes the relevant LLM application tab. Interrupt traces are collected during the user’s access to the LLM application website, while network traffic packet traces are captured during the user’s queries using an HTTP proxy.
Two policies are selected and implemented to fuse two traces in LLM-WFIN for an enhanced fine-grained attack: (1) the prior-fusion policy first fuses the two traces and trains a classifier to predict LLM websites along with their content types; (2) the post-fusion policy uses the two traces to train two separate models, then combines their classification results to predict LLM websites along with their content types.

5.3. One-Stage LLM-WFIN-Prior Attack Model Using the Prior-Fusion Policy

The one-stage LLM-WFIN-Prior attack model using the prior-fusion policy is composed of three steps as shown in Figure 11:
  • Data collection. When the application inputs are prepared for accessing LLM websites, the two types of traces are collected. During the loading phase of LLM websites, interrupt traces are collected by the interrupt monitor. Subsequently, packet traces are captured by the HTTP proxy during the LLM querying phase.
  • Data fusion. The two traces are fused into a unified trace by simple concatenation. More effective multimodality fusion modules, such as those described in [36,37,38,39], can be utilized for higher accuracy, albeit with increased computational complexity. Additionally, to address data alignment issues, each short interrupt trace (1–3 s) is downscaled by two orders of magnitude to match the corresponding packet trace.
  • Training a model. A classifier assisted by machine learning methods is trained for LLM-WFIN-Prior to directly identify websites along with their querying content types. The choice of machine learning methods significantly impacts classification results; further comparisons and analyses are detailed in Section 4.2.

5.4. Two-Stage LLM-WFIN-Post Attack Model Using the Post-Fusion Policy

The two-stage LLM-WFIN-Post attack model using the post-fusion policy is also composed of three steps as shown in Figure 12:
  • Data collection. Similar to the procedure of LLM-WFIN-Prior, interrupt traces and packet traces are collected during the LLM website loading phase and querying phase, respectively.
  • Train two models. Two classifiers are trained separately using interrupt traces and packet traces. In other words, the LLM-WFIN-Post attack combines the existing interrupt-based attack and network-traffic-based attack. The primary differences lie in two points: (a) non-LLM applications are replaced by LLM-oriented websites; (b) the network-traffic-based packet traces focus solely on the querying content type, while interrupt traces address LLM website identification.
  • Fusion of classification results. To obtain the fine-grained classification results for LLM websites with content types, the prediction labels from the interrupt-based attack model and the network-packet-based attack model are merged. Higher classification accuracy results in a stronger LLM-WFIN-Post attack, exposing more of the user’s browsing privacy.

6. Result and Analysis

6.1. Experiments Configuration

In the experiments, the widely-used Chrome browser was employed to generate the interrupt traces and network packet traces for LLM applications, running on Ubuntu 20.04 and Windows 11. The detailed configurations are listed in Table 3. These 25 websites were selected based on the Global AI Product Ranking aicpb.com, excluding websites that (1) are inaccessible and (2) do not collect network traffic via automated tools, as listed in Appendix A. To observe the impact of WF attacks on LLM applications, a clean scenario is established where a user loads an LLM application website, followed by all query interactions. When a user opens a specific LLM application website, either text-based or image-related queries are submitted. Finally, the user waits for the full response for LLM and then closes the website.
Data Collection. The term “LLM Application Website Interrupt Trace” refers to the interrupt trace generated during the website loading phase, while “LLM Application Interrupt Trace” and “LLM Application Network Packet Trace” refer to the interrupt and network packet traces generated during the querying phase, respectively. For each LLM application, 100 traces are collected for each type of trace. As a result, 2500 traces are collected for each type of trace, totaling 7500 traces across all 25 LLM websites.
Machine Learning Models. Several widely used classification models are selected, including RF (Random Forest) [40], KNN (K-Nearest Neighbors) [41], CNN (Convolutional Neural Network) [42], LSTM (Long-Short Term Memory) [43], and SVM (Support Vector Machine) [44].
Overhead of LLM-WFIN Approaches. Both methods utilize the same LSTM+CNN model with 5 layers and 560,748 parameters, requiring approximately 10 min for one training iteration on 7500 traces. During the attack, both methods require 15 s to collect traces. The prediction time for both methods is approximately 18 s, primarily due to trace collection.

6.2. Comparative Results Under Varying Learning Methods

The comparative results of LLM-WFIN with two different fusion policies under varying learning methods are summarized in Table 4. These machine learning methods are widely used in WF attacks. First, it is observed that the proposed two-stage LLM-WFIN-Post achieves higher attack’s accuracy than one-stage LLM-WFIN-Prior. The primary reason is that LLM-WFIN-Post effectively utilizes two heterogeneous traces for two different prediction labels of websites and content type individually. In contrast, the simpler LLM-WFIN-Prior fuses the two traces at an early stage and ignores their heterogeneity. More importantly, a simple concatenation is applied between the two traces without adopting complex attention mechanisms to distinguish their weights. Second, as increasingly complex machine learning models are employed, the accuracies of both policies have progressively improved. CNN is more capable of extracting and assembling the local features of the trace into complex global characteristics, while LSTM excels at retaining sequential feature information of the trace. Consequently, when CNN and LSTM are applied separately, the accuracies exceed 90%, and the combination of LSTM+CNN achieves the highest accuracy. Therefore, this classification model is adopted in the subsequent experimental results.

6.3. Comparative Results Under Varying Defenses

Since our proposed LLM-WFIN attack on LLM applications uses interrupt traces and packet traces, two corresponding defenses are also configured to verify its effectiveness.
For the interrupt trace in the LLM loading phase, the randomized timer [10] is employed to interfere with the interrupt trace of loading websites, reducing the WF attack’s accuracy to 1.0% for non-LLM ordinary applications. As shown in Figure 13, the randomized timer completely randomizes the interrupt trace of the LLM application. The principle behind this is that the randomized timer introduces random adjustments to the time intervals and increments of the browser timer. As a result, when we sample data at a 5 millisecond interval, the actual interval may vary between 0 and 100 ms. Consequently, the value of each sample point becomes random, resulting in the randomization of the entire trace. In the one-stage LLM-WFIN-Prior attack, since two types of traces are concatenated and the interrupt features account for over 90% of the total features, the randomized timer effectively disrupts the characteristics of the entire trace and reduces the attack accuracy to 6.3%. Similarly, in the two-stage LLM-WFIN-Post attack, although the two types of features are predicted separately, the prediction of LLM websites using interrupt features is effectively defended, reducing the overall accuracy to 15.6%. However, experiments demonstrated that the use of only network packet traces can still achieve an accuracy of 92.1%, as described in Section 4.2. This judgment may not be unique, as the results could be influenced by activities from other types of websites. If the attacker is willing to accept the possibility of misclassification, the attack can still be considered successful. The likelihood of misclassification can be roughly estimated by the difficulty of finding these other types of websites. In fact, approximately 100 website searches were required to find one network packet trace similar to that of the LLM application. Therefore, relying solely on the randomized timer to defend against the first phase of the attack is not sufficiently reliable.
For the packet trace during the querying phase, a similar principle is applied by randomly sending invalid data requests to the LLM application via JavaScript during each sampling interval throughout the querying procedure. The experimental results indicate that this method had a minimal impact on the distribution of network packets traces, resulting in only a slight effect on accuracy. The possible reason is that the target server does not return valid data packets for invalid data requests, so it has little impact on the characteristics of the overall response packet flow. As shown in Figure 14, even after introducing random invalid network packets, the overall distribution of the two network packet traces remains unchanged. Under this defense, the accuracy decreased from 92.1% to 81.6%. Subsequently, this defense was applied to both the prior-fusion and post-fusion policies, reducing the attack accuracy to 80.3% and 78.3%, respectively. Finally, both defenses were employed simultaneously, and the attack accuracy was recorded for each policy. In the post-fusion policy, assuming the attacker is willing to accept the possibility of misclassification, the attack accuracy was approximately equal to the accuracy of the second-phase attack under the random packet interference at 80.3%. Under the prior-fusion policy, the randomized timer effectively disrupted the interrupt trace, as shown in Figure 15. Since this feature accounts for over 90% of the entire feature, the attack accuracy was reduced to 6.3%.
In summary, the proposed attack can predict user behavior when using LLM applications with high accuracy. This suggests that these LLM applications lack effective server-side defenses against such attacks during user interactions. The attack maintains high accuracy even when confronted with defenses at different phases. Even when both randomization mechanisms are activated simultaneously, a stable attack can still be achieved by training classification models for both phases. As a result, users’ private information can be obtained when they use LLM applications.

6.4. Comparative Results Between LLMs and Other Non-LLM Websites

In Section 4.2 and Section 4.3, various types of interrupt traces and packet traces were compared. Here, the difference matrix is used to more clearly contrast the differences between ordinary websites and LLM websites. Each matrix compares the differences among 10 traces, with these 10 traces representing the characteristics of the website across different epochs.
In Figure 16, the difference matrices of the interrupt traces during the loading phase for LLM websites and ordinary websites are presented. Whether for LLM websites or ordinary websites, the interrupt characteristics of the same website exhibit certain patterns. This regularity is precisely what enables the classification of LLM application websites in the two-stage LLM-WFIN-Post attack.
In Figure 17, the difference matrices of the interrupt traces during the querying phase for LLM websites and ordinary websites are presented. The queries on LLM websites involve pictures, while for ordinary websites, random clicks on pictures displayed on web pages are used as a substitute. It is evident that that both types of traces exhibit complete randomness, with no discernible patterns among the traces. This further indicates that interrupt traces during the querying phase cannot be used to predict users’ behaviors.
In Figure 18, the difference matrices of the packet traces during the querying phase for LLM websites and ordinary websites are presented. The behaviors during the querying phase are identical to those in the interrupt difference matrices in Figure 17, involving picture-related queries. It is evident that the packet traces of ordinary websites during the querying phase lack regularity and exhibit complete randomness. In contrast, the packet traces of LLM websites are highly distinctive. Therefore, the packet traces during the querying phase can be utilized to predict users’ query behaviors on LLM websites.
In summary, the similarities and differences in interrupt traces and packet traces between ordinary and LLM websites enable the realization of the LLM-WFIN attack.

6.5. Ablation Study

In Table 5, all experimental results are presented. First, the packet traces of LLM websites during the querying phase are used to predict the LLM application and content type accessed by users, resulting in a basic packet-based LLM application attack model with an accuracy of 92.1%. Next, a network with an accuracy of 96.6% is trained using the interrupt traces of LLM websites during the loading phase to classify LLM websites. Subsequently, the same packet traces are used to classify only the content type of users, achieving an attack model with an accuracy of 98.6%. Based on these two models, the two-stage LLM-WFIN-Post attack is achieved, with a statistical accuracy of 97.2%. Then, the two features are fused by continuously collecting the interrupt traces of LLM websites during the loading phase and the packet traces during the querying phase into a single trace, enabling the prediction of the LLM application and content type accessed by users. This results in the one-stage LLM-WFIN-Prior attack achieves an accuracy of 93.2%. Finally, the attacks are tested under different defenses. The accuracy of the packet-based LLM application attack model, used for classifying LLM websites and content type, drops to 81.6% under randomized packets. The accuracy of the two-stage LLM-WFIN-Post attack reduces to 15.6% under randomized timers. However, if attackers are willing to accept the probability of misjudgment, they can directly use the packet traces with the packet-based LLM application attack model, achieving an accuracy of 92.1%. Under randomized packets, its accuracy is reduced to 78.3%, and under both defenses, the attack accuracy is reduced to 10.2%. Similarly, attackers can use only packet traces for classification and thus are affected solely by randomized packets, maintaining an accuracy of 81.6%. Since the one-stage LLM-WFIN-Prior attack is already fused into a single trace, the situation is more pessimistic. When the two defenses are applied separately and simultaneously, the attack accuracy is reduced to 6.3%, 80.3%, and 3.2%, respectively.

7. Conclusions

Based on the observation of the attack classification accuracy dropping to 6.5% for LLM website accesses, LLM-WFIN is proposed—a fine-grained LLM-oriented website fingerprinting attack that fuses interrupt traces and network traffic to accurately identify the browsing website and content type. Two fusing policies, prior fusion based one-stage classifier and post-fusion based two-stage classifier, are trained to enhance website fingerprinting attacks. The comprehensive results identify the optimal attack model—LSTM+CNN. The attack’s accuracy is then assessed under different defenses. Despite an acceptable misjudgment rate, LLM-WFIN achieves an accuracy of 81.6%, which is sufficient to compromise user privacy on LLM websites. Furthermore, packet traces between LLM websites and non-LLM websites are compared, highlighting the unique characteristics of LLM websites. LLM-WFIN is specifically designed to exploit these unique features. However, the approach has several limitations. First, the dataset includes only 25 LLM websites, and future work should expand this to include a broader range of LLM websites. More importantly, it is observed that some non-LLM websites have packet traces resembling those of LLM websites. Although such occurrences are rare, they present a risk of misjudgment. In future work, additional distinguishing features of LLM websites will be identified to improve the attack.

Author Contributions

Conceptualization, J.J.; Methodology, J.J.; Software, H.Y.; Validation, H.Y.; Writing—riginal draft, J.J. and H.Y.; Writing—review & editing, J.J. and R.W.; Visualization, R.W.; Project administration, J.J.; Funding acquisition, J.J. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Shanghai Pujiang Program with the grant number 21PJD026.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Appendix A. LLM Application Websites

The following is a list of LLM application websites used in the dataset:
so.comwenku.baidu.comkimi.moonshot.cnyiyan.baidu.com
doubao.comtongyi.aliyun.commetasearch.cnaippt.cn
baidu.comtiangong.cnczd.comdeepseek.com
chatglm.cndesign.meitu.comzhihu.comsudamobile.com
gaoding.com/ai10jqka.com.cnxfyun.cnhuoshan.com
minimax.chatliblib.artprocesson.commodao.cc
immersive-translate.owenyoung.com

References

  1. AI Industry Analysis: 50 Most Visited AI Tools and Their 24B+ Traffic Behavior—WriterBuddy. Available online: https://writerbuddy.ai/blog/ai-industry-analysis (accessed on 10 March 2024).
  2. Yan, M.; Fletcher, C.W.; Torrellas, J. Cache Telepathy: Leveraging Shared Resource Attacks to Learn DNN Architectures. In Proceedings of the 29th USENIX Security Symposium, Online, 12–14 August 2020. [Google Scholar]
  3. Nazari, N.; Xiang, F.; Fang, C.; Makrani, H.M.; Puri, A.; Patwari, K.; Sayadi, H.; Rafatirad, S.; Chuah, C.-N.; Homayoun, H. LLM-FIN: Large Language Models Fingerprinting Attack on Edge Devices. In Proceedings of the 2024 25th International Symposium on Quality Electronic Design (ISQED) 2024, San Francisco, CA, USA, 3–5 April 2024; pp. 1–6. [Google Scholar] [CrossRef]
  4. Dong, G.; Wang, P.; Chen, P.; Gu, R.; Hu, H. Floating-Point Multiplication Timing Attack on Deep Neural Network. In Proceedings of the 2019 IEEE International Conference on Smart Internet of Things (SmartIoT), Tianjin, China, 9–11 August 2019; pp. 155–161. [Google Scholar] [CrossRef]
  5. Patwari, K.; Hafiz, S.M.; Wang, H.; Homayoun, H.; Shafiq, Z.; Chuah, C.-N. DNN Model Architecture Fingerprinting Attack on CPU-GPU Edge Devices. In Proceedings of the 2022 IEEE 7th European Symposium on Security and Privacy (EuroS&P) 2022, Genoa, Italy, 6–10 June 2022; pp. 337–355. [Google Scholar] [CrossRef]
  6. Willison, S. Prompt Injection Attacks Against GPT-3. Simon Willison’s Weblog. Available online: https://simonwillison.net/2022/Sep/12/prompt-injection/ (accessed on 10 March 2024).
  7. Greshake, K.; Abdelnabi, S.; Mishra, S.; Endres, C.; Holz, T.; Fritz, M. Not What You’ve Signed Up For: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection. In Proceedings of the 16th ACM Workshop on Artificial Intelligence and Security, Copenhagen Denmark, 30 November 2023; pp. 79–90. [Google Scholar] [CrossRef]
  8. Xie, S.; Dai, W.; Ghosh, E.; Roy, S.; Schwartz, D.; Laine, K. Does Prompt-Tuning Language Model Ensure Privacy? arXiv 2023, arXiv:2304.03472. [Google Scholar]
  9. Shumailov, I.; Zhao, Y.; Bates, D.; Papernot, N.; Mullins, R.; Anderson, R. Sponge Examples: Energy-Latency Attacks on Neural Networks. In Proceedings of the 2021 IEEE European Symposium on Security and Privacy (EuroS&P), Vienna, Austria, 6–10 September 2021. [Google Scholar] [CrossRef]
  10. Cook, J.; Drean, J.; Behrens, J.; Yan, M. There’s always a bigger fish: A clarifying analysis of a machine-learning-assisted side-channel attack. In Proceedings of the 49th Annual International Symposium on Computer Architecture, New York, NY, USA, 18–22 June 2022; pp. 204–217. [Google Scholar]
  11. Bhat, S.; Lu, D.; Kwon, A.H.; Devadas, S. VAR-CNN: A Data-Efficient Website Fingerprinting attack based on Deep learning. Proc. Priv. Enhancing Technol. 2019, 2019, 292–310. [Google Scholar] [CrossRef]
  12. Esmradi, A.; Yip, D.W.; Chan, C.F. A comprehensive survey of attack techniques, implementation, and mitigation strategies in Large Language Models. In Communications in Computer and Information Science; Springer: Berlin/Heidelberg, Germany, 2024; pp. 76–95. [Google Scholar] [CrossRef]
  13. Carlini, N.; Ippolito, D.; Jagielski, M.; Lee, K.; Tramer, F.; Zhang, C. Quantifying memorization across neural language models. arXiv 2022, arXiv:2202.07646. [Google Scholar]
  14. Andriushchenko, M.; Croce, F.; Flammarion, N. Jailbreaking Leading Safety-Aligned Llms with Simple Adaptive Attacks. 2024. Available online: https://arxiv.org/abs/2404.02151 (accessed on 10 March 2024).
  15. Dong, Z.; Zhou, Z.; Yang, C.; Shao, J.; Qiao, Y. Attacks, Defenses and Evaluations for LLM Conversation Safety: A Survey. arXiv 2024, arXiv:2402.09283. [Google Scholar] [CrossRef]
  16. Tian, Z.; Cui, L.; Liang, J.; Yu, S. A Comprehensive Survey on Poisoning Attacks and Countermeasures in Machine Learning. ACM Comput. Surv. 2022, 55, 1–35. [Google Scholar] [CrossRef]
  17. Roy, S.S.; Naragam, K.V.; Nilizadeh, S. Generating Phishing Attacks Using ChatGPT. arXiv 2023, arXiv:2305.05133. [Google Scholar]
  18. Panchenko, A.; Niessen, L.; Zinnen, A.; Engel, T. Website Fingerprinting in Onion Routing Based Anonymization Networks. In Proceedings of the 10th Annual ACM Workshop on Privacy in the Electronic Society, Chicago, IL, USA, 17 October 2011. [Google Scholar] [CrossRef]
  19. De La Cadena, W.; Mitseva, A.; Hiller, J.; Pennekamp, J.; Reuter, S.; Filter, J.; Engel, T.; Wehrle, K.; Panchenko, A. TrafficSliver: Fighting Website Fingerprinting Attacks with Traffic Splitting. In Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security, Los Angeles, CA, USA, 7–11 November 2022. [Google Scholar] [CrossRef]
  20. Wang, T.; Cai, X.; Nithyanand, R.; Johnson, R.; Goldberg, I. Effective Attacks and Provable Defenses for Website Fingerprinting. In Proceedings of the USENIX Security Symposium, San Diego, CA, USA, 20–22 August 2014; pp. 143–157. [Google Scholar]
  21. Panchenko, A.; Lanze, F.; Zinnen, A.; Henze, M.; Pennekamp, J.; Wehrle, K.; Engel, T. Website Fingerprinting at Internet Scale. In Proceedings of the NDSS 2016, San Diego, CA, USA, 21–24 February 2016. [Google Scholar] [CrossRef]
  22. Rimmer, V.; Preuveneers, D.; Juárez, M.; Van Goethem, T.; Joosen, W. Automated Feature Extraction for Website Fingerprinting through Deep Learning. arXiv 2017, arXiv:1710.01590. [Google Scholar]
  23. Sirinam, P.; Imani, M.; Juarez, M.; Wright, M. Deep Fingerprinting: Undermining Website Fingerprinting Defenses with Deep Learning. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, Toronto, ON, Canada, 15–19 October 2018; pp. 1928–1943. [Google Scholar]
  24. Oh, S.E.; Mathews, N.; Rahman, M.S.; Wright, M.; Hopper, N. GANDALF: GAN for Data-Limited Fingerprinting. Proc. Priv. Enhancing Technol. 2021, 2021, 305–322. [Google Scholar] [CrossRef]
  25. Dyer, K.P.; Coull, S.E.; Ristenpart, T.; Shrimpton, T. Peek-a-Boo, I Still See You: Why Efficient Traffic Analysis Countermeasures Fail. In Proceedings of the IEEE Symposium on Security and Privacy, San Francisco, CA, USA, 20–23 May 2012; pp. 332–346. [Google Scholar] [CrossRef]
  26. Juarez, M.; Afroz, S.; Acar, G.; Diaz, C.; Greenstadt, R. A Critical Evaluation of Website Fingerprinting Attacks. In Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security, Scottsdale, AZ, USA, 3–7 November 2014; pp. 263–274. [Google Scholar]
  27. Wang, T.; Goldberg, I. Walkie-Talkie: An Efficient Defense Against Passive Website Fingerprinting Attacks. In Proceedings of the 26th USENIX Security Symposium, Vancouver, BC, Canada, 16–18 August 2017; pp. 1375–1390. [Google Scholar]
  28. Cai, X.; Nithyanand, R.; Johnson, R. CS-BuFLO: A Congestion Sensitive Website Fingerprinting Defense. In Proceedings of the 13th Workshop on Privacy in the Electronic Society, Scottsdale, AZ, USA, 3 November 2014; pp. 121–130. [Google Scholar]
  29. Cai, X.; Nithyanand, R.; Wang, T.; Johnson, R.; Goldberg, I. A Systematic Approach to Developing and Evaluating Website Fingerprinting Defenses. In Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security, Los Angeles, CA, USA, 7–11 November 2022; pp. 227–238. [Google Scholar] [CrossRef]
  30. JavaScript|MDN. MDNWeb Docs. Available online: https://developer.mozilla.org/zh-CN/docs/Web/JavaScript (accessed on 1 March 2024).
  31. Genkin, D.; Pachmanov, L.; Tromer, E.; Yarom, Y. Drive-By Key-Extraction Cache Attacks from Portable Code. In Lecture Notes in Computer Science; Springer: Cham, Switzerland, 2018; pp. 83–102. [Google Scholar] [CrossRef]
  32. Weiss, J.O.B.; Alves, T.; Kundu, S. EZClone: Improving DNN Model Extraction Attack via Shape Distillation from GPU Execution Profiles. arXiv 2023, arXiv:2304.03388. [Google Scholar]
  33. Pasquini, D.; Kornaropoulos, E.M.; Ateniese, G. LLMMap: Fingerprinting for Large Language Models. arXiv 2024, arXiv:2407.15847. [Google Scholar]
  34. Lightbody. GitHub—Lightbody/Browsermob-Proxy: A Free Utility to HelpWeb DevelopersWatch and Manipulate Network Traffic From Their AJAX Applications. GitHub. Available online: https://github.com/lightbody/browsermob-proxy (accessed on 13 February 2024).
  35. Selenium. Selenium. Available online: https://www.selenium.dev/zh-cn (accessed on 13 February 2024).
  36. Li, W.; Zhou, H.; Yu, J.; Song, Z.; Yang, W. Coupled mamba: Enhanced multi-modal fusion with coupled state space model. arXiv 2024, arXiv:2405.18014. [Google Scholar]
  37. Hemker, K.; Simidjievski, N.; Jamnik, M. HEALNet: Multimodal Fusion for Heterogeneous Biomedical Data. In Proceedings of the Thirty-Eighth Annual Conference on Neural Information Processing Systems, Vancouver, BC, Canada, 9–15 December 2024. [Google Scholar]
  38. Shankar, S.; Thompson, L.; Fiterau, M. Progressive fusion for multimodal integration. arXiv 2022, arXiv:2209.00302. [Google Scholar]
  39. Zhou, M.; Huang, J.; Yan, K.; Hong, D.; Jia, X.; Chanussot, J.; Li, C. A General Spatial-Frequency Learning Framework for Multimodal Image Fusion. IEEE Trans. Pattern Anal. Mach. Intell. 2024, 1–18. [Google Scholar] [CrossRef]
  40. Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
  41. Fix, E.; Hodges, J.L. Discriminatory Analysis. Nonparametric Discrimination: Consistency Properties. Int. Stat. Rev. 1989, 57, 238. [Google Scholar] [CrossRef]
  42. O’Shea, K.; Nash, R. An Introduction to Convolutional Neural Networks; National Key Lab for Novel Software Technology, Nanjing University: Nanjing, China, 2015. [Google Scholar]
  43. Graves, A. Supervised Sequence Labelling with Recurrent Neural Networks; Springer: Berlin/Heidelberg, Germany, 2012. [Google Scholar] [CrossRef]
  44. Cortes, C.; Vapnik, V. Support-Vector Networks. Mach. Learn. 1995, 20, 273–297. [Google Scholar] [CrossRef]
Figure 1. Interrupt-based WF attack model.
Figure 1. Interrupt-based WF attack model.
Electronics 14 01263 g001
Figure 2. Network-traffic-based WF attack model.
Figure 2. Network-traffic-based WF attack model.
Electronics 14 01263 g002
Figure 3. The process of collecting network packet traces.
Figure 3. The process of collecting network packet traces.
Electronics 14 01263 g003
Figure 4. The packet traces of an image query with LLM application doubao.com.
Figure 4. The packet traces of an image query with LLM application doubao.com.
Electronics 14 01263 g004
Figure 5. The packet traces of different queries with the LLM application kimi.com.
Figure 5. The packet traces of different queries with the LLM application kimi.com.
Electronics 14 01263 g005
Figure 6. The packet traces of general non-LLM websites.
Figure 6. The packet traces of general non-LLM websites.
Electronics 14 01263 g006
Figure 7. The interrupt traces of the LLM website doubao.com.
Figure 7. The interrupt traces of the LLM website doubao.com.
Electronics 14 01263 g007
Figure 8. The interrupt traces during interaction with the LLM application doubao.com.
Figure 8. The interrupt traces during interaction with the LLM application doubao.com.
Electronics 14 01263 g008
Figure 9. The interrupt traces during interaction with the ordinary website jstv.com.
Figure 9. The interrupt traces during interaction with the ordinary website jstv.com.
Electronics 14 01263 g009
Figure 10. An example demonstrating the non-uniqueness of the LLM application packet trace.
Figure 10. An example demonstrating the non-uniqueness of the LLM application packet trace.
Electronics 14 01263 g010
Figure 11. One-stage LLM-WFIN-Prior attack.
Figure 11. One-stage LLM-WFIN-Prior attack.
Electronics 14 01263 g011
Figure 12. Two-stage LLM-WFIN-Post attack.
Figure 12. Two-stage LLM-WFIN-Post attack.
Electronics 14 01263 g012
Figure 13. The interrupt traces without and with the randomized timer.
Figure 13. The interrupt traces without and with the randomized timer.
Electronics 14 01263 g013
Figure 14. The packet traces without and with randomized packets.
Figure 14. The packet traces without and with randomized packets.
Electronics 14 01263 g014
Figure 15. The feature fusion trace under different defenses.
Figure 15. The feature fusion trace under different defenses.
Electronics 14 01263 g015
Figure 16. The difference matrices of interrupt traces.
Figure 16. The difference matrices of interrupt traces.
Electronics 14 01263 g016
Figure 17. The difference matrices of interaction interrupt traces.
Figure 17. The difference matrices of interaction interrupt traces.
Electronics 14 01263 g017
Figure 18. The difference matrices of packet traces.
Figure 18. The difference matrices of packet traces.
Electronics 14 01263 g018
Table 1. LLM fingerprinting attacks.
Table 1. LLM fingerprinting attacks.
LLM Fingerprinting AttackAttack TypeTarget LeakageSide ChannelAttack Accuracy
DNN architecture attack [5]Attacks on LLM modelDNN architectureMemory, CPU, and GPU usage99%
EZClone [32]Attacks on LLM modelDNN architectureGPU profiles100%
LLM-FIN [3]Attacks on LLM modelLLM familyMemory usage95%
LLMmap [34]Attacks on LLM modelLLM versionQuery95%
Our LLM-WFINAttacks on LLM applicationsLLM website and interaction content typeInterrupt and Network traffic97.2%
Table 2. Attack accuracy on non-LLM and LLM websites.
Table 2. Attack accuracy on non-LLM and LLM websites.
AttackTrace Collection PhaseClassification TargetAccuracy
Packet-based WF attack [11] on general non-LLM applicationsQueryingWebsites98.1%
Our test, packet-based LLM application attackQueryingWebsites with content type92.1%
Our test, packet-based LLM application attackQueryingContent type98.6%
Interrupt-based WF attack [10] on general non-LLM applicationsLoadingWebsites96.6%
Our test, interrupt-based LLM application attackLoadingWebsites96.6%
Our test, interrupt-based LLM application attackQueryingWebsites with content type6.5%
Interrupt and Packet-based LLM application attackLoading and queryingWebsites with content type[95.4%, 99.9%]
Table 3. Experimental configuration.
Table 3. Experimental configuration.
ItemConfiguration
Operating SystemLinux, Windows
BrowserChrome 92
Number of LLM Applications25
Size of LLM Application Website Interrupt Trace2500
Size of LLM Application Interrupt Trace2500
Size of LLM Application Network Packet Trace2500
Classification ModelLSTM, CNN, KNN, RF, SVM, LSTM+CNN
Total dataset size7500
Table 4. Comparative results under varying learning methods.
Table 4. Comparative results under varying learning methods.
ModelLLM-WFIN-PostLLM-WFIN-Prior
Random Forest90.5 ± 1.2%90.6 ± 1.3%
KNN90.4 ± 1.4%87.4 ± 1.1%
SVM92.9 ± 1.3%88.9 ± 1.5%
CNN93.6 ± 0.9%90.6 ± 0.7%
LSTM94.3 ± 0.8%91.3 ± 1.0%
CNN+LSTM97.2 ± 0.6%93.4 ± 0.9%
Table 5. Attack accuracy with and without defenses.
Table 5. Attack accuracy with and without defenses.
AttackAccuracy
Interrupt-based WF attack [10]96.6%
Packet-based WF attack [11]98.1%
Packet-based LLM application attack with randomized packets81.6 ± 1.5%
Packet-based LLM application query type attack98.6 ± 0.5%
Interrupt-based LLM application attack6.5 ± 3.2%
Interrupt and packet-based LLM application attack (post-fusion)97.2 ± 0.7%
Interrupt and packet-based LLM application attack (prior to fusion)93.2 ± 1.0%
Interrupt and packet-based LLM application attack (post-fusion) with randomized timer15.6% (92.1 ± 1.2%)
Interrupt and packet-based LLM application attack (post-fusion) with randomized packets78.3 ± 1.8%
Interrupt and packet-based LLM application attack (post-fusion) with randomized timer and packets10.2% (81.6 ± 1.5%)
Interrupt and packet-based LLM application attack (prior to fusion) with randomized timer6.3 ± 3.1%
Interrupt and packet-based LLM application attack (prior to fusion) with randomized packets80.3 ± 1.6%
Interrupt and packet-based LLM application attack (prior to fusion) with randomized timer and packets3.2 ± 1.6%
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Jiao, J.; Yang, H.; Wen, R. LLM-WFIN: A Fine-Grained Large Language Model (LLM)-Oriented Website Fingerprinting Attack via Fusing Interrupt Trace and Network Traffic. Electronics 2025, 14, 1263. https://doi.org/10.3390/electronics14071263

AMA Style

Jiao J, Yang H, Wen R. LLM-WFIN: A Fine-Grained Large Language Model (LLM)-Oriented Website Fingerprinting Attack via Fusing Interrupt Trace and Network Traffic. Electronics. 2025; 14(7):1263. https://doi.org/10.3390/electronics14071263

Chicago/Turabian Style

Jiao, Jiajia, Hong Yang, and Ran Wen. 2025. "LLM-WFIN: A Fine-Grained Large Language Model (LLM)-Oriented Website Fingerprinting Attack via Fusing Interrupt Trace and Network Traffic" Electronics 14, no. 7: 1263. https://doi.org/10.3390/electronics14071263

APA Style

Jiao, J., Yang, H., & Wen, R. (2025). LLM-WFIN: A Fine-Grained Large Language Model (LLM)-Oriented Website Fingerprinting Attack via Fusing Interrupt Trace and Network Traffic. Electronics, 14(7), 1263. https://doi.org/10.3390/electronics14071263

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop