Online Traffic Obfuscation Experimental Framework for the Smart Home Privacy Protection

Huang, Shuping; Cao, Jianyu; Chen, Ziyi; Zhong, Qi; Zhang, Minghe

doi:10.3390/electronics14163294

Open AccessArticle

Online Traffic Obfuscation Experimental Framework for the Smart Home Privacy Protection

by

Shuping Huang

^1,†,

Jianyu Cao

^1,2,*

,

Ziyi Chen

^1,†,

Qi Zhong

^3,*

and

Minghe Zhang

¹

School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin 541004, China

²

State Key Laboratory of Public Big Data, Guizhou University, Guiyang 550025, China

³

Faculty of Data Science, City University of Macau, Macau 999078, China

^*

Authors to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Electronics 2025, 14(16), 3294; https://doi.org/10.3390/electronics14163294

Submission received: 8 July 2025 / Revised: 5 August 2025 / Accepted: 17 August 2025 / Published: 19 August 2025

(This article belongs to the Section Networks)

Download

Browse Figures

Versions Notes

Abstract

Attackers can use Ethernet or WiFi sniffers to capture smart home device traffic and identify device events based on packet length and timing characteristics, thereby inferring users’ home behaviors. To address this issue, traffic obfuscation techniques have been extensively studied, with common methods including packet padding, packet segmentation, and fake traffic injection. However, existing research predominantly utilizes non-real-time traffic to verify whether traffic obfuscation techniques can effectively reduce the recognition rate of traffic analysis attacks on smart home devices. It often overlooks the potential impact of obfuscation operations on device connectivity and functional integrity in real network environments. To address this limitation, an online experimental framework for three fundamental traffic obfuscation techniques is proposed: packet padding, packet segmentation, and fake traffic injection. Experimental results demonstrate that the proposed framework maintains the continuous connectivity and functional integrity of smart home devices with a low system overhead, achieving an average CPU usage rate of less than 0.4% and an average memory occupancy rate of less than 2%. Evaluation results based on the random forest classification method show that the device event recognition accuracy for injected fake traffic exceeds 89%. In this context, a higher recognition accuracy indicates that attackers are more effectively deceived by the injected fake traffic. Conversely, the recognition accuracy for packet padding and packet segmentation methods is nearly zero, and a lower recognition accuracy in these cases implies a more effective implementation of those obfuscation techniques. Further evaluation results based on the deep learning classification method reveal that the packet segmentation approach significantly reduces device recognition accuracy for certain devices to below 5%, while simultaneously increasing the false recognition rate for other devices to over 95%. In contrast, fake traffic injection achieves a device recognition accuracy exceeding 90%. Moreover, the obfuscation effect of the packet padding method is found to be suboptimal, a finding consistent with existing literature suggesting that no single obfuscation technique can effectively withstand all types of traffic analysis attacks.

Keywords:

smart home; traffic analysis; privacy protection; traffic obfuscation; online experimental framework

1. Introduction

The global smart home market continues to expand, driven by the development of the Internet of Things (IoT) and network communication technologies. According to [1], the market was valued at USD 127.80 billion in 2024 and is expected to grow at a compound annual growth rate (CAGR) of 27.0% from 2025 to 2030. However, several studies [2,3] have demonstrated that attackers can capture the network traffic of smart home devices using Ethernet or WiFi sniffers. They can then leverage packet length and timing characteristics to infer device events, thereby analyzing users’ home behaviors. For example, monitoring the activity pattern of a smart door sensor can allow an attacker to deduce the user’s absence. Similarly, analyzing interactions with a smart camera could reveal occupancy status. Such inferences pose significant privacy risks, as attackers can reconstruct daily routines or even identify vulnerable time windows for physical intrusions.

To mitigate these risks, traffic obfuscation techniques have been proposed [4]. These techniques are typically operated either between the Internet Service Provider (ISP) and the WiFi Access Point (AP), or between the AP and individual smart home devices. They aim to disrupt the correlation between traffic patterns and user behaviors. Current research primarily evaluates obfuscation techniques (e.g., packet segmentation and padding) using non-real-time traffic, such as simulated, offline, and replayed traffic, focusing on their ability to reduce the accuracy of traffic analysis attacks. However, as emphasized by Jmila et al. [5], such approaches fail to validate whether the obfuscation mechanisms can maintain device connectivity and functional integrity in real-world networks. This limitation arises because offline experiments cannot replicate dynamic network conditions like latency sensitivity or protocol compatibility. To address this limitation, this paper proposes an online traffic obfuscation experimental framework. This framework is designed to mitigate the inherent constraints of using non-real-time traffic in developing and evaluating traffic obfuscation techniques. The design concept was inspired by our in-depth understanding of the Linux network protocol stack, particularly the mechanism for capturing and processing packets in user space. During the observation of IoT device communication behaviors, we recognized that introducing an intermediate control node between the device and the external network would be possible. This approach could achieve real-time intervention and control of the device’s traffic without requiring any modifications to the device itself. This key insight served as the foundation for constructing a controllable experimental link to enable online obfuscation operations. The main contributions are summarized as follows.

An online traffic obfuscation experimental network is established, which is an operational network link between smart home devices and their external router that enables real-time capture and dynamic obfuscation of traffic patterns (including packet size and timing characteristics) while maintaining normal device operation.
The implemented platform supports three fundamental obfuscation primitives: fake traffic injection, packet padding, and packet segmentation. Additionally, it provides an extensible architecture for integrating and evaluating novel complex obfuscation methods through continuous online validation.
Our evaluation confirms the framework’s ability to preserve device connectivity and functional integrity during obfuscation. While the basic techniques demonstrate partial effectiveness against traffic analysis (consistent with existing literature), the results highlight the need for developing advanced composite methods building upon these foundational approaches to achieve stronger protection.

The remainder of the paper is organised as follows. Section 2 provides a comprehensive review of related work. Section 3 gives the online traffic obfuscation experimental network. Section 4 elaborates on the online experimental framework for three fundamental traffic obfuscation techniques. Section 5 verifies the performance of the experimental framework. Section 6 concludes this paper.

2. Related Work

Traffic obfuscation techniques have been extensively studied as a key strategy for defending against traffic analysis attacks. The core technical approaches in these studies typically include fake traffic injection, packet padding, and packet segmentation. Furthermore, the effectiveness of traffic obfuscation methods is primarily evaluated using simulated traffic, offline traffic datasets, and replayed traffic scenarios.

Simulated Traffic. In this experimental evaluation scenario, a sequence of simulated packets is constructed based on the packet sizes and timestamps of real traffic to evaluate the effectiveness of traffic obfuscation methods. For instance, Datta et al. [6] developed a Python library for traffic obfuscation by employing payload padding, packet segmentation, and random overlay during data transmission. In the experimental evaluation, the researcher extracted the payload length and sending time of real device traffic and generated simulated traffic based on these parameters. Then, the communication between the device and the server was simulated using a socket. Subsequently, unidirectional traffic obfuscation was implemented at both ends of the socket.

Offline Traffic. In this experimental evaluation scenario, real traffic datasets, such as Packet Capture (PCAP) files, are directly modified according to predefined traffic obfuscation policies. For example, Apthorpe et al. [7] proposed the Stochastic Traffic Padding (STP) mechanism, which aims to provide a flexible balance between privacy protection and resource overhead for users. The researchers applied STP to traffic traces (in the form of PCAP files) generated by real smart home devices and assessed its effectiveness in resisting traffic analysis attacks. Wang et al. [8] proposed a traffic padding method that integrates an adaptive mechanism with differential privacy, aiming to obscure the traffic patterns of IoT devices. The obfuscated traffic traces were generated using a real-world traffic dataset collected from Amazon Echo devices and subsequently employed to evaluate the method’s efficacy in countering traffic analysis attacks. Alshehri et al. [9] proposed a packet size obfuscation mechanism based on uniform noise padding, aiming to enhance privacy protection for smart home traffic in Virtual Private Network (VPN) tunnels. This method applies noise padding and encryption to packets at the smart home side to achieve packet size obfuscation. The processed packets are then restored at the VPN receiver side. The effectiveness of the obfuscation method was validated by modifying the public dataset [10] and conducting subsequent recognition tasks. Pinheiro et al. [11] proposed an adaptive packet padding mechanism that dynamically adjusts the number of padding bytes in response to variations in home network utilization, thereby enhancing traffic obfuscation. The authors utilized a publicly available dataset [12] to extract raw traffic, adjusted packet lengths according to a predefined policy, and subsequently input the processed dataset into a classifier for evaluation, thereby assessing the effectiveness of the proposed padding strategy. Brahma et al. [13] proposed a traffic obfuscation mechanism that combines virtual packet generation with dynamic link padding. The method was experimented on based on a publicly available dataset [14], and its effectiveness in defending against traffic analysis attacks was evaluated by comparing the dataset before and after traffic shaping. Alyami et al. [15] proposed a traffic obfuscation method based on fake traffic injection. This method increases the difficulty for attackers to distinguish between devices A and B by injecting fake packets into each device according to the other’s traffic pattern. The research team conducted simulation experiments using two Linux computers, synthesizing fake packets based on real traffic trajectories. Subsequently, these fake packets were integrated into the collected real traffic trajectories to validate the effectiveness of the proposed method. Zhang et al. [16] proposed a smart home traffic obfuscation method based on the virtual user technique. The method enhances privacy protection by injecting traffic fingerprints of device activities into actual traffic to obfuscate device states and user behavior. The experimental validation was conducted on an offline traffic dataset. Alyami et al. [17] proposed a method to achieve IoT traffic obfuscation by randomizing packet sizes. Instead of introducing additional noise, the method enhances traffic indecipherability by dividing Transmission Control Protocol (TCP) segments into randomly sized chunks of data. This division disrupts the original packet length distribution. The experimental validation relies on pre-captured interactive traffic between devices and servers with obfuscation implemented on static files.

Replay Traffic. In this experimental evaluation scenario, historical traffic is replayed with the help of tools like Tcpreplay. For example, Pinheiro et al. [18] proposed a lightweight packet length obfuscation method that combines maximum padding with random padding. This approach replays IoT traffic using Tcpreplay, obtains and pads packets via Netfilter’s FORWARD hook with Socket Buffer (SKB), captures the obfuscated traffic at the interface through Tcpdump to generate PCAP files, and verifies the obfuscation effectiveness using a classifier.

The above traffic obfuscation methods were verified whether they can effectively reduce the recognition rate of traffic analysis attacks on smart home device events through simulated traffic, offline traffic and replayed traffic. In this way, traffic obfuscation cannot be assessed to determine whether it can maintain the continuous connectivity and functionality of smart home devices in the actual network environment.

To address the aforementioned issues, traffic obfuscation techniques should be implemented in the network link between the real-world servers and smart home devices. A few existing studies have attempted online verification of such methods. For example, Ibbad Hafeez et al. [19] proposed a dummy traffic generation method. This method transmits dummy traffic at a constant rate to the uplink during the device’s inactive periods, thereby concealing the device’s actual activity status. These dummy packets shares the same transmission path as legitimate traffic but are silently dropped at the destination using embedded markers that identify them as non-genuine. The performance of the obfuscation method was evaluated by conducting unidirectional injection of fake traffic in a real network environment. Similarly, Zhu et al. [20] proposed the Air-Padding method, which alters the device traffic pattern by injecting crafted packets between the AP and the device. This prevents the attacker from identifying the device type or inferring its operational state. In the experimental evaluation, the iptables tool was employed to block the device’s communication and perform Air-Padding near the device using a laptop, thus interfering with air interface traffic analysis. To the best of our knowledge, while a few fake traffic injection methods have been verified using online real traffic, no studies have yet demonstrated online traffic obfuscation based on packet padding or segmentation techniques.

In summary, while existing traffic obfuscation methods have achieved certain theoretical and experimental outcomes, their performance validation has been predominantly limited to offline, non-real-time traffic scenarios. While several studies have investigated fake traffic injection through real-time network traffic analysis (e.g., [19,20]), this body of research remains incomplete. The literature still lacks comprehensive experimental frameworks specifically designed to evaluate packet padding and segmentation techniques in online operational environments. To bridge this critical gap, this paper presents a novel online traffic obfuscation experimental framework that simultaneously supports fake traffic injection, packet padding, and segmentation techniques. Our framework overcomes the core limitations of current verification approaches. It substantially improves the real-world applicability of traffic obfuscation technologies in operational smart home environments.

3. Online Traffic Obfuscation Experimental Network

A typical smart home network structure is illustrated in Figure 1. This network includes the Smart Home ISP, home router, intelligent gateway, and various smart home devices. However, ordinary researchers are generally not permitted to run programs or execute scripts on these nodes. To address this limitation, we construct an experimental network on the device side of the smart home network. Specifically, we establish a programmable link between the smart home devices and the home router connected to the external network. This link enables real-time obfuscation of actual traffic and validates the effectiveness of the traffic obfuscation approach.

The structure of the experimental network is illustrated in Figure 2. The IP address, device name, and specification for each network node are detailed in Table 1. Node 1 serves as the home router, connecting to the external Internet. Nodes 2 and 7 are responsible for capturing traffic before obfuscation or after restoration. Nodes 3 and 6 process the traffic in real time based on a predefined obfuscation strategy. Node 4 is designated for capturing the traffic after obfuscation. Nodes 1–4 and 5–7 are interconnected via wired links, whereas a wireless relay establishes the connection between Node 4 and Node 5. This hybrid connectivity approach enables obfuscated traffic transmission through both wired and wireless links. Such dual-path propagation significantly improves experimental environment fidelity by closely mimicking actual smart home network architectures.

4. Online Traffic Obfuscation Experimental Framework

Based on the experimental network presented in Figure 2, an online traffic obfuscation experimental framework has been established, as depicted in Figure 3. Its primary functions encompass original traffic capture, obfuscated traffic capture, synthetic traffic injection/termination, traffic filtering and redirection/re-forwarding, and traffic obfuscation/restoration. The original traffic capture and obfuscated traffic capture are implemented using the Tcpdump tool (version 4.3.9), and the performance of the obfuscation method is further verified through traffic analysis. Synthetic traffic injection/termination, traffic filtering and redirection/re-forwarding, and traffic obfuscation/restoration constitute the core functions of the obfuscation method, which are described as follows.

(a): Synthetic traffic injection/termination
Based on the interaction traffic (captured in PCAP files) between the server and smart home devices, synthetic packets are constructed and injected from node 2 (or node 7). Subsequently, these packets assist node 3 (or node 6) in completing the injection of fake traffic. Finally, the synthetic packets are terminated at node 7 (or node 2).
(b): Traffic filtering and redirection
At nodes 3 and 6, packets are filtered and redirected based on the specified source address or destination address. For instance, the following commands are used to filter and redirect traffic associated with IP address 192.168.2.175 to NFQUEUE queue 1:
iptables -A FORWARD -s 192.168.2.175 -j NFQUEUE --queue-num 1
iptables -A FORWARD -d 192.168.2.175 -j NFQUEUE --queue-num 1
NFQUEUE is a target in the Netfilter framework that enables packets to be passed from the kernel space to user-space programs for processing. User-space programs can utilize the NetfilterQueue library in Python (version 3.10.13) to read packets from a specified queue and determine whether to accept, drop, or alter the packets as required. The structure of these packets is illustrated in Figure 4, where the numbers in parentheses represent the number of bits occupied by the corresponding field. The highlighted fields indicate where modifications occur. Green fields are modified during fake traffic injection, blue during packet segmentation, purple during both padding and segmentation, and yellow during all three operations. Since the data is transmitted over the Transport Layer Security (TLS) protocol, the TCP payload, also referred to as the TLS layer, consists of the “Type”, “Version”, “Length”, and “Fragment” fields. Here, the “Type” and “Version” fields record information about the TLS protocol, while the “Length” field specifies the length (in bytes) of the “Fragment”.
(c): Traffic obfuscation/restoration
At node 3, the downlink traffic (from the server to the home device) is obfuscated, and its characteristics are randomized to decrease the recognition accuracy of specific device traffic. Prior to reaching the home device, the obfuscated downlink traffic is restored to its original content at node 6 as required. Similarly, the uplink traffic (from the home device to the server) undergoes the same process at nodes 6 and 3, respectively, ensuring efficient bidirectional obfuscation and restoration.
(d): Traffic re-forwarding
After the specified traffic is obfuscated or restored in user space, it is subsequently reinjected into kernel space and re-forwarded via the packet.accept() method of the NetfilterQueue library, ensuring normal communication.

Based on the aforementioned experimental framework, this paper elaborates on the online experimental principles and execution processes. It specifically addresses three fundamental traffic obfuscation technologies: fake traffic injection, packet padding, and packet segmentation.

4.1. Fake Traffic Injection

Fake traffic injection refers to generating a set of synthetic traffic patterns based on previously captured interaction traffic between the server and the device, and strategically injecting them into the network when the smart home device is idle, thereby effectively interfering with the attacker’s judgment. The implementation principle is illustrated in Figure 5, while the corresponding simplified flowchart is presented in Figure 6, together offering a comprehensive overview of the method’s operational logic and structural design.

Synthetic traffic injection/termination
Synthetic traffic is a foundational type of traffic specifically designed to emulate real device behavior, serving as a critical support mechanism for fake traffic injection strategies. At nodes 2 and 7, the Scapy library uses captured device traffic (PCAP files) to synthesize and inject network traffic, simulating realistic bidirectional communication. This approach circumvents the operating system’s TCP/IP protocol stack, enabling direct transmission of custom packets via Scapy’s send() function, thus enhancing controllability. In the packet injection process, various fields of IP and TCP layers can be modified flexibly as required. To ensure that the timing characteristics of the injected traffic closely resemble those of real traffic, it is necessary to minimize the discrepancy between the packet intervals of the synthetic traffic generated on nodes 2 and 7 and those of the actual traffic. Additionally, the ntpd -g -p ntp.aliyun.com command is employed to synchronize the system time of nodes 2 and 7 with standard time, ensuring temporal consistency and thereby further enhancing the accuracy and timing coherence of the injected traffic.
Smart home devices perform the TCP three-way handshake only when establishing a network connection. The handshake process is not required during the subsequent remote control phase. Accordingly, the fake traffic injection method implemented in the experimental setup also excludes the handshake process, ensuring alignment with the communication behavior observed during the remote control phase. Although no actual transport-layer connection (e.g., a TCP three-way handshake) is established, key parameters such as timestamps, IP and port combinations, and sequence numbers can be utilized. These parameters enable the reconstruction of a seemingly legitimate and continuous bidirectional communication trace. The synthetic traffic constructed on node 2 eventually reaches the destination node 7, and vice versa, with the traffic from node 7 ultimately arriving at node 2, ultimately completing the termination process in the destination node.
IP address modification
For downlink traffic, the process of fake traffic injection and elimination is as follows. Firstly, traffic originating from node 2 and destined for node 7 is intercepted in user space from kernel space at node 3. Subsequently, the source address of the packet is modified from 192.168.2.2 to the IP address of the smart home server, while the destination address is modified to the IP address of a device within the smart home network (e.g., 192.168.2.175). The modified packet is then reinjected into kernel space for re-forwarding. At node 6, traffic with a source address corresponding to the smart home server IP and with a designated destination address (e.g., 192.168.2.175) is again intercepted in user space. The source and destination addresses of the packet are subsequently restored to 192.168.2.2 and 192.168.2.5, respectively. Finally, the modified traffic is reinjected into kernel space for further forwarding.
For uplink traffic, similar to the aforementioned process, fake traffic injection and elimination operations are carried out at nodes 6 and 3, respectively.

4.2. Packet Padding

Packet padding involves adding random-length padding to packets during communication between the home device and the server, and removing the padding before the packets are received. A simplified flowchart of packet padding is shown in Figure 7.

4.2.1. Implementation Principles of Packet Padding

For downlink traffic, the implementation principles of packet padding are outlined as follows.

Packet padding
At node 3, the packets destined for home devices are intercepted into the user space. The TCP payload of each packet is extended with a random-length padding. This padding is encrypted with the method agreed between nodes 3 and 6, and its length is stored in the IP header $“ Options ”$ field. The padding consists of random characters and is appended to the end of the TCP payload. Several fields are modified, including the IP $“ Internet header length ”$ , $“ Total length ”$ , $“ Header checksum ”$ , and $“ Options ”$ fields; the TCP $“ Sequence number ”$ ( $seq$ ), $“ Acknowledgment number ”$ ( $ack$ ), and $“ Checksum ”$ fields; as well as the TLS $“ Length ”$ field. The positions of these fields are illustrated in Figure 4. Subsequently, the padded packet is re-forwarded.
Packet restoration
At node 6, the packets destined for home devices are intercepted again into the user space. The packet is restored by removing a specific number of characters from the end of the TCP payload, where the number of characters to be removed is obtained from the IP header $“ Options ”$ field. The relevant fields need to be modified. Subsequently, the restored packet is re-forwarded.

For uplink traffic, packets from home devices are padded at node 6 and restored at node 3, similar to the downlink padding process.

In the process of padding and restoration, the fields requiring modification primarily depend on

seq

,

ack

, and TCP payload length (

l e n

), collectively referred to as the triplet

{seq, ack, l e n}

. Consider the scenario where a user sends a “turn-on” instruction to an LED bulb via a mobile terminal (see Figure 8). The update processes for

seq

,

ack

, and

l e n

are as follows. Suppose that when the “turn-on” instruction is issued, the first packet (Packet 1) sent by the smart home server to the LED bulb has a sequence number

seq = α

, an acknowledgment number

ack = β

, and a TCP payload length

l e n = m

. At node 3, Packet 1 is padded with x bytes, making

l e n = m + x

, while

seq

and

ack

remain unchanged. At node 6, Packet 1 is restored. After receiving Packet 1, the LED bulb sends a reply packet (Packet 2) to the server, where

seq = β

,

ack = α + m

, and

l e n = n

. If

n > 0

, Packet 2 is padded with y bytes at node 6, making

l e n = n + y

,

ack = α + m + x

, and

seq

unchanged. If

n = 0

, Packet 2 is not padded, but

ack

is updated to

α + m + x

. At node 3, Packet 2 is restored. Upon receiving Packet 2, the server sends another packet (Packet 3) to the LED bulb, where

seq = α + m

,

ack = β + n

, and

l e n = w

. If

w > 0

, Packet 3 is padded with z bytes at node 3, making

l e n = w + z

,

seq = α + m + x

, and

ack = β + n + y

. If

w = 0

, Packet 3 is not padded, but

seq = α + m + x

and

ack = β + n + y

. At node 6, Packet 3 is restored. The update of the triplet

{seq, ack, l e n}

can be performed based on algorithms provided below.

4.2.2. Key Algorithms for Packet Padding

When padding or restoring packets, node 3 calculates the triplet after downlink padding and uplink restoration using Algorithm 1, while node 6 computes it after downlink restoration and uplink padding with a variant of Algorithm 1.

Algorithm 1 Calculate

seq

,

ack

and

l e n

at Node 3.

Input:

IP

,

{seq}^{'}

,

{ack}^{'}

,

l e n^{'}

,

d \in {0, 1}

, l,

T_{0}

,

T_{1}

Output:

seq

,

ack

,

l e n

1:: $T_{s r c} \leftarrow T_{d}$ , $T_{d s t} \leftarrow T_{1 - d}$
2:: $f l a g \leftarrow d$
3:: $h a s h_k e y \leftarrow hash (IP ∥ {ack}^{'} ∥ {seq}^{'} ∥ f l a g)$
4:: $E \leftarrow Lookup entry with key h a s h_k e y in T_{s r c}$
5:: if $E = ⌀$ then
6:: $seq \leftarrow {seq}^{'}$ , $ack \leftarrow {ack}^{'}$
7:: else
8:: $seq \leftarrow E (3)$ , $ack \leftarrow E (2) + E (4)$
9:: end if
10:: if $l e n^{'} = = 0$ then
11:: $l e n \leftarrow 0$
12:: else
13:: $l e n \leftarrow l e n^{'} - {(- 1)}^{d} \cdot l$
14:: end if
15:: $n e w_k e y \leftarrow hash (IP ∥ ack ∥ (seq + l e n) ∥ (1 - f l a g))$
16:: $T_{d s t} \leftarrow T_{d s t} \cup {(n e w_k e y, {seq}^{'}, {ack}^{'}, l e n^{'})}$

The inputs to Algorithm 1 include: the destination IP address (for downlink packets) or the source IP address (for uplink packets), the original sequence number

{seq}^{'}

, acknowledgment number

{ack}^{'}

, TCP payload length

l e n^{'}

, the packet direction flag

d \in {0, 1}

, the padding length l, and the global tables

T_{0}

and

T_{1}

. The outputs are the updated values of the sequence number

seq

, acknowledgment number

ack

, and payload length

l e n

after padding or restoration. The flag

d = 1

denotes a downlink packet, for which l bytes are padded at the end of the payload. Conversely,

d = 0

indicates an uplink packet, from which l bytes are removed during restoration. The tables

T_{0}

and

T_{1}

store historical packet triplets to support sequence and acknowledgment number reconstruction. Specifically,

T_{1}

is queried when processing downlink packets, while

T_{0}

is used for uplink packets. The output of each transformation (padding or restoration) is recorded in the corresponding opposite table to support future reverse mapping. The

hash ()

function is used to construct unique keys from packet metadata, and the

| |

operator denotes field concatenation.

E (i)

represents the i-th component of vector

E

. In Algorithm 1, the lookup table

T_{d}

is used to locate prior transformation metadata using a hash of the IP, sequence, acknowledgment, and direction flag. If no match is found, the algorithm uses the original

{seq}^{'}

and

{ack}^{'}

values; otherwise, it calculates updated values from the matched record. The payload length

l e n

is either unchanged (if

l e n^{'} = 0

), or adjusted by adding or subtracting l, depending on the direction d. Finally, the current packet’s metadata is recorded in the opposite table

T_{1 - d}

to enable the future padding or restoration operation in the reverse direction.

At node 6, when calculating the triplet after padding uplink packets and restoring downlink packets, the variant of Algorithm 1 is employed. Specifically, the key differences are as follows: when

d = 1

, l bytes of data are removed from the end of the packet; conversely, when

d = 0

, l bytes of data are randomly appended to the packet. Thus, line 13 is changed to

l e n \leftarrow l e n^{'} + {(- 1)}^{d} \cdot l

.

4.3. Packet Segmentation

Packet segmentation refers to segmenting and re-encapsulating the TCP payload during communication between the home device and the server, and restoring key header fields before data reception. For downlink traffic, packets destined for home devices are intercepted into user space at node 3, where segmentation and re-encapsulation occur in steps 1 and 2. The re-encapsulated packets are then reinjected into kernel space for forwarding. At node 6, they are intercepted again for restoration under step 3, and subsequently reinjected into kernel space for further forwarding. A simplified flowchart of packet segmentation is shown in Figure 9.

Step 1: TCP Payload Segmentation. If a packet’s TCP payload length is greater than 0, it can be divided into a fixed or random number of segments. Each segment length is randomly determined, but the first segment must include at least the first 5 bytes of the original TCP payload, which contain the TLS layer’s

“ Type ”

,

“ Version ”

, and

“ Length ”

fields (as shown in Figure 4). Additionally, selective segmentation is also supported, where only some packets are segmented.

Step 2: Re-encapsulation. For the first segment, modify the first byte (the

“ Type ”

field of the original TLS layer) to a random 1-byte character, set the second and third bytes (the

“ Version ”

field) to the byte length of the original packet’s

“ Fragment ”

, and update the fourth and fifth bytes (the

“ Length ”

field) to the byte length of the first segment’s

“ Fragment ”

. Then, use the original packet’s TCP header and update the

“ Acknowledgment number ”

(ack) and

“ Checksum ”

accordingly. Also, use the original packet’s IP header and mark this re-encapsulated packet as the first segment using its

“ Options ”

field. This mark is encrypted using the shared key between nodes 3 and 6 to prevent detection. Finally, update the IP layer’s

“ Internet header length ”

,

“ Total length ”

, and

“ Header checksum ”

accordingly. For other segments, use the original packet’s TCP header and update the

“ Sequence number ”

(seq) and

“ Checksum ”

accordingly. Also, use the original packet’s IP header and mark this re-encapsulated packet as not the first segment using its

“ Options ”

field. This mark is encrypted using the shared key between nodes 3 and 6 to prevent detection. Finally, update the IP layer’s

“ Internet header length ”

,

“ Total length ”

, and

“ Header checksum ”

accordingly. During the re-encapsulation process described above, checksums for both the TCP and IP layers are cleared and then automatically set to appropriate values. The

“ Internet header length ”

and

“ Total length ”

fields in the IP layer are updated based on the characteristics of the re-encapsulated packet.

During the aforementioned encapsulation process, we obfuscate the

seq

and

ack

information of the re-encapsulated packets by delaying the transmission of TCP acknowledgment packets (with zero payload length) and adjusting their

seq

values. This approach prevents the original packet size characteristics from being inferred based on the

seq

,

ack

, and TCP payload length of the re-encapsulated packets.

Take the example of a user sending the “turn-on” instruction to an LED bulb via a mobile terminal. Suppose that when issuing the control instruction, the smart home server sends the first packet (Packet 1), which contains application data, to the LED bulb, as shown in Figure 10. This packet has a sequence number

seq = α

, an acknowledgment number

ack = β

, and a TCP payload length

l e n = x

. At node 3, the TCP payload of this packet is divided into two segments, re-encapsulated, and subsequently re-forwarded. The first segment has a length

l e n = x_{1}

, with re-encapsulated

seq = α

and

ack = β

, and the second segment has a length

l e n = x_{2}

, with re-encapsulated

seq = α + x_{1}

and

ack = β

, where

x_{1} + x_{2} = x

. Afterward, the server receives a reply packet from the LED bulb. When the server is ready to send the next packet (Packet 3) with application data, it typically sends a TCP acknowledgment packet (Packet 2) first. This packet has a sequence number

seq = α + x

, an acknowledgment number

ack = β + τ

, and a TCP payload length

l e n = 0

. Node 3 places Packet 2 into the delayed delivery queue and temporarily refrains from forwarding it until after Packet 3 has been segmented. Let the sequence number

seq = α + x

, acknowledgment number

ack = β + τ

, and TCP payload length

l e n = y

for Packet 3. Upon completing the segmentation of Packet 3, the first segment (Packet 3-1) is forwarded first, retaining the original sequence number

seq

while updating the acknowledgment number to

ack = β

, with TCP payload length

l e n = y_{1}

. Subsequently, the sequence number

seq

of the TCP acknowledgment packet in the delay queue is updated to

α + x + y_{1}

, with the acknowledgment number

ack

unchanged, and the packet is then forwarded. Finally, for the second segment (Packet 3-2), the sequence number

seq

is adjusted to

α + x + y_{1}

, with the original acknowledgment number

ack

unchanged. The TCP payload length is

l e n = y_{2}

, where

y_{1} + y_{2} = y

, and the packet is then forwarded. If Packet 3 is segmented into three or more segments, Packet 2 can be delayed and forwarded between any two segments formed by Packet 3, with the relevant

seq

and

ack

values adjusted as needed based on the actual situation.

The packet corresponding to the first segment is re-forwarded using the packet.accept() method from the NetfilterQueue library in Python, while the remaining packets are re-forwarded using the send() function from the Scapy library.

Step 3: Restoration. On node 6, the packet destined for the smart home device is intercepted into the user space. The packet, after being segmented and re-encapsulated, is modified again to ensure normal communication between the device and the server. For the packet corresponding to the first segment, the first 5 bytes of the TCP payload are modified by replacing the fourth and fifth bytes with the values of the third and fourth bytes, respectively. The first three bytes can be updated according to fixed characters associated with the TLS protocol. Subsequently, the IP header

“ Options ”

field is removed, and the

“ Internet header length ”

,

“ Total length ”

, and

“ Header checksum ”

are updated accordingly. For packets corresponding to other segments, their IP header

“ Options ”

field is similarly removed, along with updates to their lengths and checksums. Finally, the packets are restored and forwarded as usual.

Similar to downlink packet segmentation, for uplink packets, segmentation and restoration operations are respectively performed on nodes 6 and 3 for packets with a smart home device as the source address.

Remark 1.

Time-based side-channel attacks primarily infer users’ sensitive information by analyzing the correlation between time intervals and response times in IoT traffic. For instance, such attacks can reveal device types, embedded sensor characteristics, and user behaviors. Prates et al. [21] proposed an obfuscation mechanism that employs both delay insertion and fake packet injection to effectively mitigate information leakage between devices and sensors caused by temporal characteristics. Specifically, the delay insertion method alters the average difference between the distributions of all response times (denoted as τ) for each device by introducing a perturbation γ, resulting in

τ^{'} = τ + γ

; meanwhile, the fake packet mechanism constructs and injects pseudo request packets, and subsequently generates pseudo responses based on predefined delay rules to obscure the actual sending and receiving timestamps.

Following the approach described in [21], the packet segmentation method proposed in this paper exhibits time obfuscation characteristics. On one hand, segmenting a packet into multiple parts can effectively alter the transmission time interval between original packets and result in an increased number of ACKs. On the other hand, the delayed ACK redirection technique we introduce allows for the random insertion of ACKs between any two segments, thereby introducing a random response delay. This mechanism aligns with the design concept presented in reference [21], and thus demonstrates a certain level of resistance against time-based side-channel attacks.

It should be noted that this paper primarily presents a foundational experimental framework for packet segmentation. The obfuscation parameters and strategies still require flexible adjustment based on varying network environments and attack models in order to further improve the effectiveness of protection against time-based information leakage.

5. Performance Analysis

In this section, traffic obfuscation is implemented using the proposed online experimental framework. The device’s continuous connectivity and functional performance are validated to ensure system stability. Furthermore, the overhead of the obfuscation process, along with the traffic statistical characteristics, device event recognition rate, and device recognition rate before and after obfuscation, are thoroughly analyzed.

5.1. Continuous Connectivity and Functionality of Devices

During the experimental process, the online experimental framework continuously performed traffic obfuscation operations for over 3 h. During this period, the smart home devices were remotely controlled via mobile terminals and consistently functioned normally, completing the on/off operations as instructed without disconnection. The experimental results confirm that the established online traffic obfuscation experimental framework effectively ensures continuous connectivity and functionality of the devices with high practicality and stability.

5.2. Overhead

The online traffic obfuscation experimental framework proposed in this paper involves key processing components such as packet capture, construction, and modification. Among these components, the packet copy operation between user space and kernel space is widely utilized, which inevitably introduces system resource consumption and communication performance degradation. Therefore, to comprehensively evaluate the performance impact of the experimental framework, this section conducts a quantitative analysis from three dimensions: node CPU and memory usage, node throughput, and network performance.

The experiment was divided into two scenarios: one was to run three typical devices (Mijia LED bulb, Xiaomi intelligent camera, and Xiaomi smart socket) simultaneously without enabling the traffic obfuscation mechanism; the other was to enable the traffic obfuscation mechanism and perform obfuscation operations while operating the above three devices. Each device was operated once every 30 s, with an experimental duration of 30 min and a total of 60 operations.

5.2.1. Node CPU and Memory Usage

To evaluate the system resource overhead caused by the traffic obfuscation mechanism at the process level, CPU usage rate (%CPU) and memory usage rate (%MEM) are selected as key indicators. During the experiment, the system resource occupation of four nodes (Nodes 2, 3, 6, and 7) that performed obfuscation operations was recorded respectively when they ran continuously for 30 min under their respective tasks. The average value was calculated based on the sampled data per second to reflect the overall resource consumption level.

Specifically, nodes 2 and 7 only participated in fake traffic injection, and their resource utilization rates were the average values during the 30 min of operation of this method. Nodes 3 and 6 participated in packet padding, packet segmentation, and fake traffic injection, with each method running for 30 min. The average resource utilization for each method was then calculated. Finally, the averages of the three methods were used to compute the overall average, which served as the comprehensive resource overhead indicator for each node.

The experimental results are presented in Table 2. The CPU usage across all nodes remains below 0.4%, and memory usage stays under 2%, demonstrating that the proposed obfuscation mechanism incurs only minimal resource overhead.

5.2.2. Node Throughput

To evaluate the impact of the traffic obfuscation mechanism on the performance of the node itself, the average throughput is used as the evaluation index to reflect the data processing capability of the node during the task execution process. Use the iPerf3 tool (version 3.15) to continuously transmit data in TCP mode for 30 min and calculate the average value based on the sampled data per second.

This experiment established two experimental groups. In the first group, where the obfuscation mechanism was not enabled, four nodes (2, 3, 6, and 7) each operated for 30 min, and their average throughput was calculated as a baseline. In the second group, the same four nodes were evaluated under obfuscation. Specifically, nodes 2 and 7 participated only in fake traffic injection, with average throughput measured over 30 min of execution. Nodes 3 and 6 implemented three obfuscation methods—packet padding, packet segmentation, and fake traffic injection—with each method running for 30 min. The average throughput for each method was calculated separately. These individual averages were then combined to compute an overall average, which served as a comprehensive throughput metric for all nodes. By comparing the average throughput of nodes with and without the obfuscation mechanism enabled, we quantitatively assess the performance overhead introduced by the obfuscation mechanism.

The experimental results are presented in Table 3. It can be observed that after activating the obfuscation mechanism, the average throughput of the nodes experienced a decrease of 0.45%, which demonstrates that the performance overhead introduced by the obfuscation mechanism is negligible.

5.2.3. Network Performance

To evaluate the impact of traffic obfuscation mechanisms on network performance, four commonly used metrics were selected for comprehensive assessment: average delay, average throughput, average jitter, and average packet loss rate.

To restore the typical IoT application scenarios as accurately as possible, referring to the statistical results of device traffic peaks (approximately 1 Mbps) in reference [12], the experiment used the iPerf3 tool to simulate 5 Mbps traffic between nodes in UDP mode to test the scalability of the system. Meanwhile, use the Ping tool provided by BusyBox (version 1.36.1) to measure delay to ensure the accuracy and comparability of the results.

The experiment was conducted on the communication link between Node 2 and Node 7, with two experimental groups established: (1) without the obfuscation mechanism, where continuous transmission between nodes was carried out for 30 min, during which the average values of four network performance metrics were recorded and calculated; and (2) with the obfuscation mechanism enabled, where three obfuscation methods—packet padding, packet fragmentation, and fake traffic injection—were implemented separately. Each method was executed for 30 min, and the corresponding average performance metrics were calculated for each. The final average of the three sets of results was used as the comprehensive performance indicator under the obfuscation mechanism. By comparing the results from the two experimental groups, the impact of the proposed obfuscation mechanism on network performance was quantitatively analyzed.

The experimental results are shown in Table 4. After enabling the obfuscation mechanism, the average delay increased by approximately 8.85%, which is lower than the delay overhead introduced by the mechanism proposed in the reference [18]. Compared to the case where the obfuscation mechanism is not enabled, the average throughput, average jitter, and average packet loss rate changed only slightly, indicating that the obfuscation mechanism has a minor impact on the overall network performance.

Meanwhile, the line-speed packet processing framework based on P4 [22] has shown good potential in achieving efficient and low-overhead online traffic management. Future work will focus on exploring the path of integrating such programmable network hardware with existing traffic obfuscation mechanisms to further enhance the real-time processing capabilities and engineering practicality of the system.

5.3. Traffic Statistical Characteristics

Three sets of traffic data were compared and analyzed: the unobfuscated traffic data (Not_obfuscated.pcap), the traffic data after packet padding (Packet_padding.pcap), and the traffic data after packet segmentation (Packet_segmentation.pcap). These datasets were collected during 50 switching operations performed by a mobile terminal controlling the Mijia LED bulb, with each operation separated by a uniform interval of 131 s. The Not_obfuscated.pcap dataset represents the original traffic captured at node 7 in Figure 3 using the Tcpdump tool. The Packet_padding.pcap dataset corresponds to the traffic that was padded and restored at nodes 3 and 6 and subsequently captured at node 4 using Tcpdump. The padding length should be randomly selected within the range of 0 to

(M S S - l e n)

bytes, where

M S S

(Maximum Segment Size) denotes the maximum size of the TCP payload that can be transmitted in a single segment. Since the packet length in smart home devices is typically between 100 and 200 bytes, a range of 0 to 50 bytes is selected to reduce communication overhead. The Packet_segmentation.pcap dataset consists of traffic captured after segmenting TCP packets in different directions at nodes 3 and 6, with Tcpdump employed at node 4. Specifically, each packet was divided into three segments, where the TCP payload lengths of the first and second segments were randomly selected between 8 and 16 bytes, while the remaining portion served as the TCP payload for the third segment.

The packet length distributions of the three traffic sets were statistically analyzed, and Kernel Density Estimation (KDE) curves were plotted, as shown in Figure 11. In the unobfuscated traffic, the mean packet length was 124.42 bytes, the median was 91 bytes, the standard deviation was 95.89, the skewness was 1.92, and the kurtosis was 5.24, exhibiting a certain degree of right-skewness and central tendency. After packet padding processing, the length distribution shifted upward overall, with the mean increasing to 146.63 bytes, the median increasing to 122 bytes, and the standard deviation expanding to 109.06. The skewness and kurtosis decreased to 1.51 and 3.34, respectively, indicating that this strategy increases packet length while making the distribution more symmetric and smoother. After packet segmentation, the packets were split into smaller fragments, resulting in a significant decrease in the mean to 77.24 bytes, a median of 63 bytes, and a standard deviation of 55.43. Meanwhile, the skewness increased to 4.47, and the kurtosis reached 27.00, exhibiting strong right-skewed and peaked characteristics. From the above statistical characteristics, both traffic obfuscation methods alter the length distribution characteristics of the original traffic to varying degrees, thereby enhancing obfuscation effectiveness and improving resistance to identification.

In the above experiment, based on the traffic characteristics of the Mijia LED bulb, we set the packet padding length to range from 0 to 50 bytes. This range selection aims to control the communication overhead as much as possible while attempting to introduce a certain degree of length perturbation. To evaluate the actual impact of this setting on the traffic characteristics, we conducted a statistical analysis of the packet length distribution before and after padding using the Kolmogorov-Smirnov (KS) two-sample tests [23,24]. This test determines whether the two samples come from the same distribution by comparing the maximum difference between their empirical cumulative distribution functions (ECDFs).

Specifically, extract the two sets of packet length sequences before and after padding respectively, and denote them as:

X = {x_{1}, x_{2}, \dots, x_{n}}, Y = {y_{1}, y_{2}, \dots, y_{m}} .

Among them, X represents the set of packet lengths before padding, and Y represents the set of packet lengths after padding. To compare the distribution differences between the two sets of data, first calculate their ECDFs:

F_{n} (x) = \frac{1}{n} \sum_{i = 1}^{n} 1 (x_{i} \leq x), G_{m} (x) = \frac{1}{m} \sum_{j = 1}^{m} 1 (y_{j} \leq x) .

Among them,

1 (\cdot)

is an indicator function, which takes the value 1 when the condition is satisfied and 0 otherwise.

F_{n} (x)

represents the proportion of packets less than or equal to x in the sample before padding, and

G_{m} (x)

represents the corresponding proportion after padding.

The KS statistic is defined as the maximum absolute difference between two empirical distribution functions:

D_{n, m} = sup_{x} |F_{n} (x) - G_{m} (x)| .

This statistic measures the maximum vertical distance between the distribution functions of two data samples at all possible values of x. The larger the

D_{n, m}

, the more significant the difference in distribution between the two groups of samples.

By calculating the KS statistic and its corresponding p value, it can be determined whether the difference is statistically significant. When

p < 0.05

, it is considered that there is a significant difference in the distribution of packet lengths before and after padding, that is, the null hypothesis (the two samples come from the same distribution) is rejected.

The experimental results show that within the padding range of 0 to 50 bytes, the KS statistic is 0.2333, and the p-value is

1.6903 \times 10^{- 29}

, significantly lower than 0.05, indicating that the padding operation has a statistically significant impact on the distribution of packet lengths. Notably, the KS statistic measures the maximum distance between two empirical cumulative distribution functions, reaching 23.33%, suggesting that padding has to some extent altered the original distribution structure.

To further explore the impact of the padding range on the degree of distribution perturbation, we expanded the padding range to 0 to the maximum transmission unit (MTU) minus the original packet length and conducted the KS two-sample tests again. The results showed that the KS statistic rose to 0.4321, and the p-value further decreased to

1.0946 \times 10^{- 102}

, indicating that the distribution change was more significant under a larger padding range.

In conclusion, a smaller padding range (0–50 bytes) causes a statistically significant change in the packet length distribution of the Mijia LED bulb. However, the specific settings of the padding strategy should be balanced and designed based on the actual application scenario, device type, and attack model.

5.4. Device Event Recognition Rate

This section evaluates the on/off event recognition rates of the Xiaomi intelligent camera, Mijia LED bulb, Xiaomi smart socket, and Aqara door/window sensor before and after traffic obfuscation, using the random forest classification method.

5.4.1. The Impact of Fake Traffic Injection on the Device Event Recognition Rate

First, the traffic data before and after fake traffic injection are captured and saved in separate PCAP files. Prior to the fake traffic injection, four devices—namely the Xiaomi intelligent camera, Mijia LED bulb, Xiaomi smart socket, and Aqara door and window sensor—are simultaneously controlled to perform 60 on/off operations, with an operation interval of 30 s. This process employs the Tcpdump tool at node 4 in Figure 3 to capture mixed traffic, which includes device traffic and background traffic. Subsequently, the server’s traffic interacting with the devices is extracted from the captured data. Based on this traffic pattern, bidirectional communication between the server and the devices is simulated on nodes 2 and 7 in Figure 3, while fake traffic injection and elimination operations are performed on nodes 3 and 6. Obfuscated traffic is captured on node 4 using the Tcpdump tool.

Then, based on the method described in reference [25], the on/off event fingerprints of the devices are extracted. Taking the Mijia LED bulb as an example, the fingerprint refers to packet pairs (or sequences) of a specific length exchanged between the server and the bulb, as shown in Figure 12, and the fingerprints are saved in a CSV file.

Finally, device event recognition is performed following the process.

Data Preparation and Model Training: The on/off event fingerprint features of the Mijia LED bulb are extracted from a CSV file. Labels are uniformly appended, and the data are merged. Additionally, the direction and event types are encoded for further processing. Subsequently, the sequence of packets within these fingerprints (comprising combinations of packet size and direction) is utilized to train a random forest classifier, enabling accurate recognition of device types and their corresponding events.
Traffic Analysis and Device Event Recognition: The size of TCP packets and their timestamp information are extracted from the PCAP file and matched with the features of the packet sequences in the training set in terms of timing to filter out the time segments that meet the requirements. After that, the matched data sequences are predicted using the trained model to recognize the device events corresponding to them. The recognition results for device on/off events before and after fake traffic injection are presented in Figure 13 and Figure 14, respectively.

After conducting a statistical analysis of the recognition results, it was found that prior to the injection of fake traffic, the accuracy of the Xiaomi intelligent camera was 98.33%, the accuracy of the Mijia LED bulb was 90%, the accuracy of the Xiaomi smart socket was 96.67%, and the accuracy of the Aqara door and window sensor was 100%. After the injection of fake traffic, the accuracy decreased to 95%, 88.33%, 86.67%, and 88.33%, respectively. The design of the fake traffic injection method typically cannot fully replicate the actual bidirectional communication process of IoT devices, as the injected traffic does not establish genuine TCP connections. Additionally, inevitable time errors during system operation may cause a small number of packets to become out of order, resulting in minor discrepancies from actual traffic. Despite this, the constructed fake traffic still possesses strong obfuscation capabilities in most cases, capable of interfering with the identification of real device events with a high probability.

5.4.2. The Impact of Packet Padding and Segmentation on the Device Event Recognition Rate

The device event recognition process described above relies on the timing characteristics of the packet sequence. Packet padding and segmentation operations can have a significant impact on these timing characteristics. Specifically, packet padding changes the size of the original packet, while packet segmentation not only changes the packet size but also affects the time interval between neighboring packets. Applying these two traffic obfuscation methods will destroy the timing characteristics of the original traffic, which can significantly reduce the accuracy of this device event recognition process or even result in a recognition accuracy of 0.

5.5. Device Recognition Rate

This section investigates the impact of traffic obfuscation techniques on the recognition accuracy of the Xiaomi intelligent camera, Xiaomi smart socket, and Aqara door and window sensor, leveraging the deep learning methodology introduced in [26]. The training dataset collection procedure is outlined as follows. For three devices, namely the Xiaomi intelligent camera, the Xiaomi smart socket, and the Aqara door and window sensor, a mobile terminal is used to perform 50 “on” and “off” operations on each device respectively. For the camera and the socket, the interval between each “on” and “off” operation is 5 s. For the door and window sensor, the control consists of an “on” and “off” as a continuous action, with a 10-s interval between each continuous on/off action. Concurrently, network traffic is captured at node 7 in Figure 3 using the Tcpdump tool. Each round of control operations (i.e., 50 on/off cycles per device) generates one PCAP file, accumulating to a total of 40 PCAP files per device, which serve as the dataset for subsequent model training. The validation datasets comprise unobfuscated traffic data, traffic data following packet padding, traffic data after packet segmentation, and traffic data from fake traffic injection. Unobfuscated traffic data is collected using the same method as the training data. Packet-padded and segmented traffic data undergo their respective obfuscation processes during collection, maintaining parameter settings consistent with those detailed in Section 5.3. For the fake traffic injection data, synthetic packets are constructed based on the original PCAP files of the Xiaomi intelligent camera, Xiaomi smart socket, and Aqara door and window sensor, as described in Section 5.4.1.

The deep learning model introduced in [26] employs a hierarchical abstraction mechanism, enabling the transformation of original heterogeneous network traffic characteristics (e.g., packet size and time interval) into homogeneous vectors in a unified format. This input-independent model automatically extracts key features to efficiently perform traffic fingerprinting. Given the limited sample size, the K-fold cross-validation method is applied during the training phase with K set to 10, yielding 10 models for evaluation. During the verification phase, for each of the three devices, the PCAP files that were not used in the training process were selected as test data to evaluate the performance of the previously mentioned 10 models. The verification results show that the recognition accuracy of six models (numbered 0 to 5) exceed 85%. The subsequent experimental evaluations will uniformly take these six models as representatives for more in-depth analysis.

Figure 15, Figure 16 and Figure 17 illustrate the recognition accuracy of the models for the Xiaomi intelligent camera, Xiaomi smart socket, and Aqara door and window sensor, respectively, under four distinct scenarios: unobfuscated, packet padding, packet segmentation, and fake traffic injection. The Xiaomi intelligent camera achieves an average accuracy rate of 96% in both the unobfuscated and packet padding scenarios, 100% in the packet segmentation scenario, and 92% in the fake traffic injection scenario. The Xiaomi smart socket achieves average accuracy rates of 91.35% in the unobfuscated scenario, 77.24% in the packet padding scenario, 0% in the packet segmentation scenario, and 95.99% in the fake traffic injection scenario. The Aqara door and window sensor achieves average accuracy rates of 98.86% in the unobfuscated scenario, 98.02% in the packet padding scenario, 2.26% in the packet segmentation scenario, and 98.86% in the fake traffic injection scenario.

Overall, in the fake traffic injection scenario, the model achieved an average recognition accuracy of 95.62%, which indicates that this method exerts a strong deceptive influence on it. Packet padding has little effect on recognition performance. This is because the model employs a long short-term memory network (LSTM) integrated with an attention mechanism (ATT), which is specifically designed to capture the temporal characteristics of device traffic. Meanwhile, packet padding primarily alters the packet size and does not significantly interfere with these temporal characteristics. In contrast, packet segmentation has a more significant impact on recognition performance. In particular, for the Xiaomi smart socket and Aqara door and window sensor, the proportion of these two types of devices that were misclassified as Xiaomi intelligent cameras after segmentation reached 97.21%. Although the recognition accuracy of the camera exceeds 90%, the false recognition rate remains high. From another perspective, this contributes to the obfuscation of the camera’s network traffic. The following provides an analysis of why the camera achieves a high recognition accuracy while simultaneously exhibiting a relatively high false recognition rate. This is because packet segmentation alters the temporal characteristics of the original traffic, and the resulting traffic characteristics after segmentation are more similar to those of the Xiaomi intelligent camera. The Xiaomi intelligent camera still maintains a high recognition accuracy in this scenario, as the model is a three-class classifier. Even though the traffic characteristics of the camera after segmentation deviate from its original profile, they remain closer to those of the camera than to those of the other two devices (Xiaomi smart socket and Aqara door and window sensor), leading the model to still classify it as a camera.

A comparison of the performance of offline and online obfuscation in device identification is presented. In existing studies, most offline obfuscation mechanisms are applied to pre-collected datasets. To ensure result comparability, the offline dataset used in this experiment was collected in the same network environment as that used for online obfuscation, and was configured with identical obfuscation parameters. Figure 15, Figure 16 and Figure 17 display the model’s recognition accuracy under offline obfuscation conditions. The average recognition accuracies of the Xiaomi intelligent camera under packet padding, packet segmentation, and fake traffic injection scenarios were 92.83%, 100%, and 96.00%, respectively. For the Xiaomi smart socket, the corresponding recognition accuracies were 80.45%, 1.28%, and 91.35%. The Aqara door and window sensor achieved recognition accuracies of 96.21%, 24.09%, and 99.02% in these three scenarios, respectively. These experimental results do not show a fundamental difference compared to those obtained from the online obfuscation experiments. However, unlike offline mechanisms, the online framework supports real-time traffic obfuscation, addressing the limitation of traditional offline obfuscation methods that are unable to provide real-time intervention, and thus offering greater practical value in real-world deployment.

Fundamental traffic obfuscation techniques implemented within the online experimental framework described in this paper do not fully counteract existing traffic analysis methods. This conclusion is consistent with related observations in [27], which indicate that no single obfuscation technique can resist all forms of traffic analysis. It is worth noting that the fundamental traffic obfuscation techniques implemented within the online experimental framework facilitate relevant researchers in extending them to novel complex traffic obfuscation methods while assessing the effectiveness of these methods online.

6. Conclusions

To address the limitations of non-real-time traffic in the design and performance verification of traffic obfuscation methods—particularly the inability to validate the continuous connectivity and functionality of smart home devices—this paper proposes an online traffic obfuscation experimental framework. Based on this framework, three representative traffic obfuscation techniques are implemented. These techniques employ simplified obfuscation strategies, enabling researchers to flexibly adjust the obfuscation methods and intensities according to specific requirements. Looking ahead, we expect that researchers will leverage this online experimental framework to further advance the design and experimental validation of traffic obfuscation methods.

Author Contributions

Conceptualization, S.H. and J.C.; methodology, S.H. and J.C.; software, S.H.; validation, S.H. and Z.C.; formal analysis, Q.Z.; investigation, S.H. and Z.C.; resources, M.Z.; data curation, Z.C.; writing—original draft preparation, S.H. and Z.C.; writing—review and editing, Q.Z. and M.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the National Natural Science Foundation of China under Grant 62361017, in part by Natural Science Foundation of Guangxi under Grant 2023GXNSFBA026212, in part by Research Foundation Ability Enhancement Project for Young and Middle aged Teachers in Guangxi Universities under Grant 2023KY0227, in part by the Open Project of State Key Laboratory of Public Big Data under Grant PBD2022-09, and in part by the National College Students’ Innovation and Entrepreneurship Training Program of China under Grant 202410595061.

Data Availability Statement

The original contributions presented in the study are included in the article, and further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Grand View Research. Smart Home Market Size, Share & Trends Analysis Report By Product (Security & Access Controls, Lighting Control), By Protocol (Wired, Wireless, Hybrid), By Application (New Construction, Retrofit), By Region, And Segment Forecasts, 2025–2030. 2025. Available online: https://www.grandviewresearch.com/industry-analysis/smart-homes-industry (accessed on 23 May 2025).
Skowron, M.; Janicki, A.; Mazurczyk, W. Traffic Fingerprinting Attacks on Internet of Things Using Machine Learning. IEEE Access 2020, 8, 20386–20400. [Google Scholar] [CrossRef]
Ahsan, M.S.; Islam, M.S.; Hossain, M.S.; Das, A. Detecting Smart Home Device Activities Using Packet-Level Signatures From Encrypted Traffic. IEEE Trans. Dependable Secur. Comput. 2025, 22, 1070–1081. [Google Scholar] [CrossRef]
Apthorpe, N.; Reisman, D.; Feamster, N. Closing the Blinds: Four Strategies for Protecting Smart Home Privacy From Network Observers. In Proceedings of the 2017 IEEE Symposium on Security and Privacy (SP) Workshop on Technology and Consumer Protection (ConPro ’17), San Jose, CA, USA, 25 May 2017; pp. 1–6. Available online: https://www.ieee-security.org/TC/SPW2017/ConPro/papers/apthorpe-conpro17.pdf (accessed on 27 February 2025).
Jmila, H.; Blanc, G.; Shahid, M.R.; Lazrag, M. A Survey of Smart Home IoT Device Classification Using Machine Learning-Based Network Traffic Analysis. IEEE Access 2022, 10, 97117–97141. [Google Scholar] [CrossRef]
Datta, T.; Apthorpe, N.; Feamster, N. A Developer-Friendly Library for Smart Home IoT Privacy-Preserving Traffic Obfuscation. In Proceedings of the 2018 ACM Special Interest Group on Data Communication (SIGCOMM) Workshop on IoT Security and Privacy (IoT S&P ’18), Budapest, Hungary, 20 August 2018; pp. 43–48. [Google Scholar] [CrossRef]
Apthorpe, N.; Huang, D.Y.; Reisman, D.; Narayanan, A.; Feamster, N. Keeping the Smart Home Private with Smart (er) IoT Traffic Shaping. In Proceedings of the 2017 Privacy Enhancing Technologies Symposium (PETS), Minneapolis, MN, USA, 18–21 July 2019; Volume 2019, pp. 128–148. [Google Scholar] [CrossRef]
Wang, C.; Kennedy, S.; Li, H.; Hudson, K.; Atluri, G.; Wei, X.; Sun, W.; Wang, B. Fingerprinting Encrypted Voice Traffic on Smart Speakers with Deep Learning. In Proceedings of the 13th ACM Conference on Security and Privacy in Wireless and Mobile Networks (WiSec ’20), Linz, Austria, 8–10 July 2020; pp. 254–265. [Google Scholar] [CrossRef]
Alshehri, A.; Granley, J.; Yue, C. Attacking and Protecting Tunneled Traffic of Smart Home Devices. In Proceedings of the Tenth ACM Conference on Data and Application Security and Privacy (CODASPY ’20), New Orleans, LA, USA, 16–18 March 2020; pp. 259–270. [Google Scholar] [CrossRef]
Sivanathan, A.; Sherratt, D.; Gharakheili, H.H.; Radford, A.; Wijenayake, C.; Vishwanath, A.; Sivaraman, V. Characterizing and Classifying IoT Traffic in Smart Cities and Campuses. In Proceedings of the 2017 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), Atlanta, GA, USA, 1–4 May 2017; pp. 559–564. [Google Scholar] [CrossRef]
Pinheiro, A.J.; de Araujo-Filho, P.F.; Bezerra, J.d.M.; Campelo, D.R. Adaptive Packet Padding Approach for Smart Home Networks: A Tradeoff Between Privacy and Performance. IEEE Internet Things J. 2021, 8, 3930–3938. [Google Scholar] [CrossRef]
Sivanathan, A.; Gharakheili, H.H.; Loi, F.; Radford, A.; Wijenayake, C.; Vishwanath, A.; Sivaraman, V. Classifying IoT Devices in Smart Environments Using Network Traffic Characteristics. IEEE Trans. Mob. Comput. 2019, 18, 1745–1759. [Google Scholar] [CrossRef]
Brahma, J.; Sadhya, D. Preserving Contextual Privacy for Smart Home IoT Devices With Dynamic Traffic Shaping. IEEE Internet Things J. 2022, 9, 11434–11441. [Google Scholar] [CrossRef]
Ren, J.; Dubois, D.J.; Choffnes, D.; Mandalari, A.M.; Kolcun, R.; Haddadi, H. Information Exposure From Consumer IoT Devices: A Multidimensional, Network-Informed Measurement Approach. In Proceedings of the ACM Internet Measurement Conference (IMC ’19), Amsterdam, The Netherlands, 21–23 October 2019; pp. 267–279. [Google Scholar] [CrossRef]
Alyami, M.; Alkhowaiter, M.; Al Ghanim, M.; Zou, C.; Solihin, Y. MAC-Layer Traffic Shaping Defense Against WiFi Device Fingerprinting Attacks. In Proceedings of the 2022 IEEE Symposium on Computers and Communications (ISCC), Rhodes, Greece, 30 June–3 July 2022; pp. 1–7. [Google Scholar] [CrossRef]
Zhang, S.; Shen, F.; Liu, Y.; Yang, Z.; Lv, X. A Novel Traffic Obfuscation Technology for Smart Home. Electronics 2023, 12, 3477. [Google Scholar] [CrossRef]
Alyami, M.; Alghamdi, A.; Alkhowaiter, M.A.; Zou, C.; Solihin, Y. Random Segmentation: New Traffic Obfuscation against Packet-Size-Based Side-Channel Attacks. Electronics 2023, 12, 3816. [Google Scholar] [CrossRef]
Pinheiro, A.J.; Bezerra, J.M.; Campelo, D.R. Packet Padding for Improving Privacy in Consumer IoT. In Proceedings of the 2018 IEEE Symposium on Computers and Communications (ISCC), Natal, Brazil, 25–28 June 2018; pp. 925–929. [Google Scholar] [CrossRef]
Hafeez, I.; Antikainen, M.; Tarkoma, S. Protecting IoT-environments against Traffic Analysis Attacks with Traffic Morphing. In Proceedings of the 2019 IEEE international conference on pervasive computing and communications workshops (PerCom Workshops), Kyoto, Japan, 11–15 March 2019; pp. 196–201. [Google Scholar] [CrossRef]
Zhu, Q.; Yang, C.; Zheng, Y.; Ma, J.; Li, H.; Zhang, J.; Shao, J. Smart home: Keeping privacy based on Air-Padding. IET Inf. Secur. 2021, 15, 156–168. [Google Scholar] [CrossRef]
Prates, N.; Vergütz, A.; Macedo, R.T.; Santos, A.; Nogueira, M. A Defense Mechanism for Timing-based Side-Channel Attacks on IoT Traffic. In Proceedings of the GLOBECOM 2020—2020 IEEE Global Communications Conference, Taipei, Taiwan, 7–11 December 2020; pp. 1–6. [Google Scholar] [CrossRef]
Kim, H.; Toh, W.X.; Hao, L.; Schulzrinne, H. Wital: A Whitelist-Based IoT Firewall for Mitigating Device Exploitation. In Proceedings of the 2024 IEEE International Performance, Computing, and Communications Conference (IPCCC), Orlando, FL, USA, 22–24 November 2024; pp. 1–2. [Google Scholar] [CrossRef]
Pratt, J.W.; Gibbons, J.D. Kolmogorov-Smirnov Two-Sample Tests. In Concepts of Nonparametric Theory; Springer: New York, NY, USA, 1981; pp. 318–344. [Google Scholar] [CrossRef]
Xiao, Y. A Fast Algorithm for Two-Dimensional Kolmogorov–Smirnov Two Sample Tests. Comput. Stat. Data Anal. 2017, 105, 53–58. [Google Scholar] [CrossRef]
Trimananda, R.; Varmarken, J.; Markopoulou, A.; Demsky, B. Packet-Level Signatures for Smart Home Devices. In Proceedings of the Network and Distributed Systems Security (NDSS) Symposium, San Diego, CA, USA, 23–26 February 2020; pp. 1–18. [Google Scholar] [CrossRef]
Qu, J.; Ma, X.; Li, J.; Luo, X.; Xue, L.; Zhang, J.; Li, Z.; Feng, L.; Guan, X. An Input-Agnostic Hierarchical Deep Learning Framework for Traffic Fingerprinting. In Proceedings of the 32nd USENIX security symposium (USENIX Security 23), Anaheim, CA, USA, 9–11 August 2023; pp. 589–606. Available online: https://www.usenix.org/system/files/usenixsecurity23-qu.pdf (accessed on 7 March 2025).
Shen, M.; Ji, K.; Gao, Z.; Li, Q.; Zhu, L.; Xu, K. Subverting Website Fingerprinting Defenses with Robust Traffic Representation. In Proceedings of the 32nd USENIX Security Symposium (USENIX Security 23), Anaheim, CA, USA, 9–11 August 2023; pp. 607–624. Available online: https://www.usenix.org/system/files/usenixsecurity23-shen-meng.pdf (accessed on 23 April 2025).

Figure 1. Smart home network structure.

Figure 2. Online traffic obfuscation experiment network.

Figure 3. Framework for online traffic obfuscation experiments.

Figure 4. Packet structure.

Figure 5. Implementation principle of fake traffic injection, using the smart home device with IP address 192.168.2.175 as an example.

Figure 6. Simplified flowchart of fake traffic injection.

Figure 7. Simplified flowchart of packet padding.

Figure 8. Illustration of the principle of packet padding and restoration.

Figure 9. Simplified flowchart of packet segmentation.

Figure 10. Example of packet segmentation with delayed redirection.

Figure 11. Kernel density estimation of packet length distributions under three scenarios.

Figure 12. Mijia LED bulb on/off event fingerprints.

Figure 13. Recognition results for device on/off events before fake traffic injection.

Figure 14. Recognition results for device on/off events after fake traffic injection.

Figure 15. Recognition accuracy of Xiaomi intelligent camera traffic (blue solid line overlaps with dashed line).

Figure 16. Recognition accuracy of Xiaomi smart socket traffic.

Figure 17. Recognition accuracy of Aqara door and window traffic.

Table 1. Specific configuration of experimental network nodes.

No	IP Address	Equipment Name	Network Mode	Specifications
1	`192.168.2.1`	Router	Ethernet	Ordinary home router (JD.com, Shenzhen, China)
2	`192.168.2.2`	Open-source router	Ethernet	Nanopi R5S OpenWrt OS 4 G Memory (JD.com, Shenzhen, China)
3	`192.168.2.21`	Industrial control computer	Ethernet	Ubuntu OS G590-Pentium 7505 CPU DDR4 8 G Memory (JD.com, Shenzhen, China)
4	`192.168.2.3`	Open-source router	Ethernet/WiFi	Nanopi R5S OpenWrt OS 4 G Memory (JD.com, Shenzhen, China)
5	`192.168.2.4`	WiFi repeater	Ethernet/WiFi	Tenda WiFi network repeater (JD.com, Shenzhen, China)
6	`192.168.2.22`	Industrial control computer	Ethernet	Ubuntu OS G590-Pentium 7505 CPU DDR4 8 G Memory (JD.com, Shenzhen, China)
7	`192.168.2.5`	Open-source router	Ethernet/WiFi	Nanopi R5S OpenWrt OS 4 G Memory (JD.com, Shenzhen, China)
8	`192.168.2.175`	Intelligent gateway	WiFi/Zigbee/ Bluetooth	Xiaomi intelligent multi-mode gateway (JD.com, Wuhan, China)
9	`192.168.2.149`	Camera	WiFi	Xiaomi intelligent camera (JD.com, Shenzhen, China)
10	`192.168.2.204`	Smart socket	WiFi	Xiaomi smart socket (JD.com, Shenzhen, China)
11	/	LED bulb	Bluetooth	Mijia LED bulb (JD.com, Shenzhen, China)
12	/	LED bulb	Zigbee	Aqara LED bulb (JD.com, Shenzhen, China)
13	/	Door/window sensor	Zigbee	Aqara door and window sensor (JD.com, Shenzhen, China)
14	/	Human body movement sensor	Zigbee	Aqara human body movement sensor (JD.com, Shenzhen, China)

Table 2. Average CPU and Memory Usage of Nodes Performing Obfuscation Operations.

Metric	Node 2	Node 3	Node 6	Node 7
%CPU	0.264	0.399	0.385	0.336
%MEM	1.800	0.861	0.861	1.799

Table 3. Average Throughput per Node.

Metric (KBps)	Node 2	Node 3	Node 6	Node 7
Average Troughput Before Obfuscation	770	788	928	941
Average Troughput After Obfuscation	754	787	932	941

Table 4. Network Performance under 5 Mbps Network Load.

Metric	Not Obfuscated	Obfuscated
Average Delay (ms)	5.743	6.251
Average Throughput (KBps)	4.996	4.996
Average Jitter (ms)	0.111	0.085
Average Packet Loss (%)	0.000	0.000

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Huang, S.; Cao, J.; Chen, Z.; Zhong, Q.; Zhang, M. Online Traffic Obfuscation Experimental Framework for the Smart Home Privacy Protection. Electronics 2025, 14, 3294. https://doi.org/10.3390/electronics14163294

AMA Style

Huang S, Cao J, Chen Z, Zhong Q, Zhang M. Online Traffic Obfuscation Experimental Framework for the Smart Home Privacy Protection. Electronics. 2025; 14(16):3294. https://doi.org/10.3390/electronics14163294

Chicago/Turabian Style

Huang, Shuping, Jianyu Cao, Ziyi Chen, Qi Zhong, and Minghe Zhang. 2025. "Online Traffic Obfuscation Experimental Framework for the Smart Home Privacy Protection" Electronics 14, no. 16: 3294. https://doi.org/10.3390/electronics14163294

APA Style

Huang, S., Cao, J., Chen, Z., Zhong, Q., & Zhang, M. (2025). Online Traffic Obfuscation Experimental Framework for the Smart Home Privacy Protection. Electronics, 14(16), 3294. https://doi.org/10.3390/electronics14163294

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Online Traffic Obfuscation Experimental Framework for the Smart Home Privacy Protection

Abstract

1. Introduction

2. Related Work

3. Online Traffic Obfuscation Experimental Network

4. Online Traffic Obfuscation Experimental Framework

4.1. Fake Traffic Injection

4.2. Packet Padding

4.2.1. Implementation Principles of Packet Padding

4.2.2. Key Algorithms for Packet Padding

4.3. Packet Segmentation

5. Performance Analysis

5.1. Continuous Connectivity and Functionality of Devices

5.2. Overhead

5.2.1. Node CPU and Memory Usage

5.2.2. Node Throughput

5.2.3. Network Performance

5.3. Traffic Statistical Characteristics

5.4. Device Event Recognition Rate

5.4.1. The Impact of Fake Traffic Injection on the Device Event Recognition Rate

5.4.2. The Impact of Packet Padding and Segmentation on the Device Event Recognition Rate

5.5. Device Recognition Rate

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI