IoT Device Fingerprinting via Frequency Domain Analysis

Amamra, Abdelfattah; Anunwah, Jeremy C.; Louafi, Habib

doi:10.3390/electronics14163248

Open AccessArticle

IoT Device Fingerprinting via Frequency Domain Analysis

by

Abdelfattah Amamra

^1,*

,

Jeremy C. Anunwah

¹ and

Habib Louafi

²

¹

Department of Computer Science, California State Polytechnic University, Pomona, CA 91768, USA

²

Department of Science and Technology, TÉLUQ University, Québec, QC G1K 9H6, Canada

^*

Author to whom correspondence should be addressed.

Electronics 2025, 14(16), 3248; https://doi.org/10.3390/electronics14163248

Submission received: 1 July 2025 / Revised: 9 August 2025 / Accepted: 12 August 2025 / Published: 15 August 2025

(This article belongs to the Special Issue Network Security and Cryptography Applications)

Download

Browse Figures

Versions Notes

Abstract

The rapid proliferation of heterogeneous Internet of Things (IoT) devices has introduced a wide range of operational and security challenges, particularly in the domains of device identification and behavior profiling. Traditional fingerprinting methods, which rely primarily on time domain features, often fail to capture the complex, periodic, and often bursty nature of IoT communication—especially in environments characterized by sparse, irregular, or noisy traffic patterns. To address these limitations, two novel frequency-based fingerprinting techniques have been proposed: Spectral-Only Frequency Fingerprint (SFF) and Spectro-Correlative Frequency Fingerprint (SCFF). These approaches shift the analysis from the time domain to the frequency domain, enabling the extraction of richer and more robust behavioral signatures from network traffic. While SFF focuses on capturing the core spectral features of device traffic, SCFF extends this by incorporating inter-feature correlations, offering a more nuanced and comprehensive representation of device behavior. The effectiveness of SFF and SCFF is evaluated across multiple publicly available IoT datasets using a range of machine learning classifiers. Experimental results demonstrate that both fingerprinting methods significantly outperform traditional time domain approaches in terms of accuracy, precision, recall, and F1-score—across all tested classifiers and datasets.

Keywords:

IoT device fingerprinting; IoT device security; IoT device profiling; frequency domain analysis; Fast Fourier Transform; IoT device anomaly detection; IoT device management; time domain analysis

1. Introduction

The number of connected Internet of Things (IoT) devices continues to grow rapidly, and these devices are now widely deployed across a range of sectors, including critical economic infrastructures, such as healthcare, smart grids, and telecommunications networks. As their presence expands, so do the risks associated with their integration into sensitive and high-impact environments. Profiling of IoT devices has become a crucial technique to ensure the secure and efficient management of these networks. It involves identifying and validating the specific behavioral patterns of connected devices in order to detect anomalies, isolate potentially compromised or vulnerable devices, and subject them to closer scrutiny. This approach improves situational awareness and contributes to the proactive defense of complex IoT ecosystems.

The proliferation of heterogeneous IoT devices presents several significant operational and security challenges. These include the need to monitor, detect, and recognize millions of interconnected devices, often resource-constrained, that vary widely in functionality, manufacturer standards, and security protocols. Without robust profiling and classification mechanisms, it becomes extremely difficult for network and system administrators to determine which devices are functioning properly, require software or firmware updates, or are susceptible to known vulnerabilities or zero-day attacks. Failure to accurately identify devices and understand their expected behavior can have serious consequences. Undetected anomalies can lead to breaches, service disruptions, or the exploitation of devices as entry points for lateral attacks. Therefore, the ability to automatically and reliably profile IoT devices is not just beneficial: it is essential to maintain the integrity, availability, and security of modern digital infrastructures.

Several studies have proposed IoT device-profiling methods based on network behavior analysis [1,2,3,4,5], which can be broadly categorized into three types based on feature selection. The first category uses network traffic features, such as packet size, inter-arrival time, direction, session duration, and protocol type, which are easy to extract and effective for identifying device types in IP-based or wired environments [6,7,8,9]. The second category focuses on MAC-layer features, particularly from IEEE 802.11 management frames like beacons and probe requests, offering non-intrusive insights ideal for passive monitoring in wireless networks [10,11,12]. The third category leverages radio frequency signal features, using hardware-level imperfections such as frequency offset and phase noise to create unique, device-specific fingerprints that are difficult to spoof and independent of higher-layer protocols [10,13].

Despite their usefulness, existing IoT fingerprinting feature categories have notable limitations. Time domain features, such as those from network traffic and MAC layers, often struggle with devices that generate sparse or irregular traffic, making fingerprints noisy or unreliable. Conversely, radio frequency (RF) features, while operating in the frequency domain, focus solely on low-level hardware traits and lack visibility into higher-layer behaviors, limiting their effectiveness in distinguishing functionally different devices with similar hardware. To address these gaps, we propose a hybrid fingerprinting approach that combines time domain behavioral insights with the structural clarity of frequency domain analysis. This fusion captures both temporal patterns and spectral correlations, improving the accuracy, robustness, and scalability of device identification in diverse and resource-constrained IoT environments.

We refer to the proposed solution as Frequency Domain Fingerprint Generation (FDFG). As illustrated in Figure 1, the FDFG module takes as input raw network traffic—captured at the gateway from connected IoT devices, ensuring visibility into all device communications regardless of protocol or application. Since all inbound and outbound traffic passes through the gateway, the solution can analyze any type of communication, including low-power protocols such as LoRa. The proposed solution processes the network traffic through a three-stage pipeline. First, the Time Series Signal Generation (TSSG) module converts network packet data into structured time series signals, preserving the temporal dynamics of device communication. Next, the Fast Fourier Transform (FFT) module transforms these signals into the frequency domain, making periodic communication behaviors, such as beacons or burst transmissions, more apparent. Finally, the Frequency Domain Analyzer (FDA) module is the core module that performs detailed spectral analysis and generates two distinct fingerprint types, which serve as input to machine learning classifiers or clustering algorithms for the purpose of individual device identification or device type identification.

The first fingerprint, the Spectral-Only Frequency Fingerprint (SFF), is derived from core spectral features including dominant frequencies, spectral entropy, and power spectral density, which provide a foundational profile of a device’s frequency domain behavior. To enrich this representation, the FDA module further computes the Spectro-Correlative Frequency Fingerprint (SCFF), which enhances SFF by incorporating interdependency features such as cross-spectral coherence, pairwise Pearson correlations, and mutual information between the frequency domain vectors of different traffic dimensions. This enables SCFF to model not only the individual frequency traits but also the coordinated and synchronized relationships among traffic features, which are often distinctive across device types and instances.

We evaluate the proposed fingerprinting methods—SFF and SCFF—on four public IoT datasets [14,15,16,17], covering different devices and deployment scenarios. The evaluation covers both device identification and device type classification into functional categories such as smart plugs and cameras. Across all datasets and tasks, SFF and SCFF consistently outperform traditional time domain methods, achieving higher accuracy in identifying individual devices and their respective types.

The contributions of the paper can be summarized as follows:

We design a complete new processing pipeline—Frequency Domain Fingerprint Generation (FDFG)—which transforms raw network traffic into frequency domain representations.
We propose an IoT fingerprinting solution based solely on spectral frequency features, referred to as the Spectral-Only Frequency Fingerprint (SFF).
We extend this concept with a second fingerprinting model called the Spectro-Correlative Frequency Fingerprint (SCFF), which enriches SFF by incorporating inter-feature relationships.
We evaluate the performance of the proposed fingerprints across multiple public IoT datasets using various machine learning classifiers.

The remainder of the paper is organized as follows: Section 2 reviews the related work in IoT device fingerprinting and frequency domain analysis. Section 3 presents the proposed approach in detail, including the feature-extraction methodology. Section 4 describes the evaluation methodology, including datasets, experimental setup, and performance metrics, followed by a presentation and discussion of the results in Section 5. Finally, Section 6 concludes the paper and outlines potential directions for future research.

2. Related Works

Previous research in IoT device fingerprinting using machine learning, whether focused on device type identification or individual device instance recognition, can be broadly categorized into three main categories based on the types of features used to train models and generate unique device profiles.

2.1. Category 1: Network Traffic Features

These rely on features extracted from the network traffic, including packet-level attributes such as packet size and inter-arrival time, and flow-level metrics such as session duration, byte counts, and protocol types. These features are widely used due to their accessibility and relevance to device communication behavior. Notable studies—such as Bruhadeshwar et al. [18], Hamad et al. [19], Fan et al. [20], Meidan et al. [21], Miettinen et al. [22], Sivanathan et al. [16,23], Thangavelu et al. [24], and Xu et al. [25]—have primarily relied on network traffic as their primary data source. These studies extract a broad range of features, beginning with packet-level attributes, such as packet size, payload size, protocol types across the network, transport, and application layers (e.g., IP, TCP, UDP, HTTP, MQTT), and the IP addresses the device communicates with. In parallel, flow-level features are also commonly used. These include metrics, such as session duration, total bytes transferred, the number of packets exchanged per flow, directionality of flows (incoming vs. outgoing), and protocol usage statistics over a session. These aggregated features provide a higher-level view of how devices behave over time and across different network contexts. In addition to these foundational features, many studies enhance the feature set by incorporating statistical descriptors computed over time windows or flow segments. Commonly used statistics include maximum, minimum, mean, median, standard deviation, and percentiles. These statistical features help summarize behavioral trends and variability, offering additional robustness and discriminative power to the profiling models.

2.2. Category 2: MAC-Layer Features

These extracts features from IEEE 802.11 MAC frames. These features are widely used in wireless environments. These data link layer frames offer valuable insight into device behavior on Wi-Fi networks.

Robyns et al. [26] presents two novel techniques for fingerprinting and tracking Wi-Fi-enabled mobile devices using IEEE 802.11 MAC layer data in a noncooperative manner. The first technique uses per-bit entropy analysis of a single captured MAC frame to generate a unique fingerprint. The second technique leverages peer-to-peer 802.11u Generic Advertisement Service (GAS) and 802.11e Block Acknowledgement (BA) request frames to instigate on-demand transmissions from devices supporting these protocols, enhancing tracking capabilities. The methods were validated using datasets from a music festival (28,048 unique devices) and a research lab (138 unique devices), demonstrating their effectiveness for device identification and tracking in real-world scenarios.

Gu et al. [27] proposed a method for identifying Wi-Fi devices using IEEE 802.11ac MAC frames. The proposed method is suitable for IoT devices with 802.11ac capabilities. The paper uses a preprocessing method to mask strong, easily modified identifiers (e.g., MAC addresses) and employs deep learning to automatically select features from MAC frames. The method focuses on probe request frames, analyzing their fields to create unique device fingerprints, and achieves effective identification by addressing noise and variability in frame data.

Alyami et al. [28] propose a mechanism called WiFiDefender to counter device fingerprinting attacks that exploit IEEE 802.11 MAC layer frames. WiFiDefender employs MAC-layer traffic shaping to obfuscate device-specific signatures. It modifies MAC frame characteristics, such as inter-frame intervals and frame sizes, to prevent attackers from generating accurate fingerprints. WiFiDefender was evaluated on real-world IEEE 802.11 traffic from various devices. It improves the accuracy of state-of-the-art fingerprinting techniques to below 15%.

2.3. Category 3: Radio Frequency Signal Features

These extract features from radio signals, particularly those at the physical layer, forming the basis of radio frequency fingerprinting, which relies on the subtle imperfections inherent in a device’s radio transmitter hardware. These imperfections result in unique signal characteristics, such as frequency offset, amplitude variance, and phase noise, which can be captured and analyzed to distinguish devices at the hardware level. This approach leverages the advantages of frequency domain to analyze the behavior of IoT devices.

Köse et al. [29] proposed a method to identify IoT devices by analyzing the energy spectrum of transmitter turn-on transient signals, which are captured during IEEE 802.11 MAC frame transmissions. This technique uses a sliding window averaging method to estimate transient duration and extracts spectral components to form unique fingerprints, achieving superior performance (up to 8 dB better at 80% accuracy) compared to time domain and wavelet-based methods. The proposed method uses the Fast Fourier Transform (FFT) as part of its methodology to extract the energy spectrum of transmitter turn-on transient signals for IoT device fingerprinting.

Xie et al. [30] introduced an optimized radio frequency fingerprinting (RFF) classification algorithm to enhance IoT device authentication by leveraging the unique hardware imperfections in radio frequency signals. The method combines coherent integration, wavelet-based multiresolution analysis, and a Gaussian Support Vector Machine (G-SVM) to improve identification accuracy, particularly in low signal-to-noise ratio (SNR) environments. The method was tested on 10 nRF24 transmitters, achieving near 100% accuracy at 0 dB SNR with 100 stacked signals and 90% accuracy at −5 dB SNR.

Galtier et al. [31] proposed a method for IoT device identification that uses radio frequency fingerprinting based on the Power Spectral Density (PSD) of captured radio frequency signals to generate unique frequency profiles tied to each device’s hardware. Standard receivers collect the signals, and PSD is computed to capture power distribution across frequencies. The resulting fingerprint is matched against a database of known devices. Experiments show high accuracy, with precision and recall consistently above 85%, demonstrating the method’s effectiveness in reliably distinguishing between devices.

Network traffic features and MAC-layer features approaches rely heavily on aggregated statistical features derived from dense and continuous network traffic. This underlying assumption, however, proves ineffective when applied to IoT devices, which typically exhibit low-volume, sparse, and event-driven communication patterns. Unlike traditional computing devices, such as desktops or smartphones, which engage in constant background processes and frequent data exchanges, IoT devices often remain idle for extended periods and only generate significant traffic during brief, infrequent user interactions. Examples include a smart speaker transmitting data only when activated by a voice command, or a security camera uploading footage only when motion is detected. Furthermore, the majority of these methods analyze traffic strictly in the time domain, which offers limited visibility into underlying behavioral patterns. As a result, time domain statistical analysis may fail to capture the full behavioral signature of many IoT devices, particularly those that operate silently or intermittently. While radio frequency signal fingerprinting offers valuable capabilities for low-level device identification in the frequency domain, its effectiveness is constrained by environmental variability, scalability concerns, hardware dependency, temporal instability, and limited behavioral insight. The limitations of the prior work highlight the need for more robust methods capable of extracting comprehensive, resilient, and context-aware IoT fingerprinting solutions.

3. The Proposed Fingerprinting Solutions

The proposed Frequency Domain Fingerprints Generation (FDFG) module comprises three key sequential phases as shown in Figure 2. Phase 1 is Time Series Signals Generation (TSSG), where the selected network traffic features are converted into time series format. These time series are then normalized to ensure consistent scaling across features, and noise reduction techniques are applied to enhance signal clarity. In phase 2, each time series is transformed into its frequency domain representation using the Fast Fourier Transform (FFT). This transformation reveals the periodic and spectral characteristics embedded in the network behavior of the device. The last phase is the Frequency Domain Analyzer (FDA) module, which plays a central role in transforming raw frequency domain data into two powerful fingerprint representations: the Spectral-Only Frequency Fingerprint (SFF) and the Spectro-Correlative Frequency Fingerprint (SCFF). The Spectral-Only Frequency Fingerprint (SFF) is based exclusively on spectral features extracted from the frequency domain, such as dominant frequencies, spectral entropy, and power distribution. The Spectro-Correlative Frequency Fingerprint (SCFF) combines spectral features extracted from frequency domain analysis with interdependency features, such as cross-spectral correlations and mutual information between traffic dimensions. These phases and their components are details in the following subsections.

3.1. Time Series Signals Generation (TSSG)

The TSSG module processes network traffic packets in three main steps to prepare them for frequency domain analysis. First, it extracts key network-level features—such as packet counts, byte volumes, packet sizes, and inter-arrival times—that capture the behavioral characteristics of IoT devices from raw traffic data. Second, these features are transformed into a structured format suitable for signal processing by converting them into binary or numerical time series using time binning, which aggregates data into fixed-length intervals (e.g., one-second bins), ensuring consistent temporal resolution. Third, because time series signals from IoT traffic often exhibit irregular, sparse, or bursty behavior, a Hanning window is applied to each segment prior to performing the Fast Fourier Transform (FFT). This windowing minimizes edge discontinuities and reduces spectral leakage, ensuring that the resulting magnitude spectrum accurately reflects the device’s underlying periodic communication patterns. Each of these steps is described in detail in the following sections.

3.1.1. Feature Selection

The four key network traffic features, which together offer a comprehensive representation of an IoT device’s communication behavior, are selected to enhance the accuracy of device identification, strengthen network security, and support efficient device monitoring and management. These features are as follows:

Bytes Transferred per Second: This metric captures the overall data transmission volume over time and reflects the communication intensity of a device. Different types of IoT devices exhibit distinct data rate patterns—for instance, a Smart Camera typically generates high-throughput traffic due to continuous video streaming, whereas a smart thermostat or environmental sensor transmits small, periodic data packets. Measuring bytes per second helps distinguish between such diverse device behaviors.
Packets Transferred per Second: This feature reflects the rate at which a device sends or receives packets, offering insight into the device’s communication frequency and protocol behavior. Devices utilizing lightweight protocols like MQTT or CoAP, commonly used in constrained environments, tend to produce specific packet rate patterns. Tracking packet frequency aids in identifying protocol usage and supports reliable fingerprinting of device types.
Average Packet Size per Second: The average size of packets over time serves as a strong indicator of the nature of a device’s traffic. Devices that transmit large volumes of data—such as IP cameras or smart speakers—typically generate larger packets. In contrast, devices like motion sensors or smart plugs often produce smaller, more uniform packets. This distinction supports the creation of highly discriminative device fingerprints.
Average Packet Inter-Departure Time per Second: This feature measures the average time interval between successive outgoing packets. It helps distinguish between periodic communication patterns (e.g., regular status updates from sensors) and event-driven communication (e.g., user-activated commands from smart switches). By capturing timing characteristics, this feature enhances the profiling accuracy and reveals the device’s operational behavior.

3.1.2. Time Binning

Time binning is a widely used technique in signal processing and data analysis that involves aggregating time series data into fixed-length, discrete intervals—commonly referred to as “time bins”. This process simplifies complex or irregular data streams, reduces noise, and facilitates downstream analytical tasks such as pattern recognition, statistical modeling, or machine learning.

In the context of network packet flow analysis, particularly for IoT device fingerprinting, time binning plays a crucial role. Raw network traffic data often contains packet-level events with irregular or unevenly spaced timestamps, making it unsuitable for direct application of signal processing techniques. Time binning addresses this by transforming the asynchronous packet data into a uniform time series signal, where each bin summarizes activity such as the number of packets, total bytes transferred, or average inter-arrival time—within a fixed interval. The Equations for Key Metrics are given as follow: For a bin k spanning time interval

[k \cdot Δ t, (k + 1) \cdot Δ t)

. In our experiments, we used

Δ t = 1

s as the bin time interval. This value was chosen to be not so short that it results in noisy and overly detailed behavior, and not so long that it misses key data patterns. To prevent large differences in values—such as between packet counts and payload sizes—from dominating the analysis, we apply min–max normalization, which scales all feature values to a fixed range of [0, 1].

Bytes per Second:

{Bytes}_{k} = \sum_{packets where k \cdot Δ t \leq t_{i} < (k + 1) \cdot Δ t} size of {byte}_{i}

(1)

Packets per Second:

{Packets}_{k} = Count (packets where k \cdot Δ t \leq t_{i} < (k + 1) \cdot Δ t)

(2)

Average Packet Size:

{AvgPacketSize}_{k} = \frac{{Bytes}_{k}}{{Packets}_{k}} if {Packets}_{k} > 0, else 0

(3)

Average Inter-Arrival Time (within a bin):

{AvgInterArrival}_{k} = \frac{\sum_{packets in bin k} (t_{i + 1} - t_{i})}{{Packets}_{k} - 1} if {Packets}_{k} > 1, else 0

(4)

Figure 3 displays four time series metrics derived from different IoT devices over a period of approximately 50 s. These metrics include Bytes Transferred, Packets Transferred, Average Packet Size, and Average Packet Inter-Departure Time, plotted in blue, orange, green, and red, respectively. Each time series exhibits periodic spikes, suggesting recurring patterns in the IoT devices network activity.

3.1.3. Hanning Window

When analyzing discretized network traffic signals, such as bytes per second or packets per second, using the Fast Fourier Transform, the process assumes that the signal is periodic and smoothly continuous across the analysis window. This assumption implies that the signal repeats indefinitely with the same pattern at the boundaries of the window, forming a seamless loop. However, in practice, network traffic signals are rarely perfectly periodic or continuous at the window’s edges. For instance, IoT device traffic such as periodic sensor updates or bursty video streams, often contains abrupt changes, irregular patterns, or non-repeating behavior within the window. This mismatch between the assumption and reality leads to a phenomenon known as spectral leakage [32,33], where energy from one frequency component spreads into adjacent frequencies in the FFT output, distorting the frequency profile and making it harder to identify true periodic patterns.

The Hanning window (also known as the Hann window) [34,35] is a widely used technique to reduce spectral leakage in frequency domain analysis. It works by applying a tapered weighting function to the signal within the analysis window, gradually reducing the amplitude of the signal at the edges while focusing on its central part. This tapering smooths out discontinuities at the window boundaries, which are the primary source of spectral leakage. By minimizing these edge effects, the Hanning window helps produce a cleaner and more accurate frequency spectrum. This is particularly important in applications like IoT device fingerprinting, where detecting subtle, device-specific periodic patterns, such as regular transmission intervals from a smart meter, is essential for reliable identification. The Hanning window is a tapering function defined as:

w [n] = 0.5 \cdot (1 - cos (\frac{2 π n}{N - 1})), n = 0, 1, \dots, N - 1

(5)

where N is the window length. It starts and ends at zero, peaking at 1 in the center, creating a bell-shaped curve.

The Hanning window is multiplied by the signal to create a windowed signal:

x_{windowed} [n] = x [n] \cdot w [n]

(6)

This reduces the amplitude of the signal at the edges, minimizing discontinuities when it is assumed that the signal repeats in the FFT. Figure 4 illustrates this concept by showing the time domain signals of four different IoT devices. In each case, the gray line of the plots represent the reduced amplitude at the signal boundaries after applying the Hanning window. This tapering effect highlights how the window suppresses edge discontinuities, ensuring a cleaner and more accurate frequency domain analysis by preserving the integrity of the central portion of the signal—where the most relevant periodic behaviors typically reside.

3.2. Fast Fourier Transform (FFT)

After applying the Hanning window function to a segment of the discretized network traffic signal, the FFT is performed on the windowed segment. This operation transforms the signal from the time domain where it is represented as a sequence of values over time into the frequency domain, where it is expressed as a sum of sinusoidal components at various frequencies. This transformation reveals the spectral content of the signal, offering a detailed view of how power or energy is distributed across different frequency bands within that specific time segment. The sequence of resulting frequency domain representations provides the foundation for extracting spectral features such as dominant frequencies, Power Spectral Density (PSD), and frequency-specific energy distributions. These features capture the periodic and behavioral characteristics of network traffic, which are often device-specific. In applications such as IoT device fingerprinting, these spectral signatures are highly valuable; they enable a reliable identification of device types based on their communication patterns, and help distinguish between devices with similar traffic volumes but different frequency behaviors.

The FFT is an efficient algorithm to compute the Discrete Fourier Transform (DFT), which converts a signal in the time domain to its representation in the frequency domain. For a windowed signal

x_{windowed} [n]

, the FFT is:

X [k] = \sum_{n = 0}^{N - 1} x_{windowed} [n] \cdot e^{- j \frac{2 π k n}{N}}, k = 0, 1, \dots, N - 1

(7)

where

X [k]

is a complex number representing the amplitude and phase of the frequency bin k, and the frequency corresponding to bin k is:

f_{k} = \frac{k \cdot f_{s}}{N}

(8)

with

f_{s}

being the sampling frequency (1 Hz for 1-s bins). The output FFT produces a spectrum showing the magnitude (

| X [k] |

) and phase of each frequency component, indicating how much energy is present at each frequency.

Analyzing the spectral content of network traffic reveals distinctive patterns that can be leveraged to identify and classify devices. One key aspect is the detection of periodic behavior as illustarted in Figure 5, where devices such as Sonos Speaker or smart sleep senor transmit data at regular intervals, producing clear spectral peaks at corresponding frequencies. Another key aspect is bursty behavior, often exhibited by devices like Home Hub device, which generate irregular, high-throughput transmissions; this results in broadband energy spread across a wide range of frequencies, reflecting the unpredictability and intensity of their communication. Finally, spectral content can expose background noise, such as random fluctuations in traffic caused by network jitter or interference. This noise typically manifests as low-amplitude, scattered energy across the frequency spectrum.

3.3. Frequency Domain Alayzer (FDA)

The Frequency Domain Analyzer (FDA) is a core component of the proposed FDFG solution, responsible for performing comprehensive spectral analysis on IoT network traffic that has been transformed into the frequency domain. Its primary objective is to extract rich and distinctive features that capture the behavioral patterns of devices, enabling accurate and reliable identification. The FDA first computes a set of fundamental spectral features—such as dominant frequencies, spectral entropy, and power distribution—which collectively form the Spectral-Only Frequency Fingerprint (SFF). To further enhance the fingerprint’s discriminative power, the FDA also calculates inter-feature relationships within the frequency domain, analyzing correlations between spectral representations of different traffic metrics. This extended fingerprint is referred to as the Spectro-Correlative Frequency Fingerprint (SCFF). Further details on the construction and characteristics of SFF and SCFF are provided in the following subsections.

Spectral-Only Frequency Fingerprint (SFF)

For each signal, we extract the spectral characteristics from the filtered FFT output to capture device-specific patterns. We extracted the following features:

Magnitude of Dominant Frequencies: The magnitudes $| X [k] |$ of the top frequency bins K in the FFT output indicate the strength of periodic components. This feature captures device-specific periodic patterns such as the regular update of a smart thermostat versus the infrequent video streaming of the IP Cam. Selecting only the highest K magnitudes reduces the dimensionality while retaining key information.
Power Spectral Density (PSD): Calculated by Equation (9) for the highest frequencies K, quantifies the power of periodic components, normalized by signal length N. This feature highlights the energy distribution, which varies between devices where the power is high for bursty traffic and the power is low for sparse traffic.

${| X [k] |}^{2} / N$

(9)
Spectral Entropy: Measures the randomness of the spectrum by Equation (10). Low entropy indicates a concentrated spectrum (few dominant frequencies); high entropy suggests noise or complex patterns. This feature differentiates devices with structured behavior, such as periodic beacons or irregular traffic, such as random sensor readings.

$- \sum_{k} p_{k} log p_{k}$

(10)

where $p_{k} = {| X [k] |}^{2} / \sum {| X [k] |}^{2}$ .
Spectral Centroid: The “center of mass” of the spectrum is measured by Equation (11), indicating the average frequency weighted by magnitude. The feature varies between devices with different traffic rates, such as cameras vs. sensors. It summarizes the spectrum in one value, reducing dimensionality.

$\sum_{k} f_{k} \cdot | X [k] | / \sum_{k} | X [k] |$

(11)

where $f_{k}$ is the frequency of the k-th frequency bin.
Spectral Spread: Computed by Equation (12); spectral spread helps distinguish between devices with highly regular, narrowband behaviors versus complex or bursty traffic that spreads energy across a broader frequency range.

$\sqrt{\sum_{k} {(f_{k} - centroid)}^{2} \cdot | X [k] | / \sum_{k} | X [k] |}$

(12)

3.4. Spectro-Correlative Frequency Fingerprint (SCFF)

Feature interdependencies refer to the relationships and interactions between different features within a dataset, including linear correlations, nonlinear dependencies, and synchronized patterns in their behavior over time or frequency. In the context of IoT fingerprinting, capturing these interdependencies enables machine learning models to develop a more holistic and nuanced understanding of device behavior. Instead of analyzing each feature in isolation, the model considers how features influence one another—for example, how packet size might correlate with inter-departure time, or how two traffic metrics exhibit synchronized frequency domain patterns. These joint behaviors often encode device-specific traits that are not apparent when features are viewed independently. Incorporating interdependent features into the fingerprinting process enhances the model’s robustness, distinctiveness, and accuracy, especially in dynamic or adversarial network environments where single-feature spoofing or noise might otherwise reduce performance.

We define three sets of correlation features, each revealing distinct characteristics of network traffic dynamics. Collectively, these features capture synchronization, spectral similarity, and cross-feature dependencies, forming a robust layer of discrimination for IoT fingerprinting models. These sets of correlation features are explained in the following.

3.4.1. Pairwise Pearson Correlation Coefficients

This set computes the Pearson correlation coefficient between the magnitude spectra (or power spectral densities, PSD) of each pair of the four time series. For time series i and j, with FFT outputs

X_{i} [k]

and

X_{j} [k]

(where

k = 0, 1, \dots, ⌊ N / 2 ⌋

for N samples), the magnitude spectra are

| X_{i} [k] |

and

| X_{j} [k] |

. The correlation is computed by Equation (13) as:

ρ_{i, j} = \frac{\sum_{k} (| X_{i} [k] | - μ_{i}) (| X_{j} [k] | - μ_{j})}{\sqrt{\sum_{k} (| X_{i} [k] | - μ_{i})^{2} \sum_{k} (| X_{j} [k] {| - μ_{j})}^{2}}}

(13)

where

μ_{i}

is the mean of

| X_{i} [k] |

. Alternatively, PSD (

| X_{i} {[k] |}^{2} / N

) can be used to compute correlations based on normalized power spectra.

This set of features captures the overall linear similarity between the frequency domain profiles of traffic metrics. For example, a high correlation between bytes and packets per second (0.95) indicates synchronized periodic bursts, as in a video streaming device, while a lower correlation between bytes and inter-departure time (0.3) suggests independent timing patterns, typical of a sensor with fixed packet sizes.

3.4.2. Cross-Spectral Coherence

This set measures the spectral coherence between each pair of time series to quantify how strongly they share frequency components over time. The coherence between two time series i and j at frequency k, denoted

{Coherence}_{i, j} [k]

, is computed as:

{Coherence}_{i, j} [k] = \frac{| {CSD}_{i, j} {[k] |}^{2}}{{CSD}_{i, i} [k] \cdot {CSD}_{j, j} [k]}

(14)

where,

{CSD}_{i, j} [k] = X_{i} [k] \cdot X_{j} {[k]}^{*}

is the Cross-Spectral Density,

X_{j} {[k]}^{*}

is the complex conjugate of the FFT output

X_{j} [k]

, and

{CSD}_{i, i} [k] = {| X_{i} [k] |}^{2}

and

{CSD}_{j, j} [k] = {| X_{j} [k] |}^{2}

are the auto-spectral densities (power spectral densities) of time series i and j, respectively. Averaging the coherence over all frequency bins

k = 0, 1, \dots, ⌊ N / 2 ⌋

(for N samples) yields a scalar coherence feature for each pair

(i, j)

:

{Coherence}_{i, j} = \frac{1}{⌊ N / 2 ⌋ + 1} \sum_{k = 0}^{⌊ N / 2 ⌋} {Coherence}_{i, j} [k]

(15)

This scalar represents the average strength of linear association between the two time series in the frequency domain.

Coherence captures frequency-specific synchronization, revealing whether two time series exhibit correlated periodic behaviors at particular frequencies. For instance, high coherence at 0.1 Hz between bytes and packets indicates synchronized 10-s bursts (e.g., in a smart speaker), while low coherence between packets and inter-departure time at the same frequency suggests irregular timing (e.g., in a temperature sensor). This granularity is critical for IoT devices with distinct polling or beacon intervals.

3.4.3. Mutual Information Between Spectral Magnitudes

Mutual Information (MI) is a powerful statistical measure that quantifies the amount of shared information between two random variables. Unlike correlation, which only captures linear dependencies, MI can detect both linear and nonlinear relationships, making it highly useful for analyzing complex systems. In the context of frequency domain analysis of time series (e.g., bytes/sec and packet size spectra), MI can reveal inter-feature dependencies that are not visible with linear tools like correlation. For example, two features might exhibit nonlinear frequency alignments, such as shared harmonic structures or periodic interference patterns. Mutual Information can detect these even if their Pearson correlation is zero.

For time series i and j, with FFT outputs

X_{i} [k]

and

X_{j} [k]

(where

k = 0, 1, \dots, ⌊ N / 2 ⌋

for N samples), the magnitude spectra are

| X_{i} [k] |

and

| X_{j} [k] |

. The mutual information

I (| X_{i} |, | X_{j} |)

quantifies the shared information between these magnitude spectra, as follow:

I (X; Y) = \sum_{x \in X} \sum_{y \in Y} p_{X, Y} (x, y) log (\frac{p_{X, Y} (x, y)}{p_{X} (x) p_{Y} (y)})

(16)

This can also be written as:

I (X; Y) = H (X) + H (Y) - H (X, Y)

(17)

where:

$H (X) = - \sum_{x \in X} p_{X} (x) log p_{X} (x)$ is the entropy of X,
$H (Y) = - \sum_{y \in Y} p_{Y} (y) log p_{Y} (y)$ is the entropy of Y,
$H (X, Y) = - \sum_{x \in X} \sum_{y \in Y} p_{X, Y} (x, y) log p_{X, Y} (x, y)$ is the joint entropy.

3.5. Time Complexity Analysis of SFF and SCFF

To analyze the time complexity of computing the Spectral-Only Frequency Fingerprint (SFF) and Spectro-Correlative Frequency Fingerprint (SCFF), we need to consider the computational steps involved in each, with a focus on the Fast Fourier Transform (FFT) and subsequent feature calculations as described in the provided text. The FFT is a critical component for both SFF and SCFF, as it transforms time domain signals into the frequency domain, and its time complexity significantly influences the overall computational cost. Let N be the length of the input time series (number of samples). Let M be the number of time series signals.

3.5.1. SFF Time Complexity

Time Series Signal Generation (TSSG): Converts packet-level data (e.g., timestamps, sizes, directions) into time series signals. If there are N packets per device session, this step takes $O (N)$ time.
FFT Complexity: The FFT algorithm (e.g., Cooley-Tukey) has a time complexity of $O (N log N)$ for a signal of length N.
If M dimensions (multiple traffic features) are transformed, total cost: $O (M \cdot N l o g N)$ .
Spectral Feature Extraction: Extracts dominant frequency, power spectral density, and spectral entropy from the FFT output. Each of these features can be computed in $O (N)$ per signal, so for M features, the total cost: $O (M \cdot N)$ .
Total time complexity of SFF: $O (M \cdot N l o g N)$ , FFT dominates, especially for larger time series.

3.5.2. SCFF Time Complexity

SCFF includes all steps of SFF and adds correlation computations between spectral vectors.

All SFF steps: $O (M \cdot N l o g N)$ .
Computes pairwise cross-spectral coherence, Pearson correlation, and mutual information between each pair of M dimensions. The number of unique pairs is $O (M)$ .
Each metric (coherence, correlation, mutual information) typically takes $O (N)$ per pair. Total time for correlation analysis is $O (M \cdot N)$ .
Total time complexity of SCFF is $O (M \cdot N l o g N + M \cdot N)$ .

The time complexity of the Spectral-Only Frequency Fingerprint (SFF) and Spectro-Correlative Frequency Fingerprint (SCFF) methods can be considered moderate in the context of IoT traffic analysis. SFF has a total time complexity of

O (M \cdot N l o g N)

, where N is the length of the time series signal (i.e., number of packets) and M is the number of features or dimensions being analyzed. This is largely dominated by the Fast Fourier Transform (FFT) step, which is efficient and well-optimized in modern libraries. SCFF extends this by incorporating inter-feature correlation computations, resulting in a higher complexity of

O (M \cdot N l o g N + M \cdot N)

due to pairwise comparisons across all feature combinations. While SCFF is computationally heavier than SFF, both approaches remain feasible for practical use, particularly given the relatively short and bursty nature of IoT communication sessions.

4. Experiments and Results

In this section, we present the datasets, evaluation scenarios, and experimental results used to assess the effectiveness of the proposed frequency domain solutions for IoT device fingerprinting.

The performance of the proposed IoT frequency domain fingerprinting solutions are evaluated by comparing them against baseline models built using traditional time domain features. To ensure a fair comparison, the baseline models are constructed using the same machine learning algorithms and trained on the same datasets as the frequency-based models. The only difference lies in the feature representation. The baseline models use four key time domain features, which are also the foundation for generating the frequency domain fingerprints in our proposed approach. These include: Bytes Transferred per Second, Packets Transferred per Second, Average Packet Size per Second, and Average Packet Inter-Departure Time per Second. These features are used in their raw, time domain form without transformation into the frequency domain.

To quantitatively evaluate performance, we employ standard machine learning evaluation metrics, including accuracy, precision, recall, and F1-score. These metrics provide a comprehensive and balanced assessment of the system’s ability to correctly identify IoT devices while minimizing false positives and false negatives. Accuracy measures the overall correctness of predictions, precision reflects the proportion of true positive identifications among all predicted positives, recall indicates the proportion of true positives identified out of all actual positives, and F1-score provides a harmonic mean of precision and recall, especially useful in imbalanced datasets.

All experiments are conducted using 5-fold cross-validation to ensure statistical robustness and to mitigate the risk of overfitting. This approach partitions the dataset into five equally sized subsets, iteratively using four for training and one for testing, and averaging the results across all folds to provide a more reliable performance estimate.

All experiments were conducted on Google Colab Pro, utilizing an NVIDIA A100 GPU which is manufactured by tsmc Phoenix, Arizona, USA. (40 GB HBM2 memory, Ampere architecture) alongside 83.5 GB of system RAM and 235.7 GB of SSD storage, providing a high-performance environment for processing large-scale IoT traffic data and accelerating model training. The software environment was based on Python 3.10, with CUDA 12.2 and the RAPIDS ver 23.06 suite (cuML, cuDF) pre-installed for GPU-accelerated computing. Key libraries included scikit-learn 1.6.1 for traditional machine learning algorithms and evaluation metrics, XGBoost 3.0.3 with GPU support for efficient gradient boosting, pandas 2.2.3 for data manipulation, and NumPy 2.1.3 for numerical operations.

4.1. Evaluation Scenarios

The evaluation is conducted in two primary experimental scenarios, each designed to assess a different aspect of the proposed frequency domain fingerprinting approach.

Scenario 1: In the first scenario, we apply supervised machine learning algorithms to perform individual device identification, also known as device-level fingerprinting. The goal here is to determine how effectively the proposed fingerprints SFF and SCFF can distinguish between multiple instances of the same device type, even when they originate from the same manufacturer and exhibit similar functionality. This setting simulates scenarios where fine-grained device identification is critical, such as tracking specific devices within a network or detecting unauthorized replicas.
Scenario 2: In the second scenario, we use the same supervised classifiers to evaluate the performance of the fingerprints in device type classification—that is, grouping devices into broader functional categories such as smart plugs, IP cameras, thermostats, or sensors. This setting focuses on capturing shared behavioral characteristics among different instances of the same type, enabling the model to generalize across manufacturers and deployment contexts.

Together, these two evaluation scenarios provide a comprehensive assessment of the frequency based fingerprinting approach: the first tests its ability to capture devicespecific uniqueness, while the second measures its capacity to generalize across functional similarities. This dual evaluation ensures that the proposed method is suitable for both fine-grained security applications and high-level device inventory and monitoring tasks.

4.2. Machine Learning Models

We used three distinct machine learning algorithms: K-Nearest Neighbors (KNN), Random Forest (RF), and Support Vector Machine (SVM), representing three different families of machine learning techniques. These algorithms were chosen to explore the diverse aspects of feature vectors. KNN, a instance-based learning algorithm, classifies instances by computing the distance to the k nearest training examples in the feature space, making it effective for capturing local patterns in data. RF, an ensemble method from the decision tree family, constructs multiple decision trees during training and aggregates their predictions by introducing randomness in feature selection and bootstrapping, which enhances robustness against overfitting and captures complex, non-linear relationships in frequency domain features like magnitude spectra. SVM, a kernel-based algorithm from the discriminative family, classifies data by finding the optimal hyperplane that maximizes the margin between classes in a transformed feature space. These fundamental differences in their underlying methodologies enable a comprehensive exploration of the feature vectors’ geometric, hierarchical, and boundary characteristics, respectively. This diversity allows the evaluation of how well each algorithm leverages interdependencies and frequency-specific patterns to classify IoT device fingerprintings.

The Table 1 presents the finalized hyperparameter settings for the KNN, RF, and SVM algorithms, determined after experimentation with various parameter combinations. Multiple values were tested for each hyperparameter, including different numbers of neighbors, tree depths, and kernel configurations, to optimize model performance on the respective datasets. The selected values were found to yield the best performance in terms of accuracy, precision, and generalization across validation sets.

4.3. Datasets

We use four datasets: CIC IoT Dataset 2022 [14], IoT Device Classification Dataset [16], Deakin IoT Traffic (D-IoT) Dataset [15], and D-Link IoT Traffic Traces Dataset [17], to evaluate the proposed frequency domain analysis approach provides a rich and comprehensive foundation for experimentation. These datasets are valuable due to their diverse characteristics, broad coverage of IoT device behaviors, and their relevance to real-world network traffic scenarios. Collectively, they reflect a different range of IoT environments, device types, and communication patterns, making them good for assessing the generalizability and effectiveness of the proposed fingerprinting method.

The used datasets for evaluation encompass a different types of IoT device, including smart plugs, IP cameras, smart bulbs, hubs, environmental sensors, and networked switches. This variety introduces diversity in hardware platforms, manufacturer-specific behaviors, and communication characteristics, making the evaluation more representative of real-world IoT environments. Furthermore, the datasets feature devices that communicate using a different application-layer and transport-layer protocols, such as MQTT, CoAP, HTTP, HTTPS, DNS, and UPnP. This protocol heterogeneity introduces distinct traffic structures and timing behaviors, which are essential for assessing the effectivness and generalizability of frequency domain features across different communication stacks. In addition, the datasets include traffic captured during various operational states, such as idle mode, active user interaction, and initial setup/configuration. Table 2 summarizes these key attributes across the different datasets, providing an overview of their scope and relevance to the experimental design.

4.4. Individual Device Fingerprinting Classification

This task focuses on uniquely identifying a specific physical instance of an IoT device, even when multiple devices of the same type and model are present within the same network environment. Individual device identification distinguishes between identical devices based on subtle differences in their behavior or hardware characteristics. For example, in a smart home setting with several Amazon Echo Dot devices deployed in different rooms, the goal is to determine which specific unit is generating a given stream of network traffic.

The number of device instances available for this task varies across datasets, which directly impacts the difficulty and complexity of the identification process. Some datasets contain only a few instances per device type, while others include a broader set of instances, making the classification task more challenging. A detailed summary of the number of instances per dataset is provided in Table 2, which offers context for evaluating the performance of fingerprinting methods in both low- and high-instance scenarios.

4.4.1. Baseline Model Performance

To ensure a fair comparison, the baseline models are built using the same machine learning algorithms and are trained on the same datasets as the frequency-based models. The only distinction between the two approaches lies in the feature representation. The baseline models utilize four key time domain features—bytes transferred per second, packets transferred per second, average packet size, and average packet inter-departure time—which also serve as the foundation for generating the frequency domain fingerprints in the proposed frequency-based fingerprinting solutions. This consistent setup allows for a clear and unbiased evaluation of the added value provided by frequency domain analysis.

Figure 6 illustrates the accuracy performance of baseline models in identifying individual IoT devices using three machine learning algorithms, KNN, RF, and SVM, across four benchmark datasets: CIC, Deakin, UNSW, and D-Link. Accuracy values range from 0 to 1, indicating the proportion of correctly identified device instances. Among the algorithms, RF consistently delivers the best performance, achieving the highest accuracy of 0.70 on the UNSW dataset, followed closely by 0.69 on Deakin and 0.62 on CIC. SVM performs competitively, with accuracies ranging from 0.62 to 0.68, slightly below RF but still robust across datasets. KNN, which relies on instance-based distance calculations, shows moderate performance (0.54–0.62) but struggles particularly with the D-Link dataset, where its accuracy drops to 0.33. The D-Link dataset exhibits the lowest classification performance across all three algorithms (0.33–0.35), suggesting it may contain more sparse, noisy, or overlapping traffic patterns that challenge time domain representation.

The precision, recall, and F1-score of the baseline models trained using time domain features to identify individual IoT devices are presented in Table 3. The results indicate low to moderate performance across most datasets. Among the three algorithms tested, RF consistently outperforms KNN and shows a slight edge over SVM. For example, RF achieves F1-scores of 0.70 on the Deakin dataset and 0.71 on the USNW dataset, indicating relatively stronger classification performance in those environments. SVM performs comparably, with marginally lower scores, suggesting it is also capable of capturing key behavioral patterns in time domain features. KNN is generally the weakest performer, particularly on the D-Link dataset, where its F1-score drops to 0.34. Overall, the results reveal that time domain-based models struggle to achieve high precision and recall.

4.4.2. Evaluation of SFF Solution

The accuracy results of the proposed SFF solution are presented in Figure 7, where the same traffic features were transformed into the frequency domain using the FFT, and spectral characteristics such as dominant frequencies and power distributions were used for classification. The same datasets and machine learning algorithms were applied.

Notably, SVM consistently outperformed other models, achieving the highest accuracy of 0.83 on Deakin, followed by 0.96 on UNSW, 0.90 on Deakin, and a significantly improved 0.67 on D-Link. RF also showed improved results compared to the baseline, with performance ranging between 0.56 and 0.94, while KNN achieved up to 0.85 on UNSW and maintained consistent improvements across all datasets.

In comparison to the baseline models, the SFF approach yields a significant improvement in accuracy across all datasets and algorithms. For instance, while the baseline SVM model reached only 0.62 to 0.68, the SFF SVM achieves 0.67 to 0.96, indicating that spectral features capture more discriminative and consistent device-specific patterns. Similarly, RF improves from a baseline range of (0.35–0.70) to (0.56–0.94) in SFF, and KNN improves moderately from (0.33–0.62) to (0.40–0.85).

Overall, Figure 7 clearly demonstrates that the SFF significantly outperforms time domain baseline models, validating the hypothesis that frequency domain features are more robust and expressive for capturing the behavioral signatures of IoT devices—especially in sparse, irregular, or periodic traffic scenarios.

The results for the SFF solution is presented in Table 4. Among the classifiers evaluated, SVM consistently demonstrates the best performance, achieving F1-scores between 0.81 and 0.84 on the CIC, Deakin, and USNW datasets. Even on the more challenging D-Link dataset, which contains sparse and irregular traffic, SVM maintains a relatively strong performance with an F1-score of 0.66. RF also performs reasonably well, with F1-scores ranging from 0.65 to 0.96, showing its ability to model complex relationships in the spectral feature space, though not quite at the level of SVM. KNN, however, shows weaker performance, particularly on D-Link where its F1-score drops to 0.43, indicating that its distance-based method may be less effective in capturing the nuanced patterns within frequency domain data. These results highlight that while SSF provides a clear advantage over time domain features, there is still room for improvement.

4.4.3. Evaluation of SCFF Solution

The accuracy of the SCFF solution in identifying individual IoT devices is illustrated in Figure 8, where both spectral features and interdependency features are integrated to construct a more expressive and discriminative fingerprint. The SCFF solution demonstrates a significant improvement in accuracy across all combinations of algorithms and datasets, outperforming both the baseline time domain models and the SFF models. Notably, SVM achieves near-perfect accuracy, reaching 1.0 on Deakin, 0.99 on CIC, and 0.98 on UNSW, while still achieving a strong 0.67 on the more challenging D-Link dataset. RF also shows outstanding performance, with accuracy values ranging from 0.65 on D-Link to 0.95 on UNSW, and over 0.90 on CIC and Deakin. Even KNN, typically more sensitive to data sparsity, shows a substantial boost—rising to 0.86 on CIC, 0.91 on Deakin, and 0.95 on UNSW. Compared to the SFF models, which rely solely on frequency domain spectral features, the SCFF approach introduces cross-feature relational context, allowing the model to capture synchronized patterns and dependencies between different traffic dimensions. This multi-dimensional behavioral insight strengthens the model’s ability to distinguish devices with similar spectral shapes but different cross-feature dynamics.

Table 5 reports the precision, recall, and F1-score results for the enhanced SCFF solution, which extends the SSF by integrating interdependency features alongside spectral characteristics. This approach significantly improves classification across all datasets and algorithms. SVM achieves near-perfect results, with F1-scores of 0.99 on CIC, 1.00 on Deakin, and 0.98 on USNW, demonstrating outstanding accuracy and generalization. Even on the more challenging D-Link dataset, which contains sparse and noisy traffic patterns, SVM maintains a relatively strong F1 score of 0.66, matching its performance in the Spectral-Only model but with improved stability. RF also benefits significantly from the addition of interdependency features, achieving F1-scores of 0.96 on USNW and 0.94 on Deakin, well above its performance in both the baseline and SFF approaches. KNN, previously the weakest performer, shows noticeable improvement—particularly on D-Link, where its F1-score climbs from 0.43 (SFF) to 0.61, indicating increased robustness even for simpler, distance-based classifiers.

Overall, SCFF clearly outperforms both time domain and Spectral-Only approaches. By capturing not just what each feature does individually, but also how they interact and reinforce one another, SCFF offers a richer, more stable fingerprint of IoT device behavior. The inclusion of precision, recall, and F1-score in the evaluation provides a well-rounded assessment that reflects both classification correctness (low false positives) and completeness (low false negatives), affirming SCFF as a robust, scalable, and highly effective solution for accurate and context-aware IoT device identification.

4.5. Device Type Fingerprinting Classification

The datasets used in this study vary in their breadth of device type coverage. The CIC IoT Dataset 2022 [14] includes 31 distinct device types, reflecting a wide range of IoT categories and usage scenarios. The Deakin IoT Traffic (D-IoT) Dataset [15] contains 11 device types, while the UNSW dataset [16] includes 21 types, offering a good balance between variety and focus. The D-Link IoT Traffic Traces Dataset [17], though more limited in scope, includes 4 device types, primarily focusing on networked consumer devices.

Each device type in the datasets consists of one or more device instances that are always from the same manufacturer and serve the same functional purpose—such as smart plugs, IP cameras, smart switches, or sensors. This ensures consistency in device behavior within each type, allowing for meaningful classification and fingerprinting. For instance, in the CIC dataset, the device type labeled amazon-alexa-echo-dot includes two separate instances of the same Amazon Echo Dot model, both produced by Amazon and serving identical voice assistant functions.

A detailed breakdown of the specific device types and corresponding instances included in each dataset is provided in Appendix A, where tables summarize their functional roles and manufacturer details. This categorization is critical for both device type classification and instance-level fingerprinting, allowing for consistent benchmarking and comparative analysis across datasets. In this section, we evaluate the performance of the baseline models as well as the two proposed approaches (i.e., SFF and SCFF), in identifying the IoT device types. Then, we compare the obtained results.

4.5.1. Baseline Models Performance

The accuracy performance of baseline models in the type of IoT devices is evaluated using the same three machine learning algorithms, KNN, RF, and SVM, across the same four benchmark datasets: CIC, Deakin, UNSW, and D-Link.

Figure 9 presents the baseline performance of IoT device type classification using time domain features across four benchmark datasets: CIC, Deakin, UNSW, and D-Link. Among the classifiers, RF achieves the highest accuracy overall, with results peaking at 0.88 on Deakin, followed by 0.85 on D-Link, and 0.73 and 0.70 on CIC and UNSW, respectively. SVM also performs competitively, achieving 0.86 on Deakin, 0.81 on D-Link, and 0.70 on both CIC and UNSW. KNN shows more variability, with its best result on Deakin (0.85) but noticeably lower scores on CIC (0.61) and UNSW (0.62).

Table 6 presents the precision, recall, and F1-score for baseline models using time domain features to classify IoT device types. Across all datasets, RF consistently delivers the strongest performance among the three models. For example, RF achieves F1-scores of 0.72 (CIC), 0.86 (Deakin), 0.74 (USNW), and 0.85 (D-Link), reflecting its ability to generalize well across varying traffic patterns and device types. SVM also performs competitively, with F1-scores close to RF in most cases—0.72 (CIC), 0.84 (Deakin), 0.68 (USNW), and 0.84 (D-Link). These results show that SVM is capable of effectively modeling decision boundaries even with time domain data, though it slightly trails RF in most cases. KNN, while showing strong performance on Deakin (F1-score: 0.85) and D-Link (0.73), falls behind on CIC (0.63) and USNW (0.60). Its relatively lower performance on these datasets suggests that its instance-based approach may struggle with overlapping or noisy time domain feature distributions, making it less suitable in more complex classification tasks.

These results establish a baseline performance benchmark for IoT device type classification using time domain features, and at the same time, they highlight the limitations of such features in capturing the complex, often subtle behavioral patterns exhibited by IoT devices.

4.5.2. Evaluation of the SFF Solution

Figure 10 illustrates the classification accuracy of three machine learning algorithms—KNN, RF, and SVM—across four datasets (CIC, Deakin, UNSW, and D-Link) using the Spectral-Only Frequency Fingerprint (SFF) approach for IoT device type identification. Across all classifiers and datasets, SFF consistently outperforms the baseline time domain models. For example, SVM achieves perfect accuracy (1.0) on both the Deakin and D-Link datasets and near-perfect results on CIC (0.99) and UNSW (0.96). RF also demonstrates strong results, reaching up to 0.98 on Deakin and 0.99 on D-Link, with accuracies above 0.94 on the remaining datasets. Even KNN, typically more sensitive to feature scaling and noise, shows substantial improvements—achieving 0.95 on D-Link, 0.96 on Deakin, and outperforming its baseline accuracy by a large margin.

The D-Link dataset, in particular, shows a notable improvement when comparing device type classification using SFF to individual device classification using both time domain and spectral baselines. While D-Link previously presented a challenge due to its sparse and overlapping traffic patterns (resulting in lower F1-scores of around 0.34–0.66 in baseline and SFF instance-level tasks), the SFF-based device type classification yields exceptionally high accuracy (up to 1.0). This suggests that spectral features are highly effective at capturing functional similarities and periodic behaviors shared among devices of the same type, even when instance-level identification is more difficult.

In term of precision, recall, and F1-score, the obtained results are summarized in Table 7. We observe the performance of SFF fingerprints over the three algorithms across the four datasets. The results show clearly that the SFF solution provides accurate and consistent performance in device type classification across multiple datasets, particularly when paired with more powerful classifiers. SVM consistently delivers the highest performance, achieving F1-scores of 0.99 on CIC, Deakin, and UNSW, and 0.84 on the more challenging D-Link dataset. RF also performs strongly, with F1-scores ranging from 0.85 to 0.99, demonstrating its robustness and ability to generalize well with spectral features. While KNN performs well on Deakin and CIC, it struggles with the D-Link dataset, where its F1-score drops to 0.73, highlighting its sensitivity to overlapping or less distinctive patterns in the data.

4.5.3. Evaluation of SCFF Solution

Figure 11 illustrates the classification accuracy of three machine learning algorithms—KNN, RF, and SVM—applied across four datasets (CIC, Deakin, UNSW, and D-Link) using the SCFF for IoT device type classification. The results show consistently high accuracy across all classifiers and datasets, with performance improvements over both the baseline time domain models and the Spectral-Only Frequency Fingerprint (SFF) models. For example, SVM achieves near-perfect or perfect accuracy, reaching 1.0 on Deakin and D-Link, and 0.98 on both CIC and UNSW. Similarly, RF reaches 1.0 on Deakin and D-Link, and 0.98 on CIC, showing its strong capacity to generalize from enriched SCFF features. KNN, which previously showed more variability, also benefits substantially from SCFF, achieving 0.95 or higher across all datasets, including 0.95 on D-Link, where it previously performed poorly with time domain models.

Table 8 summarizes the precision, recall, and F1-score results for the SCFF solution in IoT device type classification across multiple datasets. SVM consistently achieves perfect scores (1.00) on all datasets, while Random Forest (RF) performs strongly, with all metrics above 0.96, indicating high robustness. KNN also shows reliable performance, achieving 0.95 precision on CIC, UNSW, and D-Link, and 0.98 on Deakin, with its lowest F1-score at 0.96. These results confirm the effectiveness and consistency of SCFF in accurately classifying IoT device types across diverse environments.

4.6. Impact of Correlation-Based Features on SCFF Performance

We conduct an ablation study to evaluate the contribution of each correlation-based feature in the Spectro-Correlative Frequency Fingerprint (SCFF) and determine which has the most significant impact on classification performance. The experiment is performed using the challenging dataset: CIC2022 IoT, and three machine learning classifiers: KNN, RF, and SVM. The evaluation follows a structured procedure comprising four configurations:

Baseline (Full SCFF): All three correlation-based features—Pearson correlation, coherence, and mutual information—are included. The achieved accuracies are 86% for KNN, 91% for RF, and 99% for SVM.
Ablation 1: Pearson correlation coefficients are excluded for all feature pairs, while coherence and mutual information are retained. The accuracies drop to 83% for KNN, 89% for RF, and 95% for SVM.
Ablation 2: Coherence features are excluded, retaining Pearson correlation and mutual information. The resulting accuracies are 81% for KNN, 87% for RF, and 94% for SVM.
Ablation 3: Mutual information is removed, while Pearson correlation and coherence are preserved. This results in the lowest accuracies: 78% for KNN, 85% for RF, and 91% for SVM.

These results indicate that mutual information contributes the most to SCFF’s performance, particularly in complex environments. Its ability to capture both linear and nonlinear dependencies between frequency domain features makes it a critical component for robust and accurate IoT device identification.

5. Discussion

This section analyzes the performance results of the proposed fingerprinting approaches—baseline time domain models, SFF, and SCFF—across multiple datasets and classifiers. The goal is to understand the impact of feature representation, model complexity, and dataset characteristics on the effectiveness of IoT device type and individual device identification. Key insights are drawn by comparing classifier behaviors, dataset limitations, and the value of incorporating spectral and interdependency features. These findings also highlight potential directions for further improving instance-level identification accuracy.

Across all fingerprinting approaches, RF and SVM consistently demonstrated strong performance across the majority of datasets. Their effectiveness stems from their ability to model complex and nonlinear relationships between features, which is especially important in high-dimensional spaces such as those created by frequency domain analysis. RF benefits from its ensemble structure, which provides robustness against overfitting and noise, while SVM’s strength lies in its capacity to learn optimal decision boundaries in transformed feature spaces. In contrast, KNN showed more variable performance. It performed reasonably well on simpler tasks such as device type classification, especially in datasets with fewer instances or well-separated classes. However, KNN was less effective in more complex tasks like individual device identification, particularly on datasets with low diversity. This is because KNN is a distance-based algorithm that struggles with nonlinear distributions, making it less capable of capturing subtle interdependencies between features or adapting to intricate decision boundaries.

The D-Link dataset consistently presented significant challenges for individual device identification across all evaluated models and classifiers. Among the four datasets analyzed, it yielded the lowest performance metrics. This reduced performance can be primarily attributed to the homogeneity of the devices included in the dataset—most of which are of the same type and manufactured by the same vendor. it contains only 12 devices, divided into 4 device types, with six devices (50%) belonging to the same class—the D-Link Wireless N Network Camera. This class imbalance and lack of heterogeneity reduce the availability of distinct behavioral signals, making it difficult for models to differentiate between individual instances. Consequently, even sophisticated methods like SCFF showed limited improvement on D-Link in instance-level classification. These results emphasize the importance of dataset richness and variety in training effective fingerprinting models.

A consistent pattern observed across all datasets and methods is that device type identification outperforms individual device identification. This highlights one of the key advantages of frequency domain analysis—its ability to capture general behavioral traits shared among devices of the same functional category. Since devices of the same type (e.g., IP cameras, smart plugs) tend to exhibit similar communication patterns, periodicities, and traffic volumes, frequency-based features naturally lend themselves to type-level classification. In contrast, individual device identification requires the model to detect fine-grained differences between devices of the same type, which are often subtle and harder to capture using only spectral or behavioral data.

While the SCFF solution, which combines spectral and interdependency features, significantly improved performance over both SFF and baseline models, there is still room for improvement in individual-level identification, particularly in low-diversity datasets like D-Link. One promising direction for future work is to incorporate additional device-specific features, especially those from the physical (PHY) layer. For example, RF signal-based features—which capture hardware-level imperfections such as oscillator drift or power amplifier non-linearity—can provide highly granular and unique identifiers for each physical device. When combined with frequency domain behavioral data, such hybrid fingerprints could offer significantly improved accuracy and robustness, particularly in environments where behavioral data alone is not sufficiently distinctive.

6. Conclusions

This study presented two frequency-based fingerprinting solutions: SFF and SCFF, for identifying IoT devices based on their network behavior. The results demonstrate that both proposed methods consistently outperformed the baseline time domain models across all datasets used and machine learning algorithms tested. These improvements were evident in both device type and individual device-identification tasks, with SCFF achieving the highest overall performance due to its integration of inter-feature dependencies alongside spectral characteristics.

The frequency domain fingerprints showed strong capabilities in capturing behavioral patterns, even in challenging scenarios involving noisy data, sparse traffic, or limited training samples. This robustness highlights their practical potential in a wide range of real-world applications, including IoT security, anomaly detection, and network management. The ability to model periodicity and interdependencies in device behavior provides a meaningful advantage over traditional time domain approaches, particularly when precise classification is required.

While the proposed methods demonstrate strong performance in identifying individual device instances—particularly in behaviorally diverse datasets—there remains room for improvement in scenarios involving multiple identical devices that exhibit nearly indistinguishable communication patterns. To address this challenge and enhance instance-level identification, future research can explore the integration of hardware-specific features, such as radio frequency (RF) signal characteristics, which capture persistent physical-layer imperfections that are difficult to spoof or replicate. Additionally, future work should consider the potential impact of adversarial behavior, spoofing attempts, and environmental noise, all of which can degrade fingerprinting accuracy.

In the frequency domain, certain anomalies and bursty behaviors often appear as high-frequency components, due to their sudden and irregular nature. However, not all anomalies follow this pattern—some may manifest across a broader frequency spectrum, including lower-frequency bands. While low-pass filters can attenuate high-frequency bursts, reduce noise, and mitigate some anomalies, they may fail to detect or eliminate low-frequency or more complex attack patterns. Frequency domain analysis, as implemented in the Frequency Domain Analyzer (FDA), improves resilience to such anomalies and attacks by extracting robust spectral features (e.g., SFF and SCFF) and capturing inter-feature dependencies, which enhance device-identification accuracy in dynamic and adversarial network environments. However, the effectiveness of this approach depends on several factors, including the type of attack, the choice of spectral features, and the classification methods employed. Therefore, its performance must be validated through extensive experimental evaluation. As such, extensive experimental validation under a wide range of conditions will be essential and is planned as part of future work.

Author Contributions

Conceptualization, A.A. and J.C.A.; methodology, A.A.; software, J.C.A.; validation, A.A., J.C.A. and H.L.; formal analysis, A.A.; investigation, J.C.A.; data curation, A.A. and J.C.A.; writing—original draft preparation, A.A.; writing—review and editing, A.A. and H.L.; supervision, A.A. and H.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

IoT	Internet of Things
SFF	Spectral-Only Frequency Fingerprint
FDFG	Frequency Domain Fingerprint Generation
TSSG	Time Series Signal Generation
FFT	ast Fourier Transform
FDA	Frequency Domain Analyzer
PSD	Power Spectral Density
DFT	Discrete Fourier Transform
CSD	Cross-Spectral Density
MI	Mutual Information
KNN	K-Nearest Neighbors
SVM	Support Vector Machine
RF	Random Forest

Appendix A

Table A1. List of CIC dataset smart devices as Device Types classes.

Device Type	Device Name	Device Model
amazon-alexa-echo-dot	amazon-alexa-echo-dot-1	Amazon Alexa Echo Dot
amazon-alexa-echo-dot	amazon-alexa-echo-dot-2	Amazon Alexa Echo Dot
amazon-alexa-echo-spot	amazon-alexa-echo-spot	Amazon Alexa Echo Spot
amazon-alexa-echo-studio	amazon-alexa-echo-studio	Amazon Alexa Echo Studio
amazon-plug	amazon-plug	Amazon Smart Plug
amcrest-wifi-camera	amcrest-wifi-camera	AMCREST WiFi Camera
arlo-base-station	arlo-base-station	Arlo Base Station
arlo-q-camera	arlo-q-camera	Arlo Q Camera
atomi-coffee-maker	atomi-coffee-maker	Atomi Coffee Maker
borun-sichuan-ai-camera	borun-sichuan-ai-camera	Borun/Sichuan AI Camera
dcs8000lha1-d-link-mini-camera	dcs8000lha1-d-link-mini-camera	D-Link Mini Camera
d-link-dchs-161-water-sensor	d-link-dchs-161-water-sensor	D-Link Water Sensor
eufy-homebase-two	eufy-homebase-two	Eufy HomeBase II
globe-lamp-esp	globe-lamp-esp	Globe Lamp
google-nest-mini	google-nest-mini	Google Nest Mini
gosund-esp-plug	gosund-esp-plug-1	Gosund Plug
	gosund-esp-plug-2
	gosund-esp-plug-3
	gosund-esp-plug-4
gosund-esp-socket	gosund-esp-socket-1	Gosund Socket
	gosund-esp-socket-2
	gosund-esp-socket-3
heimvision-smartlife-radio-lamp	heimvision-smartlife-radio-lamp	HeimVision SmartLife Radio/Lamp
heimvision-smart-wifi-camera	heimvision-smart-wifi-camera	HeimVision Smart WiFi Camera
home-eye-camera	home-eye-camera	Shenzhen Home Eye Camera
irobot-roomba	irobot-roomba	iRobot Roomba
luohe-cam-dog	luohe-cam-dog	Luohe Cam Dog
nest-indoor-camera	nest-indoor-camera	Nest Indoor Camera
netatmo-camera	netatmo-camera	Netatmo Camera
netatmo-weather-station	netatmo-weather-station	Netatmo Weather Station
philips-hue-bridge	philips-hue-bridge	Philips Hue Bridge
ring-base-station-ac-1236	ring-base-station-ac-1236	Ring Base Station
simcam-1s	simcam-1s	SIMCAM 1S
smart-board	smart-board	SMARTTec
sonos-one-speaker	sonos-one-speaker	Sonos One Speaker
teckin-plug	teckin-plug-1	Teckin Plug
teckin-plug	teckin-plug-2	Teckin Plug
yutron-plug	yutron-plug-1	Yutron Plug
yutron-plug	yutron-plug-2	Yutron Plug

Table A2. Deakin dataset smart devices by device type.

Device Type	Device Name	Device Model
32-smart-monitor-m80b-uhd	32-smart-monitor-m80b-uhd	32" Smart Monitor
echo-show-8	echo-show-8	Amazon Echo Show 8
hp-envy	hp-envy	HP Envy Printer
netatmo-smart-indoor-security-camera	netatmo-smart-indoor-security-camera	Netatmo Smart Indoor Security Camera
netatmo-weather-station	netatmo-weather-station	Netatmo Weather Station
perfk-motion-sensor	perfk-motion-sensor-1	Perfk Motion Sensor
perfk-motion-sensor	perfk-motion-sensor-2	Perfk Motion Sensor
pix-star-easy-digital-photo-frame	pix-star-easy-digital-photo-frame	Pix-Star Digital Photo Frame
ring-video-doorbell	ring-video-doorbell	Ring Video Doorbell
samsung-pan-tilt-1080p-wi-fi-camera	samsung-pan-tilt-1080p-wi-fi-camera	Samsung WiFi Camera
topersun-smart-plug	topersun-smart-plug-1	TOPERSUN Smart Plug
topersun-smart-plug	topersun-smart-plug-2	TOPERSUN Smart Plug
tp-link-tapo-pan-tilt-wi-fi-camera	tp-link-tapo-pan-tilt-wi-fi-camera	TP-Link WiFi Camera

Table A3. Dlink dataset smart devices organized by device type.

Device Type	Device Name	Device Model
cam	cam-1	D-Link HD WiFi Camera
cam	cam-2	D-Link HD WiFi Camera
day-cam	day-cam-1	D-Link Wireless N Network Camera
	day-cam-2
	day-cam-3
	day-cam-4
	day-cam-5
	day-cam-6
home-hub	home-hub	D-Link Home-Connected Home Hub
smart-plug	smart-plug-1	D-Link Home Smart Plug
	smart-plug-2
	smart-plug-3

Table A4. UNSW dataset smart devices by device type.

Device Type	Device Name	Device Model
amazon-echo	amazon-echo	Amazon Echo
belkin-wemo-motion-sensor	belkin-wemo-motion-sensor	Belkin Wemo Motion Sensor
belkin-wemo-switch	belkin-wemo-switch	Belkin Wemo Switch
blipcare-blood-pressure-meter	blipcare-blood-pressure-meter	Blipcare Blood Pressure Meter
dropcam	dropcam	Dropcam
ihome	ihome	iHome
insteon-camera	insteon-camera	Insteon Camera
light-bulbs-lifx-smart-bulb	light-bulbs-lifx-smart-bulb	Light Bulbs LiFX Smart Bulb
nest-dropcam	nest-dropcam	Nest Dropcam
nest-protect-smoke-alarm	nest-protect-smoke-alarm	Nest Protect Smoke Alarm
netatmo-weather-station	netatmo-weather-station	Netatmo Weather Station
netatmo-welcome	netatmo-welcome	Netatmo Welcome
pix-star-photo-frame	pix-star-photo-frame	PIX-STAR Photo-Frame
samsung-smartcam	samsung-smartcam	Samsung Smart Cam
smart-things	smart-things	Smart Things (Samsung)
tp-link-day-night-cloud-camera	tp-link-day-night-cloud-camera	TP-Link Day-Night Cloud Camera
tp-link-smart-plug	tp-link-smart-plug	TP-Link Smart Plug
triby-speaker	triby-speaker	Triby Speaker
withings-aura-smart-sleep-sensor	withings-aura-smart-sleep-sensor	Withings Aura Smart Sleep Sensor
withings-smart-baby-monitor	withings-smart-baby-monitor	Withings Smart Baby Monitor
withings-smart-scale	withings-smart-scale	Withings Smart Scale

References

Safi, M.; Dadkhah, S.; Shoeleh, F.; Mahdikhani, H.; Molyneaux, H.; Ghorbani, A.A. A survey on IoT profiling, fingerprinting, and identification. ACM Trans. Internet Things 2022, 3, 1–39. [Google Scholar] [CrossRef]
Sánchez, P.M.S.; Valero, J.M.J.; Celdrán, A.H.; Bovet, G.; Pérez, M.G.; Pérez, G.M. A survey on device behavior fingerprinting: Data sources, techniques, application scenarios, and datasets. IEEE Commun. Surv. Tutor. 2021, 23, 1048–1077. [Google Scholar] [CrossRef]
Kumar, V.; Paul, K. Device fingerprinting for cyber-physical systems: A survey. ACM Computing Surveys 2023, 55, 1–41. [Google Scholar] [CrossRef]
Abbas, S.; Abu Talib, M.; Nasir, Q.; Idhis, S.; Alaboudi, M.; Mohamed, A. Radio frequency fingerprinting techniques for device identification: A survey. Int. J. Inf. Secur. 2024, 23, 1389–1427. [Google Scholar] [CrossRef]
Xu, Q.; Zheng, R.; Saad, W.; Han, Z. Device fingerprinting in wireless networks: Challenges and opportunities. IEEE Commun. Surv. Tutor. 2015, 18, 94–104. [Google Scholar] [CrossRef]
Gu, D.; Zhang, J.; Tang, Z.; Li, Q.; Zhu, M.; Yan, H.; Li, H. IoT device identification based on network traffic. Wirel. Netw. 2025, 31, 1645–1661. [Google Scholar] [CrossRef]
Ma, X.; Qu, J.; Li, J.; Lui, J.C.; Li, Z.; Guan, X. Pinpointing hidden IoT devices via spatial-temporal traffic fingerprinting. In Proceedings of the IEEE INFOCOm 2020-IEEE Conference on Computer Communications, Toronto, ON, Canada, 6–9 July 2020; pp. 894–903. [Google Scholar]
Sheng, C.; Zhou, W.; Han, Q.L.; Ma, W.; Zhu, X.; Wen, S.; Xiang, Y. Network traffic fingerprinting for IIoT device identification: A survey. IEEE Trans. Ind. Inform. 2025, 21, 3541–3554. [Google Scholar] [CrossRef]
Xu, Z.; Lu, Q.; Chen, F.; Xian, H. LFIoTDI: A lightweight and fine-grained device identification approach for IoT security enhancement. Comput. Commun. 2025, 237, 108149. [Google Scholar] [CrossRef]
Zhang, J.; Ardizzon, F.; Piana, M.; Shen, G.; Tomasin, S. Physical Layer-Based Device Fingerprinting For Wireless Security: From Theory To Practice. IEEE Trans. Inf. Forensics Secur. 2025, 20, 5296–5325. [Google Scholar] [CrossRef]
Feng, X.; Nguyen, K.A.; Luo, Z. A Survey on Data Augmentation for WiFi Fingerprinting Indoor Positioning. IEEE Sens. Rev. 2025, 2, 246–264. [Google Scholar] [CrossRef]
Garroppo, R.G.; Pericone, G.; Ficara, D.; Henry, J. Enhancing WiFi Privacy: A Focus on Frame Anonymization Techniques. IEEE Commun. Mag. 2025, 1–6. [Google Scholar] [CrossRef]
Dhakal, R.; Devkota, B.P.; Kandel, L.N. Radio Frequency Fingerprinting With Siamese Network. In Proceedings of the 2025 International Conference on Computing, Networking and Communications (ICNC), Honolulu, HI, USA, 17–20 February 2025; pp. 212–216. [Google Scholar]
Dadkhah, S.; Mahdikhani, H.; Danso, P.K.; Zohourian, A.; Truong, K.A.; Ghorbani, A.A. Towards the development of a realistic multidimensional IoT profiling dataset. In Proceedings of the 2022 19th Annual International Conference on Privacy, Security & Trust (PST), Fredericton, NB, Canada, 22–24 August 2022; pp. 1–11. [Google Scholar]
Pasquini, A.; Vasa, R.; Logothetis, I.; Gharakheili, H.H.; Chambers, A.; Tran, M. Descriptor: Deakin IoT Traffic (D-IoT). IEEE Data Descr. 2025, 2, 8–16. [Google Scholar] [CrossRef]
Sivanathan, A.; Gharakheili, H.H.; Loi, F.; Radford, A.; Wijenayake, C.; Vishwanath, A.; Sivaraman, V. Classifying IoT devices in smart environments using network traffic characteristics. IEEE Trans. Mob. Comput. 2018, 18, 1745–1759. [Google Scholar] [CrossRef]
Chowdhury, R.R.; Aneja, S.; Aneja, N.; Abas, P.E. Packet-level and IEEE 802.11 MAC frame-level network traffic traces data of the D-Link IoT devices. Data Brief 2021, 37, 107208. [Google Scholar] [CrossRef]
Bruhadeshwar, B.; Bachani, M.; Peterson, J.; Shirazi, H.; Ray, I.; Ray, I. Iotsense: Behavioral fingerprinting of iot devices. arXiv 2018, arXiv:1804.03852. [Google Scholar]
Hamad, S.A.; Zhang, W.E.; Sheng, Q.Z.; Nepal, S. Iot device identification via network-flow based fingerprinting and learning. In Proceedings of the 2019 18th IEEE International Conference on Trust, Security and Privacy in Computing and Communications/13th IEEE International Conference on Big Data Science and Engineering (TrustCom/BigDataSE), Rotorua, New Zealand, 5–8 August 2019; pp. 103–111. [Google Scholar]
Fan, L.; Zhang, S.; Wu, Y.; Wang, Z.; Duan, C.; Li, J.; Yang, J. An iot device identification method based on semi-supervised learning. In Proceedings of the 2020 16th International Conference on Network and Service Management (CNSM), Izmir, Turkey, 2–6 November 2020; pp. 1–7. [Google Scholar]
Meidan, Y.; Bohadana, M.; Shabtai, A.; Guarnizo, J.D.; Ochoa, M.; Tippenhauer, N.O.; Elovici, Y. ProfilIoT: A machine learning approach for IoT device identification based on network traffic analysis. In Proceedings of the Symposium on Applied Computing, Marrakech, Morocco, 4–6 April 2017; pp. 506–509. [Google Scholar]
Miettinen, M.; Marchal, S.; Hafeez, I.; Asokan, N.; Sadeghi, A.R.; Tarkoma, S. Iot sentinel: Automated device-type identification for security enforcement in iot. In Proceedings of the 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS), Atlanta, GA, USA, 1–4 May 2017; pp. 2177–2184. [Google Scholar]
Sivanathan, A.; Sherratt, D.; Gharakheili, H.H.; Radford, A.; Wijenayake, C.; Vishwanath, A.; Sivaraman, V. Characterizing and classifying IoT traffic in smart cities and campuses. In Proceedings of the 2017 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), Atlanta, GA, USA, 5–8 June 2017; pp. 559–564. [Google Scholar]
Thangavelu, V.; Divakaran, D.M.; Sairam, R.; Bhunia, S.S.; Gurusamy, M. DEFT: A distributed IoT fingerprinting technique. IEEE Internet Things J. 2018, 6, 940–952. [Google Scholar] [CrossRef]
Xu, K.; Wan, Y.; Xue, G.; Wang, F. Multidimensional behavioral profiling of internet-of-things in edge networks. In Proceedings of the International Symposium on Quality of Service, Phoenix, AZ, USA, 24–25 June 2019; pp. 1–10. [Google Scholar]
Robyns, P.; Bonné, B.; Quax, P.; Lamotte, W. Noncooperative 802.11 mac layer fingerprinting and tracking of mobile devices. Secur. Commun. Netw. 2017, 2017, 6235484. [Google Scholar] [CrossRef]
Gu, X.; Wu, W.; Chen, Z.; Song, A.; Ling, Z.; Yang, M. 802.11 ac Device Identification based on MAC Frame Analysis. In Proceedings of the 2021 IEEE 24th International Conference on Computer Supported Cooperative Work in Design (CSCWD), Dalian, China, 5–7 May 2021; pp. 366–371. [Google Scholar]
Alyami, M.; Alkhowaiter, M.; Al Ghanim, M.; Zou, C.; Solihin, Y. Mac-layer traffic shaping defense against wifi device fingerprinting attacks. In Proceedings of the 2022 IEEE Symposium on Computers and Communications (ISCC), Rhodes, Greece, 30 June–3 July 2022; pp. 1–7. [Google Scholar]
Köse, M.; Taşcioğlu, S.; Telatar, Z. RF fingerprinting of IoT devices based on transient energy spectrum. IEEE Access 2019, 7, 18715–18726. [Google Scholar] [CrossRef]
Xie, F.; Wen, H.; Li, Y.; Chen, S.; Hu, L.; Chen, Y.; Song, H. Optimized coherent integration-based radio frequency fingerprinting in Internet of Things. IEEE Internet Things J. 2018, 5, 3967–3977. [Google Scholar] [CrossRef]
Galtier, F.; Cayre, R.; Auriol, G.; Kaâniche, M.; Nicomette, V. A PSD-based fingerprinting approach to detect IoT device spoofing. In Proceedings of the 2020 IEEE 25th Pacific Rim International Symposium on Dependable Computing (PRDC), Perth, Australia, 1–4 December 2020; pp. 40–49. [Google Scholar]
Wickramarachi, P. Effects of windowing on the spectral content of a signal. Sound Vib. 2003, 37, 10–13. [Google Scholar]
Jwo, D.J.; Wu, I.H.; Chang, Y. Windowing design and performance assessment for mitigation of spectrum leakage. In Proceedings of the E3S Web of Conferences. EDP Sciences, Bali, Indonesia, 21–23 November 2019; Volume 94, p. 03001. [Google Scholar]
Chen, K.F.; Mei, S.L. Composite interpolated fast Fourier transform with the Hanning window. IEEE Trans. Instrum. Meas. 2010, 59, 1571–1579. [Google Scholar] [CrossRef]
Arranz-Gimon, A.; Zorita-Lamadrid, A.; Morinigo-Sotelo, D.; Duque-Perez, O. Analysis of the use of the Hanning Window for the measurement of interharmonic distortion caused by close tones in IEC standard framework. Electric Power Syst. Res. 2022, 206, 107833. [Google Scholar] [CrossRef]

Figure 1. Block diagram of frequency-based IoT fingerprinting system.

Figure 2. The process phases of generating fingerprints.

Figure 3. Time series signals of the four key network features of four different devices as: (a) Sonos Speaker Device. (b) Smart Camera Device. (c) Smart Sleep Senor. (d) Home Hub.

Figure 4. The time series signals of the four key network features after applying the Hanning window: (a) The signals of Sonos Speakers. (b) The signals of Smart Camera. (c) The signals of Smart Sleep Sensor. (d) The signals of Home Hub.

Figure 5. FFT output of time series network traffic signals for four devices as follow: (a) The signals of Sonos Speakers. (b) The signals of Smart Camera. (c) The signals of Smart Sleep Sensor. (d) The signals of Home Hub.

Figure 6. Accuracy of baseline models across datasets using time domain features to identify individual IoT device.

Figure 7. Accuracy of the SFF solution in identifying individual IoT device.

Figure 8. Accuracy of the SCFF solution in identifying individual IoT device.

Figure 9. Accuracy of time domain feature models in classifying IoT device types.

Figure 10. Accuracy of SFF in classifying IoT device types.

Figure 11. Accuracy of SCFF in classifying IoT device types.

Table 1. Hyperparameter Settings for Machine Learning Algorithms.

Algorithm	Hyper-Parameter	Values
KNN	$n_{n e i g h b o r s}$	5
	weights	distance
	metric	minkowski
	p	2
Random Forest	$n_{e s t i m a t o r s}$	100
	$m a x_{d e p t h}$	15
	criterion	gini
	$m a x_{f e a t u r e s}$	sqrt
	bootstrap	True
	${m i n}_{s a m p l e s s p l i t}$	4
	${m i n}_{s a m p l e s l e a f}$	2
SVM	C	4
	kernel	RBF
	gamma	0.01

Table 2. Overview of Key Attributes Across IoT Datasets.

Dataset	No of Devices	Protocols	Device Status
CIC IoT Dataset [14]	60	WiFi, Zigbee, Z-Wave, Ethernet	Idle, Interaction, Setup/ Initialization, Attack
UNSW IoT Dataset [16]	28	WiFi, MQTT, CoAP, Ethernet	Idle, Interaction, Setup/Initialization
Deakin IoT Dataset [15]	24	WiFi, Zigbee, Z-Wave, MQTT, Ethernet	Idle, Interaction, Setup/Initialization
D-Link IoT Dataset [17]	12	WiFi, Ethernet	not explained

Table 3. The precision, recall, and F1-score of baseline models to identify individual IoT device.

Dataset	Algorithm	Precision	Recall	F1-Score
CIC	KNN	0.54	0.55	0.55
	RF	0.64	0.65	0.64
	SVM	0.62	0.62	0.62
Deakin	KNN	0.63	0.65	0.64
	RF	0.69	0.71	0.70
	SVM	0.64	0.64	0.64
UNSW	KNN	0.61	0.62	0.62
	RF	0.71	0.72	0.71
	SVM	0.68	0.70	0.69
Dlink	KNN	0.33	0.34	0.34
	RF	0.36	0.34	0.35
	SVM	0.35	0.35	0.35

Table 4. The precision, recall, and F1-score of SFF solution to identify individual IoT device.

Dataset	Algorithm	Precision	Recall	F1-Score
CIC	KNN	0.74	0.75	0.75
	RF	0.82	0.82	0.83
	SVM	0.87	0.87	0.89
Deakin	KNN	0.86	0.87	0.87
	RF	0.88	0.88	0.89
	SVM	0.93	0.91	0.92
UNSW	KNN	0.86	0.85	0.86
	RF	0.90	0.91	0.91
	SVM	0.96	0.96	0.96
Dlink	KNN	0.43	0.44	0.43
	RF	0.56	0.54	0.55
	SVM	0.67	0.66	0.66

Table 5. The precision, recall, and F1-score of SCFF in identifying individual IoT device.

Dataset	Algorithm	Precision	Recall	F1-Score
CIC	KNN	0.85	0.86	0.86
	RF	0.91	0.93	0.92
	SVM	0.97	0.97	0.97
Deakin	KNN	0.92	0.92	0.92
	RF	0.95	0.95	0.94
	SVM	1.00	1.00	1.00
UNSW	KNN	0.95	0.95	0.95
	RF	0.96	0.95	0.96
	SVM	0.98	0.99	0.98
Dlink	KNN	0.61	0.61	0.61
	RF	0.66	0.65	0.65
	SVM	0.67	0.66	0.66

Table 6. The precision, recall, and F1-score of baseline models in identifying IoT device type.

Dataset	Algorithm	Precision	Recall	F1-Score
CIC	KNN	0.65	0.66	0.63
	RF	0.75	0.73	0.72
	SVM	0.71	0.74	0.72
Deakin	KNN	0.86	0.85	0.85
	RF	0.88	0.86	0.86
	SVM	0.84	0.85	0.84
USNW	KNN	0.61	0.62	0.0.60
	RF	0.73	0.75	0.74
	SVM	0.71	0.69	0.68
Dlink	KNN	0.71	0.74	0.73
	RF	0.86	0.85	0.85
	SVM	0.82	0.86	0.84

Table 7. The precision, recall, and F1-score of SFF in identifying IoT device type.

Dataset	Algorithm	Precision	Recall	F1-Score
CIC	KNN	0.85	0.87	0.86
	RF	0.97	0.98	0.98
	SVM	0.99	0.98	0.99
Deakin	KNN	0.97	0.98	0.98
	RF	0.98	0.99	0.99
	SVM	1.00	0.99	0.99
UNSW	KNN	0.85	0.88	0.87
	RF	0.93	0.95	0.94
	SVM	0.97	0.98	0.98
Dlink	KNN	0.71	0.74	0.73
	RF	0.86	0.85	0.85
	SVM	0.82	0.86	0.84

Table 8. The precision, recall, and F1-score of SCFF in identifying IoT device type.

Dataset	Algorithm	Precision	Recall	F1-Score
CIC	KNN	0.95	0.98	0.96
	RF	0.98	0.99	0.99
	SVM	0.99	0.99	0.99
Deakin	KNN	0.98	0.99	0.99
	RF	0.98	0.99	0.99
	SVM	1.00	1.00	1.00
UNSW	KNN	0.95	0.96	0.96
	RF	0.96	0.97	0.97
	SVM	0.99	0.99	0.98
Dlink	KNN	0.95	0.96	0.97
	RF	1.00	1.00	1.00
	SVM	1.00	1.00	1.00

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Amamra, A.; Anunwah, J.C.; Louafi, H. IoT Device Fingerprinting via Frequency Domain Analysis. Electronics 2025, 14, 3248. https://doi.org/10.3390/electronics14163248

AMA Style

Amamra A, Anunwah JC, Louafi H. IoT Device Fingerprinting via Frequency Domain Analysis. Electronics. 2025; 14(16):3248. https://doi.org/10.3390/electronics14163248

Chicago/Turabian Style

Amamra, Abdelfattah, Jeremy C. Anunwah, and Habib Louafi. 2025. "IoT Device Fingerprinting via Frequency Domain Analysis" Electronics 14, no. 16: 3248. https://doi.org/10.3390/electronics14163248

APA Style

Amamra, A., Anunwah, J. C., & Louafi, H. (2025). IoT Device Fingerprinting via Frequency Domain Analysis. Electronics, 14(16), 3248. https://doi.org/10.3390/electronics14163248

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

IoT Device Fingerprinting via Frequency Domain Analysis

Abstract

1. Introduction

2. Related Works

2.1. Category 1: Network Traffic Features

2.2. Category 2: MAC-Layer Features

2.3. Category 3: Radio Frequency Signal Features

3. The Proposed Fingerprinting Solutions

3.1. Time Series Signals Generation (TSSG)

3.1.1. Feature Selection

3.1.2. Time Binning

3.1.3. Hanning Window

3.2. Fast Fourier Transform (FFT)

3.3. Frequency Domain Alayzer (FDA)

Spectral-Only Frequency Fingerprint (SFF)

3.4. Spectro-Correlative Frequency Fingerprint (SCFF)

3.4.1. Pairwise Pearson Correlation Coefficients

3.4.2. Cross-Spectral Coherence

3.4.3. Mutual Information Between Spectral Magnitudes

3.5. Time Complexity Analysis of SFF and SCFF

3.5.1. SFF Time Complexity

3.5.2. SCFF Time Complexity

4. Experiments and Results

4.1. Evaluation Scenarios

4.2. Machine Learning Models

4.3. Datasets

4.4. Individual Device Fingerprinting Classification

4.4.1. Baseline Model Performance

4.4.2. Evaluation of SFF Solution

4.4.3. Evaluation of SCFF Solution

4.5. Device Type Fingerprinting Classification

4.5.1. Baseline Models Performance

4.5.2. Evaluation of the SFF Solution

4.5.3. Evaluation of SCFF Solution

4.6. Impact of Correlation-Based Features on SCFF Performance

5. Discussion

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI