Next Article in Journal
Categories of Harmonic Functions in the Symmetric Unit Disk Linked to the Bessel Function
Previous Article in Journal
Early-Stage Prediction of Steel Weight in Industrial Buildings Using Neural Networks
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Federated Intrusion Detection System for Edge Environments Using Multi-Index Hashing and Attention-Based KNN

by
Ying Liu
1,
Xing Liu
2,
Hao Yu
2,
Bowen Guo
3 and
Xiao Liu
3,*
1
State Grid Corporation of China, Beijing 100124, China
2
Nari Information & Communication Technology Co., Ltd., Nanjing 210003, China
3
The School of Intelligent Software and Engineering, Nanjing University, Suzhou 215163, China
*
Author to whom correspondence should be addressed.
Symmetry 2025, 17(9), 1580; https://doi.org/10.3390/sym17091580
Submission received: 8 July 2025 / Revised: 29 July 2025 / Accepted: 31 July 2025 / Published: 22 September 2025
(This article belongs to the Section Computer)

Abstract

Edge computing offers low-latency and distributed processing for IoT applications but poses new security challenges, due to limited resources and decentralized data. Intrusion detection systems (IDSs) are essential for real-time threat monitoring, yet traditional IDS frameworks often struggle in edge environments, failing to meet efficiency requirements. This paper presents an efficient intrusion detection framework that integrates spatiotemporal hashing, federated learning, and fast K-nearest neighbor (KNN) retrieval. A hashing neural network encodes network traffic into compact binary codes, enabling low-overhead similarity comparison via Hamming distance. To support scalable retrieval, multi-index hashing is applied for sublinear KNN searching. Additionally, we propose an attention-guided federated aggregation strategy that dynamically adjusts client contributions, reducing communication costs. Our experiments on benchmark datasets demonstrate that our method achieves competitive detection accuracy with significantly lower computational, memory, and communication overhead, making it well-suited for edge-based deployment.

1. Introduction

As modern digital infrastructure grows increasingly complex and ubiquitous, cybersecurity has emerged as one of the most pressing challenges across both public and private domains. As networked systems become integral to critical infrastructure and real-time services, they face increasing risks from cyber threats such as malware, DDoS attacks, advanced persistent threats (APTs), and adversarial attacks [1,2,3]. Intrusion detection systems (IDSs) are essential for timely and adaptive monitoring, serving as a vital defense against these evolving threats.
Broadly, IDS methodologies can be categorized into four major paradigms: signature-based detection [4], anomaly-based detection [5], regex-based detection [6], and auto-encoder-based detection [7,8,9]. Signature-based systems rely on predefined rules or known attack signatures—such as byte patterns, protocol violations, or heuristically derived markers—to identify threats. These approaches are highly effective at detecting previously encountered attacks and are computationally efficient, due to their deterministic nature. However, their reliance on a static signature database renders them ineffective against unknown or polymorphic threats, and they require frequent updates to remain current with emerging attack techniques. Anomaly-based systems, on the other hand, aim to construct models of normal network behavior using statistical profiles, machine learning, or clustering techniques. Any significant deviation from the learned baseline is flagged as potentially malicious. While this paradigm enables the detection of novel and zero-day attacks, it introduces new challenges. These include high false positive rates—where benign deviations are mistaken for attacks—as well as sensitivity to environmental noise, concept drift, and diverse traffic patterns. Moreover, anomaly-based IDSs often require extensive training data and computational resources, limiting their practicality in low-resource settings such as edge environments. Regex-based detection approaches leverage regular expressions to define malicious traffic patterns with flexible matching rules, offering a lightweight solution but requiring careful rule engineering. Auto-encoder-based methods employ deep learning techniques to learn compressed representations of normal behavior, enabling the detection of deviations in a data-driven manner. To balance these trade-offs, hybrid intrusion detection [10,11] models have been proposed that combine the strengths of multiple paradigms. However, designing systems that are both accurate and efficient, especially in distributed or real-time environments, remains a non-trivial challenge. The growing complexity of modern networks and the evolving sophistication of adversaries further demand intrusion detection methods that are adaptable, lightweight, and capable of handling heterogeneous data sources.
With the proliferation of edge computing and the explosion of IoT devices, traditional IDS solutions face significant new challenges. Figure 1 shows the architecture of our distributed framework. Edge computing shifts data processing from centralized cloud servers to the network edge, closer to where data is generated. This paradigm supports latency-sensitive applications and enhances privacy preservation, but also introduces unique constraints. Edge nodes such as smart sensors and mobile devices operate under strict resource limitations, and the edge network itself is highly dynamic and heterogeneous. Recent research has applied machine learning (ML) and deep learning (DL) [12,13] to network intrusion detection, achieving impressive accuracy in traffic classification. However, most of these methods require centralized data collection, raising concerns about privacy, bandwidth, and scalability. In addition, such models may not generalize well to distributed, real-world edge environments. Federated learning (FL) [14] has emerged as a decentralized alternative, enabling collaborative model training across distributed clients without sharing raw data. While FL mitigates privacy risks and supports scalability, it faces practical challenges such as unstable convergence and communication overhead. Recent variants incorporating attention-based aggregation and hierarchical architectures have been proposed to enhance FL robustness in heterogeneous settings. Meanwhile, hashing techniques [15] have shown great potential for efficient data representation and retrieval in resource-constrained environments. Hashing neural networks learn compact binary codes that preserve semantic similarity, allowing for lightweight storage and fast nearest neighbor searching via Hamming distance.
Despite advances in federated learning and hashing, existing IDS methods often fail to address the key constraints of edge deployment, including limited computational resources, non-independent and identically distributed (non-IID) data, high communication overhead, and the need for real-time response. Recent IDS approaches based on deep learning or Transformer architectures achieve high detection accuracy, but they often incur high computational costs and lack efficient retrieval mechanisms, making them less suitable for edge scenarios. Most approaches focus on either accuracy or efficiency, but not both, and few attempt to integrate lightweight feature modeling, distributed training, and fast retrieval in a unified system. Moreover, few works simultaneously consider privacy preservation and real-time response in resource-constrained environments, leaving a significant gap between research prototypes and practical edge deployments. There remains a critical need for edge-friendly IDS frameworks that are accurate, efficient, and privacy-preserving.
To bridge this gap, we propose a unified intrusion detection framework for edge environments that combines spatiotemporal hashing, attention-driven federated learning, and multi-index nearest neighbor retrieval. Our approach first models traffic behavior using a hashing neural network to encode spatiotemporal features into compact binary codes (0/1). These binary codes are then compared using Hamming distance in a multi-index search structure, allowing efficient retrieval of the most similar entries in the database. The retrieved neighbors are subsequently used to determine the final intrusion classification result. Our proposed framework also exhibits structural and functional symmetry, where symmetric operations are applied across distributed edge nodes and the attention mechanism ensures consistent model aggregation, aligning well with the symmetry principles. For training, we introduce a hierarchical FL architecture that clusters edge nodes into subnets and uses attention-based aggregation to balance client contributions. Compared to the existing methods, our framework achieves a better trade-off between accuracy and efficiency while reducing retrieval latency and maintaining strong privacy guarantees:
  • We have developed a spatiotemporal hashing neural network that transforms raw network traffic into compact binary codes by capturing both spatial and temporal features. These codes preserve behavioral similarity and enable fast, low-cost Hamming distance comparison on edge devices.
  • We propose a hierarchical federated learning framework with attention-based aggregation, where edge clients are grouped into subnetworks to update model parameters. An attention mechanism adaptively weighs client contributions based on model similarity, improving performance under non-IID data and reducing communication overhead.
  • We implement a multi-index hashing retrieval method for efficient KNN-based intrusion detection. By splitting each binary code into subcodes and indexing them across multiple hash tables, the system supports scalable, low-latency retrieval using bitwise operations, ensuring high accuracy with lower memory and delay.

2. Related Work

2.1. Intrusion Detection in Edge Environments

Traditional intrusion detection systems are predominantly designed for deployment in centralized data centers, where traffic can be aggregated and analyzed using powerful computational resources. However, with the increasing shift toward edge computing and IoT architectures, there is a growing need to move intrusion detection closer to the data source. Edge environments consist of highly distributed, resource-constrained devices—such as sensors, embedded systems, and mobile nodes—that continuously generate and process network traffic.
Early efforts to adapt intrusion detection to edge scenarios focused on lightweight variants of traditional models. For example, Kolosnjaji et al. [16] proposed a simplified deep learning-based IDS for packet inspection on limited hardware. Ferdowsi and Roy [17] introduced a distributed GAN-based IDS that eliminated centralized coordination, showing promise for decentralized learning. Burcu et al. [18] evaluated classical classifiers such as KNN and decision trees under edge constraints, focusing on feature selection and sampling strategies to reduce computation. Recent works have further explored IDS design tailored for smart city and IoT environments. Elsaeidy et al. [19] proposed an unsupervised IDS based on Restricted Boltzmann Machines, capable of detecting anomalies from sensor grids in smart infrastructure. Other studies have explored IDSs based on edge–cloud collaboration, where preliminary detection is performed at the edge and refined in the cloud [20]. More recently, blockchain and smart contract-based IDS frameworks have emerged, using decentralized verification to enhance transparency and resilience in edge settings [21].
Despite these advancements, edge-based IDSs face persistent limitations. The high-dimensionality and heterogeneity of network traffic make efficient modeling difficult, especially on low-power devices. Moreover, handcrafted features often fail to generalize across protocols and deployment contexts. For these challenges, our work has developed a lightweight intrusion detection framework specifically designed for scalable deployment across heterogeneous edge environments.

2.2. Federated Learning

Federated learning (FL) is a decentralized machine learning paradigm that enables multiple clients to collaboratively train a global model without sharing their raw data. Each device performs local updates and transmits only model parameters or gradients to a central server, preserving privacy and reducing communication overhead. FL is particularly suited to edge environments, where data is inherently distributed, heterogeneous, and non-IID.
FL frameworks can be categorized into horizontal, vertical, and hybrid types, depending on data distribution. Horizontal FL is the most common in edge computing, where participants share the same feature space but differ in data samples. Vertical FL handles scenarios where participants hold complementary features over shared users, often in cross-organization collaborations. Both aim to balance privacy with model utility in decentralized settings. In recent years, FL has been applied to intrusion detection [22] with promising results. Hamdi Friji et al. [23] proposed a GNN-based IDS using IP and port graphs for lightweight anomaly detection. Mansi H. Bhavsar et al. [24] proposed a federated intrusion detection system leveraging logistic regression and CNN classifiers, achieving high accuracy and privacy preservation in transportation IoT environments using the NSL-KDD and Car-Hacking datasets. Zhigang Jin et al. [25] proposed FL-IIDS, a federated intrusion detection framework that addresses catastrophic forgetting through gradient-balanced and knowledge-distilled loss functions. Ozlem Ceviz et al. [26] proposed a decentralized FL-based intrusion detection system tailored for UAV networks, effectively reducing computation costs and enhancing privacy without sacrificing detection performance. Babatunde Olanrewaju-George et al. [27] employed both supervised and unsupervised deep learning models trained via federated learning for IoT intrusion detection, demonstrating that FL-based auto-encoders outperform non-FL counterparts.
Although FL provides privacy by keeping raw data on local devices, it remains vulnerable to security risks such as model poisoning attacks and gradient leakage, which can compromise the integrity of the global model. Recent studies have suggested robust aggregation strategies, anomaly detection of malicious updates, and differential privacy to mitigate these threats. Despite these advancements, FL-based IDS frameworks still face key limitations, including uneven data distributions, varying computational capacities, and high communication costs. Most existing works focus on either learning accuracy or communication reduction, with limited consideration of real-time constraints and retrieval efficiency. To address these gaps, our work introduces an attention-guided aggregation strategy within a federated learning framework, enabling efficient intrusion detection across decentralized edge nodes.

2.3. Hash-Based Representation and Retrieval

Hashing has been widely used for compact data representation and efficient similarity searching in large-scale scenarios. It maps high-dimensional inputs into fixed-length binary codes that preserve semantic similarity, allowing fast nearest neighbor retrieval in Hamming space with low memory and computational cost. This is especially advantageous for real-time intrusion detection on resource-constrained edge devices.
Locality-Sensitive Hashing (LSH) [28] is a well-known data-independent method using random projections but suffers from limited accuracy and adaptability. To address this, data-dependent methods learn hash functions from data distributions. Gong and Lazebnik [29] proposed Iterative Quantization (ITQ) for minimizing quantization error. Zhou et al. [30] introduced Kernel Supervised Hashing (KSH), and Norouzi and Fleet [31] developed Minimal Loss Hashing (MLH) to preserve semantic relations more effectively. Deep learning further advanced hashing by jointly learning feature representations and binary codes. CNN Hashing [32] and Deep Semantic Ranking Hashing (DSRH) [33] are representative approaches. Recent studies have leveraged hashing techniques in various directions, including accelerating retrieval, enhancing privacy protection, and reducing communication costs. To improve retrieval speed, Wan et al. [34] proposed multi-index hashing (MIH), which partitions hash codes to support sublinear KNN search. Hashing has also been explored in distributed and privacy-sensitive contexts. Kapoor et al. [35] proposed HashFed, combining hashing with federated learning to reduce communication costs. In intrusion detection, hash-based representations enable lightweight traffic encoding and fast matching, making them suitable for low-power edge nodes. However, existing hashing-based IDS approaches rarely consider integration with federated settings or adapt to edge environments. In this work, we address this gap by designing a spatiotemporal hashing model integrated with multi-index retrieval.

3. Methodology

To address the challenges of scalability, efficiency, and adaptability in deploying Network Intrusion Detection Systems (NIDS) at the edge, we propose a distributed and hash-driven detection framework that integrates compact traffic representation, communication-aware training, and collaborative retrieval. Specifically, we first design a hashing neural network to encode high-dimensional traffic features into compact binary codes, enabling lightweight and similarity-preserving modeling. Then, we introduce a multi-level federated learning architecture that employs attention-based aggregation to improve model convergence and reduce communication overhead. Finally, we develop a collaborative detection strategy that constructs a distributed spatiotemporal hash code database across edge nodes and performs efficient KNN-based anomaly detection via multi-index hashing.

3.1. Discrete Hashing Network for Intrusion Detection

We adopt a deep supervised discrete hashing neural network to model high-dimensional network traffic samples with compact binary representations, as illustrated in Figure 2. Given N traffic instances X = { x i } i = 1 N , where each x i R τ × 1 and τ denotes the number of traffic features, our goal is to learn a set of K-bit binary hash codes B { 0 , 1 } K × N through an end-to-end learnable network. For each input x i , its corresponding binary code is denoted as b i = h ( x i ) , where h ( · ) is the learned hash function.
The network architecture consists of an input layer, two hidden layers, and an output layer. The number of input neurons matches the traffic feature dimension τ , and the number of output neurons equals the target hash length K. Each layer is fully connected, and the hidden layers use ReLU activation to enhance non-linearity. This compact structure balances representation capability and computational efficiency, making it suitable for real-time edge deployment. The learning objective is twofold: (1) to preserve semantic similarity by ensuring that hash codes of similar samples have minimal Hamming distance, and (2) to enable discriminative classification through a linear classifier Y = W T B , where Y is the ground truth label matrix and W is the classifier weight. We define the similarity matrix M = { m i j } . Here, M is a binary similarity matrix where m i j = 1 indicates that samples x i and x j belong to the same class (semantically similar), and m i j = 0 otherwise. This matrix serves as the supervision signal for preserving semantic similarity in the learned hash codes. The training procedure of our deep supervised discrete hashing network is summarized in Algorithm 1, where a neural network is optimized to learn a hash function H ( x ) = sgn ( F ( x ) ) that maps inputs into compact binary codes. The algorithm consists of four main update steps described as follows.
Algorithm 1: Neural Network Training Algorithm Based on Deep Supervised Discrete Hashing
Symmetry 17 01580 i001
    The training procedure of our deep supervised discrete hashing network is summarized in Algorithm 1, where a neural network is optimized to learn a hash function H ( x ) = sgn ( F ( x ) ) that maps inputs into compact binary codes. Here, sgn ( · ) denotes the sign function, which outputs 1 for positive values and 0 otherwise, producing binary sequences in { 0 , 1 } K . The algorithm consists of four main update steps described as follows.
Step 1: Loss function construction. We define a loss function that jointly preserves semantic similarity and enables discriminative classification. For training samples { x i , y i } i = 1 N , where y i { 0 , 1 } C is a one-hot class label, the total objective is given by
L = m i j M m i j I i j log ( 1 + e I i j ) + u i = 1 N y i W b i 2 + v W F 2 ,
where I i j = 1 2 h i h j measures the pairwise similarity between continuous hash outputs h i , h j R K , m i j { 0 , 1 } indicates whether x i and x j are semantically similar, W R K × C is the linear classification weight matrix, and b i { 0 , 1 } K is the K-bit binary hash code obtained from h i via the sign function. Hyperparameters u and v control the trade-off between the classification loss and the Frobenius-norm regularization term on W.
Step 2: Feature mapping update. The continuous hash representation is computed as h i = G z ( x i ; g ) + n , where z ( x i ; g ) is the output of intermediate layers with parameters g, G is a projection matrix, and n is a bias term. During training, we update G, n, and g via standard backpropagation based on the gradient of L with respect to h i . We update the feature mapping via:
L G = z ( x i ; g ) L h i , L n = L h i , L g = G · L h i
Step 3: Classifier weight update. To minimize the supervised loss, we update the classifier weights using ridge regression:
W = B B + v u I 1 B Y
where B contains current binary codes and Y holds label vectors.
Step 4: Binary code optimization. We update the K × N binary hash matrix B { 0 , 1 } K × N using discrete cyclic coordinate descent to jointly preserve semantic similarity and enhance classification. The optimization objective is
min B { 0 , 1 } K × N W B F 2 2 Tr ( P B ) ,
where · F is the Frobenius norm, Tr ( · ) is the matrix trace (sum of diagonal elements), W R K × C is the classifier weight matrix, Y R C × N is the label matrix, H R K × N is the continuous hash output, and
P = W Y + t u H
integrates the label prediction term ( W Y ) with the quantization adjustment term ( t u H ). Here, t > 0 is a hyperparameter controlling the influence of the quantization adjustment, encouraging H to align with B, and u > 0 is a hyperparameter balancing the classification loss term in the overall objective. Each bit b k i is updated by fixing all others and applying
b k i = sgn p k i B k , i W k w k ,
where p k i is the ( k , i ) element of P, B k , i and W k denote B and W with the k-th row removed, and w k is the k-th row of W. This process is iterated over all bits until convergence.

3.2. Multi-Level Attention-Based Aggregation

To mitigate communication overhead and address data heterogeneity in distributed intrusion detection, we design a multi-level federated learning framework combined with an attention-based aggregation strategy. As illustrated in Figure 2, the framework organizes edge nodes into hierarchical subnetworks to enable multi-level model aggregation. Furthermore, as shown in Figure 3, an attention mechanism is employed to dynamically assign aggregation weights at each level, allowing more adaptive and fine-grained model updates. This joint design significantly improves the efficiency and scalability of the system.
The resulting relevance scores are then normalized to obtain attention weights. In our federated learning framework, the server begins by initializing a global parameter vector ϕ ( 0 ) and distributing it to all participating clients. Each client c receives ϕ ( t ) at round t and trains a local model U c ( t ) on its private dataset D c = { ( x i c , y i c ) } , producing updated parameters ϕ ^ c ( t ) . After completing local training, the client sends ϕ ^ c ( t ) to the server. Upon receiving updates from a subset of m t clients in round t, the server aggregates them as:
ϕ ( t + 1 ) = 1 m t c = 1 m t ϕ ^ c ( t ) ,
where ϕ ( t + 1 ) is the updated global model and m t is the number of clients participating in aggregation at round t. After each aggregation, the server sends ϕ ( t + 1 ) back to the participating clients. Each client then updates its local parameters by blending the new global model with its current local model:
U c ( t + 1 ) = β · ϕ ( t + 1 ) + ( 1 β ) · U c ( t ) ,
where β [ 0 , 1 ] is a mixing coefficient controlling the balance between global guidance and local adaptation. A higher β places more weight on the global model, while a lower β allows the client to retain more of its locally learned features. This partial participation and blending strategy improves robustness in heterogeneous environments, as the server can proceed with updates without waiting for all clients, and clients can flexibly adjust the influence of global updates according to their local data characteristics.

3.3. Federated Multi-Index Hashing

To enable efficient intrusion detection in distributed and large-scale network environments, we propose a collaborative detection strategy that integrates hash-based representation with multi-index retrieval across multiple edge nodes. In this framework, each edge node trains a hashing model to encode raw network traffic into fixed-length binary codes, while preserving semantic similarity among similar traffic patterns. The encoded hash codes are uploaded to the node’s sub-server, which synchronizes with the central server to construct a global spatiotemporal hash codebase covering multiple domains.
To support fast retrieval, we adopt the multi-index hashing (MIH) scheme, as illustrated in Figure 4. MIH partitions each K-bit hash code into m disjoint subcodes and stores them in separate hash tables. During inference, a query code is similarly split, and each subcode is used to search its corresponding hash table in parallel. The retrieved candidates are then ranked by Hamming distance, and their labels are aggregated to determine the most likely traffic category.
Theorem 1. 
Given two binary codes h and g with Hamming distance h g H r , if the codes are divided into m substrings, then there exists at least one substring z { 1 , 2 , , m } such that the Hamming distance between the z-th substrings is bounded by r m , i.e.,
1 z m such that h ( z ) g ( z ) H r , r = r m .
This property ensures retrieval completeness [36]. Based on Theorem 1, MIH retrieves, for each substring position j, a candidate set N j ( g ) consisting of codes whose Hamming distance from g ( j ) is at most r / m or r / m 1 . The union of these sets forms N ( g ) , and all elements in N ( g ) are compared with g to determine the final r-neighbors.
Theorem 2. 
To find any binary code within a Hamming radius r = m r + a in q-bit binary space, it is sufficient to search only a + 1 substrings with radius r , and the remaining m ( a + 1 ) substrings with radius r 1 . That is,
1 z a + 1 s . t . h ( z ) g ( z ) H r OR a + 1 < z m s . t . h ( z ) g ( z ) H r 1 .
Let s = q / m be the length of each substring, where q is the total bit length. The number of bucket lookups per query is given by:
T ( s ) = q s z = 0 r q s s z q s 2 H r q s ,
where H ( · ) denotes the binary entropy function. The corresponding retrieval cost is:
C o s t ( s ) = 1 + n 2 s · q s z = 0 r q s s z 1 + n 2 s · q s 2 H r q s .
Since the number of queries grows much slower than exhaustive search and the candidates per bucket decrease exponentially with s, the overall cost remains low even in large-scale databases. This design preserves privacy by avoiding raw traffic transmission, enables rapid retrieval via bitwise XOR operations, and ensures generalization across diverse traffic types, protocols, and temporal patterns.

4. Experiments

To validate the effectiveness of our proposed intrusion detection framework in edge environments, we conducted a series of experiments on widely used benchmark datasets. This section describes the datasets used, the experimental setup, performance results, system resource overhead, and communication efficiency.

4.1. Datasets

We conducted our experiments on three widely used benchmark datasets: NSL-KDD [37], UNSW-NB15 [38], and CIC-IDS2017 [39]. Among these, CIC-IDS2017 served as the primary dataset for performance comparison and analysis, due to its comprehensive coverage of real-world network traffic scenarios. The CIC-IDS2017 dataset contains both benign and malicious traffic, including various types of modern attacks, such as DoS, DDoS, Web Attack, Port Scan, Botnet, and Infiltration. For our experiments, a subset of the dataset was selected and sampled for training and testing, as shown in Table 1.
For all the datasets (NSL-KDD, UNSW-NB15, and CIC-IDS2017), we performed a unified preprocessing pipeline including feature selection, normalization, and encoding of categorical attributes. We first removed redundant or highly correlated features, using correlation analysis and domain knowledge to reduce dimensionality. Then, continuous features were normalized to zero mean and unit variance using z-score normalization to prevent scale bias during training. For categorical fields such as protocol type and service, one-hot encoding was applied. These preprocessing steps have been shown to improve the convergence of the hashing neural network and the performance of the attention-based KNN retrieval.

4.2. Experimental Settings

To simulate real-world network attack scenarios, we set up a controlled test environment comprising six emulated attack hosts and one designated target server. The attack hosts used the hping3 tool to launch Denial of Service (DoS) attacks by sending crafted TCP packets to the target server. Specifically, each attack machine transmitted a packet every 500 to 2000 milliseconds, continuously sending large volumes of spoofed requests to saturate the server’s bandwidth and computational resources. This setup effectively mimicked a Distributed Denial of Service (DDoS) environment, which is one of the most prevalent and representative forms of network attacks in real-world scenarios. The target server is equipped with a packet capture utility that monitors and records all incoming traffic, including both attack traffic and normal background traffic (labeled as BENIGN). The collected packets contained essential TCP-related header fields, such as source and destination IP addresses, source and destination ports, and TCP flags. Additionally, a set of time-based traffic statistical features was extracted from the captured flows, providing a comprehensive representation of network behavior for both benign and malicious scenarios.
In our experiments, the neural network was configured with an input layer of 78 nodes, followed by two hidden layers with 64 and 48 neurons, respectively. The output layer size correspondes to the hash code length, which was set to {12, 24, 48, 96}. During training, the baseline model was configured with a batch size of 8 and a learning rate of 1 × 10 5 . The mixing factor α is set to 0.28 and the number of communication rounds was fixed at 80. To ensure a comprehensive evaluation, each dataset was split into a training set and a test set with an 60/20 ratio. The training samples were evenly distributed across five edge nodes. For the IID setting, the training data was randomly divided into five equally sized subsets and assigned to the edge nodes, ensuring that each node had a representative distribution of the entire dataset. For the non-IID setting, each edge node received samples belonging to only two randomly selected classes from the training set, thereby simulating data heterogeneity across the devices. We evaluated the performance of the intrusion detection model, using three widely accepted metrics: accuracy (AC), false positive rate (FPR), and F1-score. These metrics offer a balanced view of classification precision, recall, and the rate of false alarms, which are crucial for reliable intrusion detection. During training, the base model on each edge node was optimized using a batch size of 8 and a learning rate of 1 × 10 5 . The total number of communication rounds between the clients and the server was set to 80.

4.3. Experimental Results

We evaluated the intrusion detection performance of our proposed model across various datasets and compared it against state-of-the-art baselines. Table 2 reports the results on four datasets, CIC-IDS2017, UNSW-NB15, NSL-KDD, and a self-collected Synthetic Data—under both IID and non-IID settings. The Synthetic Data was constructed from network traffic traces collected in our controlled laboratory environment, containing both benign flows and simulated attack scenarios. The metrics included accuracy (AC), F1-score (F1), and false positive rate (FPR).
The results demonstrate that our model consistently achieves high performance across all datasets. For example, on the CIC-IDS2017 dataset, it reached an accuracy of 98.97% and an F1-score of 96.73% under IID conditions. Even in the more challenging non-IID setting, it maintained strong performance with 94.13% accuracy and 92.59% F1. Furthermore, Table 3 and Table 4 compare our method with several existing intrusion detection approaches on the CIC-IDS2017 and UNSW-NB15 datasets, respectively. On the UNSW-NB15 dataset, our model achieved the highest accuracy (98.52%) and F1-score (97.12%), while maintaining a lower FPR (2.32%) than NE-GConv and GatedGraphConv. On the CIC-IDS2017 dataset, it outperforms CoWatch by over 5% in accuracy and reduces the false positive rate from 3.85% to 1.97%. These results confirm the superiority of our attention-driven federated learning framework with hash-based modeling, offering both high detection accuracy and reduced false alarm rates across diverse real-world and synthetic datasets.
To further evaluate the classification performance of our method, we visualize confusion matrices for the CIC-IDS2017 and UNSW-NB15 datasets in Figure 5, each including six representative categories of network traffic. From Figure 5a, we observe that the proposed model achieved extremely high classification accuracy across most classes in the CIC-IDS2017 dataset. Specifically, benign traffic (class 1), DoS (class 2), and Port Scan (class 5) show near-perfect accuracy exceeding 99%, with very limited misclassification. Figure 5b shows the results on the UNSW-NB15 dataset, where overall classification performance remained strong, with most classes above 97% accuracy. Nonetheless, class 6 experienced a slightly lower performance (97.63%), with a small number of instances misclassified as benign traffic or class 5. This indicates a minor similarity in feature space, highlighting the challenge of precisely identifying certain rare attack types in realistic and noisy network environments.

4.4. Time and Space Efficiency Analysis

We evaluated the time efficiency by measuring the search latency of linear scan and Multi-Index Hashing (MIH) on the CIC-IDS2017 and UNSW-NB15 datasets for various k values. As shown in Table 5, MIH delivered significant acceleration over linear scan, with speedups ranging from an order of magnitude to over one thousand times, depending on the dataset characteristics and hash code distribution. When performing a 1-NN search on 1 billion 48-bit binary codes, linear scan required 11.63 seconds, while MIH completed the same task in just 0.018 seconds, achieving a 640-fold speedup. On the CIC-IDS2017 dataset, the speedup ranged from 239× (with 12-bit codes) to 1013× (with 24-bit codes). On the UNSW-NB15 [38] dataset, the corresponding improvement ranged from 407× to 806×. Notably, the distribution of the hash codes determined by the dataset and hash bit length directly affected the retrieval performance. As the value of k increases (i.e., more neighbors were retrieved), the speedup tended to decrease. This was because larger k values expand the search radius, increasing the number of candidate neighbors that must be checked. In general, however, MIH consistently outperforms linear scan by one to two orders of magnitude, in terms of speed, demonstrating its efficiency and suitability for real-time intrusion detection on large-scale binary hash databases.
To evaluate the space efficiency of our hash-based modeling approach, we selected subsets of three datasets—CIC-IDS2017, UNSW-NB15, and NSL-KDD whose original data sizes were 17.2 MB, 15.7 MB, and 23.8 MB, respectively. The results are visualized in Figure 6. The line plot in the figure shows the ratio of the hash code storage to the original data size across four encoding lengths: 12, 24, 48, and 96 bits. The corresponding storage proportions were approximately 1.2%, 11%, 32%, and 63%, respectively. As expected, longer hash codes provided finer-grained representations of the original data but also consumed more storage. When combined with the accuracy results in Table 5, we observe that extremely short codes (e.g., 12-bit) tend to sacrifice detection performance, while moderate-length codes (e.g., 48-bit) offer a good trade-off between accuracy and storage.

5. Conclusions

In this paper, we propose a novel intrusion detection framework tailored for edge environments, integrating deep supervised discrete hashing, a multi-level federated learning architecture, and a multi-index hash retrieval mechanism. Our approach delivers efficient intrusion detection for edge-based network security, even under limited resource constraints. By leveraging compact binary hash codes to represent traffic data, we significantly reduce storage and computation overhead. Furthermore, the proposed aggregation strategy in federated training improves convergence speed and model accuracy under heterogeneous conditions. Extensive experiments conducted on benchmark datasets demonstrated that our method achieves superior detection performance and efficiency compared to the existing approaches.

Author Contributions

Conceptualization, Y.L. and B.G.; Methodology, Y.L. and H.Y.; Software, X.L. (Xing Liu) and H.Y.; Validation, Y.L., X.L. (Xing Liu), H.Y. and B.G.; Formal analysis, X.L. (Xing Liu), Y.L. and X.L. (Xiao Liu); Resources, Y.L.; Data curation, X.L. (Xing Liu) and Y.L.; Writing—original draft, Y.L.; Writing—review and editing, B.G.; Visualization, X.L. (Xing Liu); Supervision, B.G.; Project administration, Y.L. and X.L. (Xiao Liu). All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding authors.

Conflicts of Interest

Author Ying Liu is employed by the company State Grid Corporation of China. Authors Xing Liu and Hao Yu are employed by the company Nari Information & Communication Technology Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as potential conflicts of interest.

References

  1. Guo, B.; Yang, Y.; Li, Q.; Hou, J.; Rao, Y. Boosting Adversarial Attacks with Improved Sign Method. In Advanced Data Mining and Applications; Springer: Berlin/Heidelberg, Germany, 2023; pp. 150–164. [Google Scholar]
  2. Guo, B.; Li, Q.; Liu, X. Improving Adversarial Transferability with Heuristic Random Transformation. In Proceedings of the 2023 26th International Conference on Computer Supported Cooperative Work in Design (CSCWD), Rio de Janeiro, Brazil, 24–26 May 2023; pp. 35–40. [Google Scholar]
  3. Liang, L.; Guo, B.; Lian, Z.; Li, Q.; Jing, H. IMPGA: An Effective and Imperceptible Black-Box Attack Against Automatic Speech Recognition Systems. In Asia-Pacific Web (APWeb) and Web-Age Information Management (WAIM) Joint International Conference on Web and Big Data; Springer: Berlin/Heidelberg, Germany, 2022; pp. 349–363. [Google Scholar]
  4. Agarwal, R.; Joshi, M.V. PNrule: A new framework for learning classifier models in data mining (a case-study in network intrusion detection). In Proceedings of the 2001 SIAM International Conference on Data Mining, SIAM, Chicago, IL, USA, 5–7 April 2001; pp. 1–17. [Google Scholar]
  5. Nikolova, E.; Jecheva, V. Some similarity coefficients and application of data mining techniques to the anomaly-based IDS. Telecommun. Syst. 2012, 50, 127–135. [Google Scholar] [CrossRef]
  6. Nagaraju, S.; Shanmugham, B.; Baskaran, K. High throughput token driven FSM based regex pattern matching for network intrusion detection system. Mater. Today Proc. 2021, 47, 139–143. [Google Scholar] [CrossRef]
  7. Basati, A.; Faghih, M.M. PDAE: Efficient network intrusion detection in IoT using parallel deep auto-encoders. Inf. Sci. 2022, 598, 57–74. [Google Scholar] [CrossRef]
  8. Gondhalekar, R.; Chattamvelli, R. A Comprehensive Review of Dimensionality Reduction Techniques for Real-time Network Intrusion Detection with Applications in Cybersecurity. Def. Sci. J. 2024, 74, 246–255. [Google Scholar] [CrossRef]
  9. Srikanth yadav, M.; Kalpana, R. Recurrent nonsymmetric deep auto encoder approach for network intrusion detection system. Meas. Sens. 2022, 24, 100527. [Google Scholar]
  10. Buczak, A.L.; Guven, E. A survey of data mining and machine learning methods for cyber security intrusion detection. IEEE Commun. Surv. Tutor. 2015, 18, 1153–1176. [Google Scholar] [CrossRef]
  11. Shi, W.; Cao, J.; Zhang, Q.; Li, Y.; Xu, L. Edge computing: Vision and challenges. IEEE Internet Things J. 2016, 3, 637–646. [Google Scholar] [CrossRef]
  12. Dhanabal, L.; Shantharajah, S. A study on NSL-KDD dataset for intrusion detection system based on classification algorithms. Int. J. Adv. Res. Comput. Commun. Eng. 2015, 4, 446–452. [Google Scholar]
  13. McMahan, B.; Moore, E.; Ramage, D.; Hampson, S.; y Arcas, B.A. Communication-efficient learning of deep networks from decentralized data. In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS) 2017, Fort Lauderdale, FL, USA, 20–22 April 2017; pp. 1273–1282. [Google Scholar]
  14. Vyas, A.; Lin, P.C.; Hwang, R.H.; Tripathi, M. Privacy-Preserving Federated Learning for Intrusion Detection in IoT Environments: A Survey. IEEE Access 2024, 12, 127018–127050. [Google Scholar] [CrossRef]
  15. Norouzi, M.; Punjani, A.; Fleet, D.J. Fast search in hamming space with multi-index hashing. In Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA, 16–21 June 2012; pp. 3108–3115. [Google Scholar]
  16. Kolosnjaji, B.; Zarras, A.; Webster, G.; Eckert, C. Deep learning for classification of malware system call sequences. In Proceedings of the AI 2016: Advances in Artificial Intelligence: 29th Australasian Joint Conference, Hobart, TAS, Australia, 5–8 December 2016; Proceedings 29. Springer: Berlin/Heidelberg, Germany, 2016; pp. 137–149. [Google Scholar]
  17. Yang, Y.; Nan, F.; Yang, P.; Meng, Q.; Xie, Y.; Zhang, D.; Muhammad, K. GAN-based semi-supervised learning approach for clinical decision support in health-IoT platform. IEEE Access 2019, 7, 8048–8057. [Google Scholar] [CrossRef]
  18. Gyamfi, E.; Jurcut, A. Intrusion detection in internet of things systems: A review on design approaches leveraging multi-access edge computing, machine learning and datasets. Sensors 2022, 22, 3744. [Google Scholar] [CrossRef]
  19. Elsaeidy, A.; Munasinghe, K.S.; Sharma, D.; Jamalipour, A. Intrusion detection in smart cities using Restricted Boltzmann Machines. J. Netw. Comput. Appl. 2019, 135, 76–83. [Google Scholar] [CrossRef]
  20. Ashraf, E.; Areed, N.F.; Salem, H.; Abdelhay, E.H.; Farouk, A. FIDChain: Federated intrusion detection system for blockchain-enabled IoT healthcare applications. Healthcare 2022, 10, 1110. [Google Scholar] [CrossRef] [PubMed]
  21. Solanki, T.; Patel, K.; Pande, S.; Nimkar, A.V. BlockID: Blockchain based Digital ID and Authentication System for Privacy Improvement. In Proceedings of the 2023 3rd International Conference on Advances in Computing, Communication, Embedded and Secure Systems (ACCESS), Ernakulam, India, 18–20 May 2023; pp. 106–112. [Google Scholar]
  22. Agrawal, S.; Sarkar, S.; Aouedi, O.; Yenduri, G.; Piamrat, K.; Alazab, M.; Bhattacharya, S.; Maddikunta, P.K.R.; Gadekallu, T.R. Federated learning for intrusion detection system: Concepts, challenges and future directions. Comput. Commun. 2022, 195, 346–361. [Google Scholar] [CrossRef]
  23. Friji, H.; Olivereau, A.; Sarkiss, M. Efficient network representation for GNN-based intrusion detection. In Applied Cryptography and Network Security; Springer: Berlin/Heidelberg, Germany, 2023; pp. 532–554. [Google Scholar]
  24. Bhavsar, M.H.; Bekele, Y.B.; Roy, K.; Kelly, J.C.; Limbrick, D. Fl-ids: Federated learning-based intrusion detection system using edge devices for transportation iot. IEEE Access 2024, 12, 52215–52226. [Google Scholar] [CrossRef]
  25. Jin, Z.; Zhou, J.; Li, B.; Wu, X.; Duan, C. FL-IIDS: A novel federated learning-based incremental intrusion detection system. Future Gener. Comput. Syst. 2024, 151, 57–70. [Google Scholar] [CrossRef]
  26. Ceviz, O.; Sadioglu, P.; Sen, S.; Vassilakis, V.G. A novel federated learning-based IDS for enhancing UAVs privacy and security. Internet Things 2025, 31, 101592. [Google Scholar] [CrossRef]
  27. Olanrewaju-George, B.; Pranggono, B. Federated learning-based intrusion detection system for the internet of things using unsupervised and supervised deep learning models. Cyber Secur. Appl. 2025, 3, 100068. [Google Scholar] [CrossRef]
  28. Jafari, O.; Maurya, P.; Nagarkar, P.; Islam, K.M.; Crushev, C. A survey on locality sensitive hashing algorithms and their applications. arXiv 2021, arXiv:2102.08942. [Google Scholar] [CrossRef]
  29. Gong, Y.; Lazebnik, S.; Gordo, A.; Perronnin, F. Iterative quantization: A procrustean approach to learning binary codes for large-scale image retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 2012, 35, 2916–2929. [Google Scholar] [CrossRef]
  30. Zhou, J.; Ding, G.; Guo, Y.; Liu, Q.; Dong, X. Kernel-based supervised hashing for cross-view similarity search. In Proceedings of the 2014 IEEE International Conference on Multimedia and Expo (ICME), Chengdu, China, 14–18 July 2014; pp. 1–6. [Google Scholar]
  31. Norouzi, M.; Fleet, D.J. Minimal loss hashing for compact binary codes. In Proceedings of the ICML’11: Proceedings of the 28th International Conference on International Conference on Machine Learning, Bellevue, DC, USA, 28 June–2 July 2011; Volume 1, p. 2. [Google Scholar]
  32. Shao, T.; Yang, Y.; Weng, Y.; Hou, Q.; Zhou, K. H-CNN: Spatial hashing based CNN for 3D shape analysis. IEEE Trans. Vis. Comput. Graph. 2018, 26, 2403–2416. [Google Scholar] [CrossRef]
  33. Zhao, F.; Huang, Y.; Wang, L.; Tan, T. Deep semantic ranking based hashing for multi-label image retrieval. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 1556–1564. [Google Scholar]
  34. Wan, J.; Tang, S.; Zhang, Y.; Huang, L.; Li, J. Data driven multi-index hashing. In Proceedings of the 2013 IEEE International Conference on Image Processing, Melbourne, Australia, 15–18 September 2013; pp. 2670–2673. [Google Scholar]
  35. Kapoor, A.; Kumar, D. K-HashFed: Communication Efficient Federated Learning through Gradient Clustering and Hashing. In Proceedings of the ICASSP 2025-2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Hyderabad, India, 6–11 April 2025; pp. 1–5. [Google Scholar]
  36. Norouzi, M.; Punjani, A.; Fleet, D.J. Fast exact search in hamming space with multi-index hashing. IEEE Trans. Pattern Anal. Mach. Intell. 2013, 36, 1107–1119. [Google Scholar] [CrossRef]
  37. Meena, G.; Choudhary, R.R. A review paper on IDS classification using KDD 99 and NSL KDD dataset in WEKA. In Proceedings of the 2017 International Conference on Computer, Communications and Electronics (Comptelix), Jaipur, India, 1–2 July 2017; pp. 553–558. [Google Scholar]
  38. Moustafa, N.; Slay, J. UNSW-NB15: A comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set). In Proceedings of the 2015 Military Communications and Information Systems Conference (MilCIS), Canberra, Australia, 10–12 November 2015; pp. 1–6. [Google Scholar]
  39. Rosay, A.; Cheval, E.; Carlier, F.; Leroux, P. Network intrusion detection: A comprehensive analysis of CIC-IDS2017. In 8th International Conference on Information Systems Security and Privacy; SCITEPRESS-Science and Technology Publications: Setúbal, Portugal, 2022; pp. 25–36. [Google Scholar]
  40. Zhou, H.; Jia, X.; Shu, J.; Zhou, L. Cowatch: Collaborative prediction of ddos attacks in edge computing with distributed sdn. In Proceedings of the 2021 IEEE Global Communications Conference (GLOBECOM), Madrid, Spain, 7–11 December 2021; pp. 1–6. [Google Scholar]
  41. Manocchio, L.D.; Layeghy, S.; Lo, W.W.; Kulatilleke, G.K.; Sarhan, M.; Portmann, M. Flowtransformer: A transformer framework for flow-based network intrusion detection systems. Expert Syst. Appl. 2024, 241, 122564. [Google Scholar] [CrossRef]
  42. Altaf, T.; Wang, X.; Ni, W.; Liu, R.P.; Braun, R. NE-GConv: A lightweight node edge graph convolutional network for intrusion detection. Comput. Secur. 2023, 130, 103285. [Google Scholar] [CrossRef]
  43. Ren, K.; Zeng, Y.; Cao, Z.; Zhang, Y. ID-RDRL: A deep reinforcement learning-based feature selection intrusion detection model. Sci. Rep. 2022, 12, 15370. [Google Scholar] [CrossRef]
Figure 1. Overview of the proposed distributed framework for edge-based intrusion detection.
Figure 1. Overview of the proposed distributed framework for edge-based intrusion detection.
Symmetry 17 01580 g001
Figure 2. Architecture of the deep supervised discrete hashing network for traffic representation and hashing.
Figure 2. Architecture of the deep supervised discrete hashing network for traffic representation and hashing.
Symmetry 17 01580 g002
Figure 3. Illustration of the multi-level federated learning framework with attention-based aggregation strategy.
Figure 3. Illustration of the multi-level federated learning framework with attention-based aggregation strategy.
Symmetry 17 01580 g003
Figure 4. Overview of the multi-index hashing mechanism.
Figure 4. Overview of the multi-index hashing mechanism.
Symmetry 17 01580 g004
Figure 5. Confusion matrices of classification results on CIC-IDS2017 (a) and UNSW-NB15 (b) datasets.
Figure 5. Confusion matrices of classification results on CIC-IDS2017 (a) and UNSW-NB15 (b) datasets.
Symmetry 17 01580 g005
Figure 6. Comparison of space cost under different hash code lengths across CIC-IDS2017, NSL-KDD and UNSW-NB15 datasets.
Figure 6. Comparison of space cost under different hash code lengths across CIC-IDS2017, NSL-KDD and UNSW-NB15 datasets.
Symmetry 17 01580 g006
Table 1. The number of instances in CIC-IDS2017 dataset.
Table 1. The number of instances in CIC-IDS2017 dataset.
Attack TypeTotalTraining SetTest Set
BENIGN2,271,32060002000
DDoS128,02560002000
DoS230,12460002000
Web Attack652450150
Port Scan158,80460002000
Botnet19561200400
Infiltration36306
Table 2. Performance on different datasets.
Table 2. Performance on different datasets.
Dataset/MetricIIDNon-IID
AC(%) F1(%) FPR(%) AC(%) F1(%) FPR(%)
CIC-IDS2017 [39]98.9796.731.9794.1392.595.81
UNSW-NB15 [38]98.5297.122.3292.7191.886.32
NSL-KDD [37]97.2396.622.1193.4892.244.18
Synthetic Data99.2198.051.1698.9298.131.73
Table 3. Comparison with other methods on the CIC-IDS2017 dataset.
Table 3. Comparison with other methods on the CIC-IDS2017 dataset.
MethodAC (%)F1 (%)FPR (%)
CoWatch [40]93.2191.123.85
Transformer (GPT-based) [41]97.2396.902.31
Ours98.9796.731.97
Table 4. Comparison with other methods on the UNSW-NB15 dataset.
Table 4. Comparison with other methods on the UNSW-NB15 dataset.
MethodAC (%)F1 (%)FPR (%)
NE-GConv [42]98.6396.423.48
GatedGraphConv [42]97.3196.033.98
CoWatch [40]93.0189.553.09
RL-based IDS [43]96.1894.893.52
Ours98.5297.122.32
Table 5. Speedup of k-NN search compared to linear scan under different hash bit lengths.
Table 5. Speedup of k-NN search compared to linear scan under different hash bit lengths.
DatasetHash Bits1-NN5-NN10-NN100-NN1000-NN
CIC-IDS2017122391871567448
241013991850683392
48832784734595401
9674768530917897
UNSW-NB151240732824511361
24732659418292183
48640588503368157
96806694552372226
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Liu, Y.; Liu, X.; Yu, H.; Guo, B.; Liu, X. A Federated Intrusion Detection System for Edge Environments Using Multi-Index Hashing and Attention-Based KNN. Symmetry 2025, 17, 1580. https://doi.org/10.3390/sym17091580

AMA Style

Liu Y, Liu X, Yu H, Guo B, Liu X. A Federated Intrusion Detection System for Edge Environments Using Multi-Index Hashing and Attention-Based KNN. Symmetry. 2025; 17(9):1580. https://doi.org/10.3390/sym17091580

Chicago/Turabian Style

Liu, Ying, Xing Liu, Hao Yu, Bowen Guo, and Xiao Liu. 2025. "A Federated Intrusion Detection System for Edge Environments Using Multi-Index Hashing and Attention-Based KNN" Symmetry 17, no. 9: 1580. https://doi.org/10.3390/sym17091580

APA Style

Liu, Y., Liu, X., Yu, H., Guo, B., & Liu, X. (2025). A Federated Intrusion Detection System for Edge Environments Using Multi-Index Hashing and Attention-Based KNN. Symmetry, 17(9), 1580. https://doi.org/10.3390/sym17091580

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop