Next Article in Journal
On the Symmetry and Domination Integrity of Some Bidegreed Graphs
Previous Article in Journal
Quantitative State Evaluation Method for Relay Protection Equipment Based on Improved Conformer Optimized by Two-Stage APO
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

An Intrusion Detection Method Based on Symmetric Federated Deep Learning in Complex Networks

1
School of Artificial Intelligence and Big Data, Chongqing Polytechnic University of Electronic Technology, Chongqing 401331, China
2
School of Science, Southern University of Science and Technology, Shenzhen 518055, China
3
School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China
4
Big Data and Optimization Research Institute, Chongqing Polytechnic University of Electronic Technology, Chongqing 401331, China
5
JINSHAN Science & Technology (Group) Co., Ltd., Chongqing 401120, China
6
Chongqing Key Laboratory of Big Data Intelligence and Privacy Computing, Chongqing 401331, China
*
Author to whom correspondence should be addressed.
Symmetry 2025, 17(6), 952; https://doi.org/10.3390/sym17060952
Submission received: 24 April 2025 / Revised: 24 May 2025 / Accepted: 28 May 2025 / Published: 15 June 2025

Abstract

The rapid development of the current 5G/6G network has added tremendous pressure to traditional security detection in the scenario of dealing with large-scale network attacks, resulting in high time complexity and low efficiency of attack identification. According to the deep network and its symmetry principle, this paper proposes a complex network intrusion detection and recognition method based on symmetric federation optimization, named IDS, which aims to reduce the time complexity and improve the accuracy and efficiency of attack identification. By using a symmetric network UNet-based deep feature learning to reconstruct data and construct the input matrix, we optimize the federated deep learning algorithm with a symmetric auto-encoder to make it more suitable for a complex network environment. The experimental results demonstrate that the technology based on the symmetric network proposed in this paper possesses significant advantages in terms of intrusion detection accuracy and effectiveness, which can effectively identify network intrusion and improve the accuracy of current complex network intrusion detection. The proposed symmetric intrusion detection method not only solves the bottleneck of traditional detection methods and improves the training efficiency of the model, but it also provides a new idea and solution for network security research.

1. Introduction

With the rapid development of information technology, the network environment has become increasingly complex, and various types of network attacks and intrusion incidents have emerged one after another, posing severe challenges to network security [1,2,3]. As an important component of network security, the importance of intrusion detection technology [4] has become increasingly prominent. In a complex network environment, intrusion detection technology faces many challenges. The expansion of network scale, the increase in the types of devices, and the diversification of attack methods all make it difficult for traditional intrusion detection methods to effectively respond [5]. Therefore, the research and development of intrusion detection technology that is suitable for complex network environments is particularly important.
Currently, intrusion detection technology mainly includes various methods such as expert systems [6], neural networks [7], machine learning [8], and deep learning [9]. With the rapid development of the artificial intelligence revolution and cross-disciplinary fields, deep feature learning, due to its powerful data processing and feature extraction capabilities, has shown great potential in intrusion detection in complex network environments. By constructing a deep feature learning model, network traffic data can be deeply analyzed to accurately identify abnormal behaviors and potential threats. In addition, in response to the particularity of complex network environments, researchers are constantly exploring new intrusion detection technologies. For example, the adaptive intrusion detection method based on convolutional neural networks [10] can quickly extract network traffic features using the characteristics of deep learning for precise identification of abnormal traffic. There is also the intelligent supply chain network intrusion detection method [11], which combines Markov decision processes and deep deterministic policy gradient algorithms to improve detection efficiency and security. Currently, intrusion detection technology in complex network environments is moving towards intelligence and automation. With the continuous progress of technology and in-depth research, the deep integration of intrusion detection technology and artificial intelligence will play an increasingly important role in the field of network security, providing a strong guarantee for the security and stability of cyberspace.
This paper proposes a complex network intrusion detection and identification method based on the symmetry principle and deep feature learning mechanism. By using deep feature learning based on the symmetry network UNet to achieve data reconstruction and construct the input matrix, and optimizing the federated deep learning algorithm with a symmetry autoencoder, it is more suitable for complex network environments. At the same time, knowledge distillation (a machine learning model compression technique) is used to build a lightweight model, ultimately achieving network intrusion detection based on symmetry federated knowledge distillation.
The rest of this paper is organized as follows: Section 2 introduces the related work of this paper. Section 3 presents the architecture of the method. Section 4 details the design and implementation of the main algorithms. In Section 5, several sets of experiments are conducted to obtain the experimental results, proving that the IDS can effectively achieve network traffic identification and intrusion detection. Finally, Section 6 summarizes the work of this paper and looks forward to future research.

2. Related Work

In the field of network intrusion detection in complex network environments, early research mainly focused on host-based intrusion detection systems and network-based intrusion detection systems [12]. However, with the complexity and large scale of network systems and the enhancement of cooperation in intrusion behavior, the distributed intrusion detection system has gradually become a research hotspot [13]. The distributed intrusion detection system combines host-based and network-based detection methods. It is composed of several functional components, which are scattered in the network and cooperate with each other to realize intrusion detection. At the same time, the establishment of an attack model is also of great significance for understanding the principle of network attack, analyzing the process of network intrusion, and evaluating the degree of network security [14]. Attack modeling methods mainly include attack tree, attack net, state transition diagram, and attack graph. Among them, the attack graph modeling method is the most effective, and it is also the focus of the current research. With the rapid development of artificial intelligence technology, machine learning technology plays a vital role in the field of intrusion detection. However, it is affected by high-dimensional and nonlinear network traffic, which leads to a poor intrusion detection effect. In order to mitigate this impact, researchers have focused on feature selection and dimensionality reduction of network traffic data as well as the development of effective deep feature extraction methods. For example, complex network theory is used to transform nonlinear high-dimensional data into the form of network topology to extract the internal relationship between data [15]. In addition, a real-time monitoring system is an important part of network intrusion detection, which can detect anomalies and alarms in time. Researchers have designed real-time detection platforms based on machine learning models, which can monitor network traffic in real time and automatically determine whether there is any intrusion behavior [16]. The research of network intrusion detection in complex network environment has made remarkable progress, not only in the form of in-depth research in theory but also the launch of a number of effective products in practical applications. These research results provide a strong guarantee for network security and valuable experience and reference for future research.
For the current state of research and related work on symmetric networks, there are currently a large number of theoretical foundations and application explorations. Symmetry has shown important properties in many fields such as complex networks, social networks, and wireless sensor networks. Researchers are beginning to realize that the steady state of these systems needs to be maintained by asymmetry, but at the same time, symmetry also provides new ideas for network design and optimization in some aspects. Yong-Shang Long et al. [17] proposed a rigorous and efficient method for finding and quantifying symmetries in complex networks. By introducing a structural position vector (SPV) for each node in the network, we clarify a conceptually attractive and computationally extremely efficient way to find and characterize all symmetric nodes. It has been proven that nodes with the same SPV are symmetric with each other, and SPV can not only characterize the similarity of nodes but also quantify the influence of nodes on the propagation dynamics of the network. Ioannis Kontoyiannis et al. [18] considered a class of small-world graphs in which vertices were first connected to their nearest neighbors on a circle, and then pairs of non-adjacent vertices were connected according to a distant-dependent distribution. The degree distribution of this model was first determined, and then it was used to prove that the model is asymmetric for appropriate parameter ranges. Returning to graph compression, the main result is to compute the entropy and structural entropy of these random graph models. In recent years, researchers have explored the mechanism and impact of symmetry breaking in complex networks [19]. Some work has mitigated the impact of cascading by analyzing the time characteristics of collapsing components, which can adjust the degree of symmetry breaking of interaction behaviors in the network in real time and provide solutions for congestion control and cascading mitigation of traffic flow in urban transportation networks and data flow in mesh wireless networks. In the field of quantum communication, researchers are exploring how to exploit symmetry to improve the security and efficiency of quantum key distribution systems. For example, Emir Dervisevic et al. [20] mentioned that QKD networks produce, manage, and provide symmetric keys as a service. Therefore, synchronization or consistency must be maintained between the contents of the key store. Otherwise, the service will not run. Even if perfect a correlation between two symmetric keys is proven at the quantum layer, the key management layer must verify that neither KM node has received these keys by mistake. This is done by exchanging message authentication codes, hash values calculated based on key bits, and identifiers. Symmetry algorithms in cryptography have also been extensively studied in recent years. Symmetric cryptographic algorithms have been widely used in the field of computer communication and information system security by virtue of their advantages of standardization, easy software and hardware implementation, and suitability for encryption of large amounts of data [21]. Researchers are constantly exploring and improving the performance and security of symmetry algorithms to cope with increasingly complex network security threats. In social network analysis, researchers explored the symmetry in trust relationships between people [22]. They found that trust between people is not always mutual and may only be a one-sided trust relationship. This asymmetry is important for understanding the dynamic behavior of social networks and predicting behavioral changes in the network. In the cross field, although there may be a few studies directly on the combination of federated learning and symmetric networks, the distributed training framework of federated learning provides new possibilities for the study of symmetric networks [23]. Through federated learning, the data from multiple clients can be used for model training and optimization while protecting data privacy, further improving the performance and generalization ability of symmetric networks. In summary, the research on symmetric networks is constantly deepening and expanding, involving many fields and aspects. These studies not only help to understand the essential characteristics and behavior rules of networks but also provide new ideas and methods for network design, optimization, and security protection. However, more researchers and practitioners need to work together to promote the development and application of symmetric networks because the research on symmetric networks involves multiple disciplinary fields and complex technical problems.
The knowledge distillation studied in this work is a teacher–student training architecture whose core idea is to provide knowledge through the trained teacher model and let the student model acquire the teacher’s knowledge through distillation training, thus transferring the knowledge of the complex teacher model to the simple student model at the cost of a slight performance loss. The concept of knowledge distillation was first introduced in 2015 by Hinton et al., who introduced soft objectives (i.e., class probabilities with parameters T) to transfer knowledge. Large deep models often achieve good performance in practice because overparameterization improves generalization performance when considering new data [24]. However, deploying these cumbersome deep models on devices with limited resources is a challenge. As a representative type of model compression and acceleration, knowledge distillation can effectively solve this problem. In [25], researchers proposed a novel data distillation method named TA-DFKD, which generates high-quality and diverse synthetic data by removing prior category restrictions and introducing a sample screening mechanism, thus achieving more stable and robust knowledge distillation without relying on a specific teacher model. In [26], researchers proposed TimeDistill, a Cross-Architecture Knowledge Distillation framework that uses MLP as the student model and other complex advanced architectures (such as transformers and CNNS) as the teacher model. By distilling the advantages of complex models to lightweight models, the computational burden is greatly reduced and the prediction accuracy is significantly improved. As an effective technique for model compression and acceleration, knowledge distillation has a wide range of application prospects in the field of deep learning. By continuously exploring and improving the methods and strategies of knowledge distillation, the performance and efficiency of deep learning models can be further improved.

3. System Architecture

Figure 1 depicts the framework and details of the method proposed in this paper in a complex network environment and intuitively shows the data interaction mode between the cloud center and the client computing node. The process starts with uploading the initialized parameters from the underlying network to the cloud data center, and these parameters are then fused with the cloud data center parameters to form a common model parameter set for the training of the initial model. Then, each client would download the initial model parameters from the cloud data center and update the client model with the personalized user data updated in real time. The updated client model parameters are uploaded back to the cloud data center to update the distributed federated learning model. In this process, the security of user data privacy is ensured, and the sharing of encryption parameters between edge computing nodes of different types of task data is realized. Through the interaction between the federated model and the client model, we complete the training of the federated model.
In the model application stage, the federated model classifies and identifies network traffic data based on the training results of labeled historical datasets. Then, according to the identification results, it will realize network traffic monitoring and send an alarm to users when abnormal behavior occurs.

4. IDS Method

In order to solve the problem of catastrophic forgetting (when the neural network learns a new task, the knowledge of the old task is rapidly lost due to the parameter update) in the process of federated continuous learning under the premise of ensuring user data privacy, and without saving the local sensitive data of the client and any additional datasets, this work innovatively proposes a federated continuous learning traffic classification method based on a diffusion model (a deep generative model based on probabilistic graphical models and statistical physics principles). It makes full use of the unique advantages of U-Net network architecture, and on this basis, a classifier component is added to realize the complex task of network traffic label classification.

4.1. Symmetric Deep Learning

Since we used the UNet deep autoencoder network model to train the traffic data classification model in this work, we need to construct a feature matrix as the input of the model. As shown in Figure 2, firstly we check for missing values and outliers in the dataset. Although the UNSW-NB15 dataset was cleaned when released, further cleaning may be necessary based on the traffic data analysis objective. Then, we convert the data format. Since the dataset is provided in CSV format, we need to use the Weka tool to convert the data to the corresponding format. Subsequently, we perform feature encoding on the data. Categorical features (such as attack types) are encoded using One-Hot encoding or label encoding to facilitate correct processing by the machine learning model. This dataset has a total of 49 features, and we evenly set the number of each attack and the distribution of the training and test sets.
Subsequently, the server distributes all parameters of the UNet network to each client. The client splits the local real data into a training set and a validation set in a certain proportion, updates the UNet network parameters, and local training begins. If the current task is the initial task, the parameters of the UNet network “decoder” module are frozen, and a training set based only on local real data is used. The cross-entropy loss is used to train the local classifier, especially the “classifier” parameters of UNet. Then, the local validation set is used for testing, and the optimal local classifier parameters are saved. The training method of the local classifier is as follows:
l c ( θ i λ , d i ) = c ( θ i λ ( d i r ) , l ( d i r ) ) .
where θ i is the model parameters updated by the ith client in the round, dir is the local real data of client i, l(dir) is the true label category corresponding to the data, θ i λ is the predicted label category by the model, and the loss is calculated using the cross-entropy formula.
If the current task is not the initial task, the client freezes the parameters of the UNet “classifier” module and uses the denoising diffusion model based on the UNet network as the generator. The labels of the local historical tasks are used to synthesize samples under the guidance of the gradients of the local UNet classifier of the previous task on the server. The synthesized samples are split into a training set and a validation set and added to the local training set. The local training set is updated from only real data to a mixture of real data and synthesized samples, and the validation set is the same. The parameters of the UNet network “decoder” module are frozen. First, the local classifier is trained based on the local training set containing real data and synthesized samples using the cross-entropy loss. Then, the local classifier is trained based on the local synthesized samples using the server model as the teacher model and the KL divergence. The training method of the local classifier is as follows. First, the local classifier is trained on real data and synthesized samples using Formula (2) to learn new knowledge. Then, the local model is restricted from deviating from the initial global model using local synthesized samples and Formula (3) to better utilize global knowledge:
l c ( θ i λ , d i ) = c ( θ i λ ( d i ) , l ( d i ) ) .
l k l ( θ i λ , θ G , d i s ) = k l ( θ i λ ( d i s ) , θ G ( d i s ) ) .
where di is the local training set data of client i, including real data and synthesized samples; Formula (3) is the KL divergence; dis is the local synthesized samples of client i, and θ G ( d i s ) is the predicted output category of the current client’s local synthesized samples by the global model aggregated in the previous communication round.
The local validation set is used for testing, and the optimal local classifier parameters are saved. The steps for training the diffusion model are as follows:
  • Given a time step lh, add noise to the original traffic data d0 to generate data d1.
  • Calculate the loss: Use the current UNet network parameters to predict the noise and calculate the loss between the predicted noise and the actual added noise, that is, L o s s μ , d 0 , l h , l [ | | μ μ ε ( d l h , l h , l ) | | 2 2 ] , where μ ε is the parameter of the UNet network except for the “classifier” module, d l h is the image after adding noise at step lh, and l is the category label.
  • Backpropagation and parameter update: Calculate the gradient through backpropagation and update the network parameters using the optimizer.
  • Repeat the above steps until the predetermined number of training rounds is reached. The goal is to optimize the UNet network’s ability to predict noise and learn richer feature representations of the images.
Then, we train the generator locally. During the initial communication, we freeze the “classifier” module of UNet and train the diffusion model, especially the “decoder” parameters of UNet, with local real data. If the current task is not the initial one, we also train with synthetic samples. At the end of this round of training for the client, we upload all parameters of the UNet network, including the classifier parameters and generator parameters, to the server. If the current task is not the initial one, we also upload the local synthetic samples obtained during the training of the local classifier. The server performs global average aggregation on all parameters of the UNet model. If the current task is not the initial one, we also train the classifier parameters with all local synthetic samples using the cross-entropy loss function. The global average aggregation is based on Formula (4), and the training of the classifier parameters with all local synthetic samples using the cross-entropy loss function is based on Formula (5):
θ G = i = 1 M n u m i λ i = 1 M n u m i λ θ i λ ,         μ G = i = 1 M n u m i λ i = 1 M n u m i λ μ i λ .
l c ( θ G , D i = 1 M d i s ) = c ( θ G ( D i = 1 M d i s ) , l ) .
where θ G is the global classifier model; θ i λ is the “classifier” parameters of client i at task λ; and u is the diffusion model, specifically the parameters of UNet excluding the “classifier” module. numiλ is the amount of real data of client i at task λ, i = 1 M n u m i λ is the total amount of real data of all clients at task λ, c is the cross-entropy loss, and D i = 1 M d i s represents all local synthetic data.
At this point, the federated training is complete, and the final prediction model is output. The parameters of the global model distributed by the server to the clients are those of the UNet network. The backbone network of the global model is the UNet network, with a classifier label output layer added after the middle layer of UNet. The classifier label output layer includes a normalization layer, a SiLU activation function, an AttentionPool2d layer, and a fully connected layer connected in sequence.
During the federated training iteration cycle, the objective of the method is to minimize the global loss function:
min θ λ i = 1 M n u m i λ i = 1 M n u m i λ λ = 1 max l λ ( θ i ( d i λ ) , l ) .
where λ = 1 m a x l λ ( θ i ( d i λ ) , l ) represents the loss of client i’s “classifier” model on all task data, aiming to make the model performance of each client reach the extreme value on all tasks. Further, the method designs a new UNet model that meets the needs of noise prediction and label classification, effectively reducing the model size.

4.2. Federated Knowledge Distillation Algorithm

To better optimize the network traffic identification process, we introduce and design a learning model based on federated knowledge distillation. Within the federated learning algorithm framework based on knowledge distillation technology designed in this method, we set the federated learning training process to contain a total of T rounds of iterations. Once the tth round of iteration has been reached and the clients have completed the local model update, the members of the client set x will upload the local model parameters and sample logits to the edge server together. Meanwhile, the members of the client set y will only upload the sample logits to the edge server. Subsequently, the edge server will perform the aggregation operation of the global model based on the received data, and the global model can be defined as follows.
W t + 1 = i = 1 I A i × W t + 1 i i = 1 I A i .
where W represents the model parameters. I is the total number of clients in the client set x in federated learning. For any client i, after completing the local training in the tth round, it will upload the updated model parameters, denoted as Wt+1. Meanwhile, Ai′ represents the size of the local dataset of the ith client. The server will perform the aggregation operation of the global soft labels (a non-binary representation of labels, usually in the form of probability distributions or confidence scores, rather than the traditional hard “0 or 1” labels) based on this information, and the specific process is as follows:
G t + 1 = i = 1 I A i × G t + 1 i i = 1 I A i + Δ G + 1 .
where Gt+1 represents the average value of the sample logits uploaded by client i in the client set x after completing the local training in the tth round, while ΔG+1 represents the local soft label gradient generated by the clients in the client set y at the end of the tth round, used to supplement and correct the global soft label information.
In the client set y, suppose that when the time threshold Te is reached, m clients in this group have uploaded the sample logits in the (r − 1)th round, and n clients have uploaded the sample logits in the rth round, then:
Δ G + 1 = i = 1 u A i × G t i i = 1 u A i i = 1 v A i × G t + 1 i i = 1 v A i .
During the local training process of the clients, they not only train the local model but also generate sample logits. In this process, the local model update method of the clients is similar to that in traditional federated learning, that is, each client uses the gradient descent algorithm to update the model weights.
W t + 1 i = W t i λ r i .
where λ represents the learning rate of local model training, and ri represents the gradient of the current model parameters W. If client I generates A′ sample logits in the local training of the Tth round, then finally it will upload the average value of these sample logits, denoted as Git+1, to the edge server, where:
G t + 1 i = a = 1 A G a i / A .
In the model training based on knowledge distillation in edge federated learning, both hard labels and soft labels can be utilized. The prediction model should not only be close to the hard labels of the local dataset samples but also to the corresponding soft labels. In the local training process of each client in federated learning, the model loss function is calculated to evaluate the performance of the model in approaching the hard labels and soft labels.
L o s s ( W ) = δ L s ( l b , G ) + ( 1 δ ) L k ( l b G ) .
where G represents the prediction result of the model, δ is a hyperparameter within the (0, 1) interval, lb′ represents the soft label, and lb represents the hard label. Ls() represents the cross-entropy loss function, used to measure the difference between the model prediction and the hard label. Meanwhile, Lk() represents the divergence loss function, used to measure the distribution difference between the model prediction and the soft label. If the number of local datasets in the ith client is di, then the specific calculation method of the loss function can be derived as follows:
L s ( l b , G ) = a = 1 d i l b log exp ( G a ) a exp ( G a ) .
L k ( l b G ) = a = 1 d i l b log exp ( G a ) a exp ( l b a ) exp ( l b a ) a exp ( G a ) .
Through derivation, the optimization function of this work can be calculated as follows:
min W Γ ( W ) i = 1 I f i δ L s ( W ) + ( 1 δ ) L k ( W ) , s . t . f i = A i i = 1 I A i .
From the optimization function, we can observe that the hyperparameter plays a decisive role in the knowledge distillation process, regulating the proportion of hard label loss and soft label loss in the total loss. Therefore, the value of δ has a profound impact on the training effect of the model.
In the initial stage and early period of model training, since the knowledge contained in the soft labels is relatively limited, the information of hard labels plays an absolute dominant role in improving the model’s performance. However, in the later stage of model training, when the model has fully absorbed the knowledge of hard labels and begins to gradually converge, the introduction of soft label knowledge will further promote the improvement of model performance.
Based on the above considerations, we designed a dynamic value assignment method for the hyperparameter δ . In the early stage of model training, we set the value of δ relatively large to enable the model to make more effective use of the information of hard labels. Subsequently, we gradually decrease the value of δ until it reaches a minimum threshold. This design aims to enable the model to effectively utilize the information of hard labels during the training process and introduce the knowledge of soft labels at the appropriate time, thereby enhancing the performance of the model.

4.3. Discussion

  • Explore symmetric networks to optimize network intrusion detection
To optimize the network intrusion detection problem, this work studies a symmetric federated optimization, which is a distributed machine learning paradigm that combines the advantages of symmetric networks and the characteristics of federated learning. The core of this paradigm is to improve the efficiency, security, and scalability of federated learning through symmetric architecture design. By leveraging the efficient pattern recognition capability of symmetrical networks, repetitive or similar attack patterns can be quickly identified through their structural symmetry, which is especially suitable for detecting known attack behaviors (e.g., DDoS attacks, port scanning, etc.). Its symmetric topology can process multi-node data in parallel and improve the detection efficiency. On the other hand, the regular structure of symmetric networks reduces parameter redundancy, and when encrypted communication data are detected, the decryption process can be simplified through a shared symmetric key mechanism, reducing the computational burden of real-time analysis. The balanced distribution property of the symmetric network can disperse the risk of a single point of attack, for example, the dynamic rotation of symmetric encryption keys can improve the system fault tolerance when resisting man-in-the-middle attacks. Furthermore, its structural consistency is helpful to locate abnormal nodes quickly. Our original intention is to deploy this method so that the modular design based on a symmetrical network can be horizontally extended to cover multiple critical nodes in large complex networks and realize global security state monitoring. At the same time, this symmetrical unified architecture enables security policies (such as access control rules, response mechanisms) to be updated synchronously in batches, reducing the complexity of management. However, intrusion detection systems based on symmetric networks still have limited ability to deeply detect encrypted traffic (such as SSL/TLS) and need to combine asymmetric encryption technology to achieve secure key distribution. Future research can explore the deep combination of a hybrid encryption model and dynamic symmetric network.
  • Explore federated learning to optimize network intrusion detection
In this work, federated learning provides unique privacy and security advantages in the field of network intrusion detection through its distributed nature. Local data training and parameter aggregation avoid centralized transfer of raw data, effectively preventing the risk of data leakage. Combined with differential privacy and homomorphic encryption technology, threats such as member inference attack and parameter backreasoning are further resisted. Distributed architecture reduces a single point of attack surface, and even if some nodes are compromised, the global model remains robust through secure aggregation. The real-time detection capability quickly responds to abnormal behavior through a local model, which outperforms the latency problem of traditional centralized analysis. This enables cross-institutional collaborative modeling without sharing raw data in data-sensitive environments. At the same time, considering the resource-constrained characteristics of IoT devices, the computational load is optimized, and the communication overhead is reduced. The introduction of federated knowledge distillation can support the dynamic addition of new nodes (such as new branches), and the local model can be quickly deployed to meet the requirements of real-time network intrusion detection. However, there are still some problems such as non-independent identically distributed data (non-IID) processing and malicious party detection, which need to be solved by combining a federated optimization algorithm and dynamic trust evaluation mechanism.

5. Results

The UNSW-NB15 and NSL-KDD datasets used in this work are widely employed intrusion detection datasets. The UNSW-NB15 contains 2.54 million records, with attack types classified into nine categories: DoS, Probe, U2R, R2L, Data, Fishing, Macro, Worm, and Backdoor [27]. Compared with KDD99 [28] and NSL-KDD [29], the UNSW-NB15 dataset has a larger total volume of data and more types of attacks. Moreover, although the data volume of the UNSW-NB15 dataset is simulated, its attack types are closer to real scenarios, which enables researchers to more accurately evaluate the performance of intrusion detection models. To verify the effectiveness and robustness of the IDS algorithm, the NSL-KDD dataset is also introduced for multi-classification experiments. Based on this dataset, five classifications are conducted, including the normal category as well as the four attack categories of DoS, Probe, U2R, and R2L. Finally, its performance is verified through the confusion matrix. The dataset is divided into a training dataset and a test dataset. The situation of the used dataset is shown in Table 1. To highlight the performance of this method in intrusion detection, all experiments were conducted on the same Linux workstation, which was equipped with an Intel Xeon processor, three 3.4 GHz CPUs, and 512 GB of memory. To simulate the scenario where multiple users jointly participate in the training of a federated model, this deep model is set to have five participants. The dataset is randomly divided into five parts and sent to the five parties, namely, a, b, c, d, and e. Each party then trains its model locally and conducts 50 rounds of iterative training simultaneously. This work tests the performance of the method based on the UNSW-NB15 dataset in terms of accuracy, precision, recall, and F1-score. The evaluation metrics are defined as shown in Table 2, where tp represents the number of attack samples in the attack class, fp represents the number of samples that are wrongly classified from the normal class to the attack class, fn represents the number of samples that are wrongly classified from the attack class to the normal class, and tn represents the number of samples that are correctly classified from the normal class to the normal class.
In the process of classification and recognition model training, the first convolution layer is set to have 32 convolution kernels. The size of each convolution kernel is set to 6 × 6, and the step size is set to 1. The matrix size of the max pooling layer is set to 2 × 2, and the step size is 2. The second convolutional layer has 64 kernels, and the other parameters are the same as the first layer. The Adam optimization method was introduced to dynamically update the step size. As can be seen in Table 3, during the initial training process, the network intrusion detection method proposed in this work achieves better classification results in terms of traffic data type identification with the increase of federated optimization training times. As shown in Table 4, the overall training time remains basically between 30 min and 58 min, ranging from 25 rounds to 50 rounds, while ensuring data security. As the data size increases, this method still maintains a relatively short training duration and achieves system response within a short period of time. The confusion matrix in Table 5 demonstrates the capability of this method in multi-class classification. Subsequently, as shown in Figure 3, we conduct a comparison of the accuracy of four advanced methods. We calculate the AUC values using the ROC curves of the four methods to more comprehensively verify the performance of this method. On the whole, the detection method proposed in this work can effectively identify the traffic data in complex networks. This is mainly because we chose the symmetrical federated learning model to reduce the negative impact of data islands (a state in which data cannot be shared due to scattered storage, different standards, or permission barriers) on the model accuracy. Different from traditional deep feature learning, this work improves the generalization ability and accuracy of the UNet model by making full use of their respective data advantages through collaborative training of multiple clients under the federated learning framework. This combination enables the model to better adapt to the target traffic data of different scales, thereby improving the accuracy of traffic data classification tasks. Furthermore, the UNet model under the federated learning framework can adapt to the changes of different traffic data distribution and network scenes, and the attention mechanism can be introduced to dynamically adjust the feature weights and receptive field size of the model so as to better integrate multi-scale features and improve the robustness of the model. In conclusion, more accurate traffic classification results can be obtained with the help of the powerful representation ability of the symmetrical deep network.
Figure 4 shows the results of a network traffic analysis using seven different methods under four different traffic scales in the UNSW-NB15 dataset. By applying the proposed federated knowledge distillation method based on a symmetrical network, a performance comparison is conducted with six other advanced and classical methods, including those based on traditional federated learning, stochastic gradient descent, cross-entropy loss function, Signature-Based Intrusion Detection (SID) [30], Anomaly-Based Intrusion Detection (AID) [31], and network-based intrusion detection (NID) [32].
Figure 4a–d presents the average results obtained from multiple sets of experiments. The results show that the proposed symmetrical federation model can achieve good results in the current scale of network traffic classification. Compared with other methods, the proposed method can still maintain high accuracy in the case of increasing training times. In terms of accuracy, the proposed method achieves 0.9355 when the size is 30% of the test set. When the data size is 50% of the test set, the proposed method obtains the best accuracy (0.9401). When the data size is 70% of the test set, the best accuracy (0.9461) is obtained, and the result is 0.9468 in 90% of the test set, which is the best result of IDS in all different sizes of datasets. Although there will be the disadvantage of individual traffic identification events at different data sizes, overall, we can still obtain relatively good results in the four cases. This significantly proves the effectiveness of our proposed symmetry federation model in network traffic analysis. This is mainly because the symmetric federated knowledge distillation model proposed in this work has stronger generalization performance. The teacher model usually has a stronger knowledge representation ability and deep learning ability. Through federated knowledge distillation, the student model can learn rich knowledge and data distribution information from the teacher model, thereby improving its generalization ability. This allows the student model to better adapt to different data distributions and task requirements of network traffic analysis. Furthermore, in many network traffic analysis scenarios, it may be necessary to transfer a model trained in one domain to another domain. Through federated transfer learning and knowledge distillation technology, Unet models can be transferred and adapted between different domains so as to realize cross-domain learning and knowledge sharing. This helps to extend the application scope of the model and improve the flexibility and practicability of the model.
For large-scale network intrusion detection, the accuracy of network traffic data recognition plays a crucial role in determining the security level of the system. For the intrusion detection recognition deep network model based on federated learning, the network structure, especially the number of convolutional network layers, has always been an open issue. This group of experiments compared the results calculated for different network structures under different test scales on the UNSW-NB15 dataset, setting the scales to 60%, 75%, and 90% of the test set. As shown in Figure 5a, from the experimental results, it can be found that when the number of convolutional network layers is three, the average accuracy obtained is better, which also proves that simply increasing the number of convolutional network layers does not necessarily lead to better traffic recognition results. In Figure 5b, when the hidden network layers are two and four, the proposed method does not achieve significantly better results compared to other methods, but overall, it still outperforms other methods. In Figure 5c, in the scenario where the target test dataset is large in scale, the advantage of the proposed method is most obvious when the number of hidden network layers is three, while when the number is four, its result is worse than that of the federated learning method. This is mainly because the symmetrical network designed in this work usually has structural symmetry and balance, which helps to extract and utilize the deep features in the traffic data. Combined with federated knowledge distillation, the rich knowledge in the large teacher model can be transferred to the small student model, thereby maintaining or improving the accuracy of traffic classification. The student model, by imitating the soft labels and hard labels of the teacher model, can learn more data distribution information and improve classification performance. Federated learning allows models to be trained collaboratively on the data of multiple clients, thereby learning a more extensive data distribution. Combined with the knowledge distillation technology, the symmetrical network can further enhance its generalization ability, enabling it to adapt to different traffic types and scenarios.

6. Conclusions

This work studies a combination of symmetric federated knowledge distillation and intrusion detection in complex network environments and proposes a complex network intrusion detection method based on symmetric federation optimization. By constructing the UNet symmetrical network model, deep feature learning and data reconstruction are realized. At the same time, the federated deep learning optimization algorithm is combined to make it more suitable for complex network environments and improve the effectiveness of detection. The experimental results show that the method proposed in this work can accurately classify the network traffic data, achieve effective network traffic identification, and thus ensure the safe operation of the system.
In future research work, we are committed to expanding the experimental scale and adjusting the number of hidden layers in the network to verify the effectiveness of the IDS method. At the same time, although the current symmetrical network generation models are effective, they still have some limitations in practical applications. Future research can be devoted to optimizing these generation algorithms, improving the degree of symmetry of the generated networks, and making them more consistent with the properties of real networks. At the same time, by introducing new constraints or optimization objectives under different requirements, networks with specific symmetry characteristics can be generated to meet the needs of different network application scenarios.

Author Contributions

Conceptualization, L.W. and C.W.; methodology, L.W. and C.W.; software, X.R. and C.W.; validation, L.W., X.R. and C.W.; formal analysis, L.W. and C.W.; investigation, L.W.; resources, L.W.; data curation, X.R. and C.W.; writing—original draft preparation, X.R. and C.W.; writing—review and editing, X.R. and C.W.; visualization, X.R. and C.W.; supervision, L.W.; project administration, L.W.; funding acquisition, L.W. All authors have read and agreed to the published version of the manuscript.

Funding

This work was financially supported by the General Project of Chongqing Technical Innovation and Application Development Special Foundation (CSTB2022TIAD-GPX0021), the General Project of Chongqing Natural Science Foundation (CSTB2022NSCQ-MSX1421), and the Chongqing Technology Innovation and Application Development Project (CSTB2022TIAD-CUX0001).

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

Acknowledgments

We thank the anonymous reviewers for their comments and suggestions, which helped to improve the quality of this paper and the presentation of the results.

Conflicts of Interest

Author Chunyi Wu is employed by the JINSHAN Science & Technology (Group) Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

  1. Khan, M.; Ghafoor, L. Adversarial Machine Learning in the Context of Network Security: Challenges and Solutions. J. Comput. Intell. Robot. 2024, 4, 51–63. [Google Scholar]
  2. Nazir, R.; Laghari, A.A.; Kumar, K.; David, S.; Ali, M. Survey on wireless network security. Arch. Comput. Methods Eng. 2022, 29, 1591–1610. [Google Scholar] [CrossRef]
  3. Bakhsh, A.; Khan, M.A.; Ahmed, F.; Alshehri, M.S.; Ali, H.; Ahmad, J. Enhancing IoT network security through deep learning-powered Intrusion Detection System. Internet Things 2023, 24, 100936. [Google Scholar] [CrossRef]
  4. Fang, W.; Tan, X.; Wilbur, D. Application of intrusion detection technology in network safety based on machine learning. Saf. Sci. 2020, 124, 104604. [Google Scholar] [CrossRef]
  5. Liang, W.; Xiao, L.; Zhang, K.; Tang, M.; He, D.; Li, K.-C. Data fusion approach for collaborative anomaly intrusion detection in blockchain-based systems. IEEE Internet Things J. 2021, 9, 14741–14751. [Google Scholar] [CrossRef]
  6. Shayganmehr, M.; Kumar, A.; Luthra, S.; Garza-Reyes, J.A. A framework for assessing sustainability in multi-tier supply chains using empirical evidence and fuzzy expert system. J. Clean. Prod. 2021, 317, 128302. [Google Scholar] [CrossRef]
  7. Goel, A.; Goel, A.K.; Kumar, A. The role of artificial neural network and machine learning in utilizing spatial information. Spat. Inf. Res. 2023, 31, 275–285. [Google Scholar] [CrossRef]
  8. Liaqat, S.; Dashtipour, K.; Arshad, K.; Assaleh, K.; Ramzan, N. A hybrid posture detection framework: Integrating machine learning and deep neural networks. IEEE Sens. J. 2021, 21, 9515–9522. [Google Scholar] [CrossRef]
  9. Dong, S.; Wang, X.; Abbas, K. A survey on deep learning and its applications. Comput. Sci. Rev. 2021, 40, 100379. [Google Scholar] [CrossRef]
  10. Kan, X.; Fan, Y.; Fang, Z.; Cao, L.; Xiong, N.N.; Yang, D.; Li, X. A novel IoT network intrusion detection approach based on adaptive particle swarm optimization convolutional neural network. Inf. Sci. 2021, 568, 147–162. [Google Scholar] [CrossRef]
  11. Zhang, Z.; Tao, X.; Xiong, X. A Research of Investment Cooperation Mechanism of Network Intrusion Detection and Defense Subsystems in the Supply Chain. Ind. Eng. J. 2019, 22, 1. [Google Scholar]
  12. Oliveira, N.; Praça, I.; Maia, E.; Sousa, O. Intelligent cyber attack detection and classification for network-based intrusion detection systems. Appl. Sci. 2021, 11, 1674. [Google Scholar] [CrossRef]
  13. Kumar, R.; Kumar, P.; Tripathi, R.; Gupta, G.P.; Garg, S.; Hassan, M.M. A distributed intrusion detection system to detect DDoS attacks in blockchain-enabled IoT network. J. Parallel Distrib. Comput. 2022, 164, 55–68. [Google Scholar] [CrossRef]
  14. Kunang, Y.N.; Nurmaini, S.; Stiawan, D.; Suprapto, B.Y. Attack classification of an intrusion detection system using deep learning and hyperparameter optimization. J. Inf. Secur. Appl. 2021, 58, 102804. [Google Scholar] [CrossRef]
  15. Pinto, A.; Herrera, L.-C.; Donoso, Y.; Gutierrez, J.A. Survey on intrusion detection systems based on machine learning techniques for the protection of critical infrastructure. Sensors 2023, 23, 2415. [Google Scholar] [CrossRef] [PubMed]
  16. Thirimanne, S.P.; Jayawardana, L.; Yasakethu, L.; Liyanaarachchi, P.; Hewage, C. Deep neural network based real-time intrusion detection system. SN Comput. Sci. 2022, 3, 145. [Google Scholar] [CrossRef]
  17. Long, Y.-S.; Zhai, Z.-M.; Tang, M.; Liu, Y.; Lai, Y.-C. A rigorous and efficient approach to finding and quantifying symmetries in complex networks. arXiv 2021, arXiv:2108.02597. [Google Scholar]
  18. Kontoyiannis, I.; Lim, Y.H.; Papakonstantinopoulou, K.; Szpankowski, W. Symmetry and the entropy of small-world structures and graphs. In Proceedings of the 2021 IEEE International Symposium on Information Theory (ISIT), Melbourne, Australia, 12–20 July 2021; pp. 3026–3031. [Google Scholar]
  19. Smidt, T.E.; Geiger, M.; Miller, B.K. Finding symmetry breaking order parameters with euclidean neural networks. Phys. Rev. Res. 2021, 3, L012002. [Google Scholar] [CrossRef]
  20. Dervisevic, E.; Tankovic, A.; Fazel, E.; Kompella, R.; Fazio, P.; Voznak, M.; Mehic, M. Quantum Key Distribution Networks-Key Management: A Survey. ACM Comput. Surv. 2025, 57, 1–36. [Google Scholar] [CrossRef]
  21. Kapoor, J.; Thakur, D. Analysis of symmetric and asymmetric key algorithms. In ICT Analysis and Applications; Springer: Singapore, 2022; pp. 133–143. [Google Scholar]
  22. Liu, Z.; Luo, X.; Zhou, M. Symmetry and graph bi-regularized non-negative matrix factorization for precise community detection. IEEE Trans. Autom. Sci. Eng. 2023, 21, 1406–1420. [Google Scholar] [CrossRef]
  23. Yaqoob, M.M.; Alsulami, M.; Khan, M.A.; Alsadie, D.; Saudagar, A.K.J.; AlKhathami, M.; Khattak, U.F. Symmetry in privacy-based healthcare: A review of skin cancer detection and classification using federated learning. Symmetry 2023, 15, 1369. [Google Scholar] [CrossRef]
  24. Pan, Z.; Wu, Q.; Jiang, H.; Xia, M.; Luo, X.; Zhang, J.; Lin, Q.; Rühle, V.; Yang, Y.; Lin, C.Y.; et al. Llmlingua-2: Data distillation for efficient and faithful task-agnostic prompt compression. arXiv 2024, arXiv:2403.12968. [Google Scholar]
  25. Shin, H.; Choi, D.-W. Teacher as a lenient expert: Teacher-agnostic data-free knowledge distillation. In Proceedings of the AAAI Conference on Artificial Intelligence, Vancouver, BC, Canada, 20–27 February 2024; Volume 38, pp. 14991–14999. [Google Scholar]
  26. Ni, J.; Liu, Z.; Wang, S.; Xi, M.; XI, W. TimeDistill: Efficient Long-Term Time Series Forecasting with MLP via Cross-Architecture Distillation. arXiv 2025, arXiv:2502.15016. [Google Scholar]
  27. Sallam, Y.F.; El-Nabi, S.A.; El-Shafai, W.; Ahmed, H.E.-D.H.; Saleeb, A.; El-Bahnasawy, N.A.; El-Samie, F.E.A. Efficient implementation of image representation, visual geometry group with 19 layers and residual network with 152 layers for intrusion detection from UNSW-NB15 dataset. Secur. Priv. 2023, 6, e300. [Google Scholar] [CrossRef]
  28. Al-Daweri, M.S.; Ariffin, K.A.Z.; Abdullah, S.; Senan, M.F.E.M. An analysis of the KDD99 and UNSW-NB15 datasets for the intrusion detection system. Symmetry 2020, 12, 1666. [Google Scholar] [CrossRef]
  29. Abrar, I.; Ayub, Z.; Masoodi, F.; Bamhdi, A.M. A machine learning approach for intrusion detection system on NSL-KDD dataset. In Proceedings of the 2020 International Conference on Smart Electronics and Communication (ICOSEC), Trichy, India, 10–12 September 2020; pp. 919–924. [Google Scholar]
  30. Ditcheva, B.; Fowler, L. Signature-based Intrusion Detection; University of North Carolina: Chapel Hill, NC, USA, 2005. [Google Scholar]
  31. Jyothsna, V.; Prasad, V.V.R. A review of anomaly based intrusion detection systems. Int. J. Comput. Appl. 2011, 28, 26–35. [Google Scholar] [CrossRef]
  32. Vigna, G.; Kemmerer, R.A. NetSTAT: A network-based intrusion detection system. J. Comput. Secur. 1999, 7, 37–71. [Google Scholar] [CrossRef]
Figure 1. The view of the system architecture. The color of the neural network model in the local model is to represent the intensity of the network structure. Servers and cloud data centers are colored in gray, and the data sets are also colored in gray.
Figure 1. The view of the system architecture. The color of the neural network model in the local model is to represent the intensity of the network structure. Servers and cloud data centers are colored in gray, and the data sets are also colored in gray.
Symmetry 17 00952 g001
Figure 2. System architecture of the proposed federated learning method based on improved UNet network. Green, red, orange and blue represent the neural network model, while gray represents the functional modules and the execution operations.
Figure 2. System architecture of the proposed federated learning method based on improved UNet network. Green, red, orange and blue represent the neural network model, while gray represents the functional modules and the execution operations.
Symmetry 17 00952 g002
Figure 3. Comparison of ROC curves of the four methods for network traffic analysis: AUC (Fed) = 0.7601, AUC (SGD) = 0.7960, AUC (Cross) = 0.8151, AUC (IDS) = 0.8682.
Figure 3. Comparison of ROC curves of the four methods for network traffic analysis: AUC (Fed) = 0.7601, AUC (SGD) = 0.7960, AUC (Cross) = 0.8151, AUC (IDS) = 0.8682.
Symmetry 17 00952 g003
Figure 4. Performance comparison of seven different methods for network traffic analysis under different data sizes.
Figure 4. Performance comparison of seven different methods for network traffic analysis under different data sizes.
Symmetry 17 00952 g004
Figure 5. Comparison of accuracy when the number of hidden layers is two, three, and four.
Figure 5. Comparison of accuracy when the number of hidden layers is two, three, and four.
Symmetry 17 00952 g005
Table 1. Overview of the datasets used in the experiment.
Table 1. Overview of the datasets used in the experiment.
DatasetAmount of DataNormal Sample (%)Traffic ClassNumber of Features
UNSW-NB15700,00196.831043
NSL-KDD48,51751.88541
Table 2. Measuring accuracy of intrusion data classification.
Table 2. Measuring accuracy of intrusion data classification.
IndexesCalculation Formulas
Precision P = t p t p + f p
Recall R = t p t p + f n
F-measure F = 2 × P × R P + R
Accuracy A = t p + t n T o t a l
Table 3. The effect of intrusion detection on UNSW-NB15.
Table 3. The effect of intrusion detection on UNSW-NB15.
The Number of Rounds of Federated LearningpRFA
250.57780.56520.57140.5142
300.63830.62500.63160.5749
350.74470.72920.73690.6964
400.76600.75000.75790.7206
Table 4. Effectiveness of traffic intrusion detection model IDS on UNSW-NB15.
Table 4. Effectiveness of traffic intrusion detection model IDS on UNSW-NB15.
MetricsTraining TimeTesting TimeResponse Time
Data Size
30%30.2 min6.9 s9.7 s
50%42.7 min7.7 s11.3 s
70%51.6 min9.1 s12.9 s
90%58.5 min10.2 s14.4 s
Table 5. Confusion matrix based on the experiment with symmetric federated optimization model on NSL-KDD.
Table 5. Confusion matrix based on the experiment with symmetric federated optimization model on NSL-KDD.
PredictionNormalDoSProbeR2LU2R
Actual Value
Normal969120000
DoS0567842174
Probe0321055127
R2L131120719579
U2R0051319
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wang, L.; Ren, X.; Wu, C. An Intrusion Detection Method Based on Symmetric Federated Deep Learning in Complex Networks. Symmetry 2025, 17, 952. https://doi.org/10.3390/sym17060952

AMA Style

Wang L, Ren X, Wu C. An Intrusion Detection Method Based on Symmetric Federated Deep Learning in Complex Networks. Symmetry. 2025; 17(6):952. https://doi.org/10.3390/sym17060952

Chicago/Turabian Style

Wang, Lei, Xuanrui Ren, and Chunyi Wu. 2025. "An Intrusion Detection Method Based on Symmetric Federated Deep Learning in Complex Networks" Symmetry 17, no. 6: 952. https://doi.org/10.3390/sym17060952

APA Style

Wang, L., Ren, X., & Wu, C. (2025). An Intrusion Detection Method Based on Symmetric Federated Deep Learning in Complex Networks. Symmetry, 17(6), 952. https://doi.org/10.3390/sym17060952

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop