Federated Intrusion Detection via Unidirectional Serialization and Multi-Scale 1D Convolutions with Attention Reweighting

Li, Wenqing; Gao, Di; Zhang, Tianrong

doi:10.3390/fi18030117

Open AccessArticle

Federated Intrusion Detection via Unidirectional Serialization and Multi-Scale 1D Convolutions with Attention Reweighting

by

Wenqing Li

¹

,

Di Gao

¹ and

Tianrong Zhang

^2,*

¹

Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou 310000, China

²

College of Information Science and Technology, Zhejiang Shuren University, Hangzhou 310015, China

^*

Author to whom correspondence should be addressed.

Future Internet 2026, 18(3), 117; https://doi.org/10.3390/fi18030117

Submission received: 30 January 2026 / Revised: 23 February 2026 / Accepted: 24 February 2026 / Published: 26 February 2026

(This article belongs to the Section Cybersecurity)

Download

Browse Figures

Versions Notes

Abstract

Deployed in distributed organizations and edge networks, contemporary intrusion detection increasingly requires high-performing models without centralizing sensitive traffic logs. This study presents a lightweight federated intrusion detection framework that integrates (i) unidirectional serialization to convert tabular flow records into short sequences, (ii) multi-scale one-dimensional convolutions to capture heterogeneous temporal–statistical patterns at different receptive fields, and (iii) an attention-based reweighting module that emphasizes informative feature channels prior to classification. A sample-size-weighted FedAvg aggregation protocol is used to train a global detector without transferring raw data. Experiments on three widely used benchmarks (UNSW-NB15, KDD Cup 99, and NSL-KDD) under multiple client configurations report consistently high detection effectiveness, with peak accuracies of 99.38% (UNSW-NB15), 99.86% (KDD Cup 99), and 99.02% (NSL-KDD), alongside strong precision, recall, and F1 scores. In addition, the proposed framework is quantitatively benchmarked on UNSW-NB15 against two recent federated intrusion detection baselines, FedMSP-SPEC and a multi-view federated CAE-NSVM model, demonstrating improvements of more than 10 percentage points in macro F1-score while retaining a compact architecture. The manuscript further specifies a concrete threat model, clarifies the client data partitioning strategy and Non-IID quantification, and provides a reproducibility protocol (hyperparameters, random seeds, and evaluation procedures) to facilitate independent verification.

Keywords:

federated learning; intrusion detection; 1D convolution; attention mechanism; Non-IID data; privacy-preserving analytics

Graphical Abstract

1. Introduction

Network intrusion detection systems (NIDS) remain a critical defensive mechanism for enterprise and critical-infrastructure networks, where encrypted payloads, multi-layered protocols, and rapidly evolving attack patterns complicate the design of static rule sets and signature-based tools. Data-driven and learning-based detectors built on flow-level telemetry and enriched statistical descriptors have therefore become central to contemporary intrusion detection practice [1,2,3]. In parallel, the operational context has shifted towards heterogeneous distributed settings in which security appliances, gateways, and organizational domains collect data locally under strict privacy and compliance constraints.

Federated learning (FL) provides a natural paradigm for such environments by coordinating local training across multiple clients and aggregating model updates without centralizing raw traffic logs [4,5,6,7]. In principle, FL can enable cross-silo collaboration across organizations or administrative domains while respecting regulatory constraints and internal data-governance policies [4,5,6,7]. However, practical deployments in intrusion detection must contend with classical FL challenges—client data heterogeneity, communication constraints, and privacy risks—under the additional constraints of security analytics workloads [4,6,7].

First, client data distributions in network security are almost never independent and identically distributed (IID). Different organizations, subnets, or service roles observe distinct protocol mixes, address spaces, and attack profiles, often with significant label skew. This Non-IID nature of local data can degrade FL convergence and global performance if not explicitly modeled or validated [7]. Second, resource constraints in edge and gateway devices limit feasible model complexity and training workloads, amplifying the need for architectures that are not only accurate but also parameter-efficient and communication-efficient [5]. Third, while FL reduces direct exposure of raw data, model updates may still leak sensitive information and remain vulnerable to inference or poisoning attacks, which has motivated a rapidly growing literature on privacy-preserving and robust FL [8,9,10,11,12]. Finally, many existing privacy wrappers and secure aggregation schemes introduce additional computational and communication overhead that must be balanced against real-time detection and operational budgets [11,12].

In parallel, deep learning-based intrusion detection has progressed from conventional fully connected architectures to more sophisticated convolutional, recurrent, and attention-based models. Surveys summarize the evolution from classical machine learning-based IDS to modern deep learning systems on canonical benchmarks [1,2,3]. Recent work has explored specialized architectures such as GAN-based minority oversampling for intrusion detection [13], deep hybrids that combine deformable convolutions with bidirectional recurrent layers and attention [14], and residual networks with attention blocks for intrusion detection in IoT settings [15]. These models often achieve very high accuracy on canonical datasets but are typically trained in centralized settings and may have substantial computational footprints.

This work focuses on designing and empirically characterizing a federated intrusion detection framework that remains competitive with strong centralized baselines while explicitly addressing these constraints. At the architectural level, we encode tabular flow records as unidirectional sequences and process them with multi-scale one-dimensional (1D) convolutions, followed by a channel-wise attention module that reweights salient features before classification. At the training-protocol level, we adopt sample-size-weighted FedAvg aggregation and construct Non-IID client datasets using a controllable Dirichlet-based partitioning scheme that approximates label skew in edge deployments [4,7]. Throughout, we emphasize explicit threat modeling, realistic Non-IID scenarios, and transparent reporting of hyperparameters and evaluation protocols.

The main contributions of this study are fourfold:

(1): Serialized architecture for flow-based federated IDS. We propose a unidirectional serialization scheme that converts tabular flow records into short ordered sequences and feeds these into multi-scale 1D convolutional filters. This design captures local correlations among normalized flow features without resorting to heavy sequence models (e.g., recurrent or transformer architectures), yielding a compact model suitable for edge deployment.
(2): Attention-based channel reweighting for heterogeneous traffic features. We introduce a channel-wise attention module that learns to emphasize informative feature channels prior to classification. In contrast to earlier deep learning NIDS that rely either on fixed feature selection or unweighted convolutions [1,2,3,13,14,15], this mechanism adapts to distributional differences across clients and datasets while maintaining a modest parameter count.
(3): Federated training under explicitly quantified Non-IID partitions. We train the proposed model under a cross-silo FL setting with sample-size-weighted FedAvg and construct client datasets using Dirichlet-based label-skew partitioning. We quantify Non-IID severity via the Jensen–Shannon divergence between local and global label distributions [7,16] and report performance under multiple client configurations, thereby making explicit the degree of heterogeneity under which the reported accuracies are obtained.
(4): Reproducibility protocol and deployment-oriented analysis. Beyond reporting standard detection metrics, we detail the model configuration, hyperparameters, random seeds, data splitting ratios, and evaluation protocol, including client sampling, communication rounds, and stopping criteria. We further analyze training time and discuss communication behavior as the number of clients grows, and we discuss deployment considerations and limitations in terms of outdated benchmarks, potential dataset artifacts, and unmodeled adversarial threats [17,18]. From a system perspective, the term deployment-oriented is used in this study in a precise and limited sense rather than as a generic label.

Compared with existing FL-based intrusion detection frameworks that typically apply centralized CNNs or autoencoders to tabular features, or focus on specific IoT settings [19,20,21,22,23,24,25,26,27], this work concentrates on (i) a simple yet effective serialization plus 1D-CNN-with-attention architecture adapted to flow-based datasets; (ii) a clearly specified Non-IID scenario constructed via Dirichlet label skew and quantified through Jensen–Shannon divergence; and (iii) an explicit threat model and limitations discussion that tie together privacy, robustness, and benchmarking concerns. The resulting framework demonstrates that it is possible to maintain near state-of-the-art accuracy on widely used benchmarks without centralizing traffic logs, while also providing a transparent experimental design that supports independent replication.

2. Related Work

2.1. Deep Learning for Intrusion Detection

A large body of work has applied classical machine learning and deep learning to network intrusion detection. Early surveys summarize decision trees, support vector machines, and ensemble methods applied to canonical datasets [1,2]. With the rise in deep learning, convolutional and recurrent neural networks, autoencoders, and hybrid architectures have been used to extract complex patterns from flow-level features and raw traffic [3].

Recent work has explored more specialized architectures. For example, GAN-based minority oversampling aims to alleviate class imbalance in intrusion detection by synthesizing realistic minority samples [13]. Hybrid models combining deformable convolutions, bidirectional LSTMs, and attention mechanisms further improve detection performance by jointly capturing local spatial structure and longer-range temporal dependencies [14]. Residual networks with attention blocks have also been proposed for intrusion detection in IoT settings, reporting strong performance on modern datasets [28]. These developments underscore the effectiveness of attention-augmented and multi-scale architectures, but they are largely evaluated in centralized settings and can be relatively heavy for edge devices.

The framework in this paper builds upon these insights by adopting a multi-scale 1D convolutional architecture enriched with a channel-wise attention mechanism, and by adapting it to a federated setting with serialized flow features and parameter-efficient design intended for cross-silo deployments.

2.2. Benchmark Datasets and Evaluation Pitfalls

Standard datasets such as KDD Cup 99, NSL-KDD, and UNSW-NB15 remain widely used for benchmarking intrusion detection systems. However, several surveys have pointed out that these benchmarks are outdated, exhibit class imbalance, and contain artifacts that can inflate performance metrics [17,18]. For example, duplicated records, unrealistic attack mixtures, and static network topologies may bias models toward memorizing superficial patterns rather than learning robust discriminative features.

These issues have at least three implications for this study. First, reported accuracies, especially when close to 100%, must be interpreted cautiously and complemented with additional metrics (precision, recall, F1-score, and confusion matrices) to better capture behavior on minority classes. Second, train–test splits should be constructed to minimize information leakage, for instance by ensuring that identical flows do not appear in both sets. Third, empirical findings must be positioned as relative comparisons under shared limitations rather than claims of absolute deployment readiness. These considerations are incorporated into the design and interpretation of the experiments and revisited in the discussion.

2.3. Federated Learning for Security Analytics

Federated learning has been investigated for a variety of security and network-management tasks, including intrusion detection, malware detection, and privacy-preserving threat analytics [4,5,6,7,19,20,21]. Surveys review FL architectures, aggregation schemes, and communication strategies, and emphasize the challenges introduced by Non-IID data, system heterogeneity, and adversarial threats [4,5,6,7,8,9,10,11].

In the context of intrusion detection, FL has been used to coordinate local learners across routers, gateways, and end hosts while avoiding the centralization of raw traffic logs [4,5,6,7,8,9,10,11,20,22]. Early FL-based IDS prototypes primarily targeted IoT and cyber–physical environments. For example, F-NIDS integrates convolutional neural networks into a cross-silo FL framework over the NF-ToN-IoT-v2 dataset, demonstrating that distributed training can achieve competitive detection performance while respecting data locality [19]. Devine et al. employ federated SVM variants on the CIC-IoT2023 benchmark to study communication cost, client drift, and heterogeneous feature spaces in large-scale IoT deployments [23]. These works confirm the practical viability of FL for intrusion detection in modern IoT settings but do not report results on the UNSW-NB15 dataset. More recent studies explicitly evaluate FL-based IDS on UNSW-NB15, including FedMSP-SPEC, which couples multi-scale parallel convolution with adaptive soft prediction clustering to mitigate Non-IID label distributions [26], and a multi-view FL framework based on joint training of convolutional autoencoders and neural SVMs over different feature views [27]. Compared with these systems, the present work deliberately adopts a simpler yet parameter-efficient 1D-CNN backbone with unidirectional serialization and channel attention, and emphasizes an explicit Non-IID quantification and threat-model discussion rather than architectural sophistication alone.

2.4. Security and Privacy in Federated Learning

Security and privacy concerns in FL have been extensively studied. Surveys of privacy-preserving federated learning and related techniques catalogue a wide range of attacks, including membership inference, gradient inversion, and model poisoning, along with defences such as secure aggregation, differential privacy, and robust aggregation rules [8,9,10,11,12]. Complementary work examines fairness, bias, and accountability in FL, highlighting that data heterogeneity and skew can introduce systematic performance disparities across clients or classes [8,9,10].

Further studies propose cryptographic and protocol-level defences, including homomorphic encryption, secure multi-party computation, and secure aggregation schemes [11,12]. These mechanisms can substantially reduce information leakage but may increase communication overhead and computational cost, especially for resource-constrained edge devices. Robust aggregation rules and anomaly detection techniques address poisoning and backdoor attacks by down-weighting or filtering suspicious updates [28,29,30].

The present work does not implement such defences directly. Instead, it adopts a baseline FedAvg protocol and explicitly frames the absence of secure aggregation and robust aggregation mechanisms as a limitation. Section 5 discusses how the proposed architecture could be combined with existing privacy-preserving and attack-resilient FL methods to mitigate the identified risks.

3. Methodology

3.1. Problem Setting and Notation

We consider a cross-silo federated learning scenario with

K

participating clients (e.g., organizations, subnets, or security appliances) coordinated by a central aggregator. Client

k

holds a local dataset

D_{k}

of flow-level records and associated labels, with

n_{k}

samples, and the total sample count is

n = \sum_{k = 1}^{K} n_{k}

. Each input

x_{i}^{k} \in R^{F}

is an

F

-dimensional feature vector derived from flow attributes, and

y_{i}^{k}

denotes the intrusion class label.

The global objective minimized by FL can be written as

\underset{θ}{m i n} F (θ) = \sum_{k = 1}^{K} \frac{n_{k}}{n} F_{k} (θ), F_{k} (θ) = \frac{1}{n_{k}} \sum_{i = 1}^{n_{k}} l (f (x_{i}^{k}; θ), y_{i}^{k}),

(1)

where

θ

denotes the full set of model parameters (convolutional, attention, and classifier parameters),

f (\cdot; θ)

is the shared model, and

l (\cdot, \cdot)

denotes the cross-entropy loss. The goal is to learn a global model

f (\cdot; θ^{⋆})

that achieves strong detection performance across all clients while preserving data locality and respecting Non-IID data distributions, resource limitations, and privacy considerations.

3.2. Unidirectional Serialization of Tabular Flow Records

Flow-based intrusion detection datasets such as UNSW-NB15, KDD Cup 99, and NSL-KDD are typically provided in tabular form. Rather than treating each record as an unordered feature vector, we adopt a unidirectional serialization step that converts each normalized feature vector into a short sequence.

Let

v \in R^{F}

denote a normalized feature vector. We define a serialization map

S : R^{F} \to R^{L \times D}

that assigns a fixed ordering to features based on semantically meaningful groupings (e.g., basic header fields, content features, time-based statistics) and arranges them into a one-dimensional sequence:

S^{(0)} = S (v) \in R^{L \times D},

(2)

where

L

is the sequence length and

D

is the number of channels at the input stage (for instance,

D = 1

when each position holds a single scalar feature). The superscript

(0)

indicates the initial serialized representation before convolutional processing, and

s

will denote positions along the sequence when needed.

The induced “temporal” structure is a modelling device rather than a direct representation of packet-level time. While neighbouring elements in the sequence may capture related attributes or statistics, the ordering does not correspond to real time. As a result, the model may learn useful local correlations between feature groups, but it does not recover genuine temporal dynamics across flows. This approximation can introduce biases: correlations discovered between adjacent features in the serialized sequence may partly reflect the chosen ordering rather than inherent network behaviour [17,18]. Nevertheless, Section 5 explicitly acknowledges that this serialization remains a simplification.

3.3. Multi-Scale 1D Convolutional Backbone

Given the serialized sequence

S^{(0)} \in R^{L \times D}

, we apply a stack of three one-dimensional convolutional blocks with kernel sizes

3

,

5

, and

7

and

64

,

128

, and

256

output channels in the three blocks, respectively. Let

H^{(b)}

denote the feature map after block

b

. At convolutional block

b

, we use a set of

M_{b}

parallel 1D convolutions with kernel sizes

k_{b, 1}, \dots, k_{b, M_{b}}

and

C_{b}

output channels each:

H^{(b)} = B N (M a x P o o l (‖_{m = 1}^{M_{b}} σ ({C o n v 1 D}_{k_{b, m}, C_{b}} (H^{(b - 1)})))),

(3)

where

σ (\cdot)

denotes a ReLU activation,

{C o n v 1 D}_{k_{b, m}, C_{b}}

is a 1D convolution with kernel size

k_{b, m}

and

C_{b}

filters,

M a x P o o l (\cdot)

denotes max-pooling along the sequence dimension,

B N (\cdot)

denotes batch normalization, and

‖

denotes concatenation along the channel dimension. The input to the first block is

H^{(0)} = S^{(0)}

.

Across blocks, max-pooling and batch normalization stabilize training and reduce dimensionality. After the final block, we apply global average pooling over the sequence dimension to obtain a fixed-length representation:

z = G A P (H^{(B)}) \in R^{p},

(4)

where

B = 3

is the number of blocks,

G A P (\cdot)

denotes global average pooling along the sequence dimension, and

p

is the resulting number of channels.

This multi-scale design enables the model to capture heterogeneous patterns such as local bursts of anomalous activity or broader co-occurrence structures among feature groups. In contrast to deeper two-dimensional or recurrent architectures, it remains computationally modest and well-suited to resource-constrained clients.

3.4. Channel-Wise Attention Reweighting

To adaptively emphasize informative channels, we employ a channel-wise attention module that follows the same general idea as channel-wise attention mechanisms used in recent deep residual IDS backbones [15]. Given the pooled representation

z \in R^{p}

, we compute attention weights

a \in R^{p}

via a small two-layer multilayer perceptron (MLP) with a bottleneck:

a = σ (W_{2} ϕ (W_{1} z)),

(5)

where

W_{1} \in R^{r \times p}

and

W_{2} \in R^{p \times r}

are learnable weight matrices,

r < p

is the bottleneck dimension,

ϕ (\cdot)

denotes the ReLU activation, and

σ (\cdot)

denotes the element-wise sigmoid function. The reweighted representation is then

\tilde{z} = a ⊙ z,

(6)

with

⊙

denoting element-wise multiplication. Finally,

\tilde{z}

is passed through one or more fully connected layers and a softmax layer to obtain class probabilities.

This attention module adds only a modest number of parameters and computations relative to the base CNN but provides a flexible mechanism to adjust channel importance as distributions shift across clients and datasets.

3.5. Federated Optimization via FedAvg

We train the global model using the standard FedAvg algorithm [4]. At communication round

t

, the server broadcasts the current global parameters

θ^{(t)}

to a subset

K^{(t)} \subseteq {1, \dots, K}

of clients. Each client

k \in K^{(t)}

initializes its local model with

θ^{(t)}

and performs

E

epochs of mini-batch stochastic gradient descent on its local dataset

D_{k}

, yielding updated parameters

θ_{k}^{(t + 1)}

. The server aggregates the local updates using a sample-size-weighted average:

θ^{(t + 1)} = \sum_{k \in K^{(t)}} \frac{n_{k}}{\sum_{j \in K^{(t)}} n_{j}} θ_{k}^{(t + 1)} .

(7)

This process repeats for at most 60 communication rounds or until convergence. In all experiments, each client performs one local epoch (

E = 1

) per round with a mini-batch size of 128. The combination of a relatively shallow architecture and limited local epochs contributes to the overall computational compactness of the model.

3.6. Non-IID Client Partitioning and Quantification

To simulate heterogeneous edge deployments, we partition each dataset across clients using a Dirichlet distribution over class labels. For a given concentration parameter

α

, we sample a label proportion vector

π_{k}

for each client

k

and assign each training sample to a client according to these proportions. Smaller

α

values produce more skewed distributions, whereas larger values approach IID. In this study,

α

is chosen to induce moderate but non-trivial label skew consistent with cross-silo deployments [7].

To quantify the degree of Non-IID, we compute the Jensen–Shannon divergence (JSD) between each client’s empirical label distribution and the global label distribution [16]. Let

p

denote the global class distribution and

q_{k}

denote the distribution on client

k

. The JSD between

p

and

q_{k}

is given by

J S D (p ∥ q_{k}) = \frac{1}{2} K L (p ∥ m_{k}) + \frac{1}{2} K L (q_{k} ∥ m_{k}), m_{k} = \frac{1}{2} (p + q_{k}),

(8)

where

K L (\cdot ∥ \cdot)

is the Kullback–Leibler divergence. Average and maximum JSD across clients are reported for each configuration, providing a quantitative measure of heterogeneity. While Dirichlet-based partitioning primarily models label skew, real deployments may exhibit more complex feature distribution shifts; the partitions are therefore treated as a controlled but simplified representation of Non-IID conditions, and this limitation is revisited in Section 5.

3.7. Threat Model, Privacy Considerations, and Limitations

We adopt a standard centralized FL architecture in which an honest-but-curious server coordinates training and clients are honest but vulnerable to observation and compromise. Under this model, the server and potential adversaries can observe model updates and aggregate parameters but cannot directly access raw local data. We do not assume Byzantine-resilient behaviour or robust aggregation in the current instantiation, and we do not implement explicit defences against model poisoning or inference attacks [8,9,10,11,28,29,30].

We consider inference, model-poisoning, and implementation-level attacks as potential risks to the FL process, but their mitigation is beyond the scope of this architectural study.

Because this work focuses on architectural design and Non-IID performance, these threats are treated as limitations rather than fully addressed challenges. Section 5 discusses possible integration with secure aggregation, differential privacy, and robust aggregation techniques, which can be layered on top of the proposed architecture.

3.8. Reproducibility Protocol

To facilitate independent verification, we follow standard best practices for reproducible ML experiments [17,18]: publicly available datasets with fixed pre-processing, fixed train–validation–test splits and random seeds per dataset, and a consistent evaluation pipeline logging all metrics and training curves. Models are implemented in a mainstream DL framework with synchronous simulated clients.

4. Experiments and Results

4.1. Datasets, Client Configurations, and Experimental Setup

The proposed framework is evaluated on UNSW-NB15, KDD Cup 99 (10%), and NSL-KDD, three canonical intrusion detection benchmarks that cover a range of traffic characteristics, attack categories, and class imbalances [17,18]. For each dataset, standard preprocessing pipelines and class mappings are followed, merging granular attack types into coarser categories where appropriate to reduce extreme sparsity.

Clients are simulated by partitioning the training set into

K

disjoint subsets using the Dirichlet-based label-skew procedure described in Section 3.6, with the concentration parameter α fixed at 1 to produce moderate but non-trivial heterogeneity. The number of clients is varied across datasets to balance realism and statistical robustness. For UNSW-NB15, client counts between 4 and 8 are used so that each client retains a sufficient number of flows while still exhibiting heterogeneous label distributions. For KDD Cup 99, client counts between 7 and 10 are used to reflect the larger dataset size and to explore a slightly denser cross-silo configuration. For NSL-KDD, client counts between 5 and 10 are used, again ensuring that each client retains enough samples for stable local training. This choice avoids extremely small client datasets that would compromise both realism and statistical reliability, while still providing a range of Non-IID configurations.

For all experiments, the same model architecture and optimization hyperparameters are used unless otherwise stated: the Adam optimizer with a learning rate of

1 \times 10^{- 3}

, a mini-batch size of 128, and one local epoch per communication round. Training is run for at most 60 communication rounds with early stopping based on validation accuracy.

4.2. Evaluation Metrics

Four standard metrics are reported on the test set: overall accuracy (ACC), macro-averaged precision (Prec), macro-averaged recall (Rec), and macro-averaged F1-score (F1). Let

{T P}_{c}

,

{F P}_{c}

, and

{F N}_{c}

denote true positives, false positives, and false negatives for class

c

, respectively. Then

{P r e c}_{c} = \frac{{T P}_{c}}{{T P}_{c} + {F P}_{c}}, {R e c}_{c} = \frac{{T P}_{c}}{{T P}_{c} + {F N}_{c}} .

(9)

and

{F 1}_{c} = \frac{2 {P r e c}_{c} {R e c}_{c}}{{P r e c}_{c} + {R e c}_{c}} .

(10)

Macro-averaged metrics are computed by averaging over all classes:

P r e c = \frac{1}{C} \sum_{c = 1}^{C} {P r e c}_{c}, R e c = \frac{1}{C} \sum_{c = 1}^{C} {R e c}_{c}, F 1 = \frac{1}{C} \sum_{c = 1}^{C} {F 1}_{c},

(11)

where

C

is the number of classes. Confusion matrices are used qualitatively to inspect performance on minority classes and to identify systematic misclassifications.

4.3. Overall Performance on Benchmark Datasets

Table 1, Table 2 and Table 3 summarize the performance of the proposed framework under different client counts for UNSW-NB15, KDD Cup 99, and NSL-KDD, respectively. For each dataset, test accuracy, macro precision, macro recall, macro F1-score, and total training time are reported.

Across all three datasets and client configurations, the model attains high accuracy and balanced precision, recall, and F1-scores, indicating that performance is not dominated by a small subset of classes. Although these benchmarks are known to be somewhat saturated [17,18], the results demonstrate that the proposed federated architecture can match or exceed the performance of centralized deep learning baselines reported in the literature [3,13,14,15,19,26] while preserving data locality. As shown schematically in Figure 1, the federated workflow coordinates local training and global aggregation; Figure 2 illustrates the multi-kernel 1D convolutions with attention reweighting; and Figure 3 summarizes the communication workflow under privacy-preserving assumptions.

4.4. Convergence Behaviour, Accuracy Dips, and Communication Analysis

As shown in Figure 4, both local training curves and aggregated model performance converge stably across clients. As the number of clients increases, convergence becomes slightly slower and exhibits mild oscillations, particularly under more severe label skew. Small late-round dips under Non-IID are expected due to client drift and the mismatch between local and global objectives in FL scenarios. This is a well-known phenomenon under Non-IID data and can be attributed to the interaction between local overfitting and global aggregation [4,6,7].

Concretely, in each round, clients perform several local updates using only their own data distributions. When these distributions differ substantially, the resulting parameter updates can pull the global model in different directions, leading to transient decreases in validation accuracy when aggregated.

The training-time columns in Table 1, Table 2 and Table 3 also provide a coarse quantitative measure of computational and communication overhead. For UNSW-NB15, the total training time increases from 3056 s at four clients to 4381 s at eight clients, corresponding to an increase of approximately 43%. Similar patterns are observed for KDD Cup 99 and NSL-KDD, with training time growing sub-linearly to roughly linearly in the number of clients. This trend is expected: adding clients increases the number of local updates and messages per communication round, while the per-client mini-batch size and number of local epochs remain fixed. Because the global model is relatively compact, each communication round involves transmitting only a modest number of parameters per client, and the number of rounds required for convergence remains moderate.

Overall, these results indicate that the proposed architecture maintains practical training times and communication requirements as the number of clients grows, which is consistent with deployment in resource-constrained edge environments.

As shown in Figure 5, Figure 6 and Figure 7, confusion matrices indicate low false-positive and false-negative rates across multiple client counts, including for minority attack classes.

4.5. Robustness to Non-IID Data and Client Configurations

Robustness to Non-IID data is examined by tracking test performance under different Dirichlet concentration parameters and client counts. As expected, more severe label skew (smaller

α

) increases the variance of local updates and slightly degrades global performance. However, across all tested settings, the model maintains high accuracy and balanced precision–recall metrics, confirming that the combination of multi-scale convolutions and attention-based reweighting generalizes reasonably well under controlled heterogeneity.

Per-class analysis reveals that minority attack classes are more sensitive to Non-IID partitioning, with performance drops particularly evident in KDD Cup 99 and NSL-KDD. This behaviour is consistent with prior observations on dataset artifacts and class imbalance [13,14,19,26]. It highlights the importance of monitoring minority-class metrics and, where necessary, incorporating additional mechanisms such as class-balanced loss functions, reweighting, or data-augmentation strategies. As shown in Figure 8, convergence remains stable under the specified federated protocol across different client configurations.

4.6. Ablation on Architectural Components

To quantify the contribution of the main architectural choices, we carry out an ablation study on UNSW-NB15 under the six-client Non-IID configuration described in Section 4.1. The study varies three factors: (i) the serialization length

L

, (ii) the use of multi-scale convolutions versus a single kernel size, and (iii) the presence of the channel-wise attention block. Table 4 reports test accuracy and macro F1-score for six representative variants: the full model with

L = 64

, multi-scale convolutions, and attention; a shorter serialized sequence with

L = 16

; a degenerate serialization with

L = 1

; a variant using a single convolutional kernel size; a variant without attention; and a variant without both multi-scale convolutions and attention.

The results highlight three main trends. First, the serialization step is clearly critical: collapsing the sequence to a single position (

L = 1

) degrades accuracy from 99.38% to 81.83% and macro F1 from 99.37% to 80.30%. This confirms that the model leverages local correlations between serialized feature groups and does not behave like a simple feed-forward classifier over an unordered feature vector.

Second, replacing multi-scale convolutions with a single kernel size (here,

k = 3

) leads to metrics that are very close to the full model (99.23% vs. 99.38% ACC and 99.22% vs. 99.37% macro F1). On UNSW-NB15 and under the considered client configuration, multi-scale filtering therefore plays a secondary role compared with serialization, although it still contributes to slight gains and may become more important on datasets with richer temporal structure.

Third, removing the attention block produces similarly small differences: in this particular run, removing attention yields 99.25% accuracy and 99.24% macro F1, slightly lower than the baseline configuration. However, jointly removing both multi-scale convolutions and attention (non-additive interaction) reduces performance to 98.49% accuracy and 98.47% macro F1. These patterns suggest that multi-scale convolutions and channel-wise attention are not individually critical for achieving strong performance on UNSW-NB15 in this setting, but they help maintain robustness when combined, especially under more heterogeneous client partitions.

Figure 9 visualizes the corresponding training curves on NSL-KDD for the ablated models. All variants converge within the fixed communication budget, but the full configuration exhibits the most stable convergence across datasets. For transparency, the ablation table is provided in the main manuscript rather than only in supplementary material, and the final model retains all three components while acknowledging that serialization is the dominant contributor to performance on UNSW-NB15.

4.7. Protocol Checks for High Performance and Overfitting Risks

Given the very high accuracies observed on KDD Cup 99 and NSL-KDD, several protocol checks are applied to reduce the likelihood that results are driven by evaluation artifacts or inadvertent leakage rather than genuine generalization. These checks include:

(1): Verifying that train, validation, and test splits contain no duplicate records.
(2): Confirming that scaling and normalization parameters are computed exclusively from training data.
(3): Ensuring that label encodings and class mappings are consistent across splits and clients.
(4): Inspecting confusion matrices and per-class metrics to detect anomalously perfect performance concentrated on a subset of classes.

These steps do not eliminate the possibility of overfitting to benchmark idiosyncrasies, especially given known issues in KDD Cup 99 and NSL-KDD [17,18], but they provide additional evidence that the reported performance is not trivially explained by data leakage or inconsistent preprocessing. The high accuracy is therefore interpreted in light of dataset limitations and is not claimed to directly reflect real-world deployment performance without further evaluation.

5. Discussion

5.1. Centralized vs. Federated Intrusion Detection

A natural question in this setting is what the integration of federated learning adds relative to centralized intrusion detection, especially given that many centralized models already report high accuracies on the same benchmarks [3,13,14,15,26]. The design in this paper does not claim to improve absolute detection accuracy beyond the best centralized models; rather, its primary goal is to match strong centralized baselines while enabling collaborative training without centralizing logs.

The experiments show that, under controlled Non-IID conditions, a federated architecture with unidirectional serialization, multi-scale 1D convolutions, and channel-wise attention can maintain accuracy comparable to centralized training on standard benchmarks. This suggests that the privacy and governance benefits of FL need not come at the cost of substantial performance loss.

5.2. Comparison with Federated IDS Baselines

Recent work has proposed a variety of federated learning architectures for intrusion detection that differ in backbone networks, client partitioning, and privacy mechanisms. As discussed in Section 2.3, several influential prototypes target IoT and cyber–physical settings. F-NIDS integrates convolutional neural networks into an FL framework trained on the NF-ToN-IoT-v2 dataset, and shows that the resulting detector can effectively identify attacks in large-scale IoT traffic traces without centralizing packet logs [19]. Devine et al. design a federated intrusion detection system for the CIC-IoT2023 dataset, using SVM-based classifiers and carefully measuring communication overhead and convergence behaviour under heterogeneous IoT gateways [23]. Both works provide strong evidence that FL is a viable paradigm for intrusion detection in realistic, distributed environments; however, neither report results on UNSW-NB15, and they therefore cannot serve as direct numerical baselines for the experiments in Section 4.

In parallel, several recent FL-based IDS approaches have evaluated their methods on UNSW-NB15. Ji et al. propose FedMSP-SPEC, which combines multi-scale parallel convolution and adaptive soft prediction clustering to alleviate Non-IID label distributions across clients [26]. Their experiments include a federated scenario on UNSW-NB15 where clients hold heterogeneous subsets of the data, and evaluation is reported in terms of accuracy, precision, recall, and F1-score. Yu et al. develop a multi-view federated learning framework that jointly trains convolutional autoencoders and neural SVM layers over several feature views (basic, content, traffic, and system views) and extend it to a three-client FL setting covering both TON_IoT and UNSW-NB15 [27]. In their UNSW-NB15 experiments, the federated multi-view CAE-NSVM model reports macro F1-scores while ensuring strict data locality across clients.

Architecturally, the proposed framework in this paper is deliberately conservative compared with FedMSP-SPEC and multi-view CAE-NSVM. It does not rely on hierarchical clustering, graph-based aggregation, or specialized privacy-preserving cryptographic protocols beyond standard secure communication channels [8,9,10,12,22,28,29,30]. Instead, it uses a single lightweight 1D-CNN backbone with unidirectional serialization and channel attention, trained under vanilla FedAvg with sample-size-weighted aggregation. The intent is to study how far a compact, easily re-implementable architecture can be pushed under realistic Non-IID client partitions, and to provide a transparent experimental protocol that can be replicated without specialized distributed systems support.

To provide a quantitative cross-paper reference, Table 5 summarizes the UNSW-NB15 results of the proposed framework alongside the published federated results of FedMSP-SPEC [26] and the multi-view CAE-NSVM model [27]. Metrics for FedMSP-SPEC and the multi-view CAE-NSVM model are taken directly from the corresponding papers [26,27]. “–” denotes metrics that were not reported. For FedMSP-SPEC, we report the best UNSW-NB15 configuration under Non-IID client partitions (specifically, the setting with clustering parameter R = 1), whose macro metrics are explicitly tabulated in [26]. For the multi-view CAE-NSVM model, we use the macro F1-score obtained in the three-client FL scenario on UNSW-NB15, as summarized in [27]. The entry for the proposed framework corresponds to the six-client Non-IID FedAvg experiment on UNSW-NB15 described in Section 4.3, where the detector achieved its best trade-off between performance and training time.

As shown in Table 5, the proposed serialized 1D-CNN with attention achieves 99.38% accuracy and 99.37% macro F1 on UNSW-NB15 under six-client Non-IID FedAvg, substantially higher than the reported UNSW-NB15 performance of existing FL-based baselines. Relative to FedMSP-SPEC, the improvements are approximately 11.1 percentage points in accuracy and 11.2 percentage points in macro F1, while retaining a considerably simpler backbone architecture and a standard aggregation rule. Compared with the federated multi-view CAE-NSVM model, the gain in macro F1 is about 16.7 percentage points. These results suggest that a carefully tuned, lightweight 1D-CNN with attention can match or surpass more elaborate multi-view or clustering-based FL architectures on UNSW-NB15, without requiring specialized FL protocols or complex multi-stage training pipelines.

It is important to emphasize that these cross-paper comparisons remain approximate. The referenced studies differ in data preprocessing choices, client partitioning strategies, and, in some cases, label granularity. Table 5 should therefore be interpreted as an empirical reference point rather than a claim of strict dominance. To mitigate these limitations, this work retains (i) a transparent threat model, (ii) an explicit quantification of Non-IIDness via Jensen–Shannon divergence, and (iii) a detailed reproducibility protocol for model hyperparameters and evaluation procedures (Section 3.8). These elements are intended to support independent reimplementation and facilitate future head-to-head comparisons under fully aligned experimental conditions.

5.3. Dataset Artifacts, Class Imbalance, and Benchmark Saturation

As noted in Section 2, KDD Cup 99 and NSL-KDD are known to exhibit artifacts and class imbalances that can inflate performance metrics [17,18]. The high reported accuracies must therefore be interpreted with care. While protocol checks and ablation studies support the validity of the results under the chosen splits and preprocessing pipeline, they cannot fully mitigate underlying dataset flaws. In this context, results on UNSW-NB15 are given more weight, and claims are phrased cautiously to account for these artifacts.

5.4. Non-IID Scenarios and Realistic Edge Deployments

The Non-IID scenarios in this study are generated using Dirichlet-based label skew with the concentration parameter α set to 1 to produce moderate but non-trivial heterogeneity.. This choice captures important aspects of realistic deployments—namely, that some clients may see more of certain attack types or benign protocols than others—but it remains a simplification. In real networks, feature distributions may differ across sites (e.g., due to different applications or user populations), and temporal drift may occur over time.

The Non-IID experiments should therefore be viewed as one step towards realistic modelling rather than a complete representation.

5.5. Security Vulnerabilities and Future Hardening

As discussed in Section 3.7, the current instantiation does not include defences against inference or poisoning attacks beyond the intrinsic benefits of avoiding raw data centralization. This limitation is particularly salient in security applications, where adversaries may target the FL process itself.

6. Conclusions

This paper presents a federated intrusion detection framework that combines unidirectional serialization of tabular flow records, multi-scale one-dimensional convolutions, and channel-wise attention, trained under a sample-size-weighted FedAvg protocol with explicitly quantified Non-IID client partitions. Experiments on UNSW-NB15, KDD Cup 99, and NSL-KDD show that the approach achieves high accuracy and balanced precision–recall metrics across multiple client configurations while preserving data locality.

The study contributes (i) a deployment-oriented architectural template for flow-based federated intrusion detection, (ii) a quantitative characterization of Non-IID conditions via JSD and Dirichlet-based label skew, and (iii) an explicit threat-model and reproducibility protocol that together support transparent evaluation and replication. While the use of canonical benchmarks and the absence of explicit privacy and robustness mechanisms limit the immediate deployability of the framework, the results indicate that federated learning can deliver near-centralized performance without centralizing logs, thereby aligning intrusion detection with emerging privacy and data-governance requirements.

Author Contributions

Conceptualization, W.L. and T.Z.; methodology, W.L., T.Z. and D.G.; software, W.L.; validation, W.L., T.Z. and D.G.; formal analysis, W.L.; investigation, W.L.; resources, W.L.; data curation, W.L. and T.Z.; writing—original draft preparation, W.L. and T.Z.; writing—review and editing, W.L., T.Z. and D.G.; visualization, W.L. and T.Z.; supervision, W.L. and D.G.; project administration, W.L., T.Z. and D.G. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The datasets analyzed in this study (UNSW-NB15, KDD Cup 99 (10%), and NSL-KDD) are publicly available.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

IDS	Intrusion Detection System
NIDS	Network Intrusion Detection System
FL	Federated Learning
CNN	Convolutional Neural Network
1D-CNN	One-Dimensional Convolutional Neural Network
JSD	Jensen–Shannon Divergence
IID	Independent and Identically Distributed
Non-IID	Non-Independent and Identically Distributed
ACC	Accuracy
Prec	Precision
Rec	Recall
F1	F1-score

References

Buczak, A.L.; Guven, E. A survey of data mining and machine learning methods for cyber security intrusion detection. IEEE Commun. Surv. Tutor. 2016, 18, 1153–1176. [Google Scholar] [CrossRef]
Khraisat, A.; Gondal, I.; Vamplew, P.; Kamruzzaman, J. Survey of intrusion detection systems: Techniques, datasets and challenges. Cybersecurity 2019, 2, 20. [Google Scholar] [CrossRef]
Ferrag, M.A.; Maglaras, L.; Moschoyiannis, S.; Janicke, H. Deep learning for cyber security intrusion detection: Approaches, datasets, and comparative study. J. Inf. Secur. Appl. 2020, 50, 102419. [Google Scholar] [CrossRef]
Li, T.; Sahu, A.K.; Talwalkar, A.; Smith, V. Federated learning: Challenges, methods, and future directions. IEEE Signal Process. Mag. 2020, 37, 50–60. [Google Scholar] [CrossRef]
Lim, W.Y.B.; Luong, N.C.; Hoang, D.T.; Jiao, Y.; Liang, Y.-C.; Yang, Q.; Miao, C. Federated learning in mobile edge networks: A comprehensive survey. IEEE Commun. Surv. Tutor. 2020, 22, 2031–2063. [Google Scholar] [CrossRef]
Kairouz, P.; McMahan, H.B. Advances and open problems in federated learning. Found. Trends Mach. Learn. 2021, 14, 1–210. [Google Scholar] [CrossRef]
Ma, X.; Zhu, J.; Lin, Z.; Chen, S.; Qin, Y. A state-of-the-art survey on solving Non-IID data in federated learning. Future Gener. Comput. Syst. 2022, 135, 244–258. [Google Scholar] [CrossRef]
Yin, X.; Zhu, Y.; Hu, J. A comprehensive survey of privacy-preserving federated learning: A taxonomy, review, and future directions. ACM Comput. Surv. 2021, 54, 1–36. [Google Scholar] [CrossRef]
Gosselin, R.; Vieu, L.; Loukil, F.; Benoit, A. Privacy and Security in Federated Learning: A Survey. Appl. Sci. 2022, 12, 9901. [Google Scholar] [CrossRef]
Shan, F.; Mao, S.; Lu, Y.; Li, S. Differential Privacy Federated Learning: A Comprehensive Review. Int. J. Adv. Comput. Sci. Appl. 2024, 15, 220. [Google Scholar] [CrossRef]
Zhang, X.; Luo, Y.; Li, T. A Review of Research on Secure Aggregation for Federated Learning. Future Internet 2025, 17, 308. [Google Scholar] [CrossRef]
Lin, J. Divergence measures based on the Shannon entropy. IEEE Trans. Inf. Theory 1991, 37, 145–151. [Google Scholar] [CrossRef]
Ring, M.; Wunderlich, S.; Scheuring, D.; Landes, D.; Hotho, A. A survey of network-based intrusion detection data sets. Comput. Secur. 2019, 86, 147–167. [Google Scholar] [CrossRef]
Milenkoski, A.; Vieira, M.; Kounev, S.; Avritzer, A.; Payne, B.D. Evaluating computer intrusion detection systems: A survey of common practices. ACM Comput. Surv. 2015, 48, 1–41. [Google Scholar] [CrossRef]
de Oliveira, J.A.; Gonçalves, V.P.; Meneguette, R.I.; de Sousa, R.T., Jr.; Guidoni, D.L.; Oliveira, J.C.M.; Rocha Filho, G.P. F-NIDS—A Network Intrusion Detection System based on federated learning. Comput. Netw. 2023, 236, 110010. [Google Scholar] [CrossRef]
Alsamiri, J.; Alsubhi, K. Federated Learning for Intrusion Detection Systems in Internet of Vehicles: A General Taxonomy, Applications, and Future Directions. Future Internet 2023, 15, 403. [Google Scholar] [CrossRef]
Buyuktanir, B.; Altinkaya, Ş.; Karatas Baydogmus, G.; Yildiz, K. Federated learning in intrusion detection: Advancements, applications, and future directions. Clust. Comput. 2025, 28, 473. [Google Scholar] [CrossRef]
Wei, K.; Li, J.; Ding, M.; Ma, C.; Yang, H.H.; Farokhi, F.; Jin, S.; Quek, T.Q.; Poor, H.V. Federated learning with differential privacy: Algorithms and performance analysis. IEEE Trans. Inf. Forensics Secur. 2020, 15, 3454–3469. [Google Scholar] [CrossRef]
Zhou, X.; Xu, M.; Wu, Y.; Zheng, N. Deep Model Poisoning Attack on Federated Learning. Future Internet 2021, 13, 73. [Google Scholar] [CrossRef]
Manzoor, H.U.; Shabbir, A.; Chen, A.; Flynn, D.; Zoha, A. A Survey of Security Strategies in Federated Learning: Defending Models, Data, and Privacy. Future Internet 2024, 16, 374. [Google Scholar] [CrossRef]
Aziz, R.; Banerjee, S.; Bouzefrane, S.; Le Vinh, T. Exploring Homomorphic Encryption and Differential Privacy Techniques towards Secure Federated Learning Paradigm. Future Internet 2023, 15, 310. [Google Scholar] [CrossRef]
Belenguer, A.; Pascual, J.A.; Navaridas, J. A review of federated learning applications in intrusion detection systems. Comput. Netw. 2025, 258, 111023. [Google Scholar] [CrossRef]
Devine, M.; Ardakani, S.P.; Al-Khafajiy, M.; James, Y. Federated Machine Learning to Enable Intrusion Detection Systems in IoT Networks. Electronics 2025, 14, 1176. [Google Scholar] [CrossRef]
Olanrewaju-George, B.; Pranggono, B. Federated learning-based intrusion detection system for the internet of things using unsupervised and supervised deep learning models. Cyber Secur. Appl. 2025, 3, 100068. [Google Scholar] [CrossRef]
Al Tfaily, F.; Ghalmane, Z.; Brahmia, M.E.A.; Hazimeh, H.; Jaber, A.; Zghal, M. Graph-based federated learning approach for intrusion detection in IoT networks. Sci. Rep. 2025, 15, 41264. [Google Scholar] [CrossRef]
Klinkhamhom, C.; Boonyopakorn, P.; Wuttidittachotti, P. MIDS-GAN: Minority Intrusion Data Synthesizer GAN—An ACON Activated Conditional GAN for Minority Intrusion Detection. Mathematics 2025, 13, 3391. [Google Scholar] [CrossRef]
Li, B.; Li, J.; Jia, M. ADFCNN-BiLSTM: A Deep Neural Network Based on Attention and Deformable Convolution for Network Intrusion Detection. Sensors 2025, 25, 1382. [Google Scholar] [CrossRef]
Cui, B.; Chai, Y.; Yang, Z.; Li, K. Intrusion Detection in IoT Using Deep Residual Networks with Attention Mechanisms. Future Internet 2024, 16, 255. [Google Scholar] [CrossRef]
Ji, C.; He, S.; Dai, W. A Federated Learning Based Intrusion Detection Method with Multi-Scale Parallel Convolution and Adaptive Soft Prediction Clustering. Electronics 2025, 14, 4705. [Google Scholar] [CrossRef]
Yu, J.; Wang, G.; Shi, N.; Saxena, R.; Lee, B. A Multi-View-Based Federated Learning Approach for Intrusion Detection. Electronics 2025, 14, 4166. [Google Scholar] [CrossRef]

Figure 1. Overall workflow of the proposed federated intrusion detection framework.

Figure 2. Multi-scale 1D-CNN with attention reweighting.

Figure 3. Federated training workflow with privacy-preserving communication assumptions.

Figure 4. Training dynamics and performance curves on UNSW-NB15.

Figure 5. Confusion matrix (5 clients) on UNSW-NB15.

Figure 6. Confusion matrix (6 clients) on UNSW-NB15.

Figure 7. Confusion matrix (7 clients) on UNSW-NB15.

Figure 8. Training dynamics and performance curves on KDD Cup 99.

Figure 9. Training dynamics and performance curves on NSL-KDD.

Table 1. Federated performance on UNSW-NB15 under different client counts.

# Clients	Accuracy (%)	Precision (%)	Recall (%)	F1 (%)	Training Time (s)
4	99.24	99.58	98.84	99.21	3056
5	99.31	99.48	99.10	99.29	3614
6	99.38	99.61	99.14	99.37	4108
7	99.24	99.61	98.81	99.21	4226
8	99.26	99.43	99.05	99.24	4381

Table 2. Federated performance on KDD Cup 99 under different client counts.

# Clients	Accuracy (%)	Precision (%)	Recall (%)	F1 (%)	Training Time (s)
7	99.82	99.85	99.78	99.82	589
8	99.84	99.87	99.81	99.84	772
9	99.86	99.87	99.84	99.86	853
10	99.79	99.79	99.78	99.79	995

Table 3. Federated performance on NSL-KDD under different client counts.

# Clients	Accuracy (%)	Precision (%)	Recall (%)	F1 (%)	Training Time (s)
5	98.97	99.00	98.97	98.98	297
6	98.96	98.96	99.00	98.97	320
7	99.02	99.01	99.05	99.02	377
8	98.94	98.90	99.03	98.97	420
9	98.93	98.94	98.94	98.94	516
10	98.93	99.02	98.83	98.92	604

Table 4. Ablation on architectural components for UNSW-NB15 (six clients, Non-IID label partitions).

Configuration	$Serialization Length L$	Multi-Scale Conv.	Channel Attention	ACC (%)	Macro F1 (%)
Full model (baseline)	64	√	√	99.38	99.37
Short sequence	16	√	√	98.31	98.30
Degenerate serialization	1	√	√	81.83	80.30
No multi-scale convolutions (kernel size 3 only)	64	×	√	99.23	99.22
No attention module	64	√	×	99.25	99.24
No multi-scale and no attention	64	×	×	98.49	98.47

Note: √ indicates the component is enabled; × indicates removed.

Table 5. Federated IDS performance on UNSW-NB15 compared with existing FL-based approaches.

Method	Dataset	ACC (%)	F1-Score (%)	Notes
FedMSP-SPEC [26]	UNSW-NB15	88.28	88.18	FL IDS, Dirichlet α = 1
Multi-view FL CAE-NSVM [27]	UNSW-NB15	–	82.6	Multi-view FL, 3 clients
Serialized 1D-CNN with attention (this work)	UNSW-NB15	99.38	99.37	Same FL setting as Table 1

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Li, W.; Gao, D.; Zhang, T. Federated Intrusion Detection via Unidirectional Serialization and Multi-Scale 1D Convolutions with Attention Reweighting. Future Internet 2026, 18, 117. https://doi.org/10.3390/fi18030117

AMA Style

Li W, Gao D, Zhang T. Federated Intrusion Detection via Unidirectional Serialization and Multi-Scale 1D Convolutions with Attention Reweighting. Future Internet. 2026; 18(3):117. https://doi.org/10.3390/fi18030117

Chicago/Turabian Style

Li, Wenqing, Di Gao, and Tianrong Zhang. 2026. "Federated Intrusion Detection via Unidirectional Serialization and Multi-Scale 1D Convolutions with Attention Reweighting" Future Internet 18, no. 3: 117. https://doi.org/10.3390/fi18030117

APA Style

Li, W., Gao, D., & Zhang, T. (2026). Federated Intrusion Detection via Unidirectional Serialization and Multi-Scale 1D Convolutions with Attention Reweighting. Future Internet, 18(3), 117. https://doi.org/10.3390/fi18030117

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Federated Intrusion Detection via Unidirectional Serialization and Multi-Scale 1D Convolutions with Attention Reweighting

Abstract

1. Introduction

2. Related Work

2.1. Deep Learning for Intrusion Detection

2.2. Benchmark Datasets and Evaluation Pitfalls

2.3. Federated Learning for Security Analytics

2.4. Security and Privacy in Federated Learning

3. Methodology

3.1. Problem Setting and Notation

3.2. Unidirectional Serialization of Tabular Flow Records

3.3. Multi-Scale 1D Convolutional Backbone

3.4. Channel-Wise Attention Reweighting

3.5. Federated Optimization via FedAvg

3.6. Non-IID Client Partitioning and Quantification

3.7. Threat Model, Privacy Considerations, and Limitations

3.8. Reproducibility Protocol

4. Experiments and Results

4.1. Datasets, Client Configurations, and Experimental Setup

4.2. Evaluation Metrics

4.3. Overall Performance on Benchmark Datasets

4.4. Convergence Behaviour, Accuracy Dips, and Communication Analysis

4.5. Robustness to Non-IID Data and Client Configurations

4.6. Ablation on Architectural Components

4.7. Protocol Checks for High Performance and Overfitting Risks

5. Discussion

5.1. Centralized vs. Federated Intrusion Detection

5.2. Comparison with Federated IDS Baselines

5.3. Dataset Artifacts, Class Imbalance, and Benchmark Saturation

5.4. Non-IID Scenarios and Realistic Edge Deployments

5.5. Security Vulnerabilities and Future Hardening

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI