Real-Time Bernoulli-Based Sequence Modeling for Efficient Intrusion Detection in Network Flow Data

El Alami, Abderrahman; El Batteoui, Ismail; Satori, Khalid

doi:10.3390/jcp6010032

Open AccessArticle

Real-Time Bernoulli-Based Sequence Modeling for Efficient Intrusion Detection in Network Flow Data

by

Abderrahman El Alami

^*,

Ismail El Batteoui

^* and

Khalid Satori

LISAC Laboratory, Faculty of Science Dhar El Mahraz, Sidi Mohamed Ben Abdellah University, FSDM 22MF+97C, Fez 30050, Morocco

^*

Authors to whom correspondence should be addressed.

J. Cybersecur. Priv. 2026, 6(1), 32; https://doi.org/10.3390/jcp6010032

Submission received: 6 November 2025 / Revised: 30 December 2025 / Accepted: 4 January 2026 / Published: 10 February 2026

(This article belongs to the Section Security Engineering & Applications)

Download

Browse Figures

Versions Notes

Abstract

The exponential growth of network traffic and the increasing sophistication of cyberattacks have underscored the need for intelligent and real-time Intrusion Detection Systems (IDS). Traditional flow-based IDS models typically analyze each network flow independently, ignoring the temporal and contextual dependencies among flows, which reduces their ability to recognize coordinated or multi-stage attacks. To address this limitation, this paper proposes a Bernoulli-based probabilistic sequence modeling framework that integrates statistical learning with visual feature representation for efficient intrusion detection. The approach begins with a comprehensive data-preprocessing pipeline that performs feature cleaning, encoding, normalization, and sequence aggregation. Each aggregated feature vector is then transformed into a 6 × 6 grayscale image, allowing the system to capture spatial correlations among network features through convolutional operations. A logistic regression model first estimates per-flow attack probabilities, and these are combined using the Bernoulli probability law to infer the likelihood of malicious activity across flow sequences. The resulting sequence-level representations are evaluated using lightweight classifiers such as TinyNet-6 × 6, MobileNetV2, and ResNet18. Experimental results on the CICIDS2017 dataset demonstrate that the proposed method achieves high detection accuracy with reduced computational cost compared to state-of-the-art deep models, highlighting its suitability for scalable, real-time IDS deployment.

Keywords:

intrusion detection system (IDS); Bernoulli sequence modeling; probabilistic aggregation; network flow analysis; grayscale image transformation; deep learning; real-time cybersecurity

1. Introduction

The increasing complexity, frequency, and sophistication of cyberattacks have intensified the demand for real-time Intrusion Detection Systems (IDS) capable of adaptive, low-latency response in dynamic network environments. Conventional flow-based IDS frameworks typically analyze each network flow independently, focusing on packet-level or statistical features while ignoring temporal and contextual relationships between consecutive flows. This isolationist perspective weakens their ability to recognize stealthy or multi-stage attacks, which often unfold progressively through correlated network activities and subtle temporal patterns [1,2].

In recent years, deep learning and sequential modeling approaches have significantly advanced intrusion detection by enabling the automatic extraction of temporal dependencies. Architectures such as Recurrent Neural Networks (RNNs), Long Short-Term Memory (LSTM) networks, and Transformer-based models have demonstrated superior performance in identifying complex attack sequences by leveraging the time-ordered structure of network events. However, these methods are often computationally intensive, requiring substantial processing power and memory, which limits their feasibility for real-time or resource-constrained deployments [3,4].

To overcome these limitations, researchers have explored lightweight probabilistic and hybrid frameworks that preserve temporal awareness while maintaining computational efficiency. These strategies aim to provide a trade-off between accuracy, interpretability, and speed, addressing the increasing demand for scalable IDS solutions [5]. For instance, ensemble architecture combining deep and classical models have shown that contextual learning can enhance precision without excessive computational cost, yet they still depend on heavy feature extraction and parameter tuning pipelines.

In this context, the present study introduces a Bernoulli-based sequence modeling framework that integrates probabilistic reasoning with visual encoding of network flows. The proposed approach begins by transforming raw network traffic data into structured grayscale images, where each pixel intensity represents a normalized flow feature. This visual transformation captures inter-feature correlations in a compact, spatially coherent format that facilitates downstream learning. A logistic regression model then estimates the probability of each individual flow being malicious. Subsequently, flows are grouped into short sequences, and the Bernoulli probability law is applied to compute the likelihood that a given sequence contains at least one attack instance. This formulation produces an interpretable, context-aware, and compressed probabilistic representation of flow sequences.

The resulting sequence-level features are further aggregated using mean-based statistical descriptors, yielding compact and low-dimensional representations suitable for traditional classifiers such as Support Vector Machine (SVM), K-Nearest Neighbors (KNN), and Decision Tree. Similar temporal aggregation strategies over NetFlow records have been shown to enhance anomaly detection by capturing inter-flow dependencies while maintaining computational efficiency [6]. Experiments conducted on the CICIDS2017 dataset demonstrate that the proposed framework achieves high detection accuracy and low inference latency, outperforming baseline single-flow and conventional deep-learning models. This hybrid probabilistic–visual approach offers a scalable, interpretable, and computationally efficient solution for real-time intrusion detection.

The remainder of this paper is organized as follows. Section 2 presents the background and related work, summarizing recent advances in deep learning–based intrusion detection and identifying the research gap motivating this study. Section 3 describes the data preprocessing pipeline, including feature cleaning, encoding, and normalization procedures. Section 4 details the proposed methodology, encompassing flow-level probability estimation using logistic regression, construction of flow sequences, image transformation of network flows into grayscale representations, probabilistic aggregation via the Bernoulli law, and final classification using multiple lightweight models. Section 5 reports the experimental setup and results, evaluating performance across different sequence lengths, image configurations, and classifiers.

2. Background and Related Work

2.1. Background

Network Intrusion Detection Systems (IDS) form a critical component of modern cyber defense infrastructures, designed to identify malicious activity within increasingly complex and high-volume network environments. Historically, IDS techniques have evolved from signature-based detection, which relies on known attack patterns, to anomaly-based approaches that leverage statistical and machine-learning models for identifying deviations from normal traffic behavior. While these approaches have improved adaptability, they still face challenges in effectively modeling temporal dependencies and contextual relationships across network flows.

Traditional IDS models generally operate at the flow level, where each record summarizes communication between two endpoints. Large, labeled datasets such as CICIDS2017 have been widely adopted for training and evaluation, providing detailed flow logs that differentiate benign from malicious activities [6,7]. However, as network infrastructures expand, the resulting datasets grow exponentially, introducing computational and memory constraints that complicate real-time analysis.

Conventional flow-based models typically assess each flow independently, overlooking temporal correlations and contextual dependencies that exist among consecutive flows. This limitation hampers the detection of sophisticated multi-stage or low-rate attacks that unfold progressively across sessions. Incorporating temporal information significantly improves detection performance and reduces false-positive rates by enabling models to capture sequential behavioral patterns [8].

Motivated by these findings, a more context-aware representation groups individual flows into fixed-length sequences. This transformation captures higher-order behavioral dynamics while simultaneously compressing the dataset. For instance, forming sequences of three flows reduces the dataset size by roughly one-third without sacrificing critical temporal cues. Within this framework, each flow instance is modeled as a Bernoulli trial, where a binary outcome denotes either normal or malicious behavior. Applying the Bernoulli probability law allows the estimation of the likelihood that at least one anomalous event occurs within a given sequence. The resulting aggregated probability acts as a compact yet expressive indicator of the sequence’s threat intensity.

This probabilistic formulation preserves the semantic integrity of the original data while enhancing computational scalability, making it particularly suitable for real-time, high-throughput intrusion detection systems. Despite ongoing progress in temporal and probabilistic modeling, there remains a gap in lightweight, interpretable frameworks capable of capturing sequential dependencies under real-time constraints. The present study addresses this gap by integrating flow-level probabilistic modeling, sequence construction, and Bernoulli-based aggregation into a unified approach for efficient and context-aware intrusion detection.

2.2. Related Work

Deep learning has become the dominant paradigm for modern Intrusion Detection Systems (IDS), providing the ability to automatically extract hierarchical representations from high-dimensional traffic data. Unlike conventional machine-learning models that depend heavily on handcrafted features, deep architectures learn multi-level abstractions that capture both spatial and temporal patterns of network behavior, thereby improving detection accuracy and robustness in dynamic environments.

Convolutional Neural Networks (CNNs) have been widely adopted to capture spatial correlations among network-flow features. Hybrid CNN–Random Forest IDS models, as well as multi-channel CNN frameworks, have achieved strong performance levels, with accuracies reaching approximately 97–98% on benchmark datasets such as KDD99 and CIC-IDS2017 [5].

To model temporal dependencies in sequential network flows, researchers have leveraged Recurrent Neural Networks (RNNs) and their advanced variants. Architectures integrating CNN and LSTM layers with attention mechanisms and ensemble fusion have achieved accuracies exceeding 99% on datasets such as CIC-IDS2017 [9].

Recent studies have explored transformer-based architectures to overcome the limitations of recurrent models in long-sequence learning. Vision-transformer-style IDS frameworks have demonstrated accuracy above 99% while significantly reducing false-positive rates [10].

Hybrid frameworks combining spatial–temporal feature learning with attention-inspired fusion strategies have also emerged as a promising direction. Such approaches integrate feature selection, CNN-based spatial encoders, and LSTM-based temporal modeling to balance sparsity and sequential dependency. Recent hybrid deep learning frameworks have demonstrated very high detection accuracy and reduced false-positive rates on benchmark datasets such as CIC-IDS2017 and UNSW-NB15 [11].

Alongside accuracy, a growing body of work emphasizes computational efficiency and interpretability. Hybrid SVM–DT models optimized for distributed inference and quantum-inspired SVMs operating in deep feature spaces have achieved near-real-time detection [12,13]. Other approaches have shown that incorporating dimensionality reduction techniques together with optimized deep learning architecture can substantially reduce training time without degrading precision detection.

Despite these advances, current deep IDS models still face trade-offs between accuracy, interpretability, and scalability. Transformer-based systems offer improved contextual modeling but often require significant computational resources, whereas hybrid CNN–RNN frameworks can struggle to generalize under shifting network conditions. Consequently, there remains a need for lightweight probabilistic deep architectures that preserve interpretability while efficiently aggregating sequential dependencies. The present study addresses this gap by introducing a Bernoulli-based probabilistic sequence-modeling framework, combining deep flow-level learning with statistically grounded aggregation for scalable, real-time intrusion detection.

3. Pre-Processing

Accurate intrusion detection relies heavily on the quality and consistency of input data. Network flow datasets typically contain redundant, incomplete, or environment-specific attributes that can degrade generalization. To address these issues, a structured preprocessing pipeline was implemented, as illustrated in Figure 1.

3.1. Feature Selection

The raw dataset included a wide range of flow-level attributes, many of which carried little or no predictive relevance. Features such as IP addresses, port numbers, and timestamps were excluded because they introduce environment-dependent bias and reduce model transferability. Attributes exhibiting extremely low variance or strong multicollinearity were also removed. The retained subset therefore comprised only behavioral indicators with measurable discriminative power for anomaly detection.

To preserve temporal structure while ensuring model generalizability, all flows were first sorted chronologically before feature extraction and selection. Although timestamp values and network identifiers (such as IP addresses and port numbers) were removed to avoid environment-dependent bias, the temporal ordering of the traffic is fully retained through this sorting process. From the original CICIDS2017 feature set, we selected 36 stable and discriminative numerical attributes that capture key behavioral patterns, including packet statistics, flow dynamics, inter-arrival characteristics, and TCP flag indicators. These features were chosen for their relevance to flow-level anomaly characterization and their suitability for generating the 6 × 6 grayscale representations used in the proposed framework. The complete list of the retained attributes is provided in Table 1, ensuring full transparency and reproducibility of the preprocessing methodology.

3.2. Handling Missing and Duplicate Records

To maintain data integrity, all records containing undefined or inconsistent values were filtered out. Missing or corrupted entries can distort the statistical distribution of features and impair model convergence. In addition, duplicated flows (arising from mirroring or redundant logging) were detected and eliminated. This procedure ensured that the learning algorithm operated on a consistent, non-redundant sample space.

3.3. Encoding Categorical Features

Although most selected attributes were numerical, a limited number of categorical fields (such as protocol identifiers and TCP flag indicators) were retained due to their analytical importance. These were transformed through one-hot encoding, yielding sparse binary vectors that capture category membership without imposing ordinal relationships. This encoding guaranteed full numerical compatibility while preserving the interpretability of discrete network behaviors.

3.4. Normalization of Numerical Features

Following categorical transformation, all numerical features were rescaled to the [0, 1] interval using Min–Max normalization. This standardization mitigated disparities in feature magnitude, prevented scale-dominant bias, and improved convergence during optimization. The normalization process also contributed to model robustness, particularly for algorithms sensitive to input variance such as logistic regression and neural architecture.

3.5. Final Output

After all preprocessing operations were completed, the resulting dataset was uniform, de-duplicated, and entirely numerical. Irrelevant features were pruned, categorical variables encoded, and continuous attributes normalized. The preprocessed data provided a stable foundation for the subsequent Bernoulli-based sequence modeling phase, where each flow instance was represented as a probabilistic vector suitable for per-flow likelihood estimation and sequence-level intrusion scoring.

4. Methodology

This section details the proposed pipeline for real-time intrusion detection using probabilistic modeling, flow-level sequence aggregation, grayscale image transformation, and deep learning classification. The design leverages the Bernoulli probability law to estimate the likelihood of attack within each flow sequence, enabling context-aware intrusion detection while maintaining real-time performance constraints. As illustrated in Figure 2, the pipeline consists of six main stages: (1) per-flow probability estimation using logistic regression, (2) flow sequence construction, (3) sequence-level probability calculation via the Bernoulli probability law, (4) feature aggregation, (5) grayscale image transformation, and (6) final classification using a deep learning model.

4.1. Per-Flow Probability Estimation with Logistic Regression

As the first step in our pipeline, we employ a logistic regression model to estimate the probability of each individual network flow being malicious. This lightweight and interpretable model generates a probability score

p_{i}

∈ [0, 1] for each flow

f_{i}

, serving as a foundation for downstream sequence modeling. These probabilistic outputs not only enable efficient scoring in real time but also act as a prerequisite for applying the Bernoulli probability law at the sequence level, where aggregated attack likelihoods are later computed. By assigning a probabilistic risk to each flow upfront, this step introduces a scalable and explainable layer of early detection within the broader pipeline.

4.2. Sequence Construction (Window Lengths 2 to 5)

In the second stage of the pipeline, consecutive network flows are grouped into fixed-length temporal sequences to capture short-range behavioral context. After chronologically ordering the flows, a sliding-window procedure is applied to construct overlapping sequences of length SL ∈ {2, 3, 4, 5}. For a window starting at position k, the resulting sequence is defined as S_k = {f_k, f_k+1, …, f_k + (SL − 1)}, which preserves the natural temporal progression of the traffic. Each flow within the window retains both its original feature vector and its associated maliciousness probability p_i produced by the logistic regression model. Consequently, every sequence S_k is associated with a probability vector P_s = {p_k, p_k+1, …, p_k + (SL − 1)}.

To assist interpretation, Figure 3 provides a detailed example of the sliding-window construction for the case SL = 3, illustrating how overlapping flow segments are transformed into successive temporal sequences that serve as inputs for the Bernoulli-based aggregation stage.

4.3. Sequence-Level Probability Calculation (Bernoulli Probability Law)

Once sequences of flow-level probabilities

P_{s} = {p_{1}, p_{2}, \dots, p_{S L}}

have been constructed, the overall risk of each sequence being malicious is evaluated using the Bernoulli probability law. Assuming approximate independence among flows, the probability that none of them are malicious is given by

\prod_{i = 1}^{S L} (1 - p_{i})

, leading to a sequence-level attack probability of:

P_{s e q u e n c e} = 1 - \prod_{i = 1}^{S L} (1 - p_{i})

(1)

This formulation captures the cumulative risk across the sequence, making it sensitive to even a single high-probability malicious flow. Sequences with

P_{s e q u e n c e} \geq τ

are labeled as attacks, where the threshold

τ

is selected empirically to optimize detection performance. This step enables context-aware classification and prepares the dataset for downstream representation and learning, bridging per-flow predictions and sequence-level decision-making.

Although Equation (1) relies on an independence assumption at the probability level, this assumption is employed as a modeling approximation rather than a claim that network flows are statistically independent in real-world traffic. When multiple flows originate from the same attack episode, they are interpreted as correlated evidence of a single malicious context rather than as separate attack instances. In this sense, the Bernoulli aggregation functions as a soft logical OR, estimating the presence of malicious activity within a short temporal window. The use of short sequence lengths (SL ∈ {2, 3, 4, 5}) further mitigates potential bias arising from correlated flows and contributes to stable and reliable sequence-level risk estimation.

4.4. Sequence Feature Aggregation (Mean of Each Feature)

In addition to deriving the sequence-level probability, we construct a compact numerical description of each sequence by aggregating the flow features across the SL elements in the window. For every numerical attribute

x_{j}

, we compute the average value over the SL flows, yielding a single representative vector for the sequence:

{\bar{x}}_{j, s e q} = \frac{1}{S L} \sum_{i = 1}^{S L} x_{i, j} f o r e a c h f e a t u r e j

(2)

Using the mean provides a stable and low-dimensional summary of the sequence while preserving the overall temporal tendency of the underlying traffic. This design choice is motivated by the real-time nature of the proposed framework: incorporating additional statistics such as variance or range would increase feature dimensionality and inference cost, which is undesirable for lightweight deployment. Preliminary analysis also indicated that adding such dispersion measures did not lead to noticeable performance improvements. Moreover, the Bernoulli-based sequence probability inherently reflects fluctuations in per-flow maliciousness within the window, indirectly capturing part of the intra-sequence variability. The resulting aggregated vector thus offers an effective compromise between expressiveness and computational efficiency and is subsequently used for the grayscale image transformation stage.

It is important to emphasize that mean-based feature aggregation does not serve as the primary decision mechanism for intrusion detection. The detection process is driven at the flow level, where each individual flow is assigned a malicious probability using logistic regression, and at the sequence level through Bernoulli-based probabilistic aggregation. This formulation is intentionally sensitive to single high-risk events, ensuring that sequences containing isolated burst-driven anomalies remain detectable regardless of feature averaging. Mean aggregation is applied only after probabilistic scoring and serves a representational role, providing a compact and stable statistical summary that supports efficient downstream learning without obscuring critical anomaly signals.

4.5. Image Transformation (6 × 6 Grayscale from Aggregated Features)

Once each sequence is reduced to an aggregated feature vector, we transform it into a two-dimensional grayscale image to enable spatial pattern learning via convolutional neural networks. Specifically, a fixed set of 36 normalized features is selected from each sequence vector and reshaped into a 6 × 6 matrix:

I = r e s h a p e (c, [6 \times 6])

(3)

where

x = [x_{1}, x_{2}, \dots, x_{36}] ϵ R^{36}

is the aggregated feature vector, and

I ϵ R^{6 \times 6}

is the resulting grayscale image. Each pixel

I_{i, j}

corresponds to a normalized feature value in the range [0, 255], preserving the semantic structure and distribution of flow-level attributes. Examples of the resulting 6 × 6 grayscale representations for normal and attack classes are illustrated in Figure 4.

This transformation bridges tabular and image representations, enabling the downstream deep learning model to leverage spatial locality and feature correlations inherent in attack patterns.

The 6 × 6 grayscale representation is not intended to encode strict physical or semantic spatial relationships between individual features. Rather, features are arranged according to functional similarity (e.g., packet statistics, inter-arrival timing metrics, and protocol-level indicators) to encourage meaningful local correlations where possible. Although non-adjacent features are not jointly processed by a single convolutional kernel, stacked convolutional layers progressively expand the receptive field, enabling higher-level filters to integrate information from spatially distant regions of the feature map. Compared to multilayer perceptrons, convolutional architecture offers parameter sharing, improved regularization, and enhanced robustness to small variations in feature magnitude. These properties are particularly advantageous when learning from compact representations under real-time and resource-constrained deployment scenarios.

4.6. Final Classification Using Deep Learning Models

In the final stage of the pipeline, the grayscale images generated from aggregated flow sequences are classified using three deep learning models. The first is TinyNet-6 × 6, a custom lightweight convolutional neural network designed specifically for low-resolution grayscale images. It incorporates depth wise separable convolutions, Squeeze-and-Excitation (SE) blocks, and global average pooling to achieve high accuracy with minimal computational overhead.

To benchmark our architecture, we also evaluate two widely adopted CNN models: MobileNetV2, known for its efficiency in real-time and resource-constrained environments, and ResNet18, a residual network offering deeper representational power with relatively low complexity. Each model is trained using binary cross-entropy loss and the Adam optimizer, with normalized image inputs and a batch size of 64.

During inference, only the trained deep learning model and the grayscale images are required, ensuring that the pipeline remains efficient, scalable, and fully decoupled from earlier probabilistic modeling steps.

5. Results and Comparative Analysis

5.1. Experimental Setup

All experiments were conducted using the publicly available CICIDS2017 dataset, which provides labeled flow-level network traffic encompassing a variety of benign and malicious activities. After applying the preprocessing pipeline described in Section 3, each flow sequence was transformed into a 6 × 6 grayscale image representing 36 aggregated features.

To investigate the effect of temporal context, we generated datasets using four sequence lengths (SL ∈ {2, 3, 4, 5}). For each SL, overlapping sliding windows were applied to the time-ordered flow records, producing distinct training and test sets. Each image was labeled as attack or benign according to the Bernoulli probability law.

The dataset was split into 70% training, 15% validation, and 15% testing. All features were normalized to [0, 1] before transformation.

All models were implemented in Python 3.9 using TensorFlow/Keras 2.13 and trained on an NVIDIA GPU (RTX 3090) with binary cross-entropy loss, the Adam optimizer (η = 0.001), and a batch size of 64. Early stopping based on validation loss was employed to prevent overfitting.

5.2. Evaluation Metrics

We evaluated each model using four standard metrics widely adopted in intrusion detection research:

Accuracy (ACC): Proportion of correctly classified samples.
Precision (PRE): Proportion of predicted attacks that are true attacks.
Recall (REC): Proportion of actual attacks correctly detected.
F1-Score (F1): Harmonic mean of precision and recall.
AUC (Area Under the Curve): Represents the overall ability of the model to separate the classes.

These metrics provide a balanced assessment of classification performance, capturing both correctness and robustness.

5.3. Class Imbalance Considerations

The CICIDS2017 dataset exhibits a pronounced class imbalance, where benign traffic significantly outnumbers malicious instances. In such scenarios, relying solely on accuracy may provide a misleading assessment of model performance, as a classifier biased toward the majority class can achieve high accuracy while failing to detect minority-class attacks.

To ensure a reliable evaluation under these conditions, we place particular emphasis on class-sensitive metrics, including Precision, Recall, F1-score, and the Area Under the ROC Curve (AUC). The F1-score provides a balanced measure by jointly accounting for false positives and false negatives, while AUC offers a threshold-independent assessment of the model’s ability to separate benign and malicious traffic.

As reported in Table 2 and Table 3, the proposed framework achieves consistently high F1-score and AUC values across all sequence lengths, indicating that the observed performance gains are not driven by majority-class dominance. These results confirm that the Bernoulli-based sequence modeling approach maintains robust intrusion detection capability despite the inherent class imbalance of the dataset.

5.4. Baseline Performance on Original Dataset (SL = 1)

As an initial baseline, we evaluated all three models (TinyNet-6 × 6, MobileNetV2, and ResNet18) using the original CICIDS2017 flow records without constructing temporal sequences (SL = 1). This configuration allows us to assess the intrinsic separability of individual flows, independently of any temporal dependencies. The models achieve remarkably high performance in this setting, which reflects the effectiveness of the preprocessing pipeline and the strong discriminative power of many flow-level attack signatures in the dataset. As illustrated in Figure 5, TinyNet-6 × 6 reaches an accuracy of 0.99, followed by MobileNetV2 at 0.97 and ResNet18 at 0.96. It is important to note, however, that this strong baseline does not imply that temporal correlations are uninformative. Rather, it indicates that CICIDS2017 contains many attacks that are readily detectable from single-flow characteristics. In practice, multi-stage or low-rate attacks frequently span several correlated flows, and sequence-based modeling remains essential for capturing such behaviors despite the high accuracy observed at SL = 1.

In terms of loss convergence, illustrated in Figure 6, all models exhibit consistent and smooth optimization, with TinyNet demonstrating the fastest convergence and lowest final loss. This supports its suitability for real-time intrusion detection tasks, particularly in edge or resource-constrained environments where efficiency and speed are critical.

To provide a more complete evaluation, we report standard classification metrics (Precision, Recall, F1-score, and AUC) in Table 2 to assess the models’ robustness on flow-level detection. All models deliver high precision and recall values above 0.94, confirming a low false positive rate and strong detection sensitivity. Notably, TinyNet-6 × 6 achieves the highest F1-score, reinforcing its balanced performance across detection criteria. These metrics underscore the effectiveness of our initial preprocessing and Bernoulli-based scoring even before incorporating temporal context or image modeling.

To complement the standard evaluation metrics, we also report the Area Under the ROC Curve (AUC), which provides a threshold-independent measure of classifier performance. Figure 7 presents the ROC curves for the three models, showing consistently high separability between benign and malicious flows.

5.5. Enhanced Evaluation with Sequence Aggregation SL = {2, 3, 4, 5}

To assess the added value of temporal context, we extended our experiments by aggregating flows into sequences of varying lengths, specifically SL ∈ {2, 3, 4, 5}, using a sliding window mechanism. This strategy enables the modeling of short-term behavioral dependencies, allowing the system to detect distributed or multi-step intrusion patterns that may be imperceptible in isolated flows.

In terms of data efficiency, this aggregation strategy provides a clear advantage by reducing data dimensionality. Rather than processing thousands of individual flow records, the proposed pipeline compresses multiple flows into a single representative image, substantially decreasing data volume and enabling faster training while preserving sufficient anomaly-related information for accurate detection. This design leads to a more lightweight and scalable IDS architecture, particularly suitable for deployment in resource-constrained environments.

As shown in Figure 8 and Figure 9, the introduction of sequence-level aggregation maintains high performance across models, particularly at SL = 2, where accuracy remains comparable to the SL = 1 baseline. TinyNet-6 × 6 consistently leads in both accuracy and convergence speed, achieving 0.98 accuracy at SL = 2, only slightly below its SL = 1 performance. However, as sequence length increases, a gradual degradation in performance is observed, especially at SL = 5, where model accuracy converges around 0.94–0.95. This decline is attributed to potential information dilution or noise introduced when aggregating too many flows, highlighting a trade-off between temporal richness and discriminative clarity.

Training loss curves (Figure 9) reflect stable optimization behavior across all models, with TinyNet again achieving the fastest and most stable convergence. The smoothness of the loss landscape across sequence lengths further supports the robustness of our grayscale transformation and feature-averaging strategy under different temporal granularities.

To further analyze the impact of sequence length on classification robustness, Figure 10 presents the ROC curves obtained for sequence lengths SL = 2, 3, 4, and 5 using the three evaluated models. Across all configurations, the curves remain strongly concave, confirming stable discriminative capacity even as the temporal context is reduced. As expected, shorter sequences (SL = 2 and SL = 3) achieve higher true-positive rates at low false-positive regions, reflecting the benefits of denser, less variable flow representations. When the sequence length increases to SL = 4 and SL = 5, a gradual reduction in ROC–AUC is observed, which is consistent with the performance summarized in Table 3. This trend indicates that longer input sequences introduce additional temporal variability that slightly challenges the models, yet all architectures still maintain high ROC–AUC levels above 0.95. Overall, these results demonstrate that the proposed representation is effective across a wide range of sequence lengths, with particularly strong resilience for short-window intrusion scenarios.

Table 3 summarizes the quantitative evaluation results, reporting Accuracy, Precision, Recall, AUC, and F1-score across different sequence lengths and models. The results indicate that SL = 2 provides the most favorable balance between representation compactness, computational efficiency, and predictive performance, while longer sequence lengths (SL ≥ 3) yield progressively smaller performance gains.

The experiments reveal a consistent trend: all three models (TinyNet-6 × 6, MobileNetV2, and ResNet18) achieve their highest classification accuracies when trained on individual flows (SL = 1), with performance values of 0.99, 0.97, and 0.96, respectively. As sequence length increases (SL ∈ {2, 3, 4, 5}), a gradual decline in accuracy is observed, with final values settling around 0.94 for SL = 5. This performance degradation is expected, as longer sequences may introduce temporal noise and over-smoothing in the aggregated features.

Table 3 also illustrates the stability of the F1-score and AUC across different sequence lengths. For SL = 2, TinyNet-6 × 6 achieves an F1-score of 0.975 and an AUC of 0.990, indicating that the observed accuracy is supported by balanced detection performance for both benign and malicious sequences.

As the sequence length increases (SL ≥ 4), a gradual decrease in F1-score and AUC is observed across all models. This behavior suggests that longer aggregation windows introduce additional temporal variability, which can slightly affect minority-class discrimination. Nevertheless, AUC values remain consistently above 0.95, indicating strong class separability even under increased temporal aggregation.

These results demonstrate that the proposed Bernoulli-based aggregation preserves robust class-sensitive performance and is not biased toward the majority class, with SL = 2 offering the best compromise between detection effectiveness, temporal context, and computational efficiency.

However, the advantage of sequence-based aggregation lies in data compression and computational efficiency. By reducing the number of flow instances, sequence modeling facilitates faster training cycles and lower memory overhead. Notably, even at SL = 5, all models-maintained accuracy above 0.93, demonstrating the resilience of the proposed Bernoulli-based aggregation and grayscale transformation pipeline.

Figure 11 illustrates training accuracy trends, confirming that TinyNet remains the most robust and efficient model throughout. These results validate the dual objective of the framework: enabling high-accuracy detection while supporting lightweight, real-time deployment in network environments with dynamic flow behavior.

5.6. Comparative Analysis

To further evaluate the effectiveness of the proposed Bernoulli-based sequence modeling framework, a comparative analysis was performed against recent deep learning–based IDS approaches reported in the literature. Traditional CNN-LSTM and hybrid architectures have achieved remarkable accuracy in intrusion detection but remain computationally intensive, limiting their suitability for real-time environments.

By comparison, the proposed model attains comparable or higher detection accuracy while significantly reducing computational cost through its lightweight probabilistic aggregation and 6 × 6 grayscale feature encoding. Unlike deep sequential networks that require extensive parameter tuning and GPU resources, the Bernoulli-based method maintains a faster convergence rate and lower latency, making it practical for large-scale, real-time IDS applications.

The accuracy value of 99.1% reported for the proposed framework in Table 4 corresponds to the SL = 1 (single flow) configuration, which is included as a baseline reference rather than as a full temporal modeling setting. Most of the state-of-the-art methods included in the comparison rely on explicit temporal architectures, such as CNN–LSTM or attention-based models and are designed to operate on multi-flow sequences.

Under longer sequence configurations (SL = 3 and SL = 5), the proposed framework achieves accuracy values in the range of 94–96%, which may be lower than those reported by some heavily parameterized deep models. This behavior reflects a deliberate design choice that prioritizes lightweight probabilistic aggregation and data compression over complex temporal learning. Aggregating flows into short sequences substantially reduces dataset size and computational overhead while preserving meaningful detection capability. The framework is therefore not designed to maximize raw accuracy under long temporal windows, but to provide a balanced trade-off between detection performance, computational efficiency, and real-time applicability, making it suitable for high-throughput or resource-constrained intrusion detection environments.

6. Conclusions

This paper presented a Bernoulli-based sequence modeling framework that integrates probabilistic reasoning with visual feature encoding for efficient network intrusion detection. By transforming aggregated flow statistics into compact 6 × 6 grayscale images, the proposed approach bridges lightweight statistical modeling with deep visual analysis, enabling the detection of complex and temporally correlated attack patterns. Flow-level intrusion probabilities are estimated using logistic regression and aggregated using the Bernoulli probability law to produce sequence-level decisions.

Experimental evaluation on the CICIDS2017 dataset showed that the framework achieves high detection accuracy (≈99%), low false-positive rates, and significantly lower computational overhead compared to heavier architectures such as CNN-LSTM and XGBoost-CNN-LSTM. This combination of interpretability, robustness, and computational efficiency makes the approach well suited for real-time network defense in resource-constrained or high-throughput environments.

The compact 6 × 6 representation is also well suited to practical deployment scenarios, including IoT edge gateways, industrial control systems, and encrypted-traffic behavioral monitoring, where fast and lightweight inference is essential. These environments provide promising testbeds for evaluating the generalization of short-sequence intrusion patterns across diverse operational settings.

Future work will extend the framework to multi-class attack detection, incorporate adaptive thresholding for dynamic network conditions, and explore lightweight transformer-based architectures to further strengthen temporal modeling. We also intend to evaluate the proposed method on hardware-accelerated platforms (such as FPGAs and edge inference devices) to provide a more comprehensive assessment of low-latency performance and enable long-term self-adaptation against emerging cyber threats.

Author Contributions

Conceptualization, A.E.A. and I.E.B.; methodology, A.E.A. and I.E.B.; software, A.E.A.; validation, I.E.B. and K.S.; formal analysis, A.E.A. and I.E.B.; investigation, A.E.A. and I.E.B.; resources, A.E.A. and I.E.B.; data curation, A.E.A.; writing—original draft preparation, A.E.A.; writing—review and editing, A.E.A. and I.E.B.; visualization, A.E.A.; supervision, I.E.B. and K.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the University of Sidi Mohamed Ben Abdellah, Faculty of Sciences Dhar El Mehraz (Fes, Morocco).

Data Availability Statement

The data used in this study are publicly available. The CIC-IDS2017 dataset can be accessed at https://www.unb.ca/cic/datasets/ids-2017.html, accessed on 6 November 2025.

Conflicts of Interest

The authors declare no conflict of interest.

References

Nguyen, L.G.; Watabe, K. A method for network intrusion detection using flow sequence and BERT framework. arXiv 2023, arXiv:2310.17127. [Google Scholar] [CrossRef]
Chen, J.; Zhou, H.; Mei, Y.; Adam, G.; Bastian, N.D.; Lan, T. Real time Network Intrusion Detection via Decision Transformers. arXiv 2023, arXiv:2312.07696. [Google Scholar] [CrossRef]
Nazre, R.; Budke, R.; Oak, O.; Sawant, S.; Joshi, A. A Temporal Convolutional Network-based Approach for Network Intrusion Detection. arXiv 2024, arXiv:2412.17452. [Google Scholar] [CrossRef]
Boswell, B.; Barrett, S.; Rajaganapathy, S.; Dorai, G.; Qiu, M. FLARE: Feature-based Lightweight Aggregation for Robust Evaluation of IoT Intrusion Detection. arXiv 2025, arXiv:2504.15375. [Google Scholar]
Doost, P.A.; Moghadam, S.S.; Khezri, E.; Basem, A.; Trik, M. A new intrusion detection method using ensemble classification and feature selection. Sci. Rep. 2025, 15, 13642. [Google Scholar] [CrossRef] [PubMed]
Luay, A.; Wu, Y.; Zhang, H. Temporal modeling of NetFlow records for anomaly-based intrusion detection. arXiv 2025, arXiv:2503.04404v1. [Google Scholar]
Li, J.; Li, L. A Lightweight Network Intrusion Detection System Based on Temporal Convolutional Networks and Attention Mechanisms. Comput. Fraud. Secur. 2025, 2025. [Google Scholar] [CrossRef]
Zhang, C.; Li, J.; Wang, N.; Zhang, D. Research on Intrusion Detection Method Based on Transformer and CNN-BiLSTM in Internet of Things. Sensors 2025, 25, 2725. [Google Scholar] [CrossRef] [PubMed]
Gueriani, A.; Kheddar, H.; Mazari, A.C. Adaptive Cyber-Attack Detection in IIoT Using Attention-Based LSTM-CNN Models. arXiv 2025, arXiv:2501.13962. [Google Scholar]
Zhou, H.; Zou, H.; Li, W.; Li, D.; Kuang, Y. HiViT-IDS: An Efficient Network Intrusion Detection Method Based on Vision Transformer. Sensors 2025, 25, 1752. [Google Scholar] [CrossRef] [PubMed]
Sajid, M.; Malik, K.R.; Almogren, A.; Malik, T.S.; Khan, A.H.; Tanveer, J.; Ur Rehman, A. Enhancing intrusion detection: A hybrid machine and deep learning approach. J. Cloud Comp. 2024, 13, 123. [Google Scholar] [CrossRef]
Al-Saleh, A. A balanced communication-avoiding support vector machine decision tree method for smart intrusion detection systems. Sci. Rep. 2023, 13, 9083. [Google Scholar] [CrossRef] [PubMed]
Lima, M.G.; Carvalho, A.; Álvares, J.G.; Chagas, C.E.D.; Goldschmidt, R.R. Impacts of Data Preprocessing and Hyperparameter Optimization on the Performance of Machine Learning Models Applied to Intrusion Detection Systems. arXiv 2024, arXiv:2407.11105. [Google Scholar] [CrossRef]

Figure 1. Data preprocessing workflow for flow-based intrusion detection using machine learning.

Figure 2. Overview of the Sequence-Based Intrusion Detection Pipeline.

Figure 3. Illustration of the sliding-window mechanism (SL = 3) applied to consecutive network flows. The arrows indicate the sliding window shift used to generate successive flow sequences, where

p_{i}

represents the estimated probability associated with flow

i

.

Figure 3. Illustration of the sliding-window mechanism (SL = 3) applied to consecutive network flows. The arrows indicate the sliding window shift used to generate successive flow sequences, where

p_{i}

represents the estimated probability associated with flow

i

.

Figure 4. Examples of 6 × 6 grayscale images generated from aggregated flow sequences: (a) Normal class samples and (b) Attack class samples. Each image encodes 36 normalized flow features, where pixel intensity reflects the relative magnitude of individual attributes.

Figure 5. Training accuracy curves for SL = 1 across TinyNet-6 × 6, MobileNetV2, and ResNet18.

Figure 6. Corresponding training loss curves across TinyNet-6 × 6, MobileNetV2, and ResNet18 on the original dataset (SL = 1).

Figure 7. ROC curves and AUC values for TinyNet-6 × 6, MobileNetV2, and ResNet18 on the CICIDS2017 dataset (SL = 1).

Figure 8. Training accuracy curves for SL = {2, 3, 4, 5} across TinyNet-6 × 6, MobileNetV2, and ResNet18.

Figure 9. Corresponding training loss curves across TinyNet-6 × 6, MobileNetV2, and ResNet18 for SL = {2, 3, 4, 5}.

Figure 10. ROC Curves Across Sequence Lengths (SL = 2, 3, 4, 5).

Figure 11. Training Accuracy Across All Sequence Lengths (SL = 1 to 5).

Table 1. List of the 36 Selected Flow Features Retained from CICIDS2017 for Sequence Aggregation and Grayscale Image Generation.

Feature Name	Description
Flow Duration	Total duration of the flow in microseconds.
Flow Bytes/s	Number of bytes transmitted per second during the flow.
Flow Packets/s	Number of packets transmitted per second.
Flow IAT Mean	Average inter-arrival time between consecutive packets of the flow.
Flow IAT Std	Standard deviation of inter-arrival times within the flow.
Flow IAT Max	Maximum recorded packet inter-arrival time.
Flow IAT Min	Minimum recorded packet inter-arrival time.
Total Fwd Packets	Number of packets sent from source to destination.
Total Fwd Bytes	Total number of bytes sent in the forward direction.
Fwd Packet Length Mean	Average size of packets sent forward.
Fwd Packet Length Std	Standard deviation of forward packet lengths.
Fwd Packet Length Max	Maximum packet size in the forward direction.
Fwd Packet Length Min	Minimum packet size in the forward direction.
Fwd IAT Mean	Average inter-arrival time between forward packets.
Fwd IAT Std	Variation in forward packet inter-arrival times.
Fwd IAT Max	Longest time interval between forward packets.
Fwd IAT Min	Shortest time interval between forward packets.
Total Backward Packets	Number of packets sent from destination to source.
Total Backward Bytes	Total number of bytes sent backward.
Bwd Packet Length Mean	Average packet size in the backward direction.
Bwd Packet Length Std	Standard deviation of backward packet lengths.
Bwd Packet Length Max	Maximum backward packet size.
Bwd Packet Length Min	Smallest backward packet size.
Bwd IAT Mean	Mean inter-arrival time between backward packets.
Bwd IAT Std	Standard deviation of backward inter-arrival time.
Bwd IAT Max	Maximum inter-arrival time for backward packets.
Bwd IAT Min	Minimum inter-arrival time for backward packets.
Packet Length Mean	Average length of all packets in the flow.
Packet Length Std	Variability in packet size within the flow.
Packet Length Variance	Statistical variance of packet lengths.
Average Packet Size	Mean size of packets considering both directions.
FIN Flag Count	Number of packets with FIN flag set.
SYN Flag Count	Number of packets with SYN flag set.
RST Flag Count	Number of packets with RST flag set.
PSH Flag Count	Number of packets with PSH flag set.
ACK Flag Count	Number of packets with ACK flag set.

Table 2. Evaluation metrics for SL = 1 on original dataset (SL = 1).

Sequence Length (SL)	Dataset Size	Model	Accuracy	Precision	Recall	F1-Score	AUC
1	691,395	TinyNet-6 × 6	0.99	0.98	0.99	0.985	0.995
1	691,395	MobileNetV2	0.97	0.97	0.97	0.965	0.983
1	691,395	ResNet18	0.96	0.96	0.96	0.954	0.975

Table 3. Evaluation metrics for SL ∈ {2, 3, 4, 5} across models.

Sequence Length (SL)	Dataset Size	Model	Accuracy	Precision	Recall	F1-Score	AUC
2	345,698	TinyNet-6 × 6	0.98	0.97	0.98	0.975	0.990
2	345,698	MobileNetV2	0.97	0.96	0.96	0.96	0.982
2	345,698	ResNet18	0.96	0.95	0.95	0.95	0.972
3	230,465	TinyNet-6 × 6	0.96	0.95	0.95	0.95	0.980
3	230,465	MobileNetV2	0.95	0.94	0.94	0.94	0.972
3	230,465	ResNet18	0.94	0.93	0.93	0.93	0.963
4	172,849	TinyNet-6 × 6	0.95	0.94	0.94	0.94	0.973
4	172,849	MobileNetV2	0.94	0.93	0.93	0.93	0.965
4	172,849	ResNet18	0.94	0.92	0.92	0.92	0.960
5	138,279	TinyNet-6 × 6	0.94	0.92	0.93	0.925	0.965
5	138,279	MobileNetV2	0.94	0.91	0.92	0.915	0.958
5	138,279	ResNet18	0.93	0.90	0.9&	0.905	0.952

Table 4. Comparison of the Proposed Method with Recent Deep Learning–Based IDS Models.

Reference	Model Type	Dataset	Accuracy
Gueriani et al. [9]	CNN-LSTM + Attention	CIC-IDS2017	99.3
Ajid et al. [11]	XGBoost-CNN-LSTM	CIC-IDS2017/UNSW-NB15	98.7
Al-Saleh [12]	BCA-SVM + DT	UNSW-NB15	97.8
Proposed Framework	Bernoulli + Logistic Regression + Image Encoding	CIC-IDS2017	99.1

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

El Alami, A.; El Batteoui, I.; Satori, K. Real-Time Bernoulli-Based Sequence Modeling for Efficient Intrusion Detection in Network Flow Data. J. Cybersecur. Priv. 2026, 6, 32. https://doi.org/10.3390/jcp6010032

AMA Style

El Alami A, El Batteoui I, Satori K. Real-Time Bernoulli-Based Sequence Modeling for Efficient Intrusion Detection in Network Flow Data. Journal of Cybersecurity and Privacy. 2026; 6(1):32. https://doi.org/10.3390/jcp6010032

Chicago/Turabian Style

El Alami, Abderrahman, Ismail El Batteoui, and Khalid Satori. 2026. "Real-Time Bernoulli-Based Sequence Modeling for Efficient Intrusion Detection in Network Flow Data" Journal of Cybersecurity and Privacy 6, no. 1: 32. https://doi.org/10.3390/jcp6010032

APA Style

El Alami, A., El Batteoui, I., & Satori, K. (2026). Real-Time Bernoulli-Based Sequence Modeling for Efficient Intrusion Detection in Network Flow Data. Journal of Cybersecurity and Privacy, 6(1), 32. https://doi.org/10.3390/jcp6010032

Article Menu

Real-Time Bernoulli-Based Sequence Modeling for Efficient Intrusion Detection in Network Flow Data

Abstract

1. Introduction

2. Background and Related Work

2.1. Background

2.2. Related Work

3. Pre-Processing

3.1. Feature Selection

3.2. Handling Missing and Duplicate Records

3.3. Encoding Categorical Features

3.4. Normalization of Numerical Features

3.5. Final Output

4. Methodology

4.1. Per-Flow Probability Estimation with Logistic Regression

4.2. Sequence Construction (Window Lengths 2 to 5)

4.3. Sequence-Level Probability Calculation (Bernoulli Probability Law)

4.4. Sequence Feature Aggregation (Mean of Each Feature)

4.5. Image Transformation (6 × 6 Grayscale from Aggregated Features)

4.6. Final Classification Using Deep Learning Models

5. Results and Comparative Analysis

5.1. Experimental Setup

5.2. Evaluation Metrics

5.3. Class Imbalance Considerations

5.4. Baseline Performance on Original Dataset (SL = 1)

5.5. Enhanced Evaluation with Sequence Aggregation SL = {2, 3, 4, 5}

5.6. Comparative Analysis

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI