Real-Time Bernoulli-Based Sequence Modeling for Efficient Intrusion Detection in Network Flow Data
Abstract
1. Introduction
2. Background and Related Work
2.1. Background
2.2. Related Work
3. Pre-Processing
3.1. Feature Selection
3.2. Handling Missing and Duplicate Records
3.3. Encoding Categorical Features
3.4. Normalization of Numerical Features
3.5. Final Output
4. Methodology
4.1. Per-Flow Probability Estimation with Logistic Regression
4.2. Sequence Construction (Window Lengths 2 to 5)
4.3. Sequence-Level Probability Calculation (Bernoulli Probability Law)
4.4. Sequence Feature Aggregation (Mean of Each Feature)
4.5. Image Transformation (6 × 6 Grayscale from Aggregated Features)
4.6. Final Classification Using Deep Learning Models
5. Results and Comparative Analysis
5.1. Experimental Setup
5.2. Evaluation Metrics
- Accuracy (ACC): Proportion of correctly classified samples.
- Precision (PRE): Proportion of predicted attacks that are true attacks.
- Recall (REC): Proportion of actual attacks correctly detected.
- F1-Score (F1): Harmonic mean of precision and recall.
- AUC (Area Under the Curve): Represents the overall ability of the model to separate the classes.
5.3. Class Imbalance Considerations
5.4. Baseline Performance on Original Dataset (SL = 1)
5.5. Enhanced Evaluation with Sequence Aggregation SL = {2, 3, 4, 5}
5.6. Comparative Analysis
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Nguyen, L.G.; Watabe, K. A method for network intrusion detection using flow sequence and BERT framework. arXiv 2023, arXiv:2310.17127. [Google Scholar] [CrossRef]
- Chen, J.; Zhou, H.; Mei, Y.; Adam, G.; Bastian, N.D.; Lan, T. Real time Network Intrusion Detection via Decision Transformers. arXiv 2023, arXiv:2312.07696. [Google Scholar] [CrossRef]
- Nazre, R.; Budke, R.; Oak, O.; Sawant, S.; Joshi, A. A Temporal Convolutional Network-based Approach for Network Intrusion Detection. arXiv 2024, arXiv:2412.17452. [Google Scholar] [CrossRef]
- Boswell, B.; Barrett, S.; Rajaganapathy, S.; Dorai, G.; Qiu, M. FLARE: Feature-based Lightweight Aggregation for Robust Evaluation of IoT Intrusion Detection. arXiv 2025, arXiv:2504.15375. [Google Scholar]
- Doost, P.A.; Moghadam, S.S.; Khezri, E.; Basem, A.; Trik, M. A new intrusion detection method using ensemble classification and feature selection. Sci. Rep. 2025, 15, 13642. [Google Scholar] [CrossRef] [PubMed]
- Luay, A.; Wu, Y.; Zhang, H. Temporal modeling of NetFlow records for anomaly-based intrusion detection. arXiv 2025, arXiv:2503.04404v1. [Google Scholar]
- Li, J.; Li, L. A Lightweight Network Intrusion Detection System Based on Temporal Convolutional Networks and Attention Mechanisms. Comput. Fraud. Secur. 2025, 2025. [Google Scholar] [CrossRef]
- Zhang, C.; Li, J.; Wang, N.; Zhang, D. Research on Intrusion Detection Method Based on Transformer and CNN-BiLSTM in Internet of Things. Sensors 2025, 25, 2725. [Google Scholar] [CrossRef] [PubMed]
- Gueriani, A.; Kheddar, H.; Mazari, A.C. Adaptive Cyber-Attack Detection in IIoT Using Attention-Based LSTM-CNN Models. arXiv 2025, arXiv:2501.13962. [Google Scholar]
- Zhou, H.; Zou, H.; Li, W.; Li, D.; Kuang, Y. HiViT-IDS: An Efficient Network Intrusion Detection Method Based on Vision Transformer. Sensors 2025, 25, 1752. [Google Scholar] [CrossRef] [PubMed]
- Sajid, M.; Malik, K.R.; Almogren, A.; Malik, T.S.; Khan, A.H.; Tanveer, J.; Ur Rehman, A. Enhancing intrusion detection: A hybrid machine and deep learning approach. J. Cloud Comp. 2024, 13, 123. [Google Scholar] [CrossRef]
- Al-Saleh, A. A balanced communication-avoiding support vector machine decision tree method for smart intrusion detection systems. Sci. Rep. 2023, 13, 9083. [Google Scholar] [CrossRef] [PubMed]
- Lima, M.G.; Carvalho, A.; Álvares, J.G.; Chagas, C.E.D.; Goldschmidt, R.R. Impacts of Data Preprocessing and Hyperparameter Optimization on the Performance of Machine Learning Models Applied to Intrusion Detection Systems. arXiv 2024, arXiv:2407.11105. [Google Scholar] [CrossRef]













| Feature Name | Description |
|---|---|
| Flow Duration | Total duration of the flow in microseconds. |
| Flow Bytes/s | Number of bytes transmitted per second during the flow. |
| Flow Packets/s | Number of packets transmitted per second. |
| Flow IAT Mean | Average inter-arrival time between consecutive packets of the flow. |
| Flow IAT Std | Standard deviation of inter-arrival times within the flow. |
| Flow IAT Max | Maximum recorded packet inter-arrival time. |
| Flow IAT Min | Minimum recorded packet inter-arrival time. |
| Total Fwd Packets | Number of packets sent from source to destination. |
| Total Fwd Bytes | Total number of bytes sent in the forward direction. |
| Fwd Packet Length Mean | Average size of packets sent forward. |
| Fwd Packet Length Std | Standard deviation of forward packet lengths. |
| Fwd Packet Length Max | Maximum packet size in the forward direction. |
| Fwd Packet Length Min | Minimum packet size in the forward direction. |
| Fwd IAT Mean | Average inter-arrival time between forward packets. |
| Fwd IAT Std | Variation in forward packet inter-arrival times. |
| Fwd IAT Max | Longest time interval between forward packets. |
| Fwd IAT Min | Shortest time interval between forward packets. |
| Total Backward Packets | Number of packets sent from destination to source. |
| Total Backward Bytes | Total number of bytes sent backward. |
| Bwd Packet Length Mean | Average packet size in the backward direction. |
| Bwd Packet Length Std | Standard deviation of backward packet lengths. |
| Bwd Packet Length Max | Maximum backward packet size. |
| Bwd Packet Length Min | Smallest backward packet size. |
| Bwd IAT Mean | Mean inter-arrival time between backward packets. |
| Bwd IAT Std | Standard deviation of backward inter-arrival time. |
| Bwd IAT Max | Maximum inter-arrival time for backward packets. |
| Bwd IAT Min | Minimum inter-arrival time for backward packets. |
| Packet Length Mean | Average length of all packets in the flow. |
| Packet Length Std | Variability in packet size within the flow. |
| Packet Length Variance | Statistical variance of packet lengths. |
| Average Packet Size | Mean size of packets considering both directions. |
| FIN Flag Count | Number of packets with FIN flag set. |
| SYN Flag Count | Number of packets with SYN flag set. |
| RST Flag Count | Number of packets with RST flag set. |
| PSH Flag Count | Number of packets with PSH flag set. |
| ACK Flag Count | Number of packets with ACK flag set. |
| Sequence Length (SL) | Dataset Size | Model | Accuracy | Precision | Recall | F1-Score | AUC |
|---|---|---|---|---|---|---|---|
| 1 | 691,395 | TinyNet-6 × 6 | 0.99 | 0.98 | 0.99 | 0.985 | 0.995 |
| 1 | 691,395 | MobileNetV2 | 0.97 | 0.97 | 0.97 | 0.965 | 0.983 |
| 1 | 691,395 | ResNet18 | 0.96 | 0.96 | 0.96 | 0.954 | 0.975 |
| Sequence Length (SL) | Dataset Size | Model | Accuracy | Precision | Recall | F1-Score | AUC |
|---|---|---|---|---|---|---|---|
| 2 | 345,698 | TinyNet-6 × 6 | 0.98 | 0.97 | 0.98 | 0.975 | 0.990 |
| 2 | 345,698 | MobileNetV2 | 0.97 | 0.96 | 0.96 | 0.96 | 0.982 |
| 2 | 345,698 | ResNet18 | 0.96 | 0.95 | 0.95 | 0.95 | 0.972 |
| 3 | 230,465 | TinyNet-6 × 6 | 0.96 | 0.95 | 0.95 | 0.95 | 0.980 |
| 3 | 230,465 | MobileNetV2 | 0.95 | 0.94 | 0.94 | 0.94 | 0.972 |
| 3 | 230,465 | ResNet18 | 0.94 | 0.93 | 0.93 | 0.93 | 0.963 |
| 4 | 172,849 | TinyNet-6 × 6 | 0.95 | 0.94 | 0.94 | 0.94 | 0.973 |
| 4 | 172,849 | MobileNetV2 | 0.94 | 0.93 | 0.93 | 0.93 | 0.965 |
| 4 | 172,849 | ResNet18 | 0.94 | 0.92 | 0.92 | 0.92 | 0.960 |
| 5 | 138,279 | TinyNet-6 × 6 | 0.94 | 0.92 | 0.93 | 0.925 | 0.965 |
| 5 | 138,279 | MobileNetV2 | 0.94 | 0.91 | 0.92 | 0.915 | 0.958 |
| 5 | 138,279 | ResNet18 | 0.93 | 0.90 | 0.9& | 0.905 | 0.952 |
| Reference | Model Type | Dataset | Accuracy |
|---|---|---|---|
| Gueriani et al. [9] | CNN-LSTM + Attention | CIC-IDS2017 | 99.3 |
| Ajid et al. [11] | XGBoost-CNN-LSTM | CIC-IDS2017/UNSW-NB15 | 98.7 |
| Al-Saleh [12] | BCA-SVM + DT | UNSW-NB15 | 97.8 |
| Proposed Framework | Bernoulli + Logistic Regression + Image Encoding | CIC-IDS2017 | 99.1 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
El Alami, A.; El Batteoui, I.; Satori, K. Real-Time Bernoulli-Based Sequence Modeling for Efficient Intrusion Detection in Network Flow Data. J. Cybersecur. Priv. 2026, 6, 32. https://doi.org/10.3390/jcp6010032
El Alami A, El Batteoui I, Satori K. Real-Time Bernoulli-Based Sequence Modeling for Efficient Intrusion Detection in Network Flow Data. Journal of Cybersecurity and Privacy. 2026; 6(1):32. https://doi.org/10.3390/jcp6010032
Chicago/Turabian StyleEl Alami, Abderrahman, Ismail El Batteoui, and Khalid Satori. 2026. "Real-Time Bernoulli-Based Sequence Modeling for Efficient Intrusion Detection in Network Flow Data" Journal of Cybersecurity and Privacy 6, no. 1: 32. https://doi.org/10.3390/jcp6010032
APA StyleEl Alami, A., El Batteoui, I., & Satori, K. (2026). Real-Time Bernoulli-Based Sequence Modeling for Efficient Intrusion Detection in Network Flow Data. Journal of Cybersecurity and Privacy, 6(1), 32. https://doi.org/10.3390/jcp6010032
