A thorough grasp of the concepts, methods, and difficulties involved in identifying and classifying the many kinds of data that move across the networks is required for the background in network traffic classification. For network administration, security, and optimization, this sector is essential. These are the major elements of the network traffic categorization backdrop.
2.2. Literature Survey
Alissa et al. 2022 [
13] discussed botnet detection in IoT using ML algorithms. The UNSW-15 dataset along with SMOTE, that was utilized by the researchers to effectively train the XGB, DT Classification, and LR algorithms. After applying an 80-20 train-test split to the dataset, an accuracy of 94% was achieved for DT Classification, 78% for LR, and 94% for XGB. The research highlights potential applications of Support Vector Machines, Random Forest Classification, and sophisticated DL methods including Residual Networks (ResNet-50) and LSTM networks. To improve cybersecurity and lessen possible dangers, the study emphasizes how important it is to include these detection techniques in the network applications’ backends.
Araujo et al. [
14] examined how ML approaches were applied to evaluate and derive insights from four different datasets—ISOT HTTP Botnet [
15], CTU-13 [
16], CICDDoS2019 [
17], and BoT-IoT [
18]. A key outcome of their effort was the development of a novel machine-learning pipeline called ANTE, which was painstakingly created for the thorough testing of several algorithms. Their study’s results showed an astounding 100% precision rate and an average detection accuracy of 99.06%. One significant obstacle that this investigation has to overcome, though, is the bias that is present in the datasets themselves, where there is a predisposition for the “clean” or “benign” class. Interestingly, the report doesn’t go into detail on how to use any sample strategies to ensure a class balance in datasets.
Fok et al. examined the use of ML techniques to detect botnet traffic using support vector machines, DT, and RF classifiers [
19]. The average recall percentages and FPRs for SVM, DT, and random forests are shown in the paper as 83.1%, 89.6%, and 84.9% respectively, offering insights into the efficacy of these classifiers. The results imply that botnet traffic identification accuracy may be improved, resulting in a notable decrease in FPRs. The authors specifically highlight the possibility of using DL techniques for these kinds of categorization issues in further studies. Utilizing a basic amount of dataset features such as flow size in bytes and packets, flow inter-arrival periods, and counts of illegal TCP flows is the main goal of the suggested technique. In addition to improving detection efficiency, this condensed feature set indicates room for future development in the field of ML-based automated botnet traffic identification.
Saied et al., 2023 [
20], sought to improve botnet attack detection, mitigation, and classification accuracy in IoT environments. They used numerous ET learning algorithms to do this, and they used the N-BaIoT17 [
21] dataset which is particularly made for IoT environments to perform their research. AdaBoost, Gradient Descent Boosting, DT, Random Forest, Bagging Meta Classifier, and XGB are the six tree-based methods that were evaluated, implemented, and tested as part of this research. After a thorough evaluation, the authors concluded that the Random Forest algorithm performed the best in both intrusion detection and multi-class categorization. The program attained a remarkable 99.9991% accuracy rate. It also showed respectable test and training times, with timings of 4.33 s and 1249.52 s, respectively. The results highlight how well the Random Forest algorithm handles the intricacies involved in identifying and categorizing botnet attacks on IoT devices, which adds important context to the current efforts to strengthen IoT ecosystem security.
An extensive survey on network traffic classification was carried out by Sheikh and Peng [
22]. By analyzing current methods, their study offered a comprehensive analysis of the several approaches, strategies, and datasets used in the traffic categorization process. They specifically explored ML techniques for traffic categorization, providing an extensive overview of the latest research efforts in this field. Four main categories of classification strategies were identified in the paper including behavior-based, statistical-based, payload-based, and port-based classifications. They also covered parameters that are necessary for researchers to assess the effectiveness and performance of traffic categorization techniques.
Koroniotis et al. proposed a brand-new cloud-based approach that uses a ML algorithm to identify and categorize botnet activity [
23]. The three main modules of the system architecture were detection or classification, filtering, and feature extraction. Using a bagging classifier and a DT ML method, the system was trained. Using the Botnet-Detection dataset [
24] which was taken from the broader ISCX UNB dataset [
25] and contained trojan, clean data, and random bots, the system was trained. The study’s conclusions showed how well the system identified botnets and detected unusual flows. The scientists did, however, identify several efficiency problems with the system, such as the size of the cloud’s constraints and the length of training sessions. Notwithstanding these obstacles, by utilizing cloud-based solutions and ML techniques, the research advances botnet detection systems. The results provide insightful information on the system’s strengths and weaknesses, opening the door for further improvements and optimizations in cloud-based botnet traffic detection in the future.
Azab et al. performed a thorough analysis of the techniques, datasets, and machine-learning algorithms used in network traffic categorization [
9]. They provided a thorough analysis of both ML and DL approaches for categorization, outlining the benefits, limitations, implementations, and limits of each strategy. Numerous algorithms, including supervised, unsupervised, and semi-supervised ML techniques, were examined in the study. The results of the study show that no one solution can achieve perfect performance in terms of speed, accuracy, and early-stage detection. The authors recommended using multilayer classification models as a solution to the drawbacks of current methods.
Rachmawati et al. conducted a thorough investigation of the widely used DL methods in traffic categorization problems [
26]. They did a thorough evaluation of current contributions in the area and developed a complete framework covering many elements of DL-based approaches. They categorized the contributions based on data preparation, pre-processing, model input design, and model architecture. The paper also meticulously outlined the challenges associated with deploying DL in traffic classification, shedding light on its inherent limitations. In envisioning future research directions, the authors put forth a proposal for the development of a deep-learning model specifically tailored for the classification of encrypted traffic. This suggested avenue for exploration could potentially address current gaps and contribute to advancing traffic classification methodologies in a rapidly evolving technological landscape.
Esmaeilyfard et al. used the UNSW Bot-IoT dataset to detect IoT botnet attacks [
27]. A stacked ensemble model combining MLP and RF achieved the best accuracy of 99.3%. To make the system lightweight, the authors applied lasso-based feature selection and LR stacking. This reduced resource use, requiring 36% less CPU and 38% less memory, making it suitable for IoT devices. These findings motivate us to design ML models for fulfilling the constraints of resource-limited devices.
In recent years, there has been a growing trend towards applying XAI in botnet detection and classification for IoT security [
28]. While ML-based methods have shown strong performance in identifying botnet activities, their increasing complexity raises concerns about transparency and trust. XAI techniques such as rule extraction, LIME, and SHAP are now being integrated to make these models more interpretable, enabling better trust, early attack detection, and stronger defense strategies. This shift reflects a broader movement in cybersecurity research toward building transparent, and trustworthy AI-driven systems for protecting IoT ecosystems.
Rupanetti and Kaabouch, in their work on applying machine learning for botnet attack detection, utilized the IoT-23 dataset and trained three models using a fixed train–test split [
29]. In their results, the RF model achieved an accuracy of 99.0%. However, their approach primarily focused on evaluating model performance without addressing model optimization or deployment efficiency, which limits its applicability to real-time IoT environments. Ibrahim et al. discussed several other approaches for detecting IoT botnets based on DNS traffic analysis and host-based anomaly detection [
30].
Table 1 provides the list of all datasets cited in this study.
Table 2 provides a comparative analysis of our study’s results against several state-of-the-art approaches in IoT threat detection. It outlines the algorithms or models employed in each study, along with their respective datasets and evaluation metrics. Notably, hybrid models such as the combination of XGB, SVM, RF, ANN, and RNN achieved the highest accuracy of 99.996% on the N-BaIoT dataset, as demonstrated by Rawat et al. [
31]. Other studies, including those utilizing DT, NB, and Hidden Markov Models, reported varying levels of accuracy based on dataset characteristics. In our study, we employed XGB with SMOTE for class balancing and achieved an accuracy of 99.93% and 99.99%, demonstrating its effectiveness in handling imbalanced data. The table highlights the diversity in methodologies and datasets used, emphasizing the significance of model selection in achieving optimal detection performance.
Table 1.
List of datasets cited in this study.
Table 1.
List of datasets cited in this study.
| Reference | Dataset |
|---|
| [15] | ISOT-HTTP Botnet |
| [16] | CTU-13 |
| [17] | CIC-DDoS2019 |
| [18] | UNSW Bot-IoT |
| [21] | N-BaIoT17 |
| [24] | Botnet-Detection |
| [25] | ISCX UNB |
| [32] | UNSW-NB15 |
| [33] | CIC-IDS2017 |
| [34] | BoT IoT |
| [35] | ICS-Flow |
| [36] | CIC-IoT2023 |
| [37] | ToN-IoT |
| [38] | N-BaIoT25 |
Table 2.
Comparison of our approach with state-of-the-art approaches.
Table 2.
Comparison of our approach with state-of-the-art approaches.
| State-of-the-Art Study | Year | ML/DL Model | Dataset Description | Accuracy |
|---|
| Rawajbeh et al. [35] | 2025 | Adaptive Hoeffding Tree | ToN-IoT, Bot-IoT | 96.40% |
| Pallakonda et al. [39] | 2025 | DT | ICS-Flow | 99.81% |
| Ye et al. [40] | 2025 | XGB stacked with RF | CICIoT2023 | 95.9% |
| Nuha et al. [41] | 2025 | KNN | Non-Public Simulation Data | 99% |
| Ali et al. [42] | 2025 | Stacked KNN, SVM, RF, DT and MLP | UNSW-NB15 | 97.94% |
| Mohan et al. [43] | 2025 | BiGRU | UNSW-NB15 | 99.22% |
| Kayyidavazhiyil [44] | 2025 | Ensemble of BiGRU, LSTM and SIBMO | TON-IoT | 93% |
| Saied et al. [18] | 2024 | Histogram Gradient Boosting | N-BaIot | 99.97% |
| Tikekar et al. [45] | 2024 | NB | CTU-13 Dataset | 90.62% |
| Hostiadi et al. [46] | 2024 | DT | NCC Dataset | 99.03% |
| Rawat et al. [31] | 2024 | Hybrid of XGB, SVM, RF, ANN, RNN | | 99.996% |
| Mannikar and Troia [47] | 2024 | Hidden Markov Model | CTU-13 Dataset | 83.19% |
| Bojarajulu and Tanwar [48] | 2024 | Customized CNN | TON_IoT, UNSW-NB15 | 88.42% |
| Saif et al. [49] | 2023 | Random Forest | N-BaIoT | 99% |
| Chaganti et al. [50] | 2023 | LSTM | Simulated Data | 97.1% |
| Sharma et al. [51] | 2023 | DNN | UNSW-NB15 | 91% |
| Santhadevi and Janet [52] | 2023 | LSTM | UNSW-NB15, UNSW_BOT_IoT | 97.97% |
| Cam and Trung [53] | 2023 | Decision Tree | CICIDS 2017 | 99.90% |
| Our Approach | 2025 | Quantized XGB | CTU-IoT-Malware-Capture 2023 Bot-IoT Dataset 2019 | 99.93% & 99.99% |
2.3. Summarized Problem Statement
The datasets utilized in the surveyed works predominantly exhibit an imbalance toward the infected class. However, the authors did not implement any data sampling or balancing techniques, leading to bias, overfitting, poor generalization, and misclassification of minority classes. Furthermore, the evaluation of model performance was conducted on a specific subset of the dataset, raising concerns regarding the reliability of the reported results. The high accuracy, precision, and recall values observed in these studies are likely influenced by data imbalance and the selection of biased training and testing data, thereby limiting the models’ real-world applicability.
Additionally, while several models have been proposed, their practical deployment on resource-constrained IoT devices remains a challenge due to limited memory availability, often in the KB range. This constraint makes it infeasible for DL models and even some ML algorithms to run efficiently on IoT endpoints. Moreover, the existing works primarily frame the problem as a binary classification task, which restricts the scope to mere detection and blocking of threats. However, effective cybersecurity measures require the capability to identify specific threat classes, enabling automated responses and appropriate mitigation strategies. Addressing these limitations is crucial for developing practical, scalable, and effective IoT security solutions.