An Efficient Network Intrusion Detection and Classification System

: Intrusion detection in computer networks is of great importance because of its effects on the different communication and security domains. The detection of network intrusion is a challenge. Moreover, network intrusion detection remains a challenging task as a massive amount of data is required to train the state-of-the-art machine learning models to detect network intrusion threats. Many approaches have already been proposed recently on network intrusion detection. However, they face critical challenges owing to the continuous increase in new threats that current systems do not understand. This paper compares multiple techniques to develop a network intrusion detection system. Optimum features are selected from the dataset based on the correlation between the features. Furthermore, we propose an AdaBoost-based approach for network intrusion detection based on these selected features and present its detailed functionality and performance. Unlike most previous studies, which employ the KDD99 dataset, we used a recent and comprehensive UNSW-NB 15 dataset for network anomaly detection. This dataset is a collection of network packets exchanged between hosts. It comprises 49 attributes, including nine types of threats such as DoS, Fuzzers, Exploit, Worm, shellcode, reconnaissance, generic, and analysis Backdoor. In this study, we employ SVM and MLP for comparison. Finally, we propose AdaBoost based on the decision tree classiﬁer to classify normal activity and possible threats. We monitored the network trafﬁc and classiﬁed it into either threats or non-threats. The experimental ﬁndings showed that our proposed method effectively detects different forms of network intrusions on computer networks and achieves an accuracy of 99.3% on the UNSW-NB15 dataset. The proposed system will be helpful in network security applications and research domains.


Introduction
Intrusion detection is a method for observing and tracking events on a computer system. It is used to identify signs of security issues and activities are monitored using event-based techniques and security information. With the exponential increase in internetbased facilities, the number of devices for computing and consumers connected to data networks and computer networks has increased significantly. These devices provide and serve online services to users and public/private sector organizations. In parallel, there has been an infinite amount of unauthorized access to online services. Typical services include protocol-specific attacks (ARP, IP, TCP, UDP, and ICMP), traffic flooding, worms, and DoS [1]. In protocol-specific attacks, the attacker exploits a specific protocol feature installed on the target machine. Prevalent protocol attacks include SMURF, SYP, and authentication server attacks. The ever-increasing attacks demand an effective intrusion detection system that detects documented forms and learns to detect new forms. The intrusion detection system is one of the methods used to address the network security issue. The imperfectness of existing systems has allowed data mining to make several significant contributions in the field of intrusion detection.
The HIDS runs on all the machines in the network and other parts of the enterprise network. NIDS is deployed and intended to manage those places only where chances of vulnerability are high. NNIDS is like NIDS [3,4], but it only applies to one host at a time. Three main approaches are used for intrusion detection: i Signature-based IDS; ii Anomaly-based IDS; iii Hybrid of signature and anomaly-based IDS.
Signature-based IDS focuses on identifying signatures and patterns and matching those patterns with the well-known signature of misuses. Anomaly-based IDS searches for forms of unknown signature attacks-based IDS that are hard to detect. Owing to the exponential growth in malware and attack styles, anomaly-based IDS uses machine learning approaches to equate trustworthy activity patterns with new behaviors.
Several supervised learning techniques such as decision trees, SVM, and ANNs [5][6][7] have been used, but they have a high false-positive ratio for infrequently occurring attacks. This study demonstrates an AdaBoost-based network intrusion detection system (NIDS) that takes statistical network flow features to recognize malicious activities to tackle network attacks. A set of features are extracted from network flow after potential analysis of HTTP, MQTT, and DNS network packets. For this purpose, we used the source files of the UNSW-NB15 dataset [8,9]. In the proposed system, firstly, it performs the feature selection from the entire dataset. Next, we plot a correlation matrix and select features based on their correlation with one another, and remove some highly correlated features. In this study, we used AdaBoost on a decision tree classifier to classify the normal traffic and possible threats and obtained an accuracy of 99.3%.
The key contributions in this study are presented as follows: i. A feature selection method is proposed based on the correlation matrix calculated between the features. Features that were highly correlated with one another were removed as they added no significant difference and only increased the complexity of the model. ii.
Optimal features are selected from the UNSW NB 15 dataset. iii.
The AdaBoost-based model is proposed, using a decision tree to classify normal network traffic and network threats.
The rest of the paper is organized as follows. Section 2 describes the background and literature review. The description of the proposed system is presented in Section 3. The performance metrics and evaluation protocol are discussed in Section 4. The experimental results and discussion are presented in Section 5. Section 6 presents the comparison and Section 7 concludes the paper.

Background and Related Work
An intrusion detection system is a technique for tracking network activities between different entities by estimating their integrity and availability principles [10,11]. Classic systems consist of data source, preprocessing, and a decision-making method to identify vulnerability. The first step involves raw data gathered from host traces or network traffic. The second step covers the construction of features passed to the decision-making method to identify possible threats [12]. Several studies have been conducted in the past to address this challenge. The previous research studies have used KNN, SVM, and ANN models for network anomaly detection [13][14][15].
Awais et al. [16] proposed an ANN-based method for detecting network intrusions with an accuracy of 89.4%. The proposed structure uses a neural network for partitioning the training data. Each training data sample is used for training a network and a boosting algorithm to learn an optimized set of weights. This system was evaluated and tested on KDD99 [17] and UNSW-NB15 datasets. The UNSW-NB15 dataset contains a mixture of 49 input features, with nine attack types and normal network traffic. The details of the attack types are shown in Table 1, where they are divided into two different categories. The first category (shown at sr. # 1, Table 1) is for normal network traffic and the second category (shown at sr. # 2 to 10, Table 1) describes the attack types, which include Analysis, Backdoor, DoS, Exploits, Fuzzers, Generic, Reconnaissance, Shellcode, and Worms. The data types and their respective descriptions are shown in Table 2. Shellcode Code is used to exploit software vulnerabilities.
10 Worms A set of virus codes can add to a computer system or other programs. The categories of the UNSW-NB15 dataset and their descriptions are shown in Table 1, where one is the normal network traffic and the rest are different types of threats. The system could face these threats while connected to the internet. In another study [7], the authors applied a deep convolutional neural network (DCNN) on the NSL-KDD dataset for the binary classification of anomaly detection. The proposed approach was optimized by randomized grid search, hyper-parameters' optimization, and fine-tuning. They obtained the highest accuracy of 85.22% using DCNN, whereas 82.02% was obtained using the random tree classifier.
The methodology of network intrusion is either hybrid-based or intrusion-based. Many researchers have used ensemble-based intrusion detection techniques to propose enhanced systems for optimizing NIDS. Mustafa et al. [18] proposed a beta mixture model for anomaly detection. The author computed a normal network profile, calculated the deviations from that network profile, and considered that as an anomaly. The UNSW-NB15 dataset was used for performance evaluation using external evaluation metrics. This technique slightly improved the accuracy of the UNSW-NB15 dataset. In another study [19], the authors use ANNs for signature-based intrusion detection in IoT devices. They tested the approach on various malicious and heterogeneous data, achieving an average accuracy of 84% with an average false-positive rate of 8%.
Benjamin et al. [20] proposed an ensemble-based intrusion detection technique against application layer protocols (MQTT, HTTP, and DNS), as shown in Figure 1. The authors generated the new statistical flow features from the protocols based on their potential properties. The experimental analysis showed that statistically generated features are related to normal and suspected activities. Ensemble-based AdaBoost was developed using naïve Bayes, decision trees, and artificial neural network (ANN). NIMS botnet was used to detect and evaluate ensemble learning. Wei et al. [21] propose an Adaboost-based network intrusion detection system based on the principle of weak learners. They use naïve Bayes as weak learners and boost the performance utilizing the weight update ability of the Adaboost. Yuan et al. [22] proposed a semi-supervised tri-Adaboost-based methodology for network intrusion detection. They suggest using three distinct weak Adaboost-based classifiers for training their model. The dimension of the features is reduced using the chi-square methodology.
Sheraz et al. [23] presented an anomaly system based on deep learning. Convolution neural networks (CNNs), recurrent neural networks (RNNs), and autoencoders structures are proposed. They also test the conventional IDS based on machine learning and present an evaluation of their findings. Truong et al. [24] presented an empirical study on the use of generative adversarial networks (GANs) for network anomaly detection. The authors consider using multiple datasets and perform extensive experiments to test the robustness of GANs. A traffic aggregation technique was also developed to extract statistical features. Dutta et al. [25] propose using DNN consisting of multiple stacked fully connected layers that present a flow-based system for multiclass anomaly detection. Lecun uniform initialization is used for the initialization of weights. Michal et al. [26] propose a two-stage hybrid network anomaly detection system. Classical autoencoder (CAE) and DNN are used for feature engineering and classification. They evaluate the performance of their model using the UNSW-NB-15 dataset and achieve an accuracy of 91.29%.
Yuan et al. [27] propose an anomaly detection architecture to detect short interval intrusions using SVM. They train the model using cross-correlation and Kullback-Leibler (KL) divergence calculated by the data planes and control traffic. Furthermore, they evaluate the performance of their proposed model based on a realistic internet traffic dataset. Giacinto and Roli [28] developed a network intrusion system using an automated design of classifier systems. Intrusions were detected based on the voting rule method. The experimental results showed that hybrid classification techniques are effective if each classifier performs well. Redundant features have little contribution to NIDS. Srilatha and Johnson [29] focused on examining and finding potential data features for intrusion detection systems. They aimed to design an optimized and computationally effective NIDS. Srilatha proposed a hybrid technique involving ensemble-based classifiers. PCA was used to reduce input dimensions, but has not achieved the expected results in the NIDS domain. MIT Lincoln laboratory prepared the dataset for the evaluation of the proposed solution.
Arif et al. [30] proposed using Adaboost for an efficient intrusion detection system. The synthetic minority oversampling technique (SMOTE) is considered for handling the class imbalance problem, whereas principal component analysis (PCA) and ensemble feature selection (EFS) are selected for feature selection. Through this technique, they obtain an accuracy of 81.83% on the CIC IDS 2017 database (University of New Brunswick: Fredericton, NB, Canada).
From the above literature review, we observed that that previous research uses Adaboost mainly for the weight update property and does not consider the importance of discriminative features to distinguish the threats from normal network activity. Thus, we believe in using a complex dataset (UNSW NB 15), (UNSW Canberra: Canberra, Australia) selecting the most discriminative features, and utilizing AdaBoost as a weight update model. At the same time, the decision tree is a primary classifier. Decision trees perform exceptionally well on network data. The weight update in the algorithm is done by Adaboost, which shows promising results when used in combination with the decision tree.

Proposed Method
This study proposes an automatic and efficient network intrusion detection system based on the Adaboost technique. Firstly, we select the features from the pool of features from the UNSW-NB15 dataset. For this purpose, we perform a correlation analysis. From the result of the investigation, we remove those highly correlated features with each other, as shown in Figure 2. Furthermore, this study presents the techniques to classify network traffic as normal traffic or threats. We briefly review ANN, SVM, and Adaboost-based decision trees in our proposed system. ANN and SVM were used for comparison. Furthermore, the techniques are described below. i.
Artificial neural network (ANN): The artificial neural network technique is used to match targets by transforming respective inputs into outputs of each class. The result depends on the weights learned for each neuron and the activation function to get the output. Training a successful model requires normal and irregular network data to train the weights more effectively.
The model uses an activation function in which Equation (1) represents the input data, where τ represents the sigmoid activation and w j is the weight corresponding to each input.
ii. Support vector machine (SVM): Support vector machine or SVM is a widely known classifier used for regression and classification purposes. This study uses SVM for network intrusion on the UNSW-NB15 data set [14]. SVM provides various kernel functions used to map low dimensional to high dimensional space in an SVM model. This study used the RBF kernel to obtain the results, as discussed in the next section [31]. Equation (2) represents the formula for this approach as shown below: iii. Adaboost: In this study, we proposed the use of a decision tree's-based Adaboost model, which uses the basic principles of decision trees as a primary classifier, but, on top of that, employs Adaboost for the weight updates, hence achieving outstanding results. Section 3.4 discusses the algorithm for our proposed approach.

System Architecture
The system diagram of the proposed system is shown in Figure 3. The proposed system architecture is based on different components used to perform network intrusion detection. The UNSW-NB15 dataset is a comprehensive dataset that contains 49 features utilized in detecting network intrusions. We used a subset of the entire available data. Its training and testing instances specification is mentioned in Tables 2 and 3, respectively. Data were divided into training and testing sets. Then, feature selection was applied based on the correlation matrix. In the next step, we trained our Adaboost model, which used a decision tree as a classifier using maximum depth = 2 from the pool {2,4,6,8} as the complexity of the model increased when the depth increased to 4 and the tree tends to overfit for higher depth values. The algorithm selected was 'SAMME.R' because, when tested, the SAMME.R algorithm converged faster than others, hence utilizing fewer boosting iterations and lower error rates for the proposed model. There were 82,332 input samples in the training set and 175,341 samples in the test set. Decision tree was used as the classifying model, while Adaboost performed the weight updates. Section 3.4 discusses the algorithm for our proposed approach (Algorithm 1).

Dataset Description
In this study, we used the UNSW-NB15 dataset for evaluating the performance of the proposed network intrusion system. Nour et al. [32] examined the complexity of the UNSW-NB-15 dataset in three aspects and found that the dataset is more complex. Furthermore, the authors presented that KDD CUP 99, NSLKDD, and KDD 98 do not reflect modern footprint attacks, whereas the UNSW-NB-15 dataset comprises current synthesized suspicious network activities. The Australian Center for Cyber Security (ACCS) used IXIA Perfect Storm to create the raw network packets. Using Argus, Bro-IDS tools, this dataset contains 49 input features and class labels representing nine different attack types [1,33]. The input features include packet-based features, time features, connection features, and content features. Packet-based features are based on the payload exchanged between the hosts. Time features assist the examination of source jitter, destination jitter, source, and destination inter-packet arrival times. The content features provide the content-related information, e.g., sequence numbers, mean packet size, and content size of data from the server. The connection-based features offer general information about the connection, e.g., number of links calling the same service, number of connections from source, and number of references to the destination address. The available features contain general HTTP methods for sending the information and FTP login details.
The attack types are DoS, Fuzzers, Exploit, Worm, shellcode, reconnaissance, generic, analysis, and backdoor (see description of attacks in Table 1). We computed the values of the newly created column as 0 for all the normal activities and 1 for the rest of the nine attack types. CSV files are classified into two network activities as discussed below: Network activity is classified as normal if there is no network intrusion. In contrast, activity is classified as a network attack when someone breaches an internet-based application via the port, bypassing the network authentication, accessing the resources with unauthorized access, and targeting security loopholes. We used a label encoder to convert the 'activity' column into numerical data. The data are divided into two sets, i.e., training and testing sets. The training set contains 82,332 instances, whereas the testing set consists of 175,341 records. Previous work on this dataset acknowledges that the data are normally distributed in the sense that the different types of network traffic are evenly distributed between training and testing sets [9]. Tables 2 and 3 describe the class-based distributions  for training and testing data. Furthermore, feature selection has been applied to select optimum features, as explained in the next section.

Feature Selection
Feature selection is a process of selecting discriminative feature variables that contribute most to the target variable. The objective of the feature selection process is to reduce the computational cost of the model by removing the variables that have no contribution to the determination of the target variable.
We used p-value and correlation measures for examining the relationship between input variables and output labels. The p-value is the probability of observation under the observation. This probability is then used to accept or reject the hypothesis. Equation (3) is used for computing the p-value as follows: Here,p is the sample proportion, P0 is the assumed population under the null hypothesis, and n is the sample size. Then, the p-value can be computed by finding the corresponding value from z value obtained. We have not achieved the expected results using the p-value, so we used correlation measures for feature selection.
The correlation coefficient is the statistical measure of finding the strength of the relationship between two variables using their relative moments. Its value ranges from +1 to −1.
The correlation coefficient variable of value +1 shows a strong positive correlation between two variables, whereas −1 represents a strong negative relationship between two variables. There is no relationship between the two attributes if the value of the correlation coefficient is zero.
In Equation (4), x i are the values of X variable in the sample, x is the mean value of X variable, yi are the values of Y variables in the sample, and y is the mean value of Y variable.
We computed the correlation matrix of all the variables and eliminated those variables that do not correlate with the target variable. The correlation matrix for all the input variables is shown in Figure 2. As shown in Figure 2, there are some input variables where either they have no correlation with the target variable or they have a strong correlation with each other. We have removed those variables as well. After analyzing the correlation matrix, we are removing attack_cat, proto, service, state, id, spkts, dpkts, sloss, dloss, and ct_dst_src_ltm. The potential features that we are using in the proposed NIDS are shown in Table 4.

Adaboost Algorithm Pseudocode
The Adaboost algorithm is an iterative procedure that usually comprises many classifiers [34]. The Adaboost algorithm works in that it first classifies the training input data and produces the output labels. It then compares the results with the actual output and boosts the weight if it is wrongly classified. Again, misclassified data are organized with the boosted weights, the same process is repeated, and the weights are constantly updated [34].

Algorithm 1 AdaBoost Pseudocode
Fit a classi f ier x.(T m ) to the training data using weights w i . err m ← Σ n 1 w i I I(c i = T m (x i ))\Σ n 1 w i e m ← log(1 − err m /err m ) w i = ω i .exp(e m · I I(c i = T m (x i ))) Renormalize w i .

Performance Metrics and Evaluation Protocol
In this section, we analyze the performance of our method for two classification problems (normal traffic vs. threats) using the UNSW-NB15 dataset with a 10-fold crossvalidation technique. The detail about the classes is given in the previous section of our proposed method.
This section explains the performance evaluation and evaluation matrix used in the experiments. Then, we discuss the impact of fewer potential features. Finally, the comparison is shown between SVM, ANN, and Adaboost classification algorithms.
We conducted several experiments to find the effectiveness of the proposed system. We used TensorFlow with python for the implementation of our proposed model. All of our assessments were performed on a 64-bit Windows system running GPU-enabled TensorFlow with a Core i7-10750H (10th Gen), 32 GB RAM, and an NVIDIA GeForce RTX 2070 Super (8 GB) GPU.
We used the UNSW-NB15 dataset (described in Section 3.2) to conduct our assessments. In NIDS, this dataset is treated as a benchmark. In addition, the use of this dataset helps draw comparisons with current approaches and studies.
We used the following performance metrics in this study to evaluate the system: i True Positive (TP)-Attack data correctly classified as an attack. ii False Positive (FP)-Normal data incorrectly classified as an attack. iii True Negative (TN)-Normal data correctly classified as normal. iv False Negative (FN)-Attack data incorrectly classified as normal.
We used different performance metrics to measure the performance of our method. Accuracy computes the number of accurate classifications out of total samples-Equation (5).
The precision computes the number of correct classifications penalized by the number of incorrect classifications-Equation (6).
The recall computes the number of correct classifications penalized by the missed entries-Equation (7).
The F-score measures the harmonic mean of precision and recall, which is a derived effectiveness measurement-Equation (8).
We trained the models on a massive dataset with no duplication of records for evaluating the performance and to overcome the overfitting and underfitting problems.
We tested our models by setting various parameters and, based on the results, we came to the following parameters that give us the best results, as mentioned below: i SVM technique was adjusted by kernel = 'RBF, degree = 3, gamma = 'scale', and shrinking = 'true'; these parameters were selected as they showed the best results. The rest of the parameters were set to default. ii Four hidden layers tuned ANN from a combination of {4,6,8} because increasing the number of hidden layers had no effect on the results, and their sizes were 256, 128, 64, and 32, respectively. We used 'relu' as the activation function and 'Adam' as a solver for weight optimization as they gave optimal results.
iii The Adaboost classifier was adopted on top of the decision tree classifier with maximum depth = 2, algorithm = 'SAMME.R'. Max depth was kept at two as increasing causes the model to overfit, and the 'SAMME.R' algorithm was selected owing to its faster convergence.

Experimental Results and Discussion
In this section, we analyze the performance of our proposed system for the classification problem using the UNSW-NB15 dataset. We evaluate the results of SVM, ANN, and Adaboost on top of the decision tree classifier for a binary classification problem of network intrusion detection on the UNSW-NB15 dataset. Most of the work done revolves around the traditional datasets. We focus on the more complex network dataset using reduced features and apply three algorithms to compare the performance. This research aims to accurately classify normal network traffic and network threats from suspicious network activities.
In this paper, we compared various techniques for classifying network data into the form of threats or non-threats. We tested multiple models, including the use of ANN, SVM, and finally our proposed technique of Adaboost based on the decision tree classifier. While ANN uses an activation function to generate outputs, it performs weight updates at each neuron to improve the results. We achieve an accuracy of 89.54% when we set the input parameters to four hidden layers, with 256, 128, 64, and 32 hidden sizes; 'relu' as the activation function; and 'Adam' as a solver for weight optimization.
In this study, we also used support vector machine (SVM) to classify network intrusions. In SVM, we plot each sample point in an n-dimensional hyperplane, and then we find the maximum distance between the hyperplanes, which differentiate the classes quite well. We use input parameters, kernel 'RBF' with degree = 3, gamma = 'scale', and shrinking = 'true', and we achieve an accuracy of 94.7%.
The proposed approach we present in this paper is Adaboost-based decision tree classification. The proposed method uses decision tree as a classifier, while Adaboost is used for weight updates. A correlation matrix was computed to find the correlation of the input features. Depending on the results, we performed feature selection, where we removed some features on the following bases: i They are highly correlated with one another. ii They do not affect the performance of our model.
The parameters we used in our Adaboost model were maximum depth = 2 and algorithm = 'SAMME.R'. We achieved an accuracy of 99.3%. Furthermore, the comparison of the results of the three models is shown in Table 5.  Figure 4 presents the performance evaluation matrix between the accuracy, precision, recall, and F-score among the SVM, ANN, and Adaboost-based decision tree classifier models. Table 5 displays the results of our proposed models for the NIDS problem using the UNSW-NB15 dataset. When we observe the outcomes of our model, we can see that the achieved results are overall higher than those of other existing approaches. Our proposed model achieves the highest accuracy of 99.3%.

Comparison
This section compares the different approaches used to detect network intrusion anomalies and their datasets. Table 6 presents the detailed overview of some of the techniques used to detect network anomalies using the UNSW-NB15 dataset. Previously, various datasets were used for network intrusion detection, but none of them were as comprehensive as UNSW-NB15, as it contains nine different types of attack categories. Although we are classifying threats and normal traffic in our approach, we obtain a variety of input network features to train the model. ANN (artificial neural network) [22] has been used on the UNSW-NB15 dataset as a whole and obtained an accuracy of 84%. Adaboost-based neural network learning [5] has been used where NN (neural network) serves as a base model and Adaboost is used to achieve an 86.40% classification accuracy. Wei et al. [35] used an Adaboost algorithm for network intrusion detection. They present a four-module system: Feature extraction, data labeling, weak classifiers design, and robust classifier construction. Furthermore, an improved objective function and weight initialization method is presented to adjust the false positive ratio (FPR) and detection rate (DR). The authors test their approach on the KDD CUP 99 dataset and obtain a maximum accuracy of 90.88%. Wei et al. [36] also propose an improved Adaboost algorithm that uses decision stumps as weak classifiers. They offer to combine two weak classifiers for continuous and categorical features into a robust classifier. They show that their algorithm has low computational complexity and error rates by experimentation. Their algorithm provides detection rates between 90.4 and 90.88%. Naseer et al. [7] proposed a deep convolution neural network (DCNN), which is fine-tuned using randomized search over configuration space. Furthermore, they test their approach using the NSLKDD dataset. Mabu et al. [37] presented a novel fuzzy class association rule mining approach using genetic network programming (GNP) to detect network intrusions. Decision trees and negative selection algorithms have also been used to detect network intrusions, as illustrated in Table 6. The highest accuracy of 99.3% is achieved by our proposed model using the Adaboost-based decision tree classifier models owing to the optimal features selected from the dataset. These comparisons indicate that our model's results are excellent compared with other methods based on Adaboost-based decision tree classifiers.

Conclusions and Future Work
In this study, we proposed a feature selection method based on the correlation matrix among all the features of the UNSW-NB 15 dataset. We also propose an approach for network intrusion detection based on the selected features using the Adaboost-based decision tree classifier. Furthermore, we also discussed the issues faced by the current NIDS (network intrusion detection system) techniques. The proposed method firstly involves feature selection based on a correlation matrix. We discarded the features that were highly correlated with other input features and had little effect on the output label. The technique is based on AdaBoost on top of the decision tree classifier. The proposed method for network intrusion detection has very high accuracy with the UNSW-NB15 dataset as the model was trained and tested on the best discriminating features. In comparison with the previous works, we assessed the abilities of our model based on the dataset used and demonstrated consistent accuracy in the classification. We have evaluated our proposed system using different performance metrics, and performed the comparison with the stateof-the-art techniques to show that the proposed NIDS system performs better than the existing systems. The proposed NIDS will be helpful in network security applications and research domains.
Furthermore, this work can be extended to the real world to identify the dynamic intrusions on live network traffic. An analysis of suitable classifiers can also be performed to detect and analyze the performance of the classifiers. We intend to perform multiclass analysis to detect and classify different types of threats based on their categories.