An Efficient Two-Stage Network Intrusion Detection System in the Internet of Things

Zhang, Hongpo; Zhang, Bo; Huang, Lulu; Zhang, Zhaozhe; Huang, Haizhaoyang

doi:10.3390/info14020077

Open AccessArticle

An Efficient Two-Stage Network Intrusion Detection System in the Internet of Things

by

Hongpo Zhang

^1,2,*

,

Bo Zhang

¹

,

Lulu Huang

²

,

Zhaozhe Zhang

¹

and

Haizhaoyang Huang

¹

School of Cyber Science and Engineering, Zhengzhou University, Zhengzhou 450001, China

²

Cooperative Innovation Center of Internet Healthcare, Zhengzhou University, Zhengzhou 450001, China

^*

Author to whom correspondence should be addressed.

Information 2023, 14(2), 77; https://doi.org/10.3390/info14020077

Submission received: 29 November 2022 / Revised: 19 January 2023 / Accepted: 26 January 2023 / Published: 29 January 2023

Download

Browse Figures

Versions Notes

Abstract

:

Internet of Things (IoT) devices and services provide convenience but face serious security threats. The network intrusion detection system is vital in ensuring the security of the IoT environment. In the IoT environment, we propose a novel two-stage intrusion detection model that combines machine learning and deep learning to deal with the class imbalance of network traffic data and achieve fine-grained intrusion detection on large-scale flow data. The superiority of the model is verified on the newer and larger CSE-CIC-IDS2018 dataset. In Stage-1, the LightGBM algorithm recognizes normal and abnormal network traffic data and compares six classic machine learning techniques. In Stage-2, the Convolutional Neural Network (CNN) performs fine-grained attack class detection on the samples predicted to be abnormal in Stage-1. The Stage-2 multiclass classification achieves a detection rate of 99.896%,

F_{1} s c o r e

of 99.862%, and an MCC of 95.922%. The total training time of the two-stage model is 74.876 s. The detection time of a sample is 0.0172 milliseconds. Moreover, we set up an optional Synthetic Minority Over-sampling Technique based on the imbalance ratio (IR-SMOTE) of the dataset in Stage-2. Experimental results show that, compared with SMOTE technology, the two-stage intrusion detection model can adapt to imbalanced datasets well and reveal higher efficiency and better performance when processing large-scale flow data, outperforming state-of-the-art intrusion detection systems.

Keywords:

internet of things; network intrusion detection; convolutional neural network; class imbalance; LightGBM

1. Introduction

With the development of the Internet of Things (IoT), IoT technology has been widely used in wireless telecommunications scenarios [1], such as smart cities [2], smart health care and smart transportation. On the one hand, IoT is constantly changing our daily life and working methods; on the other hand, it faces serious security threats [3]. Intrusion detection systems (IDSs) can identify abnormal or malicious activities in the network and hosts and are an important security technology to ensure the security of the IoT [4].

According to the different monitoring data sources, IDS can be divided into host-based IDS (HIDS) and network-based IDS (NIDS). HIDS is generally deployed on passenger planes that need to be protected from monitoring system logs and other information. NIDS is installed on the host or switch and mainly monitors network traffic data, including packets, flow, and session [5]. IDS can be divided into misuse detection and anomaly detection according to different detection principles [6]. Misuse detection extracts network traffic characteristics and matches them with a characteristic database maintained in advance. The network traffic is considered as an attack activity if the matching is successful. Misuse detection has a low false alarm rate (FAR). However, it cannot detect unknown attacks and needs to maintain a huge feature database. Anomaly detection learns a normal behavior pattern in advance and then recognizes any network traffic that deviates from the normal behavior pattern as abnormal behavior. Anomaly detection’s FAR is slightly higher than that of misuse detection, but anomaly detection has attracted wide attention because of its ability to identify unknown attacks and strong versatility.

Traditional machine learning (ML) techniques have been widely used in anomaly network intrusion detection, such as random forest (RF) [7,8], K-nearest neighbor (KNN) [9,10], logistic regression (LR) [6], AdaBoost [11], decision tree (DT) [12], XGBoost [13], and gradient boosting decision tree series [11,14,15]. IDSs based on machine learning technology have some advantages, such as simple implementation and strong technical interpretability. However, with the increase of network traffic and the diversification of attack types, only the shallow learning technology of machine learning cannot meet the requirements of large-scale NIDS [5]. Deep learning techniques, such as CNN [16,17,18,19], recurrent neural network (RNN) [3,20], and multilayer perceptron (MLP) [21,22], are increasingly used to process large-scale data and show superior performance.

The previous NIDS scheme mainly ignored the importance of new datasets and class imbalance issues. The dataset plays a vital role in the learning of anomaly detection behavior patterns. The continuous improvement of attack methods means that the use of old network traffic is not enough to reflect the actual performance of NIDS in the modern network environment. There are much normal data in network traffic, but abnormal data only account for a small part of the data. A single-layer intrusion detection model directly trained from such data with class imbalance problems generally has poor performance in fine-grained attack class recognition. Therefore, solving the problem of data class imbalance or improving the intrusion detection model such that the IDS can better deal with the increasingly serious network security threats is necessary. The current imbalance treatment scheme can be categorized into four groups [23], namely, data-level, algorithm-level, cost-sensitive, and ensemble methods. The data-level method balances various data categories of by resampling data samples. The algorithm-level method compensates for the deviation detection results caused by the class imbalance of various data by improving classification algorithms. The cost-sensitive method [2] alleviates class imbalance by increasing the cost of misclassifying the minority class during the model training process. The ensemble method is used to obtain a high-performance strong classifier by combining multiple basic learners.

Considering the class imbalance in large-scale network flow data, we present a two-stage network intrusion detection model based on machine learning and deep learning for the IoT environment. We evaluate it on the CSE-CIC-IDS2018 dataset, which is a newer and larger network traffic dataset. The following are our main contributions:

(1): For the intrusion detection of large-scale flow data, we propose a two-stage fine-grained network intrusion detection model that combines LightGBM algorithm and CNN. The model improves detection accuracy and efficiency while making full use of large-scale data.
(2): In Stage-1, we use the LightGBM algorithm to identify normal and abnormal network traffic and compare six other classic machine learning classification algorithms, including RF, DT, LR, KNN, AdaBoost, and XGBoost. The experimental results show that the LightGBM algorithm has advantages in accuracy and time cost.
(3): In Stage-2, we use CNN to perform fine-grained attack class detection on the samples predicted to be abnormal in Stage-1. Aiming at the problem of class imbalance, we use the improved SMOTE based on the imbalance ratio (IR-SMOTE) to study the effect of different class imbalance ratios in training set on the performance of the model. Experimental results show that the two-stage intrusion detection model can adapt well to imbalanced large-scale network flow data.

The rest of this paper is organized as follows: Section 2 introduces the current research on NIDS from two aspects: IoT and class imbalance processing. Section 3 describes the two-stage intrusion detection model and dataset. Experimental results and discussion analysis are given in Section 4. Section 5 summarizes the work.

2. Related Work

The standard IoT architecture consists of the perception, network, and application layers [24]. The IoT is more vulnerable to attacks than general networks [1]. The two main types of attacks on the IoT are cyber and physical attacks [25]. Among the cyber attacks suffered mainly include denial of service (DoS), malicious input attacks, botnets, etc. IDSs can detect various attacks on the IoT at an early stage. Scholars have proposed some IDSs for the IoT [4,20,26].

Machine learning can automatically learn useful information from a large amount of data. Thus, it is widely used to build intelligent IDS. Lopez-Martin et al. [26] studied a specific intrusion detection method for structure-based conditional variational autoencoder, ID-CVAE, in the IoT, which integrates attack category tags into the decoder layer. ID-CVAE has low complexity, good classification effect and feature reconstruction. The accuracy on the NSL KDD dataset is 80.10%, which is higher than algorithms, such as RF and linear SVC. Rathore et al. [27], in order to solve the problem that existing machine learning based intrusion detection systems cannot sense recent unknown attacks when working in high-speed networks, proposed a decision tree-based classification model, namely C4.5. FSR and BER technologies are used to select nine best features from 41 features in the KDD99 intrusion dataset, realizing a real-time intrusion detection system in a high-speed environment with fewer flow features. Koroniotis et al. [12] used four machine learning algorithms, namely, association rule mining (ARM), artificial neural network (ANN), decision tree C4.5 (DT), and naive Bayes (NB), to the forensics of IoT botnet activities on the UNSW-NB15 dataset. C4.5 performs best in identifying botnet activities, and its accuracy and FAR are 93.23% and 6.77%, respectively. Hosseini et al. [28] proposed a hybrid intrusion detection method called MGA-SVM-HGS-PSO-ANN based on three evolutionary algorithms. They used a feature selection method that combines support vector machine (SVM) and genetic algorithm with multiple parental crossovers and multiple parental mutations (MGA) to reduce the dimensionality of the dataset and then utilized hybrid gravity search (HGS) and particle swarm optimization (PSO) trained three-layer ANN for classification. Various popular technologies on the NSL-KDD dataset were compared. Simulation results show that the MGA-SVM-HGS-PSO-ANN can achieve 99.3% maximum detection accuracy. Almaiah et al. [29] proposed an intrusion detection model based on the PCA feature selection technology and SVM with four different kernel functions. The results show that SVM based on Gaussian radial basis function achieves the best classification performance on both the KDD Cup’99 and UNSW-NB15 datasets.

As a branch of machine learning, deep learning can automatically learn data feature representation from raw data and is better at processing large-scale data than traditional machine learning [5]. Zhang et al. [21] designed an effective MLP-based network intrusion detection model after using a weighted denoising autoencoder for feature selection. The accuracy of this model on the UNSW-NB15 dataset is 98.80%, and FAR is 0.57%. Alzaqebah et al. [30] optimized the Grey Wolf Optimization algorithm (GWO) using information gain, which is a smart initialization method that can improves the detection ability of IDS on the UNSW-NB15 dataset for generic attacks. The accuracy, F1-score, and G-mean measures are 81%, 78%, and 84%, respectively. Kanimozhi et al. [22] proposed a NIDS that uses ANNs to detect botnet attacks. After hyperparameter optimization by GridSearchCV optimization technology, MLP performed positive anomaly recognition on the CSE-CIC-IDS2018 dataset. The accuracy on the test set is 99.97% and AUC is 99.91%. Almiani et al. [20] designed a two-layer cascaded IoT intrusion detection model based on RNN. They used a backpropagation algorithm to train RNN in the model. They carried out a simulation experiment on the NSL KDD dataset. The first layer was used to identify positive and anomalies in the data, which is mainly used to identify DoS attacks. The second layer of detection was identified by the first layer as a hidden attack that is missed in the normal sample. The experimental results show the effectiveness of the model in real-time IoT, and the model is highly sensitive to a DoS attack. Gamage et al. [31] conducted intrusion detection experiments on four deep learning models of the feed-forward neural network, autoencoder, deep belief network, and long short-term memory network on four datasets. The results show that the supervised deep feed-forward neural network (ANN) performs best and outperforms the two semi-supervised learning technologies of autoencoder and deep belief network. On the CSE-CIC-IDS2018 dataset, the accuracy of the deep ANN is 98.38%, the

F_{1} s c o r e

is 97.85%, and the training time is 3 min.

The class imbalance problem in network traffic data is one of the main problems that limit the performance of intrusion detection models. More and more scholars have begun paying attention to the class imbalance problem, and few studies have proposed solutions. Abdulhammed et al. [32] used the Uniform Distributed Based Balancing (UDBB) in an anomaly-based IDS. Experimental results show that UDBB can handle the class imbalance problem of the CICIDS2017 dataset, and combine with RF to achieve the best performance. Huang et al. [33] proposed a kind of NIDS based on imbalanced generative adversarial network (IGAN) to solve the problem of imbalanced intrusion detection in dynamic decentralized ad-hoc network. They used IGAN to generate minority data samples and CNN to classify network traffic. A comparative study with 15 other methods on three public datasets (NSL-KDD dataset, UNSW-NB15 dataset, and CICIDS2017 dataset) shows the advantages of IGAN-IDS. Experiments with different generation ratios and different imbalance ratios show that the system has good robustness. Zhang et al. [16] proposed a comprehensive sampling method, namely SGM, which combines SMOTE and Gaussian Mixture model to address the class imbalance problem. The classification results using CNN on the UNSW-NB15 and CICIDS2017 datasets show that SGM-CNN is particularly effective in improving the detection rate of rare categories. Lin et al. [34] proposed a dynamic network anomaly detection system, which adds an attention mechanism to LSTM and uses SMOTE and a weighted loss function to deal with class imbalance problem. The overall accuracy of the system on the CSE-CIC-IDS2018 dataset can reach 96.2%, and the recall of six categories reaches 98%. The above works are summarized in Table 1.

The present study proposes a two-stage fine-grained network intrusion detection model that combines machine learning technology and CNN for large-scale flow data in the IoT environment. This model effectively saves time costs while making full use of large-scale data, improves efficiency, is highly adaptable to an imbalanced dataset, and accurately identifies specific attacks.

3. Proposed Solution

The proposed two-stage network intrusion detection model combines machine learning, class imbalance processing technology (IR-SMOTE), and CNN. Figure 1 shows the overall structure of the model. The system classifies network traffic in two stages. During the training phase, Stage-1 performs data preprocessing on the original dataset and then uses LightGBM machine learning technology for normal and abnormal recognition. Stage-2 first randomly deletes s% of the benign data in the training set, uses IR-SMOTE to resample the training set, and finally trains a CNN to perform the fine-grained classification. During the testing phase, Stage-1 classifies the test set into benign data and attack data firstly. Then, the data classified as attack by ML are input to Stage-2’s CNN for fine-grained identification. We use the newer CSE-CIC-IDS2018 dataset to test the performance of this two-stage network intrusion detection model in a modern large-scale network environment.

3.1. Benchmark Dataset

The CSE-CIC-IDS2018 dataset [35] is collected by a collaborative project between the Communications Security Establishment (CSE) and the Canadian Institute for Cybersecurity (CIC) in 2018. The dataset is currently one of the latest datasets in the field of intrusion detection. The CSE-CIC-IDS2018 dataset has 10 CSV files, containing 16.2 million pieces of traffic data and involving six scenarios, namely Brute-force, DoS, DDoS, Web attacks, Botnet, and infiltration. Different attack scenarios can be subdivided into 14 specific attack classes. Table 2 shows the sample distribution of each class. The “Class Label” in Table 2 is the numeric class label that the nominal attack type is replaced with during data preprocessing. Table 2 shows that the dataset has a serious class imbalance problem. The “Benign” class accounts for 83.070% of the total data, whereas the least classes (“Brute Force-XSS” and “SQL Injection”) only account for 0.001% of the total data. The dataset uses the CICFlowMeter-V3 tool to extract 80 features from network traffic [36].

3.2. Stage-1

In Stage-1, network traffic data are preprocessed, and LightGBM is then used to identify normal and abnormal network traffic.

3.2.1. Data Preprocessing

Data preprocessing is mainly responsible for adjusting the original dataset so that network traffic data can be easily input into the intrusion detection model for classification. We delete the “Timestamp” feature in the CSE-CIC-IDS2018 dataset because the time when the network traffic occurred is meaningless for the training of the current model. We then replace some special values in the data, such as missing values (“NaN”) and “Infinity” values. We fill the “NaN” values with zero and replace “Infinity” in the features “Flow Byts/s” and “Flow Pkts/s” with the maximum value of their column +1. We use StandardScaler to normalize each feature to a Gaussian distribution with a mean of 0 and a variance of 1. StandardScaler can be defined as Equation (1):

\begin{matrix} x^{'} = \frac{x - x_{μ}}{x_{δ}}, \end{matrix}

(1)

where x,

x_{μ}

and

x_{δ}

are the value, mean and standard deviation of the original samples, respectively.

x^{'}

is the standardized samples. Finally, we replace the 15 nominal class labels of the dataset with integers between 0 and 14. The specific correspondence is shown in Table 2. At this time, the CSE-CIC-IDS2018 dataset has 78 behavior features and a class label. In Stage-1, we distribute the dataset according to the ratio of trainset: testset = 8:2, and the proportion of each class in each subset is the same.

3.2.2. Baseline Methods

First, we consider six classic machine learning classification algorithms as follows for comparison. Decision tree [12] is a classic machine learning method, which can provide extraordinary performance. Random Forest [7,8] is an ensemble method. Although RF is constructed by a multitude of Decision Trees, it has stronger generalization ability than Decision Tree. Both AdaBoost [11] and XGBoost [13] belong to the classic Boosting algorithm. AdaBoost’s basic strategy is to train multiple weak classifiers with a training set, and the final model is their weighted integration. XGBoost specifies that the weak classifier used is the decision tree, which uses the optimized distributed gradient lifting library, is efficient and good at large-scale parallelism. Logistic regression [6] is a commonly used linear regression algorithm, mainly applied to some binary classification problems. KNN [9,10] selects the nearest adjacent points for clustering by calculating Euclidean distance, and finally gives classification or prediction results according to the clustering results.

LightGBM is a highly efficient Gradient Boosting Decision Tree (GBDT) [37]. The computational complexity of traditional GBDT is proportional to the number of samples and the number of features, and the time cost is high when processing large-scale data. LightGBM improves the problem by reducing the number of samples and the number of features to adapt to high-dimensional, large-scale data scenarios. LightGBM uses Gradient-based One-Side Sampling (GOSS) and Exclusive Feature Bundling (EFB) to improve GBDT and enable the algorithm to achieve high-efficiency classification. To reduce the number of samples, given that data samples with large gradients play a vital role in the calculation of information gain, GOSS only uses a small number of data samples with large gradients to estimate information gain. In terms of reducing the number of features, EFB is achieved by bundling mutually exclusive features. In the process of learning decision trees, LightGBM uses a histogram-based algorithm that is excellent in terms of memory consumption and training speed to find the best split point.

In Stage-1, in addition to using LightGBM to identify normal and abnormal in the dataset, we compare six other classic machine learning algorithms, namely, RF, DT, LR, KNN, AdaBoost, and XGBoost. The data predicted to be benign in Stage-1 are ignored and no longer processed, and data samples predicted to be abnormal are inputted into Stage-2 to distinguish the attack classes further.

3.3. Stage-2

Stage-2 is responsible for the more fine-grained identification of attack types for the samples predicted to be abnormal in Stage-1. First, we delete the

s %

of the “Benign” data in the training set of Stage-1 as the new training set of Stage-2. The value of s must meet a condition, that is, the number of retained “Benign” class samples is balanced with the attack class with the most number of samples. Then, SMOTE is used to oversample the training data according to the IR. Finally, we train a CNN model to perform fine-grained attack type recognition on abnormal samples.

3.3.1. IR-SMOTE

A serious class imbalance problem exists in the intrusion detection dataset, which affects the recognition of minority classes by those classification models that assume that the class distribution is balanced. Suppose the dataset is

D = \{D_{i}, i = 1, 2, \dots, C\}

, C is the total number of classes, and

| D_{i} |

is the number of samples in the i-th class. The degree of imbalance of the dataset can be measured by the imbalance ratio (IR) in Equation (2). IR can be defined as the fraction between the number of instances of the majority (max) class and the minority (min) class [11]:

\begin{matrix} I R = \frac{M a x {| D_{i} |, i = 1, 2, \dots, C}}{M i n {| D_{i} |, i = 1, 2, \dots, C}}, \end{matrix}

(2)

To explore the impact of the degree of imbalance on the performance of the two-stage intrusion detection model, we propose an improved SMOTE based on IR, IR-SMOTE. SMOTE is a well known oversampling method, which increases the minority sample data by “synthesizing” the minority samples [38].

IR-SMOTE calculates the IR of the dataset according to Equation (2), sets a coefficient

α (α = 0, 1, 2, \dots; 0 \leq α \leq ⌊ lg I R ⌋)

related to the degree of oversampling, and calculates the number of instances after resampling

I_{R e s a m p l e} = M i n {| D_{i} |, i = 1, 2, \dots, C} * 10^{α}

. For the classes with less than

I_{R e s a m p l e}

sample, we use SMOTE to oversample the number of samples in the class to

I_{R e s a m p l e}

. After processing with IR-SMOTE, the IR of the dataset decreases, and the degree of decrease is related to

α

. Algorithm 1 provides the pseudocode of IR-SMOTE.

Algorithm 1 IR-SMOTE

Input:
Training set

D = {D_{i}, i = 1, 2, \dots, C}

, where C is the total number of classes and

| D_{i} |

is the number of samples in class i
Output:
Resampled training set

D^{'}

;
1:

I R = \frac{M a x {| D_{i} |, i = 1, 2, \dots, C}}{M i n {| D_{i} |, i = 1, 2, \dots, C}},

2: Oversampling coefficient

α (α = 0, 1, 2, \dots; 0 \leq α \leq ⌊ lg I R ⌋)

3:

I_{R e s a m p l e} = M i n {| D_{i} |, i = 1, 2, \dots, C} * 10^{α}

4: for

i \leftarrow 1

to C do
5: if

| D_{i} | < I_{R e s a m p l e}

then
6: for

| D_{i} |^{'} < I_{R e s a m p l e}

do
7: Choose a sample

x^{a}

from

D_{i}

8: Randomly select a sample

x^{b}

from the K nearest neighbor sample of

x^{a}

9: r = random.random (0, 1)
10: Synthetic sample

x^{c} = x^{a} + (x^{a} - x^{b}) * r

11: Add the Synthetic sample

x^{c}

to

| D_{i} |^{'}

12: end for
13: end if
14:

D^{'} = C o n c a t e n a t e (D_{i}^{'})

15: end for
16: return

D^{'}

3.3.2. Convolutional Neural Network

Convolutional neural network is a deep feedforward neural network with weight sharing and local connection. It is one of the representative algorithms of deep learning. It has been widely used in image and video analysis [39]. CNN is generally composed of the convolutional, the pooling and the dense layer cross-stacked and trained using the backpropagation algorithm.

The convolutional layer can extract effective local area features from the sample, and different convolution kernels are equivalent to different feature extractors. For the network traffic sample X with k features, we first convert X into two-dimensional data

X^{'}, (X^{'} \in R^{M \times N}, M \times N = k)

through the

r e s h a p e ()

function, and then input

X^{'}

into the two-dimensional convolutional layer. Given filter (or convolution kernel)

W \in R^{m \times n}, m < < M, n < < N

, the operation performed by the convolutional layer can be expressed as follows:

\begin{matrix} c_{i, j} = f (b_{i, j} + \sum_{u = 0}^{m - 1} \sum_{v = 0}^{n - 1} w_{u, v} \cdot x_{i - u, j - v}^{'}), \end{matrix}

(3)

where

c_{i, j}

is the feature map extracted by the convolutional layer,

f (\cdot)

is the nonlinear activation function, and

b_{i, j}

is the learnable bias. The current feature map is obtained by using the convolution of the filter with the previous output feature map or the original feature map plus an offset. The nonlinear activation process can enhance the nonlinear learning ability of the network.

The pooling layer, also called the subsampling layer, which is responsible for under-sampling the feature map, reducing the number of features and thus decreasing the number of network parameters, reducing network complexity, and avoiding overfitting. This study uses Max-Pooling, and the output of a region is the maximum value of all feature maps in the region. For a feature map c, it can be divided into multiple overlapping or non-overlapping regions, and Max-Pooling can be expressed by Equation (4).

\begin{matrix} Y_{m, n} =_{i \in R_{m, n}}^{m a x} c_{i} . \end{matrix}

(4)

The architecture of the CNN model designed in this study is shown in Stage-2 of Figure 1. The network data are converted into two-dimensional and then inputted into the network. Two-dimensional convolutional layers with 64 filters, and the filter size is (3*3) constitutes the first two layers. This method of stacking multiple convolutional layers has fewer parameters and more nonlinear activation processes than directly using a large-size convolutional layer, which can enhance the feature learning ability of the network [40]. The third layer is the Max-Pooling layer, which can under-sample the feature map of the convolutional layer to half of the original. We then added a Dropout layer with the parameter of 0.2 and a BatchNormalization layer. In order to effectively reduce the risk of model overfitting, the Dropout layer discards 20% of the output features.The BatchNormalization layer can standardize the features of each batch level during training. At this time, the entire dataset is rescaled. After processing the data using the flatten function, three dense layers are entered in turn. The number of neural units in the first two dense layers is 128 and 64. The convolutional layer and the first two dense layers use the Relu activation function. The third dense layer is the output layer, the number of neural units is consistent with the total number of classes of the dataset, and the Softmax activation function is used.

4. Experiments and Evaluation

To evaluate the effectiveness of the proposed two-stage network intrusion detection model in modern large-scale network environments, we use Python3 language, Kreas 2.2.4, and Tensorflow to experiment on the CSE-CIC-IDS2018 dataset on a computer with the Ubuntu 19.04 operating system. The computer’s CPU is i9-9900KF, the GPU is RTX2080Ti, and the memory is 64 GB.

4.1. Evaluation Metrics

We use the confusion matrix in Table 3. TN/FP is the number of samples, which is actually negative samples and are predicted to be negative and positive samples. FN/TP is the number of samples, which are actually positive samples and are predicted to be negative and positive samples. We regard each attack class as a positive class, and the others as a negative class.

To measure the performance of the model, based on the confusion matrix shown in Table 3, we use seven evaluation indicators, including accuracy (ACC), false alarm rate (FAR), detection rate (DR), recall, precision,

F_{1} s c o r e

, and Matthews correlation coefficient (MCC). MCC is a correlation coefficient that describes the actual classification and the predicted classification. It is considered to be more suitable for evaluating classification tasks with imbalances [41]. MCC calculation formula is shown in Equation (5). The formulas of other evaluation indicators are shown in Equations (6)–(10). The calculation formula of recall is the same as DR. In Stage-2 of fine-grained recognition, to reasonably evaluate the model’s ability to detect imbalanced datasets, in addition to directly calculating the average value of MCC, the weighted average method is used to calculate each other indicator according to the proportion of each class of sample.

\begin{matrix} M C C = \frac{T P * T N - F P * F N}{\sqrt{(T P + F N) (T P + F P) (T N + F P) (T N + F N)}}, \end{matrix}

(5)

\begin{matrix} A C C = \frac{T P + T N}{T P + T N + F P + F N}, \end{matrix}

(6)

\begin{matrix} D R = \frac{T P}{T P + F N}, \end{matrix}

(7)

\begin{matrix} F A R = \frac{F P}{F P + T N}, \end{matrix}

(8)

\begin{matrix} P r e c i s i o n = \frac{T P}{T P + F P}, \end{matrix}

(9)

\begin{matrix} F_{1} s c o r e = \frac{2 * P r e c i s i o n * R e c a l l}{P r e c i s i o n + R e c a l l}, \end{matrix}

(10)

4.2. Stage-1: Normal and Abnormal Recognition

In Stage-1, the LightGBM algorithm is used to identify normal and abnormal network traffic data. To highlight the advantages of the LightGBM algorithm, we also compare six other machine learning techniques, namely, RF, DT, LR, KNN, AdaBoost, and XGBoost. The machine learning algorithms used in this study are implemented based on the Scikit-learn (sklearn) module, and the algorithm parameters are all default parameters.

Table 4 shows the performance evaluation of the LightGBM algorithm and other six machine learning algorithms in recognition of normal and abnormal samples on the CSE-CIC-IDS2018 dataset in Stage-1. Table 4 shows that the evaluation indicator values of LightGBM and XGBoost are considerably better than other algorithms. LightGBM’s ACC is 99.135%, DR is 95.009%, precision is 99.878%,

F_{1} s c o r e

is 97.383%, and MCC is 96.909%. The evaluation indicator values of LightGBM are less than 0.1% lower than that of XGBoost, but the training time of LightGBM is approximately 70 times shorter than that of XGBoost. The training time of XGBoost is 3989.552 s, whereas the training time of LightGBM is only 56.880 s. The high accuracy and low time cost show that the LightGBM algorithm is suitable for processing large-scale intrusion detection data.

In Stage-1 of normal and abnormal recognition, considering the evaluation indicators and time cost, LightGBM performs the best, and XGBoost, RF, DT, and AdaBoost can be considered in turn. LR has the worst performance because of its lower accuracy and longer training time. Although KNN has good evaluation indicators, the training time is the longest, 51,138.024 s. The performance of DT and RF are equivalent, ACC does not reach 99%, and the training time is longer than LightGBM. AdaBoost, LightGBM, and XGBoost are boosting methods. However, the ACC of AdaBoost is not as good as the other two algorithms, and the time cost is higher.

4.3. Stage-2: Fine-Grained Classification

Stage-2 mainly performs the fine-grained classification of the samples predicted to be abnormal in Stage-1. At this time, the test set has a few benign samples. To make the model pay more attention to the attack categories with few samples, we delete most of the “Benign” samples in the training set. For the CSE-CIC-IDS2018 dataset, we randomly delete 95% of the “Benign” samples in the training set. At this time, the number of “Benign” samples is balanced with the number of “DDOS attack-HOIC” samples in the attack class with the most number of samples. We split another 10% of the data from the training set as the validation set. The sample distributions of the training, validation and testing sets in Stage-2 are shown in Table 5. Stage-2 uses the CNN with the convolutional layer stack structure described in Section 3.3 for fine-grained attack type detection. When training CNN, the Adam optimizer with an initial learning rate of 0.005 is used, the “categorical_crossentropy” loss function is used, the batch_size of the model is set to 500, and the epochs are set to 100.

To explore the adaptability of the two-stage intrusion detection model designed in this study to imbalanced data, we design an imbalance processing technology, IR-SMOTE, whose over-sampling is based on the data IR. The degree of data balance after processing depends on the over-sampling degree coefficient

α

. We only perform class imbalance processing on the training set of Stage-2 to ensure the true validity of the evaluation results. The original training set

I R_{0}

is 7840. To study the influence of different over-sampling degrees on the experimental results, we set the over-sampling degree coefficient

α

to 0, 1, 2, and 3 for the experiment, where

α = 0

means no sampling is performed. Moreover, we add a comparative experiment using SMOTE to balance the training set completely. Table 6 shows the fine-grained classification results of CNN when IR-SMOTE is used to set different degrees of oversampling coefficients

α

. The DR of each class is highlighted.

Table 6 shows that, in the experiments conducted in this study, the classification performance of CNN is the best when the class imbalance processing is not performed (

α = 0

). As

α

increases from 1, the overall classification performance indicators of CNN increase. When SMOTE fully balances the training set, the precision of the CNN is 99.903%, which is only 0.001% higher than the precision of the original training set. The remaining indicators are the same as those of the model trained in the original training set. The number of samples in the two attack classes, “Brute Force -XSS” and “SQL Injection,” is the least in the original training set, and the DR is improved after IR-SMOTE processing. The number of samples of “Brute Force -Web” is only more than that of “Brute Force-XSS” and “SQL Injection.” The DR of “Brute Force-Web” dropped after over-sampling from 98.485% to 89.394%. No significant change is observed in the DR of the other classes. From the time cost point of view, the training time and the test time are roughly the same when

α

is 0, 1, 2, and 3. When using SMOTE to balance the dataset fully, the training time is increased to 51.050 s, which is approximately 2.8 times the training time of the original training set. Regarding comprehensive evaluation indicators and training time, when the original imbalanced dataset is used to train the model, CNN has the best performance. At this time, ACC and DR are both 99.896%,

F_{1} s c o r e

is 99.862%, and MCC is 95.922%. Experimental results show that the two-stage intrusion detection model for large-scale flow data in this study can deal with imbalanced data well.

When the model is trained using the original imbalanced training set, the normalized confusion matrix on the CSE-CIC-IDS2018 dataset in Stage-2 is shown in Figure 2. The red numbers in the figure indicate the DR of classes with DR over 0.01. Figure 2 shows that, except for “Benign,” “Brute Force-XSS,” and “SQL Injection” with lower DR, other classes of DR are better, among which 11 classes have a DR of 100%. “Benign” in the Stage-2 test set is the part of the normal sample that was misclassified as an attack in Stage-1. After the Stage-2 of identification, 18% of “Benign” samples are corrected, and 69% of the samples are mistakenly classified as “Infilteration” attacks. The DR of “Brute Force-XSS” is 76%, and 24% of the samples are mistakenly classified as “Brute Force-Web.” One of the reasons for this phenomenon is that the samples belong to Brute Force attack scenarios and have similar attack modes. Approximately 43% of “SQL Injection” samples are mistakenly classified as “Brute Force-Web.” One of the reasons for the low DR of “SQL Injection” is that this class has the smallest proportion of samples in the original dataset.

4.4. Discussion

The results of a series of simulation experiments on the CSE-CIC-IDS-2018 dataset show that our proposed two-stage intrusion detection model using LightGBM algorithm and CNN for the IoT environment can achieve accurate and efficient fine-grained attack type detection while making full use of imbalanced large-scale network flow data. The reasons for the excellent performance of the two-stage model are as follows.

First of all, Stage-1 uses the LightGBM with both accuracy and timeliness to make full use of large-scale data. Second, Stage-2 deletes 95% of the normal data in the training set, making the model pay more attention to the attack class during the training process, effectively alleviating the negative impact of the imbalance of the dataset on the model, and shortening the training time. Third, the CNN with a convolutional layer stack structure in Stage-2 can effectively learn the features of each attack class and achieve fine-grained and accurate identification of attack classes.

To demonstrate further the effectiveness of our proposed two-stage intrusion detection model using machine learning and deep learning, as shown in Figure 3, we compare several advanced intrusion detection models that use the CSE-CIC-IDS-2018 dataset. Our two-stage model shows superiority compared with machine learning [11,42] and deep learning methods [31,34,42]. In addition to the improvement in accuracy, the total training time of our model is relatively short, which is 74.876 s. The detection time of a single sample is only 0.0172 milliseconds.

5. Conclusions

With the increasing integration of IoT devices and services with people’s lives, ensuring the security of IoT infrastructure becomes increasingly important. In the IoT environment, to achieve fine-grained and efficient intrusion detection on large-scale data, we propose a two-stage intrusion detection model that combines machine and deep learning and uses the new CSE-CIC-IDS2018 dataset to verify the performance of the model. Stage-1 uses machine learning to identify normal and abnormal samples in network traffic data. We compare seven machine learning algorithms, including LightGBM. The experimental results show that LightGBM has a high ACC of 99.135% and a short training time. Stage-2 uses CNN to perform the fine-grained detection of samples predicted to be abnormal in Stage-1. The comparison with the experimental results of the IR-SMOTE imbalance processing method shows that our two-stage intrusion detection model has better adaptability to the imbalanced dataset. The ACC of Stage-2 is 99.896%,

F_{1} s c o r e

is 99.862%, and MCC is 95.922%. The total training time of the model is 74.876 s. The comparison with other advanced research also shows that the two-stage intrusion detection model in this study has higher efficiency and better performance when processing large-scale flow data. However, the accuracy of LightGBM is 99.135%, so there are some benign data in the test data of the second stage. In order to correct this bias, the data we used to train the CNN had to retain some benign data. In the future, we will explore feature selection methods and other imbalance processing methods to improve further the performance of NIDS.

Author Contributions

Conceptualization, H.Z., Investigation, H.Z., B.Z. and L.H.; Methodology H.Z. and L.H.; Resources, H.Z.; Supervision, H.Z.; Validation, B.Z.; Writing—original draft L.H.; Writing—review and editing, H.Z., B.Z., Z.Z. and H.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research was partly supported by the Key R&D and promotion projects of Henan Province (Technological research) (Grant No. 212102210143).

Institutional Review Board Statement

Ethical review and approval are not required because all datasets used in this study are in the public domain.

Informed Consent Statement

Not applicable.

Data Availability Statement

The CSE-CIC-IDS2018 dataset used to support the findings of this study is available at https://www.unb.ca/cic/datasets/ids-2018.html (accessed on 27 November 2022).

Conflicts of Interest

The authors declare no conflict of interest.

References

Atzori, L.; Iera, A.; Morabito, G. The internet of things: A survey. Comput. Netw. 2010, 54, 2787–2805. [Google Scholar] [CrossRef]
Vinayakumar, R.; Alazab, M.; Srinivasan, S.; Pham, Q.; Padannayil, S.K.; Simran, K. A visualized botnet detection system based deep learning for the internet of things networks of smart cities. IEEE Trans. Ind. Appl. 2020, 56, 4436–4456. [Google Scholar] [CrossRef]
Vasan, D.; Alazab, M.; Venkatraman, S.; Akram, J.; Qin, Z. MTHAEL: Cross-architecture IoT malware detection based on neural network advanced ensemble learning. IEEE Trans. Comput. 2020, 69, 1654–1667. [Google Scholar] [CrossRef]
Rehman, A.; Paul, A.; Yaqub, M.A.; Rathore, M.M.U. Trustworthy Intelligent Industrial Monitoring Architecture for Early Event Detection by Exploiting Social IoT. In Proceedings of the 35th Annual ACM Symposium on Applied Computing, SAC ’20, Virtual, 30 March–3 April 2020; pp. 2163–2169. [Google Scholar]
Liu, H.; Lang, B. Machine learning and deep learning methods for intrusion detection systems: A survey. Appl. Sci. 2019, 9, 4396. [Google Scholar] [CrossRef] [Green Version]
Mahfouz, A.M.; Venugopal, D.; Shiva, S.G. Comparative analysis of ML classifiers for Nnetwork intrusion detection. In Proceedings of the Fourth International Congress on Information and Communication Technology, London, UK, 27–28 February 2019; pp. 193–207. [Google Scholar]
Tesfahun, A.; Bhaskari, D.L. Intrusion detection using random forests classifier with SMOTE and feature reduction. In Proceedings of the 2013 International Conference on Cloud & Ubiquitous Computing & Emerging Technologies, Pune, India, 15–16 November 2013; pp. 127–132. [Google Scholar]
Bhavani, T.T.; Rao, M.K.; Reddy, A.M. Network intrusion detection system using random forest and decision tree machine learning techniques. In Proceedings of the First International Conference on Sustainable Technologies for Computational Intelligence, Jaipur, India, 29–30 March 2019; Springer: Singapore, 2020; pp. 637–643. [Google Scholar]
Pajouh, N.H.; Javidan, R.; Khayami, R.; Dehghantanha, A.; Choo, K.K.R. A two-layer dimension reduction and two-tier classification model for anomaly-based intrusion detection in IoT backbone networks. IEEE Trans. Emerg. Top. Comput. 2019, 7, 314–323. [Google Scholar] [CrossRef]
Cavusoglu, U. A new hybrid approach for intrusion detection using machine learning methods. Appl. Intell. 2019, 49, 2735–2761. [Google Scholar] [CrossRef]
Karatas, G.; Demir, O.; Sahingoz, O.K. Increasing the performance of machine learning-based IDSs on an imbalanced and up-to-date dataset. IEEE Access 2020, 8, 32150–32162. [Google Scholar]
Koroniotis, N.; Moustafa, N.; Sitnikova, E.; Slay, J. Towards developing network forensic mechanism for botnet activities in the IoT based on machine learning techniques. Mob. Netw. Manag. 2018, 235, 30–44. [Google Scholar]
Dhaliwal, S.S.; Nahid, A.A.; Abbas, R. Effective Intrusion Detection System Using XGBoost. Information 2018, 9, 149. [Google Scholar] [CrossRef] [Green Version]
D’hooge, L.; Wauters, T.; Volckaert, B.; De Turck, F. Inter-dataset generalization strength of supervised machine learning methods for intrusion detection. J. Inf. Secur. Appl. 2020, 54, 102564. [Google Scholar] [CrossRef]
Zhang, J.; Gardner, R.; Vukotic, I. Anomaly detection in wide area network meshes using two machine learning algorithms. Futur. Gener. Comp. Syst. 2019, 93, 418–426. [Google Scholar] [CrossRef]
Zhang, H.; Huang, L.; Wu, C.Q.; Li, Z. An effective convolutional neural network based on SMOTE and gaussian mixture model for intrusion detection in imbalanced dataset. Comput. Netw. 2020, 117, 107315. [Google Scholar] [CrossRef]
Bu, S.J.; Cho, S.B. A convolutional neural-based learning classifier system for detecting database intrusion via insider attack. Inf. Sci. 2020, 512, 123–136. [Google Scholar] [CrossRef]
Nguyen, M.T.; Kim, K. Genetic convolutional neural network for intrusion detection systems. Future Gener. Comput. Syst. 2020, 113, 418–427. [Google Scholar] [CrossRef]
Vasan, D.; Alazab, M.; Wassan, S.; Safaei, B.; Zheng, Q. Image-based malware classification using ensemble of CNN architectures (IMCEC). Comput. Secur. 2020, 92, 101748. [Google Scholar] [CrossRef]
Almiani, M.; AbuGhazleh, A.; Al-Rahayfeh, A.; Atiewi, S.; Razaque, A. Deep recurrent neural network for IoT intrusion detection system. Simul. Model. Pract. Theory 2020, 101, 102031. [Google Scholar] [CrossRef]
Zhang, H.; Wu, C.Q.; Gao, S.; Wang, Z.; Xu, Y.; Liu, Y. An effective deep learning based scheme for network intrusion detection. In Proceedings of the 2018 24th International Conference on Pattern Recognition (ICPR), Beijing, China, 20–24 August 2018; pp. 682–687. [Google Scholar]
Kanimozhi, V.; Jacob, T.P. Artificial intelligence based network intrusion detection with hyper-parameter optimization tuning on the realistic cyber dataset CSE-CIC-IDS2018 using cloud computing. In Proceedings of the 2019 International Conference on Communication and Signal Processing (ICCSP), Melmaruvathur, India, 4–6 April 2019; pp. 33–36. [Google Scholar]
Galar, M.; Fernandez, A.; Barrenechea, E.; Bustince, H.; Herrera, F. A review on ensembles for the class imbalance problem: Bagging-, boosting-, and hybrid-based approaches. IEEE Trans. Syst. Man Cybern. Part C 2012, 42, 463–484. [Google Scholar] [CrossRef]
Elazhary, H. Internet of things (IoT), mobile cloud, cloudlet, mobile IoT, IoT cloud, fog, mobile edge, and edge emerging computing paradigms: Disambiguation and research directions. J. Netw. Comput. Appl. 2019, 128, 105–140. [Google Scholar] [CrossRef]
Tahsien, S.M.; Karimipour, H.; Spachos, P. Machine learning based solutions for security of internet of things (IoT): A survey. J. Netw. Comput. Appl. 2020, 161, 18. [Google Scholar] [CrossRef] [Green Version]
Lopez-Martin, M.; Carro, B.; Sanchez-Esguevillas, A.; Lloret, J. Conditional variational autoencoder for prediction and feature recovery applied to intrusion detection in IoT. Sensors 2017, 17, 1967. [Google Scholar] [CrossRef] [Green Version]
Rathore, M.M.; Saeed, F.; Rehman, A.; Paul, A.; Daniel, A. Intrusion Detection Using Decision Tree Model in High-Speed Environment. In Proceedings of the 2018 International Conference on Soft-computing and Network Security (ICSNS), Coimbatore, India, 14–16 February 2018; pp. 1–4. [Google Scholar]
Hosseini, S.; Zade, B.M.H. New hybrid method for attack detection using combination of evolutionary algorithms, SVM, and ANN. Comput. Netw. 2020, 173, 15. [Google Scholar] [CrossRef]
Almaiah, M.A.; Almomani, O.; Alsaaidah, A.; Al-Otaibi, S.; Bani-Hani, N.; Hwaitat, A.K.A.; Al-Zahrani, A.; Lutfi, A.; Awad, A.B.; Aldhyani, T.H.H. Performance Investigation of Principal Component Analysis for Intrusion Detection System Using Different Support Vector Machine Kernels. Electronics 2022, 11, 3571. [Google Scholar] [CrossRef]
Alzaqebah, A.; Aljarah, I.; Al-Kadi, O.; Damaševičius, R. A Modified Grey Wolf Optimization Algorithm for an Intrusion Detection System. Mathematics 2022, 10, 999. [Google Scholar] [CrossRef]
Gamage, S.; Samarabandu, J. Deep learning methods in network intrusion detection: A survey and an objective comparison. J. Netw. Comput. Appl. 2020, 169, 102767. [Google Scholar] [CrossRef]
Abdulhammed, R.; Musafer, H.; Alessa, A.; Faezipour, M.; Abuzneid, A. Features dimensionality reduction approaches for machine learning based network intrusion detection. Electronics 2019, 8, 322. [Google Scholar] [CrossRef] [Green Version]
Huang, S.; Lei, K. IGAN-IDS: An imbalanced generative adversarial network towards intrusion detection system in ad-hoc networks. Ad Hoc Netw. 2020, 105, 102177. [Google Scholar] [CrossRef]
Lin, P.; Ye, K.; Xu, C.Z. Dynamic network anomaly detection system by using deep learning techniques. In Proceedings of the International Conference on Cloud Computing, San Diego, CA, USA, 25–30 June 2019; Volume 11513, pp. 161–176. [Google Scholar]
CSE-CIC-IDS2018 Dataset. Available online: https://www.unb.ca/cic/datasets/ids-2018.html (accessed on 27 November 2022).
Sharafaldin, I.; Lashkari, A.H.; Ghorbani, A.A. Toward generating a new intrusion detection dataset and intrusion traffic characterization. In Proceedings of the 4th International Conference on Information Systems Security and Privacy, Funchal, Portugal, 22–24 January 2018. [Google Scholar]
Ke, G.L.; Meng, Q.; Finley, T.; Wang, T.F.; Chen, W.; Ma, W.D.; Ye, Q.W.; Liu, T.Y. LightGBM: A highly efficient gradient boosting decision tree. In Advances in Neural Information Processing Systems 30; Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R., Eds.; Advances in Neural Information Processing Systems; Neural Information Processing Systems (Nips): La Jolla, CA, USA, 2017; Volume 30. [Google Scholar]
Chawla, N.V.; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P. SMOTE: Synthetic minority over-sampling technique. J. Artif. Intell. Res. 2002, 16, 321–357. [Google Scholar] [CrossRef]
Protas, E.; Bratti, J.D.; Gaya, J.F.O.; Drews, P.; Botelho, S.S.C. Visualization methods for image transformation convolutional neural networks. IEEE Trans. Neural Netw. Learn. Syst. 2019, 30, 2231–2243. [Google Scholar] [CrossRef]
Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2015, arXiv:1409.1556v6. [Google Scholar]
Chicco, D.; Jurman, G. The advantages of the matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genom. 2020, 21, 13. [Google Scholar] [CrossRef] [Green Version]
Ferrag, M.A.; Maglaras, L.; Moschoyiannis, S.; Janicke, H. Deep learning for cyber security intrusion detection: Approaches, datasets, and comparative study. J. Inf. Secur. Appl. 2020, 50, 102419. [Google Scholar] [CrossRef]

Figure 1. Schematic diagram of the two-stage NIDS model.

Figure 2. Normalized confusion matrix when training CNN with the original training set in Stage-2.

Figure 3. Comparison of the proposed model and previous studies on the CSE-CIC-IDS-2018 dataset. (a) classification performance; (b) train time.

Table 1. Summary of the most relevant works in the literature.

Article	Classifiers	Datasets	Evaluation Metric	Techniques	Findings
Koroniotis et al. [12]	DT, ANN, NB, ARM	UNSW-NB15	ACC, FAR	Information gain and Weka tool.	Use of four classification techniques, DT C4.5, ARM, ANN and NB for defining and investigating the botnets.
Zhang et al. [16]	CNN	CICIDS2017 UNSW-NB15	ACC, DR, FAR, PRE, F1-score	SMOTE and GMM	A flow-based NID model which fuses SGM and 1D-CNN to detect highly imbalanced network traffic.
Almiani et al. [20]	RNN	NSL-KDD	ACC, PRE, Recall, F1-score, FPR, FNR	RNN	An IDS composed of cascaded filtering stages to detect specific types of attacks for IoT environments.
Zhang et al. [21]	MLP	UNSW-NB15	ACC, PRE, Recall, F1-score	DAE and MLP	An effective IDS based on deep learning techniques, DAE and MLP.
Kanimozhi et al. [22]	ANN	CSE-CIC-IDS2018	ACC, PRE, Recall, F1-score, AUC	GridSearchCV	An experimental approach of ANN with hyper-parameter optimization on the realistic new IDS cyber dataset .
Lopez-Martin et al. [26]	ID-CVAE	NSL-KDD	Frequency, ACC, PRE, Recall, F1-score, FPR, NPV	CVAE	An anomaly-based supervised ML method based on the CVAE.
Rathore et al. [27]	C4.5 DT	KDD99	ACC	FSR and BER	Propose a real-time IDS for the high-speed environment with a fewer number of features.
Hosseini et al. [28]	SVM, ANN	NSL-KDD	PRE, ACC, F1-score, MCC, AUC	MGA-SVM and HGS-PSO-ANN	The SVM technique is used to select relevant features. Than combining MGA-SVM with HGS-PSO-ANN.
Almaiah et al. [29]	SVM	KDD Cup’99, UNSW-NB15	ACC, Sensitivity, F1-score	PCA and SVM	Study the impact of different kernel functions of SVM on classification performance.
Alzaqebah et al. [30]	ELM	UNSW-NB15	ACC, F1-score, G-mean measures	GWO and IG	Optimize the Grey Wolf Optimization algorithm (GWO) using information gain.
Abdulhammed et al. [32]	RF	CICIDS2017	ACC, F1-score, FPR, TPR, PRE, Recall	UDBB	UDBB approach for imbalanced classes.
Huang et al. [33]	DNN	NSL-KDD, UNSW-NB15, CICIDS2017	ACC, F1-score, AUC	IGAN	IGAN to generate representative samples for minority classes and counter the class imbalance problem in intrusion detection by IGAN-IDS.

Table 2. Number of samples in each category of the CSE-CIC-IDS2018 dataset.

Category	Attack Type	Class Label	Number	Volume (%)
Benign	/	0	13,484,708	83.070
Brute-force	SSH-Bruteforce	1	187,589	1.156
	FTP-Bruteforce	2	193,360	1.191
	Brute Force –XSS	3	230	0.001
	Brute Force –Web	4	611	0.004
Web attack	SQL Injection	5	87	0.001
DoS attack	DoS attacks-Hulk	6	461,912	2.846
	DoS attacks-SlowHTTPTest	7	139,890	0.862
	DoS attacks-Slowloris	8	10,990	0.068
	DoS attacks-GoldenEye	9	41,508	0.256
DDoS attack	DDOS attack-HOIC	10	686,012	4.226
	DDOS attack-LOIC-UDP	11	1730	0.011
	DDOS attack-LOIC-HTTP	12	576,191	3.550
Botnet	Bot	13	286,191	1.763
Infilteration	Infilteration	14	161,934	0.998
Total	/	/	16,232,943	100

Table 3. Confusion matrix.

	Prediction Negative	Prediction Positive
Actual Negative	TN	FP
Actual Positive	FN	TP

Table 4. Performance evaluation of multiple machine learning algorithms in Stage-1 (%).

Method	ACC	DR	FAR	Precision	$F_{1} score$	MCC	Train-Time (s)	Test-Time (s)
LR	95.337	97.423	1.012	93.975	84.900	82.712	2849.512	0.188
DT	98.716	95.747	0.678	96.640	96.192	95.421	1581.116	0.704
RF	98.897	95.404	0.391	98.028	96.698	96.049	1001.335	3.583
KNN	99.027	95.437	0.241	98.774	97.077	96.514	51,138.024	10,967.225
AdaBoost	98.540	94.419	0.620	96.878	95.633	94.768	6018.889	35.698
XGBoost	99.149	95.087	0.023	99.883	97.426	96.959	3989.552	3.786
LightGBM	99.135	95.009	0.237	99.878	97.383	96.909	56.880	1.286

Table 5. Sample distribution of the training, validation, and test sets in Stage-2.

Class	Class Label	Train	Val	Test	Total
Benign	0	485,449	53,939	638	540,026
SSH-Bruteforce	1	135,064	15,007	37,518	187,589
FTP-Bruteforce	2	139,219	15,469	38,672	193,360
Brute Force –XSS	3	166	18	37	221
Brute Force –Web	4	440	49	66	555
SQL Injection	5	63	7	7	77
DoS attacks-Hulk	6	332,577	36,953	92,382	461,912
DoS attacks-SlowHTTPTest	7	100,721	11,191	27,978	139,890
DoS attacks-Slowloris	8	7913	879	2172	10,964
DoS attacks-GoldenEye	9	29,885	3321	8302	41,508
DDOS attack-HOIC	10	493,928	54,881	137,203	686,012
DDOS attack-LOIC-UDP	11	1245	139	346	1730
DDOS attack-LOIC-HTTP	12	414,858	46,095	115,208	576,161
Bot	13	203,058	22,895	57,213	283,166
Infilteratio	14	116,592	12,955	5112	134,659
Total	/	2,464,178	273,798	522,854	3,260,830

Table 6. Stage-2 CNN classification results when IR-SMOTE sets different degrees of oversampling coefficients (%).

Class	$α = 0$	$α = 1$	$α = 2$	$α = 3$	SMOTE
Benign	17.712	14.890	16.144	16.458	17.398
SSH-Bruteforce	99.984	99.984	99.984	99.984	99.984
FTP-Bruteforce	100	100	100	100	100
Brute Force –XSS	75.676	100	91.892	97.297	100
Brute Force –Web	98.485	68.182	93.939	90.909	89.394
SQL Injection	57.143	71.429	71.429	71.429	71.429
DoS attacks-Hulk	100	100	100	100	100
DoS attacks-SlowHTTPTest	100	100	100	100	100
DoS attacks-Slowloris	100	100	100	100	100
DoS attacks-GoldenEye	99.988	99.988	99.988	99.988	99.976
DDOS attack-HOIC	100	100	100	100	100
DDOS attack-LOIC-UDP	100	100	100	100	100
DDOS attack-LOIC-HTTP	100	100	100	100	100
Bot	100	100	100	100	100
Infilteratio	99.980	99.980	99.980	99.980	100
DR	99.896	99.890	99.894	99.895	99.896
ACC	99.896	99.890	99.894	99.895	99.896
Precision	99.902	99.897	99.900	99.901	99.903
$F_{1} s c o r e$	99.862	99.853	99.859	99.860	99.862
MCC	95.922	95.922	95.913	95.922	95.836
Train-time (s)	17.996	17.475	17.691	19.890	51.050
Test-time (s)	8.790	8.083	8.851	8.470	9.057

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhang, H.; Zhang, B.; Huang, L.; Zhang, Z.; Huang, H. An Efficient Two-Stage Network Intrusion Detection System in the Internet of Things. Information 2023, 14, 77. https://doi.org/10.3390/info14020077

AMA Style

Zhang H, Zhang B, Huang L, Zhang Z, Huang H. An Efficient Two-Stage Network Intrusion Detection System in the Internet of Things. Information. 2023; 14(2):77. https://doi.org/10.3390/info14020077

Chicago/Turabian Style

Zhang, Hongpo, Bo Zhang, Lulu Huang, Zhaozhe Zhang, and Haizhaoyang Huang. 2023. "An Efficient Two-Stage Network Intrusion Detection System in the Internet of Things" Information 14, no. 2: 77. https://doi.org/10.3390/info14020077

APA Style

Zhang, H., Zhang, B., Huang, L., Zhang, Z., & Huang, H. (2023). An Efficient Two-Stage Network Intrusion Detection System in the Internet of Things. Information, 14(2), 77. https://doi.org/10.3390/info14020077

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

An Efficient Two-Stage Network Intrusion Detection System in the Internet of Things

Abstract

1. Introduction

2. Related Work

3. Proposed Solution

3.1. Benchmark Dataset

3.2. Stage-1

3.2.1. Data Preprocessing

3.2.2. Baseline Methods

3.3. Stage-2

3.3.1. IR-SMOTE

3.3.2. Convolutional Neural Network

4. Experiments and Evaluation

4.1. Evaluation Metrics

4.2. Stage-1: Normal and Abnormal Recognition

4.3. Stage-2: Fine-Grained Classification

4.4. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI