HDLNIDS: Hybrid Deep-Learning-Based Network Intrusion Detection System

: Attacks on networks are currently the most pressing issue confronting modern society. Network risks affect all networks, from small to large. An intrusion detection system must be present for detecting and mitigating hostile attacks inside networks. Machine Learning and Deep Learning are currently used in several sectors, particularly the security of information, to design efﬁcient intrusion detection systems. These systems can quickly and accurately identify threats. However, because malicious threats emerge and evolve regularly, networks need an advanced security solution. Hence, building an intrusion detection system that is both effective and intelligent is one of the most cognizant research issues. There are several public datasets available for research on intrusion detection. Because of the complexity of attacks and the continually evolving detection of an attack method, publicly available intrusion databases must be updated frequently. A convolutional recurrent neural network is employed in this study to construct a deep-learning-based hybrid intrusion detection system that detects attacks over a network. To boost the efﬁciency of the intrusion detection system and predictability, the convolutional neural network performs the convolution to collect local features, while a deep-layered recurrent neural network extracts the features in the proposed Hybrid Deep-Learning-Based Network Intrusion Detection System (HDLNIDS). Experiments are conducted using publicly accessible benchmark CICIDS-2018 data, to determine the effectiveness of the proposed system. The ﬁndings of the research demonstrate that the proposed HDLNIDS outperforms current intrusion detection approaches with an average accuracy of 98.90% in detecting malicious attacks.


Introduction
Functions of information and communication technology (ICT) systems are critical in all aspects of the industry and human life.In recent decades, numerous organizations have become vulnerable to sophisticated cyber-attacks, resulting in the formation of a revolutionary Intrusion Detection System (IDS).An IDS is an unutilized network security method for identifying various forms of malicious intrusions.John Anderson was the first person to put in a significant amount of work in the identification field in the year 1980 [1].Every cyber-attacks entail economic costs, reputational harm, and legal consequences; hence, the development of IDSs has a global impact on both the academic community and the business sector.
It is important that networks be protected against unwanted access, and that user engagement and user data be safeguarded [2], in addition to revealing new security vulnerabilities.An intrusion detection system is an effective security-enhancing technique for identifying and preventing networks or systems from cyber assaults.IDSs are responsible for the identification of suspicious activities and the overall security of a network infrastructure against cyberattacks and for reducing financial and operational losses [3].According to the literature, a network architecture determines the classification of IDSs according to three categories:

•
Intrusion detection systems based on the network [4], which examine the components of unique packets to detect harmful network traffic behavior patterns.

•
Server signature IDS [5], which analyzes the activity system logs of individual hosts and identifies malicious attacks and hybrid identification systems [6].

•
Systems that use anomaly and signature-based intrusion detection systems have higher quality and stronger security practices.
The signature detection method makes use of predetermined patterns and classifiers to better assess malicious assaults.It utilizes existing information to identify harmful threats; hence, it is named a strategy based on knowledge.The approach achieves a low false positive (FP) along with higher accuracy, but it is incapable of detecting new network attacks [7].To discover unknown hostile threats, the anomaly detection approach employs heuristic methods.As a result, the effectiveness of this anomaly detection approach for detecting anomalies is good despite a high false-positive rate.Various businesses have adopted protocol analysis, which uses a combination of anomaly and signature-based systems, to avoid this problem [8].According to the deployment pattern, ID systems are categorized into two main types, which are distributed and non-distributed.A distributed implementation consists of several ID subsystems connected over a vast network, whereas a non-distributed structure, such as an open-source snort, may be deployed in a specific location [9].
In modern days, current methods to detect network intrusions in industries include statistical testing and threshold computation techniques.The ID system based on statistical tests relies on numerous traffic limitations, such as packet length, the timing of packet arrival, and traffic flow volume, depending on the network traffic of the model in a predetermined amount of time.Due to the complexity of today's modern malicious attacks, it is possible that these strategies will not be effective.In replacement of these statistically based techniques, a solution that is most optimized and efficient is needed.Machine learning (ML)-based approaches have been widely utilized to help network administrators deal with a wide range of harmful attacks in preventing these attacks [10].
Ensemble learning (EL) enhances ML outcomes by combining many models into one.These algorithms evaluate the state of the network by classifying the processed data into normal and abnormal categories.These algorithms exercise and simulate attacks with precision to evaluate capabilities with diverse datasets.Nevertheless, most of these datasets are extremely imbalanced.A total of 98% of these datasets are regarded as normal, whereas the remaining 2% are categorized as assaults [11].Folino et al. [12] suggested a novel deeplearning model based on ensemble learning for interpreting non-stationary datasets such as IDS logs.It is desirable to be able to construct a better detection system, especially when utilizing ensemble classifiers.When creating an ensemble, selecting suitable classifiers and deciding on combiners are two important issues.In [13], a study of ensemble learning for IDSs was provided by Tama.However, most traditional ML techniques fall within the area of superficial learning and place a low emphasis on the design and selection of features; they are incapable of addressing the large classification task imposed by attack data in a real-world network application.As the number of datasets continues to grow, the accuracy of multiclassification attack detection will decrease.Consequently, intelligent assessment is inconsistent with ML and the projection requirements of higher-dimensional learning with vast amounts of data [14].
This study aimed to develop a network intrusion detection system that is based on flow-based statistics utilizing the benchmark Canadian Institute for Cybersecurity intrusion detection system (CICIDS) 2018 dataset, which accurately identifies and categorizes every type of attack using a multi-categorization scheme.To identify network traffic, we developed an improved 1D CNN-based deep neural network model.The main contributions of this paper are as follows.

•
Proposal of a deep-layered architecture using the recurrent neural network (RNN) and convolutional neural network (CNN) to detect and classify malicious traffic.

•
Detailed analysis of existing machine learning and deep learning techniques.

•
In-depth analysis of the CICIDS 2018 dataset.
The remaining sections of this work are structured as follows: Section 2 presents the background while Section 3 explains the related work.Section 4 discusses the proposed HDLNIDS model.The dataset explanation is presented in Section 5.The experimental details and discussion are presented in Section 6, followed by a comparison with existing approaches in Section 7, and finally the summary and the conclusion in Section 8.

Background
This section covers the details about current IDSs and existing deep-learning-based methodologies for their detection.

Intrusion Detection Systems
The use of network monitoring in forensics, security, and anomaly detection has become commonplace.However, recent developments have introduced several additional challenges for IDSs.The most relevant concerns are as follows.

Volume
Data are being stored and sent over networks in ever-increasing amounts.The amount of data that was accessible in 2020 was predicted to exceed 44 ZB.By 2025, the amount of data generated each day is expected to reach 463 exabytes globally [4].As a result, modern networks' traffic capacity has increased dramatically to accommodate the observed traffic level.Numerous contemporary backbone connections currently operate at wire speeds of 100 Gbps or more.To put this into context, a 100 Gbps network can handle 148,809,524 packets per second (PPS) [5].To function at wire speeds, an IDS must be capable of analyzing within 6.72 nanoseconds.Providing an IDS at such a rapid rate is challenging, and achieving adequate efficacy, accuracy, and efficiency is similarly a challenge.

Accuracy
Existing methods cannot be depended upon to maintain adequate accuracy.To give a more thorough and accurate viewpoint, higher levels of precision, depth, and contextual knowledge are required [5].Unfortunately, this imposes several financial, computational, and time-related constraints.

Diversity
In recent years, there has been a rise in the number of novel or customized protocols used in contemporary networks [5].This is largely attributable to the amount of network and/or internet-connected gadgets.Consequently, it is becoming more difficult to distinguish between regular and anomalous traffic behavior.

Dynamics
Due to the complexity and adaptability of contemporary networks, their behavior is dynamic and difficult to anticipate [5].In turn, this makes it impossible to develop a dependable behavioral standard.It also raises questions regarding the longevity of learning models.

Low-Frequency Attacks
Based on the complex behavior of contemporary networks, different types of attacks have frequently defeated earlier systems for detecting anomalies, including artificialintelligence-based approaches [6].As a result of imbalances in the training dataset, an IDS delivers less precise detection when confronted with low-frequency attacks.

Adaptability
Inadequate precision, dynamic network traffic behavior, low-frequency network attacks, flexibility to software-defined networks, the enormous volume of stored and sent data, and a variety of network access devices are significant obstacles for modern NIDSs.Modern networks have embraced several new technologies to decrease their dependency on static technology and management techniques [6].Therefore, dynamic technologies such as containerization, virtualization, and Software-Defined Networks (SDNs) are being utilized.IDSs will need to adapt to the use of these technologies and the adverse effects they produce.

Deep-Learning-Based IDS
Deep learning is a subfield of machine learning.Using several layers of representation helps the modeling of intricate relationships and concepts [6].Using the output characteristics of lower levels, supervised and unsupervised learning algorithms are utilized to generate increasingly higher levels of abstraction.
IDSs play an important part in cybersecurity as they defend the network from cyberattacks by monitoring the network.IDSs in cybersecurity have evolved using deep learning (DL) due to their findings in computer vision, image processing, and natural language processing [15].Due to their two key properties, hierarchical feature representations and the acquisition of long-term temporal patterning, this structure of hierarchical and heuristic search is highly effective.DL is popular among researchers.Therefore, considerable thought has been given to DL approaches for enhancing the intelligence of IDSs, despite a lack of research comparing such machine learning methods with openly available datasets.DL's complex structuring architecture facilitates high-quality learning for complex data processing.Rapid progress in parallel processing technology has produced a robust system basis for DL approaches.Common prevalent issues with existing ML-based models are as follows: (1) such models do have a false positive rate (FP) with a wider variety of malicious invasions [16]; (2) such proposed models are not able to generalize, because most available detection systems skip novel attack vectors because of obsolete ID datasets; (3) better solutions are required to sustain today's rapidly expanding rising internet traffic in a heterogeneous network.
The Canadian Institute for Cybersecurity intrusion detection system (CICIDS) 2018 dataset, which is an upgrade to CICIDS 2017, is frequently utilized.One of the primary reasons for its consideration is that it was developed to address the issues observed in its earlier CICIDS 2017 dataset [17].It also comprises various kinds of traffic and real-world network traffic, which is one of the primary reasons for its popularity [18] Although CICIDS 2018 is a suitable dataset, there is a significant issue that must be addressed.The issue's impacts result in a high-class imbalance that directly misleads the classifier [19].

Related Work
Over recent decades, ML and DL approaches have been widely utilized regarding the security of the network due to their capacity to distinguish data [18,[20][21][22].Previously, researchers have employed a variety of ML-and DL-based techniques for ID.Using the KDDCUP ID dataset, Xu et al. [23] used the K-Nearest Neighbor (KNN) for the identification of network anomalies and assessed the effectiveness of the suggested ID system.Bhati et al. [24] used different versions of support vector machine (SVM), including quadratic and linear Gaussians, to evaluate the efficiency of SVM approaches on the NSL-KDD dataset.Sumaiya et al. [25] proposed an integrated ID system employing correlation-based feature selection and the artificial neural network (ANN).Using the datasets of UNSW-NB15 and NSL-KDD ID, the authors conducted an experimental study.Waskle et al. [26] proposed a Random Forest (RF)-based ID system, while Alqahtanet et al. [27] proposed a system for identification based on multiple conventional machine learning classification algorithms.Prior methodologies applied within the scope of ID, however, had inefficient classification effectiveness, with a greater FP and a poor detection rate (DR) in the ID system.Utilizing non-symmetric deep auto-encoder for network intrusion detection problem, Qazi et al. [28] conducted the experiments using the benchmark dataset KDD CUP'99.In another study, a one-dimensional convolutional neural network (1D-CNN) based deep learning system was proposed by the authors [29] for network intrusion detection.The authors used the benchmark CICIDS2017 dataset for conducting the experiments, while Ahmad et al. [30] proposed network intrusion detection and classification system using AdaBoost-based approach.The authors used a UNSW-NB 15 dataset for network anomaly detection.The experimental findings showed that proposed method effectively detects different forms of network intrusions on computer networks.
Deep learning is a subfield of machine learning consisting of concealed layers to determine the features of the deep network.These approaches are more successful than ML [24] owing to their comprehensive structure and capacity to grasp the relevant aspects of the dataset independently and provide an output.Recently, DL has gained popularity and is being applied for ID; studies indicate that DL outperforms traditional methods.Girdler et al. [31] employed the DL technique for the anomaly based on flow identification developed on a deep neural network (DNN), and the results of experiments demonstrated that DL may be utilized for anomaly identification in networks.
Idhammad et al. [21] proposed a decentralized intrusion detection solution for cloud environments.First, the Naive Bayes model was applied to identify anomalies for data preprocessing; subsequently, for multi-classification, RF was used to determine the pattern of each attack.Using the CICDDS-001 dataset, experiments were carried out using variables such as false-positive rate (FPR) and precision.
Anand et al. [32] introduced an IDS composed of multiple vector classifiers for wireless mesh support.Utilizing genetic-algorithm-based feature selection and SVM classification, the authors chose particular traits to boost efficiency.The system was evaluated using a WMN-generated intrusion dataset and was simulated using a standard intrusion dataset in the network simulator-3 (NS3) simulator.The CICIDS 2017 intrusion dataset was used to assess the model.
Ran et al. [33] proposed an ensemble-based approach for network anomaly identification in an intrusion detection system.This method utilizes a combination of learning and forecasting mechanisms to classify anomalies into various classes.Initially, researchers employed the ANOVA F-test considering the strategy of univariate feature selection [34,35] to determine the performance of features and the link between class labels and data characteristics.In addition, they utilized an automated machine learning model for learning and a Kalman filter for prediction.They utilized a Bayesian optimizer as the optimizer for neural network architecture search (NAS), which finds the most accurate architecture from a list of architectures.This ensemble technique employs a voting mechanism that rates the predictions of both algorithms depending on their average accuracy.On publicly accessible CICIDS-2017 and UNSW-NB15 datasets, they tested the performance of the suggested approach and obtained 97.02 and 98.801 percent accuracies, respectively.
An auto-encoder (AE) was used [36], which is a type of ANN used to inexpensively grasp data.By training the network, an AE attempts to discover a representation for a dataset to disregard "noise" signals to limit the number of parameters.The encoder, message, and decoder are the three components that make up an auto-encoder.In cybersecurity, a deep AE can be utilized to develop a viable security model.Therefore, the AE-based feature learning (FL) model surpasses other sophisticated algorithms.Compared to other complex algorithms, the AE-based FL model employs the lowest security measures.The approach is more productive and practical, particularly in small areas such as the Internet of Things, due to the dense and latent representation of security characteristics [37].The authors of [38] demonstrated the efficacy of an AE-based FL prototype for malware categorization and detection.A Deep-AE-based anomaly detection model was proposed by the authors in [39] to develop an efficient ID model using the Restricted Boltzmann Machine (RBM).
On the UNSW-NB15 dataset, Zakir et al. [40] investigated the performance of four common classifiers for binary classification: Support Vector Machine (SVM), Random Forest, Naive Bayes, and Decision Tree.The researchers utilized One-hot encoding to convert categorical data to attribute values and then performed machine learning on the complete feature set.The results of the experiments indicated that the researchers attained accuracies of 79.59%, 66%, 76%, and 78% on SVM, Naive Bayes, Random Forest, and Decision Tree, respectively.
To address the limitations in the development of IDS feature selection, hybrid approaches have emerged.These strategies combine the filtering and wrapping processes to make use of and increase the effectiveness of both techniques, as well as to improve predictions with enhanced computation.Thus, Song et al. [41] presented a model that combines chi-square with RF to build an intrusion detection hybrid feature selection (FS) approach.Furthermore, Wang et al. [42] considered two schemes: one with a Naive Bayes classifier integrated with information acquisition and the other with decision-making.A hybrid strategy integrating the linear correlation analysis approach with the cuttlefish algorithm was recently integrated with a decision tree as a classifier [42].The fundamental disadvantage of this class of techniques is that the wrapping method is dependent on the performance of the filter method, which is combined with the hybrid technique.To put it another way, the wrapper methodology can only operate with the components delivered by the filter function.In this instance, there is a possibility that informative characteristics will be filtered away and will not be included for wrapper evaluation.
Zhang et al. [43] introduced a neural-network-based anomaly detection model on the LeNet 5 convolutional neural network (CNN) and the Long short-term memory (LSTM) feature reduction algorithm.CICIDS 2017 and CTU datasets were used in the experiments, which used binary and multi-classification.The use of CNN, LSTM, and hybrid combinations resulted in increased efficiency on both binary and multi-classification examinations.
Furthermore, Aydin et al. [44] presented an approach where they integrated the two following methodologies: (1) Anomaly Detection (PHAD) in a packet header; (2) Network Anomaly Detection (NETAD) on a network using the IDS Snort on a signature basis.Both approaches are anomaly-based intrusion detection systems.The proposed hybrid IDS was assessed using IDEVAL data, which indicated that the number of attacks detected increased significantly in comparison to signature-based systems.
The literature demonstrates that malicious threats arise and evolve frequently; hence, the network requires a very effective security solution.Due to the complexity of new attacks, the present models are incapable of detecting them.Deep learning has enabled researchers to investigate new fields of inquiry.Deep-Learning-based techniques necessitate little input while exploring every possible set of features.Using this technique in intrusion detection can assist in the detection of such malicious attacks.The main aim of the paper is to accurately detect network intrusions by employing a model that can identify malicious traffic using deep learning and detect different types of intrusion attacks, as a deep-learningbased intelligent approach is proposed in this research.The proposed model overcomes the issues in existing models and provides promising results.

Proposed HDLNIDS Model
Deep learning for the detection of malicious traffic enables the detection of various changes in traffic, which in turn enhances the performance and allows only normal traffic to pass into the system.
Deep learning can assist in the identification of hostile intrusions for the detection of rising attacks.We proposed a model based on deep learning for an intrusion detection system that is illustrated in Figure 1.It comprises two learning phases, which are represented below.We proposed this method to build an IDS utilizing an HDLNIDS-based deep learning technique.Our suggested HDLNIDS is better in terms of computation while employing full-featured datasets and provides enhanced accuracy with a low likelihood of failure.
deep learning technique.Our suggested HDLNIDS is better in terms of computation while employing full-featured datasets and provides enhanced accuracy with a low likelihood of failure.With a massive data processing architecture, HDLNIDS learning concentrates on tackling real-world ID challenges.Solving such a challenge is difficult due to a lack of time and space.Big data now has massive and expanding quantities; however, it requires enormous amounts of energy, resources, and a computing device to help with the training process that can deal with significant data correctly.
Combining the RNN with a CNN-DL model, HDLNIDS reduces the aforementioned issues.Figure 1 depicts the HDLNIDS in further detail.According to the HDLNIDS overview, a CNN comprises two basic components: a feature extractor comes first, followed by a classifier.The feature extractor comprises two layers: a convolution comprising of the twin set of layers and a pooling layer.The obtained output called the feature map is fed into the second component for classification.CNN adopts the local characteristics in this way.The drawback is that it overlooks the time dependency of essential features.As a result, we added recurrent layers following the CNN layers to capture both spatial and temporal data more adequately.These four layers of RNN were added to make it more efficient in terms of accuracy and computation.Using this, we were able to effectively manage the disappearance and inflating gradient issues, which increases the capacity to capture temporal and spatial correlations and efficient learning from varied extent patterns.
The CNN first processes the input in the HDLNIDS network, and the output is subsequently sent to the recurrent layers, which produce sequences at each timestep, allowing for the modeling of both spatial and temporal characteristics.The resultant sequential vector is then input into a layer of SoftMax for the probability distribution over the categories after passing through a set of two fully linked layers.

Data Preprocessing
In the data pre-processing phase, initially, the network traffic was categorized and preprocessed.All necessary conversions for the HDLNIDS and IDS-compatible file formats were performed during pre-processing.The different characteristics, such as timestamps and network Internet Protocol (IP) addresses, have little effect on whether network traffic is malicious in the original CICIDS-2018.As timestamp features are utilized to keep track of when malicious communication occurs and give only minor assistance while training the algorithm, we eliminated them during the pre-processing step.Similar to an anomalous intrusion detection system (AIDS), the majority of traffic should be classed based on its behavior, without bias or competing with the IP address; thus, we With a massive data processing architecture, HDLNIDS learning concentrates on tackling real-world ID challenges.Solving such a challenge is difficult due to a lack of time and space.Big data now has massive and expanding quantities; however, it requires enormous amounts of energy, resources, and a computing device to help with the training process that can deal with significant data correctly.
Combining the RNN with a CNN-DL model, HDLNIDS reduces the aforementioned issues.Figure 1 depicts the HDLNIDS in further detail.According to the HDLNIDS overview, a CNN comprises two basic components: a feature extractor comes first, followed by a classifier.The feature extractor comprises two layers: a convolution comprising of the twin set of layers and a pooling layer.The obtained output called the feature map is fed into the second component for classification.CNN adopts the local characteristics in this way.The drawback is that it overlooks the time dependency of essential features.As a result, we added recurrent layers following the CNN layers to capture both spatial and temporal data more adequately.These four layers of RNN were added to make it more efficient in terms of accuracy and computation.Using this, we were able to effectively manage the disappearance and inflating gradient issues, which increases the capacity to capture temporal and spatial correlations and efficient learning from varied extent patterns.
The CNN first processes the input in the HDLNIDS network, and the output is subsequently sent to the recurrent layers, which produce sequences at each timestep, allowing for the modeling of both spatial and temporal characteristics.The resultant sequential vector is then input into a layer of SoftMax for the probability distribution over the categories after passing through a set of two fully linked layers.

Data Preprocessing
In the data pre-processing phase, initially, the network traffic was categorized and preprocessed.All necessary conversions for the HDLNIDS and IDS-compatible file formats were performed during pre-processing.The different characteristics, such as timestamps and network Internet Protocol (IP) addresses, have little effect on whether network traffic is malicious in the original CICIDS-2018.As timestamp features are utilized to keep track of when malicious communication occurs and give only minor assistance while training the algorithm, we eliminated them during the pre-processing step.Similar to an anomalous intrusion detection system (AIDS), the majority of traffic should be classed based on its behavior, without bias or competing with the IP address; thus, we deleted the IP address features as well.Pandas, NumPy, and Scikit-learn packages within the Python language were used to implement data pre-processing tasks.
The subject of class inequality has received significant attention from the research community.A class imbalance is caused by inadequate data distribution; one class has the majority of the samples, while others have comparably few.Because of unlimited data values and imbalanced classes, the classification issue becomes more challenging as data dimensionality grows.Bedi et al. [45] used a variety of ML methods to address the class imbalance problem.Thabtah et al. [46] investigated numerous methods for the problem of class imbalance.Most algorithms target the majority of data samples while missing the minority of data samples.Hence, minority samples emerge in an irregular but consistent manner.The basic strategies to resolve the data problem include data preprocessing and feature selection, and each approach has advantages and disadvantages.The ID dataset has a problem with high-dimensional imbalance, which includes lacking interesting features, attribute values, and the only availability of cumulative data.The data appear to be unreliable, with inaccuracies and anomalies, as well as unpredictable, with variations in codes or names.To overcome the imbalance problem, we utilized oversampling, which included increasing the frequency of occurrences among the minority group by indiscriminately reproducing them to enhance the representation of the minority group in the dataset.
Although there is a risk of overfitting with this process, no records were lost, and the over-sampling approach outperformed the under-sampling option.Table 1 presents the extracted features during preprocessing phase.

Fw-iat-min
The shortest delay between two packets supplied in an onward route.

Bw-iat-tot
The total lag time between two packets transmitted via a back channel.

Bw-iat-avg
The average time two packets transmitted through a back channel.

Bw-iat-std
The average time between two packets transmitted in the reverse direction.

Bw-iat-max
The highest time between two packets transmitted in the reverse direction.

Bw-iat-min
The shortest time between two packets transmitted in a forward direction.

Bw-iat-min
The lowest time between two packets transmitted in a reverse direction.

Fw-pkt-l-avg
The average quantity of data in a packet in an upward direction Fw-pkt-l-min The lowest volume of the package Tot-bw-pk Overall data packets in a back channel Tot-l-fw-pkt Overall size of the packet in an up channel Tot-fw-pk Forwardly aggregate data packets Fl-iat-max The highest-duration period between two flows Fl-dur Interval of flow

Model Training and Testing
Model training is an important step after data preprocessing to extract meaningful features from the training set.For this purpose, we divided the dataset into 80, 10, and 10 training, testing, and validation sets, respectively.To train an efficient model, we ensured that all of the class samples were present in the training, validation, and testing set.During the training process, the model was tested on the validation set throughout the process to assess the performance of the model, and this process helps to obtain better results when unseen data are provided to the model.

Dataset Detail
Selecting appropriate ID data to analyze the ID system is crucial, so we chose the data preceding the simulation of the suggested methodology.Even though many ID databases are openly available, several of them contain outdated, inaccurate, and unreproducible intrusions.To address these shortcomings and generate current traffic patterns, the Amazon Web services (AWS) platform created the well-known CICDS-2018 [47] dataset.The CICIDS 2018 intrusion dataset depicts real-time network behavior and includes a variety of intrusion modes.Furthermore, it is spread as a full network that encompasses all of the internal networks traced to compute data packet payloads.These qualities of the CICIDS-2018 dataset compelled us to use it in our research for the proposed system.
This dataset provides many intrusion patterns that may be applied to a variety of network protocols and topologies about safety and security.This dataset is an upgrade to the CICIDS-2017 dataset.CICIDS-2018 is a publicly accessible dataset with two patterns and seven intrusion techniques at the moment.Multiple data states were gathered, and raw data were refreshed.CICIDS-2018 contains 80 statistical features measured in forward and reverse modes, such as volume, packet length, and the number of bytes.Finally, the dataset, which had around 5 million entries, was made available to all researchers over the internet.The CICIDS-2018 dataset is available in PCAP and CSV formats.In this research, we considered the use of CSV format, whereas the PCAP format is utilized to extract innovative features [48,49].This CICIDS-2018 dataset includes various categories of attacks:

•
Brute-force DOS attacks; The dataset framework comprises 50 systems, whereas the attacking firms comprise 31 servers and 421 endpoints.CICIDS-2018 data provide AWS-recorded network traffic and a system log containing 80 retrieved parameters using CICFlowMeter-V3.The CICIDS-2018 dataset is approximately 400 GB in size, which is greater than the CICIDS-2017 dataset.Table 1 shows a few retrieved characteristics from the CICIDS-2018 dataset.The sample size of the CICIDS-2018 dataset was compared to that of CICIDS-2017.Table 2 displays the outcomes of both datasets, notably in the Botnet and Infiltration assaults, where it increased by 143 and 4497, respectively.Moreover, the number of Internet Attacks provided in CICIDS-2018 is quite low (928).Table 2 presents the comparison between CICIDS-2018 and CICIDS-2017 datasets.

Performance Analysis
In this subsection, we explain the simulation environment used to test our proposed model along with performance parameters and results.

Simulation Setup
To validate the efficacy of the suggested ID strategy, we implemented the proposed technique in Python using deep learning.The experiment was carried out on a computer (Core i7, 64-bit, 24 GB RAM, 64-core CPU).To improve pipeline speed, the model was trained using an NVIDIA GTX 2080 Ti GPU.To evaluate the proposed models, the dataset was divided into training and testing datasets separately.We utilized the training dataset to train the proposed model to make it effective.Then, the testing set was used to assess the efficiency of the proposed model.We utilized the CICIDS-2018 dataset to demonstrate the effectiveness of the proposed solutions.The network traffic included both malicious and normal information, which the proposed model categorized into malicious and nonmalicious categories, respectively.To obtain high accuracy and a low FAR value, the proposed technique decreases computational complexity by utilizing rich characteristics from the CICIDS-2018 dataset.Even though the CSE-CIC-IDS2018 data were utilized for training and testing, the model was tested with 10-fold cross-validation in each step, and each model was trained on the lot size using first-order gradient-based optimization algorithms such as RMSprop and Ada Max with different learning rates, while various combinations of hyperparameters were used to optimize the actual network packet using a search algorithm and 10-fold cross-validation.We included Gaussian noise layers after convolutional and recurrent layers to improve model flexibility in terms of performance and to prevent overfitting.Table 3 presents the hardware specifications used for this research.

Evaluation Metrics
A confusion matrix (CM) assists in determining the actual and expected classification.The classification result is divided into two categories: normal and abnormal.Four crucial states in the confusion matrix must be measured.

•
True Positive (TP): this implies demonstrating that the model is accurate and representative and that it predicts favorable results.

•
False negative (FN): this is defined as an inaccurate prediction.It accurately classifies malicious situations as normal, whereas it predicts bad outcomes wrongly.

•
False positive (FP): when the number of attacks seen is typical, the model predicts a favorable result.

•
True negative (TN): this refers to events that are appropriately identified as an assault and forecasts unfavorable outcomes.
The following performance matrices were used to evaluate the performance of the proposed system in terms of accuracy, precision, recall, and F1 score as mentioned below in Equations ( 1)-( 4), respectively: Accuracy = TP + TN Total Samples (1) F1 Score = T2 × Precision × Recall Precision + Recall (4)

Results and Evaluation
Experiments were conducted to evaluate the effectiveness of HDLNIDS in terms of accuracy, precision, recall, F1 measure, and the loss graph.We used 10-fold cross-validation to evaluate the ability of the model on new data as shown in Table 4.  Figure 2 displays the results in terms of accuracy, which indicate that the proposed model's accuracy continues increasing with time.Initially, the accuracy of the model is observed as low, but it continues increasing over epochs.This is due to the learning ability of the proposed model that makes it stable and more competent to classify normal and malicious traffic.The results of the proposed model for the loss graph are shown in Figure 3.This indicates that the proposed model loss graph continues decreasing with time.Initially, the model's loss graph is observed to be high, but it steadily decreases over time.This is due to the proposed model's ability to learn, which makes it more stable and competent at classifying normal and malicious traffic.The results of the proposed model for the loss graph are shown in Figure 3.This indicates that the proposed model loss graph continues decreasing with time.Initially, the model's loss graph is observed to be high, but it steadily decreases over time.This is due to the proposed model's ability to learn, which makes it more stable and competent at classifying normal and malicious traffic.

Characteristics of Each Fold
We present the performance of fold-wise computation in Table 4, which describes the results in terms of precision, recall, F-measure, and accuracy achieved against each fold.With each fold, the results continue improving, and the model achieves an average of 98.63% precision, 99.14% recall, 99.03% F-measure, and 98.90% accuracy.

Performance Analysis
In this section, we compare the proposed technique with existing techniques in terms of the dataset, and the methodology they used to detect the network intrusion.We also reviewed the technique to detect the limitations of the literature.A detailed comparison with existing techniques is presented in Table 5.Most of the authors used the NSL-KDD, KDD99, UNSW-NB15, and AWIS dataset, while the approaches used were the Sparse autoencoder, conjugate gradient algorithm (CGA), deep belief networks, DNN, CNN, RNN, and LSTM [48][49][50][51][52][53][54].Most of the research conducted used limited data, while, in [49], the performance could be improved by improving the sampling methodology.There are different types of networks attacks, and no study caters to all of them; most research focuses on the detection of binary classification while some research considers multiclass classification within a limited scope.From Table 5, it is evident that our proposed approach achieves significant results as compared to existing techniques.

Characteristics of Each Fold
We present the performance of fold-wise computation in Table 4, which describes the results in terms of precision, recall, F-measure, and accuracy achieved against each fold.With each fold, the results continue improving, and the model achieves an average of 98.63% precision, 99.14% recall, 99.03% F-measure, and 98.90% accuracy.

Performance Analysis
In this section, we compare the proposed technique with existing techniques in terms of the dataset, and the methodology they used to detect the network intrusion.We also reviewed the technique to detect the limitations of the literature.A detailed comparison with existing techniques is presented in Table 5.Most of the authors used the NSL-KDD, KDD99, UNSW-NB15, and AWIS dataset, while the approaches used were the Sparse autoencoder, conjugate gradient algorithm (CGA), deep belief networks, DNN, CNN, RNN, and LSTM [48][49][50][51][52][53][54].Most of the research conducted used limited data, while, in [49], the performance could be improved by improving the sampling methodology.There are different types of networks attacks, and no study caters to all of them; most research focuses on the detection of binary classification while some research considers multiclass classification within a limited scope.From Table 5, it is evident that our proposed approach achieves significant results as compared to existing techniques.-By performing extensive experiments, we determined the optimal parameters for our proposed model: it took 3.06 s per epoch for training and validation.The proposed model was trained for 60 epochs due to early stopping and it took approximately 3 min in total to train the model.

Conclusions and Future Work
Currently, the most urgent issue facing modern society is network attacks.All networks are susceptible to network risks, regardless of size.A network must have an intrusion detection system (ID) for detecting and mitigating hostile attacks.As malicious threats continually emerge and evolve, the network requires a highly advanced security solution.
Using deep learning to detect malicious traffic enables the detection of various changes in traffic, which in turn enhances the performance and allows only normal traffic to pass into the system.
Due to this, the development of an effective and intelligent ID system is of the utmost importance.In this study, a hybrid ID framework based on Deep Learning is created using a convolutional recurrent neural network (CRNN) that detects hostile network attacks.The model is built by combining an RNN with a CNN in which two convolutional layers are followed up by various RNN layers; the result is then passed into fully connected, flattened, and SoftMax layers that make the model capable of detecting and classifying traffic.To enhance the accuracy and predictability of the ID system, the CNN collects local features via convolution, whereas a deep-layered RNN captures HDLNIDS features.Experiments are conducted using publicly available intrusion detection data, in particular, the modern and realistic CICIDS-2018 data, to determine the efficacy of the proposed HDLNIDS system.The simulation results demonstrate that the proposed HDLNIDS gives promising results in terms of accuracy and data loss.
In the near future, we want to expand our model with more parameters that contribute to enhanced performance and detection, by designing more effective algorithms for other malicious network traffic using multiple deep learning techniques.We will also analyze and expand our model's capacity to handle zero-day assaults as our initial avenue for improvement.Furthermore, we will seek to expand upon our current assessments by employing actual backbone network traffic to illustrate the expanded model's value.

Figure 2 .
Figure 2. Accuracy of proposed deep learning model with respect to epochs.

Figure 2 .
Figure 2. Accuracy of proposed deep learning model with respect to epochs.

Figure 3 .
Figure 3. Loss graph of proposed deep learning model with respect to epochs.

Table 1 .
An overview of the CICIDlS-2018 ID dataset's extracted characteristics.

Table 4 .
Evaluation and Characteristics of Architecture on each Fold.Figure2displays the results in terms of accuracy, which indicate that the proposed model's accuracy continues increasing with time.Initially, the accuracy of the model is observed as low, but it continues increasing over epochs.This is due to the learning ability of the proposed model that makes it stable and more competent to classify normal and malicious traffic.

Table 4 .
Evaluation and Characteristics of Architecture on each Fold.

Table 5 .
Comparison with Existing Techniques.
Figure 3. Loss graph of proposed deep learning model with respect to epochs.

Table 5 .
Comparison with Existing Techniques.