In this section, we examine the tests that were done to measure the efficiency of the projected method. Here, the proposed model’s results for IDS and encryption are reviewed, and the model’s effectiveness is illustrated by a comparison to other current models that are in use. The suggested model is developed using the CloudSim simulator and the working environment of Python on a computer running Windows 10 with an Intel i7 CPU and 8 GB of RAM.
5.1. Dataset Description
In this work, two kinds of datasets are mainly used for testing: KDDCUP99 and UNSW-NB15.
KDDCUP99: This dataset was modeled for ID in an expanded form. The following four attack types are given in the KDDCUP99 Dataset as Denial of Service (DoS), User to Root (U2R), Remote to Local (R2L), and Probing. There are 41 characteristics in total, 38 of which are numeric, and 3 of which are not (protocol type, service type, and fag). There are also fundamental traffic features (23–41), content features (11–22), features (1–10), and a class label for each item.
UNSW-NB15: There is a massive amount of network traffic in the UNSW-NB15 dataset, and about two million records of both anomalous and regular connections. This dataset, which reflects real-world situations, is high-dimensional, with 49 attributes for each link record and a mean velocity of 5–10 MBs. The UNSW-NB15 dataset, one of the most widely used benchmark datasets for testing IDS, is old but still contains patterns of nine contemporary attack types, including Backdoors, Fuzzers Inquiry, Denial of Service (DoS), Shellcode, Exploits, Generic, Reconnaissance, and Worms.
The proposed work was evaluated with these two IDS datasets. On the other hand, the real-time data collection involves a network of IoT devices that interact with each other, possibly under various conditions or scenarios. These conditions could include different network loads, device configurations, and types of traffic. The data collected from this IoT setup will likely include information about network traffic patterns, communication between devices, and potential anomalies or security events. The proposed technique follows the same procedure for the real-time data to predict and mitigate the attacks. Its ability of differentiating malicious and normal network traffic enables to be more efficient in real-time conditions.
5.2. Validation Parameters
The model’s results are analyzed by estimating parameters such as recall, accuracy, specificity, f-measure, false positive rate (FPR), precision, log loss, and FNR. These parameters provide the efficiency of the IDS methods in identifying the malicious network traffic. Furthermore, to manifest the influence of the classification process, a wide range of indicators are typically utilized in IDS research.
Accuracy: It is a crucial performance indicator for assessing IDS, well-defined as the entire amount of data that was successfully classified out of all the packets transmitted. The formula for accuracy calculation is represented in Equation (19):
Recall: It is known as detection percent and is indicated as ratio of instances that have been legitimately tested to all positive trials. A critical IDS metric and detection rate demonstrate how well the model can recognize attacks, mathematically expressed in Equation (20):
Precision: The confidence in attack detection is measured by the ratio of actual positive models to predicted positive models. The formula for precision estimation is defined in Equation (21):
F-measure: F-measure is estimated based on the harmonic mean of recall and accuracy. The f-measure formula is represented in Equation (22):
False Negative Rate (
FNR): The FNR is the proportion of false negative samples to all positives. The FNR is often referred to as the assault detection missed alarm rate and is expressed in Equation (23):
FPR: It is described as the ratio of models that tested falsely positive to those that were anticipated to test positive. In attack detection, the FPR, sometimes referred to as the false alarm proportion, is calculated using Equation (24):
Log-Loss Score: The logarithmic loss reveals how well a forecast probability matches the true value or actual value. Log loss is the averaged negative value of the adjusted predicted probability for each case. The Log-Loss score calculation is expressed in Equation (25):
where
,
,
, and
are used to indicate the true positives, true negatives, false positives, and false negatives, respectively. Malicious traffic data are typically recognized as positives and normal data as negatives, since the objective is to determine the attacks. Metrics such as recall, accuracy, FNR, and FPR are widely used in attack detection.
5.3. Performance Evaluation
In this research, the KDDCUP99 and UNSW-NB15 datasets are taken for the validation. The raw data from the database is pre-processed for removing the noise and missing values. Further, the SSO algorithm is applied for the feature selection. Some aspects have little bearing on determining whether or not the information and traffic behavior is normal. We employ 90 characteristics instead of the original 89 features, since such features, including the timestamp feature and IP addresses, do not aid in training the neuron to identify mistakes and intrusions. The performance measure’s value of selected features for both datasets are analyzed and equated with the earlier models, and are provided in
Table 2.
Further, the selected features are applied to the LSTM approach, whether the transmission process have attacked or not, independent of what types of attack are obtained. There are two different kinds of categorized classes; one is normal and others are malicious traffic attacks, such as Denial of Service (DoS), User to Root (U2R), Probe, Remote to Local (R2L), Backdoors, Fuzzers Inquiry, Shellcode, Exploits, Reconnaissance, and Worms. Consequently, the SHA3-256 with HE method is applied for secured data transmission. The execution of the projected method is compared with the earlier methods such as MWKF-LSTM with DEDFA [
27], SA-IDPS [
29], BiLSTM [
30], and GA-FCAM-CNN [
33] regarding the evaluation metrics, including accuracy, FPR, FNR, encryption time, and decryption time.
In the presented strategy, adaptive learning was utilized for training the LSTM for classifying the malicious and normal network traffic. Adaptive learning indicates the capacity of the design to continuously update itself based on new data. In the presented work, adaptive learning helps the system to evolve and improve its accuracy as it encounters new and previously unseen attack patterns, thereby increasing its efficiency in attack prediction. To manifest the learning efficiency of the designed method, it is equated with different learning algorithms, such as reinforcement learning (RL) [
36], adaptive reinforcement learning (ARL) [
37], unsupervised learning (UL) [
38], and semi-supervised learning (SL) [
39].
Figure 4 presents the learning rates of different techniques.
Figure 5a,b provide a graphical presentation of the training accuracy, as well as a loss comparison for the two (KDDCUP99 and UNSW-NB15) databases. The suggested system’s ability to continue operating after 25 epochs of training accuracy and 50 epochs of training loss is proven. This graph is displayed between focused log loss and epochs. Comparing the suggested system’s training performance to that of older models reveals that the new approach has a greater training accuracy and a very low loss value.
Figure 6a,b provides the comparative analysis of testing accuracy and loss for both datasets. The suggested system’s ability to continue operating after 25 epochs of training accuracy and 50 epochs of training loss is proven. This graph is displayed between focused log loss and epochs. The proposed approach has achieved higher testing accuracy with very low loss function over the conventional methods.
The MSE value of the proposed method is validated for both datasets, as illustrated in
Figure 7. It is utilized to evaluate, train, and test performance records. At epoch 26, the KDDCUP99 dataset’s validation performance is at its best (0.07). The training epochs with the lowest error yield the highest performance; nevertheless, error reduction frequently starts after validation. As a result, epoch 25 (0.0138) has the best validation performance for the UNSW-NB15 datasets.
The evaluation metrics of accuracy, recall, precision, F-measure, FPR, and FNR are estimated for proposed and conventional methods. The graphical representation of those metrics for the KDDCUP99 dataset is portrayed in
Figure 8a–f. The investigation displays that the designed method has achieved greater performances regarding accuracy, precision, recall, etc., than the conventional methods.
Figure 9a–f shows a graphical depiction of those measures for KDDCUP99 database (
Figure 9a–f). The study demonstrates that the designed method outperformed the traditional frameworks with higher accuracy, precision, recall, F-measure, FPR, and lesser FNR value.
Figure 10a,b provides the communication delay incurred by different techniques for the KDDCUP99 and UNSW-NB15 datasets. The communication delay determines the time taken by the system for transmitting the signal from the sender to the receiver. The comparative study determines that the proposed approach has achieved less communication delay compared to the other tradition methods.
The detection rate efficiency of the designed framework for categorizing the traffic at different packet levels is compared with the conventional models, portrayed in
Figure 11a,b. The developed method has achieved a higher detection rate than the other methods.
Furthermore, the encryption and decryption time of the presented approach of a SHA3-256-based single-way hash function is compared with the earlier methods, as displayed in
Figure 12a,b. The comparative performance describes that the presented methodology has achieved far less encryption and decryption time than the traditional methods.
Furthermore, the collision resistance and computational efficiency of the proposed a SHA3-256-based single-way hash function is compared with the earlier SHA-1, SHA-512, SHA-256, and SHA-384 methods, portrayed in
Figure 13a,b. The results demonstrate that the designed approach achieved very high collision resistance and computational efficiency as compared to the traditional methods. As a result, the proposed SHA3-256-HE is strengthened and becomes more resistant to unidentified attackers. According to the performance investigation, all SHA3-256-HE variations offer higher clock rates per byte than SHA-2 kinds. This result was supported by the SHA3-256-HE internal structure, which makes it more secure because both the MAC and the hash are present, but SHA-2 is not. SHA3-256-HE is still the greatest option for providing security and data integrity, despite having more cycles per byte. SHA3-256-HE’s adaptable structure enables it to perform as well as non-anonymity, a non-re and secrecy from all possible attacker configurations. Moreover, the analysis shows that the proposed security algorithm is computationally efficient.
Due to its low energy consumption, collision resistance, and computational efficiency, SHA3-256 has demonstrated improved performance with small key sizes and is suitable with devices that have restricted resources. Furthermore, the memory usage and processing time of the developed SHA3-256 method is compared with the earlier models when applied to large-scale IoT networks with high data traffic, as demonstrated in
Figure 13a,b.
Figure 14a compares how different strategies use memory in this regard. The designed SHA3-256 single-way hashing method was determined to be the best with the least amount of memory needed for processing.
Furthermore, the execution time of the proposed technique was analyzed to determine how much time the proposed technique takes for predicting and mitigating attacks in the IoT environment.
Table 3 provides the overall time consumed by the developed framework for KDDCUP99 and UNSW-NB15 datasets. The developed model consumed 4.3 ms and 2.8 ms, respectively, for predicting and mitigating attacks in the KDDCUP99 dataset, while the designed approach attained 3.9 ms and 2.5 ms, respectively, for predicting and mitigating attacks in the UNSW-NB15 dataset.
5.4. Discussion
Table 4 provides thorough comparison results for the KDDCUP99 and UNSW-NB15 database values. The earlier MWKF-LSTM with DEDFA [
26] method used a MWKF model in an LSTM approach with DEDFA feature selection function, yet the optimal solution was not reached due to the complexity and computational burden of this work. In SA-IDPS [
28] research, the smart approach is developed, but only limited categories are applied for the intrusion detection and prevention. The BiLSTM [
29] for intrusion detection takes more time for training, so the attackers can easily hack the data. Using the GA-FCAM-CNN [
32] approach, the trust calculation is not updated by any outside sources and is quite low. The research demonstrates that the suggested strategy outperformed the standard strategies for both datasets. The proposed method has used SSO for optimal feature selection and an LSTM approach for intrusion detection; also, the data has been secured using the SHA3-256 algorithm. The LSTM module has the tendency to handle the sequential and temporal dependencies effectively. This capacity of the LSTM helps the system to determine the order and timing of the network events, enabling the system to estimate the network traffic and predicting the attack patterns accurately. However, the traditional ML algorithms, such as SVM, RF, DNN, etc., often face challenges to capture long-range dependencies in sequences, leading to suboptimal performance. In addition, the memory cell and gating behavior enables the system to remember and forget data over long time intervals, allowing it to identify complex relations and patterns of network traffic over time intervals. These characteristics of LSTM make it unique from other ML approaches and aid the accurate classification of network traffics more than the other approaches. Moreover, the LSTM has the capacity to mitigate the exploding gradient problems, leading to more effective training under challenging data distributions. Furthermore, the architecture of LSTM enables it to learn and capture both contextual and hierarchical attributes. This structural property of LSTM enables it to differentiate normal and malicious activities more precisely than the other ML approaches. Thus, the usage of LSTM can be helpful to handle long-term intrusion issues. Moreover, when modelling complicated sequential IDS data, LSTMs are incredibly effective. The intensive assessment of the developed approach results proved that it has achieved better attack identification proficiency than than the earlier models, with performance metrices in terms of high recall, FPR, accuracy, F-measure, precision, and less FNR, MSE, and execution time.
Previous attempts to design efficient security algorithms have failed. Experimental findings show that our suggested smart approach model, when used with important phases, effectively counters both common and uncommon assaults. In this research, the IDS performances are upgraded by incorporating deep LSTM networks with hybrid SSO feature selection methods by handling the uncertainties of data. The SSO algorithm can estimate the best and worst features from the database and obtained the optimal results. Because of this SSO algorithm, the best features are collected from the processed database and given to an LSTM algorithm for effective intrusion detection. Thus, this significant detection is helpful for further effective intrusion prevention. The features in the proposed model are chosen by the SSO algorithm from the payload data and then utilized as input data by the LSTM layer. The unaltered unique payload situated at the initial phase of the flow is helpful in traffic categorization of unaltered data, even though the features chosen from the SSO algorithm contain significant data for the classification of traffic. As a result, it requires the least amount of preparation time and training. However, it also makes our approach well-suited for high-dimensional and large-scale domains. The execution time is reduced in this research due to the malicious traffic detection at the level of the packet. A thorough examination reveals that SHA3-256-based single-way hash functions can reliably identify and stop a variety of intrusion assaults on IoT networks, including spoofing, tampering, and replay attacks. The SHA3-256 was utilized in the proposed work, playing a significant role in mitigating the intrusions in the IoT by providing a strong cryptographic mechanism, thereby ensuring data integrity and confidentiality. In the developed approach, this hash function acts as an effective checksum mechanism, allowing the system to validate the integrity and confidentiality of the network traffic data. This cryptographic mechanism works by calculating the fixed-size unique hash value for the incoming network traffic data. This hash value acts as the digital fingerprint for the data and any modification in the data completely change its hash value. The proposed technique ensures data integrity by equating the generated hash value with the computed hash value; if the hash values remain unchanged, the data is secured. Otherwise, the system predicts it as injected data. Thus, it helps to detect any unauthorized alterations on the data. The integration of this hash function into the proposed approach enhances the security and promptly mitigate the attacks. In addition, the SHA3-256-based single-way hash function has fast processing speed, increased efficiency while handling huge volumes of data, and allows one-way security from attackers. Because hash algorithms are collision-resistant, it is improbable that two separate inputs would result in the same hash. Thus, the overall analysis consequences show that the developed model effectively detects and prevents various types of intrusion attacks, such as Denial of Service (DoS), User to Root (U2R), Remote to Local (R2L), Probing, Backdoors, Fuzzers Inquiry, Shellcode, Reconnaissance, and Worms, in IoT environments. The proposed approach accuracy value is 99.9%, which is more than the other methods for both KDDCUP99 and UNSW-NB15 databases. Our system has several benefits for quickening up the detection procedure, since it analyzes traffic at the packet level, such as the ability to disregard inspecting a significant number of packets in a transaction. The experimental findings demonstrate that, when compared to earlier work, our strategy is competitive and clearly superior regarding accuracy, recall, precision, FPR, detection percent, FNR, and F1-score. The algorithm that will be utilized to analyze and anticipate the incursion uses the level of accuracy as its primary performance metric. The major concern of the presented study is to addresses the security challenge associated with the IoT system. Typically, IoT systems contain a wide range of interconnected devices covering domains including smart homes, Industrial IoT (IIoT), healthcare units, agricultural sector, transportation, etc. Addressing the security issues in these domains is crucial for manifesting the proficiency of the designed algorithm. Combining the strengths of multiple algorithms, including LSTM, SSO, and SHA3-256, the developed approach offers a versatile, robust, and efficient mechanism for tackling the privacy and security problems in different domains. In all above-mentioned domains, the proposed technique follows the common protocol to mitigate the attacks, enabling secure data access and transmission. Moreover, the designed model performs better in a resource-constrained IoT environment. Typically, IoT devices are restricted to memory, power, and other energy resources. Therefore, an attack detection framework must consider these resource-constrained factors. The proposed model was designed with adaptability characteristics, such as the capacity for minimizing memory without losing its ability of capturing interconnections within the traffic data. In addition, the integration of SSO minimizes the computational complexity, making the feature selection process more sustainable and feasible. Moreover, the usage of SHA3-256 can mitigate the attacks without producing a significant burden on hardware or power resources. Thus, the presented approach can be practicable for resource-constrained environments.