Botnet Attack Detection Using Local Global Best Bat Algorithm for Industrial Internet of Things

: The need for timely identiﬁcation of Distributed Denial-of-Service (DDoS) attacks in the Internet of Things (IoT) has become critical in minimizing security risks as the number of IoT devices deployed rapidly grows globally and the volume of such attacks rises to unprecedented levels. Instant detection facilitates network security by speeding up warning and disconnection from the network of infected IoT devices, thereby preventing the botnet from propagating and thereby stopping additional attacks. Several methods have been developed for detecting botnet attacks, such as Swarm Intelligence (SI) and Evolutionary Computing (EC)-based algorithms. In this study, we propose a Local-Global best Bat Algorithm for Neural Networks (LGBA-NN) to select both feature subsets and hyperparameters for efﬁcient detection of botnet attacks, inferred from 9 commercial IoT devices infected by two botnets: Gafgyt and Mirai. The proposed Bat Algorithm (BA) adopted the local-global best-based inertia weight to update the bat’s velocity in the swarm. To tackle with swarm diversity of BA, we proposed Gaussian distribution used in the population initialization. Furthermore, the local search mechanism was followed by the Gaussian density function and local-global best function to achieve better exploration during each generation. Enhanced BA was further employed for neural network hyperparameter tuning and weight optimization to classify ten different botnet attacks with an additional one benign target class. The proposed LGBA-NN algorithm was tested on an N-BaIoT data set with extensive real trafﬁc data with benign and malicious target classes. The performance of LGBA-NN was compared with several recent advanced approaches such as weight optimization using Particle Swarm Optimization (PSO-NN) and BA-NN. The experimental results revealed the superiority of LGBA-NN with 90% accuracy over other variants, i.e., BA-NN (85.5% accuracy) and PSO-NN (85.2% accuracy) in multi-class botnet attack detection.


Introduction
An increase in cyber-crimes has made the detection of intrusions in the network a vital research area [1]. Traditionally, personal computers (PC) and computer networks have been the subject of cyber-attack, but recently, cyber-physical systems [2], Internet of Things (IoT) [3], internet of medical things (IoMT) [4], Internet of Connected Vehicles [5], smart factories [6], and 5G communication infrastructures [7] have become targets of numerous attacks as well. Several studies have been carried out to suggest remarkable strategies to fight against cyber actions [8,9]. Previous strategies are not able to resolve complex cyberattacks. Common preventive approaches such as authentication, firewalls, and antivirus are not sufficient for complicated cyber-attacks [10]. Algorithms used for classification and hyperparameters for efficient detection of botnet attacks.
LGBA-NN further contains three major improvements; the first modification involves robust population initialization of bats by Gaussian distribution. The proposed population initialization supports the bat's solution in producing robust offspring by providing sufficient diversity. Secondly, the proposed BA uses a local-global best-based inertia weight to update the velocity of the entire swarm bat. Finally, a local search mechanism based on the Gaussian density function and the local-global best function is proposed to improve exploration during each phase.
The novelty and contribution of this paper is summarized as follows: • A novel meta-heuristic local-global best bat algorithm (LGBA) for optimization of the hyper-parameters of a neural network (LGBA-NN) is presented. • The proposed LGBA-NN is tested on an N-BaIoT data set with extensive real-world traffic data with benign and malicious classes, achieving high performance.

Related Work
Anomaly-based detection and signature-based detection are the two methods of detection. For a known pattern of attack, signature-based detection is used. At the same time, anomaly-based detection is used for unknown and known patterns of attack [31]. Additionally, Network-based Intrusion Detection Systems (NIDS) depend on traffic identification, which means the flow of traffic is used to extract essential features used to classify traffic records into malicious or normal activities by utilizing machine learning algorithms [32]. Host-based and network-based systems are two classifications of intrusion detection systems. All actions are monitored in NIDS for detection of intrusion, which helps in identifying attacks such as Denial of Service (DoS) [33]. In contrast, in the case of Host-based Intrusion Detection Systems (HIDSs), specific approaches are implemented at significant points of the server or every system. A low rate of false alarms and highly accurate detection can be gained by utilizing a vast database in anomaly-based IDSs. For developing a particular database for anomaly-based systems, testing and training phases are implemented. In the training phase, regular patterns are specified, used for comparison during the testing phase.
Traditionally used signature and heuristic methods for detecting malicious software are not able to provide a sufficient level of detection of new and previously unknown variants of botnets. This determines the applicability of the machine learning methods in solving this problem. Advanced machine learning and deep learning methods are being utilized for security purposes while increasing the robustness and accuracy of attack detection without demanding advanced knowledge of security [34,35]. The efficiency of IDSs can be enhanced by utilizing nature-inspired meta-heuristic, grammar-based [36], data mining, machine learning, reinforcement learning, and artificial intelligence-based techniques [37][38][39]. The performance of IDS can be improved by utilizing techniques such as Artificial Bee Colony (ABC) [40], Particle Swarm Optimization (PSO) [41], Grey Wolf optimization [42], and Artificial Fish Swarm Algorithm [43]. Description of recent related works on network intrusion detection using several optimization, deep, and machine learning algorithms are given in Table 1.
One study [44] proposed a group model by utilizing a meta classification strategy facilitated through stacked generalization. UGR'16 and UNSW NB-15 are two varied data sets that were gathered from real and emulated network traffic. The proposed strategy showed an accuracy of 97%, while emulated data sets came up with an accuracy of 94%. Another study [45] proposed an algorithm based on double Particle Swarm Optimization (PSO) to choose hyperparameters and feature subsets in a single process. They utilized deep Belief Networks (DBN), Long Short-Term Memory Recurrent Neural Networks (LSTM-RNN), and Deep Neural Networks (DNN) models for the proposed algorithm.
Tripathi et al. [46] utilized grasshopper optimization algorithm (GOA). Their suggested method was IDS-based to discriminate between malicious and regular traffic. Multilayer perceptron, naïve Bayes, decision tree, and support vector machine determine the attack type. The CIC-IDS 2017 and KDD Cup 99 data sets were used to test their proposed method. Novel utilization of Genetic Algorithm (GA) alongside an immune algorithm was proposed in [47] to improve a computer's ability to detect intrusion. They carried out several simulations to verify the performance of their proposed method. The experiments conducted proved that Immune Genetic Algorithm (IGA) can enhance the system's capabilities to foretell the existence of an intrusion in the network.
The authors [48] used the M-AdaBoost-A algorithm for efficient intrusion detection in the network. They grouped several M-AdaBoost-A-based classifiers through several approaches such as PSO. Their proposed method came up with better performance than existing approaches among various classes in intrusion detection in the traditional enterprise and 802.11 wireless.
Another study [49] suggested a novel method for intrusion detection, namely SRDLM, based on deep learning and semantic re-encoding. Their proposed method enhances the classification abilities by utilizing deep learning while offering highly robust and accurate results. Their proposed method showed an accuracy of 99% when detecting the Web character injection network's attack. Injadat et al. [50] suggested multi-stage optimized Machine Learning-based NIDS that minimizes computations' complexity during intrusion detection. They compared correlation-based and information gain and studied their impact in terms of time and performance. Their proposed method was tested by using the UNSW-NB 2015 and CICIDS 2017 data sets.
A framework proposed in [51] used genetic algorithm (GA), firefly optimization (FFA), grey wolf optimizer (GWO), and particle swarm optimization (PSO). The framework was tested on UNSW-NB15 data set, J48 ML classifiers, and support vector machine (SVM) with promising results.
A study [52] proposed a novel framework based on DL and ML strategies to design NIDS. They studied the pros and cons of existing approaches to the suggested method. They aimed to facilitate researchers with advanced knowledge of AI-based NIDS and spotted possible obstacles for the proposed method.
Another study [53] proposed linear nearest neighbor lasso step (LNNLS-KH) to select features of intrusion detection. They implemented LNNLS-KH on renewed krill herd position to obtain the optimal global solution.
B et al. [54] applied wrapper and filter-based approaches using the firefly algorithm for feature selection. Classifiers of Bayesian Networks (BN) and C4.5 were applied to obtained features with a data set of KDD CUP 99. The preliminary outcomes proved that ten features are adequate for intrusion detection, providing enhanced accuracy.
A novel framework [40] named anomaly network-based IDS (A-NIDS) was proposed by utilizing AdaBoost and the artificial bee colony (ABC) algorithm to achieve low False Positive Rate (FPR) and high Detection Rate (DR). For the selection of features, they utilized ABC, and for classification and evaluation, they used the AdaBoost algorithm. The ISCX-IDS2012 and NSL-KDD data sets were used. Their framework showed promising results.
The authors of [55] proposed a method of intrusion detection for wireless networks using an improved convolutional neural network (ICNN). Firstly, they preprocessed and discriminated data and then model it using ICNN. The experimental results showed that their proposed method offers highly accurate results and valid favorable rates along with lower false-positive rates.
Another study [56] proposed an intrusion detection system based on anomaly to analyze and monitor network traffic flow towards a cloud system. For the classification of network traffic, they utilized an SVM classifier. To tune the SVM parameters, they used SPSO, and to select features of the network, they used binary-based Particle Swarm Optimization (BPSO). They used the NSL-KDD data set for the development and evaluation of their proposed method.
Salo et al. suggested a technique [57] of intrusion detection through principal component analysis (PCA) and information gain (IG). They utilized multi-layer perceptron (MLP), Instance-based learning algorithms (IBK), and support vector machine (SVM). They used Kyoto 2006+, NSL-KDD, and ISCX 2012 to test their model. Their proposed method came up with highly accurate results.
A study [58] investigated network security issues, intrusion detection, fault tolerance, access control, authentication, key management, and crucial security technologies. It considered various intrusion detection approaches and their importance in IoT. They also made a comparison between various techniques of intrusion detection.
A method proposed in [59] used XGBoost to select features through a deep neural network (DNN) in the process of intrusion detection. The proposed model comprises classification, selection of features, and normalization, while for testing, the NSL-KDD data set was used.
Another study [60] analyzed and compared current machine learning-based NIDSs while using the UGR'16 data set to solve issues of intrusion detection. The guidelines provided will be helpful for researchers in analyzing NIDSs. Their proposed model still needs improvements and efforts to enhance understandability.
Krich et al. used stochastic optimization (SPSA) [66] to compute airborne weights that are low-side lobe beam-forming. Their proposed method only depends on digitizing the beam of radar's sum without demanding antenna calibration. The proposed approach comprises low-priced computations and can be scaled conveniently to radars as it contains a considerable quantity of elements of the antenna.
One study [61] suggested a novel method of hybrid classification based on Artificial Fish Swarm (AFS) and Artificial Bee Colony (ABC) algorithms. The authors utilized Correlation-based Feature Selection (CFS) and Fuzzy C-Means Clustering (FCM) to filter unnecessary features and to classify training data sets and evaluated their proposed model on the UNSW-NB15 and NSL-KDD data sets.

Bat Algorithm
BA is a metaheuristic algorithm inspired by nature, proposed by Yang [67]. The microbats' property of echolocation is utilized in this algorithm used by microbats when searching for prey and handling hurdles faced in darkness. Microbats send quite loud pulses during searching prey, and these pulses start becoming quieter with an increase in distance.
Theories of the natural behavior of microbat are expressed as a method of optimization to propose BA. Every bat settles its velocity and location depending upon the mutual exclusivity of microbats close to prey to locate the accurate trajectory from their existing location, as illustrated in Figure 1. To systematize the terms, Reference [67] proposed the following theories related to artificial bats.

1.
Every micro bat estimates the distance within their surroundings and prey by utilizing its property of echolocation.

2.
Frequency of fixed range is utilized to find a micro bat's velocity v i from location x i beside distinct loudness A o and distinct wavelength λ while searching for prey.

3.
Emission pulse rate r ∈ [0, 1] can be utilized to adjust the frequency of its pulses while estimating distance among prey and microbat. 4.
Loudness will be migrated from a considerable positive value A o to a smaller value A min .
Conventional BA comprises the following six steps: Step 1: The bat parameters and the population are initialized. The global optimization function is expressed as follows: The The following steps are followed for initialization of the BA parameters: 1.
N denotes the number of locations of the artificial bat.

2.
F min is used to denote minimum frequency, while F max is used to denote maximum frequency.

3.
Every bat's velocity vector is represented by v j .

4.
Rate of loudness is denoted by A j .

5.
Pulse rate is denoted by r j .

6.
Every bat's initial pulse rate is denoted by r 0 j . 7.
£ is used to denote a range of bandwidth.
Step 2: Bat population memory is initialized. BM is used to stores vectors of bat location. The following steps are followed to generate those vectors randomly as expressed in Equation (2): BM is then used to store those solutions while arranging values of objective function in ascending order. : . . . : Step 3: Regeneration of current population of bat. Three operators are used to reconstruct every bat's location; these operators are selection, diversification, and intensification. While applying the intensification operator, a new location x j of the bat is created as follows: While applying diversification operator, a local strategy of searching is used to create the new location of the bat. Consequently, the latest location x j is obtained as follows: While applying a selection operator, the current location of bat is replaced with a new location of bat and updates x Gbest in the case f (x Step 4: Stopping criteria. The previous step is iterated until the criteria of termination are met. A pseudo code for standard BA is presented in Algorithm 1.

Algorithm 1
Pseudo code for standard BA.
1: Sensor with rich RSSI value 2: for j = 1 to N do 3: for i = 1 to d do 4: for j = 1 to N do 10: for i = 1 to d do 12: end for 15: if U (0, 1)> r j then 16: for i = 1 to d do 17: x j i = x best i + ∈ A j 18: end for 19: end if 20: if

23:
A j = αA j 24: An artificial neuron comprises interconnected processing units responsible for processing in parallel and assigns inputs to the required outputs. The output gained from the artificial neuron is illustrated as Equation (12). The DNN model consists of the following layers: output layer, hidden layers, and input layer.
Every neuron receives a signal from any environment. A weight w i.j is linked with each input signal x i . The output signal y i is computed by the environment. In Equation (12), the node's output is denoted by y i , the node's ith input is denoted by x i , weight among input and node are denoted by w ij , the node's bias is denoted by β i , and the activation function of the node is expressed by f i . Generally, the node's activation function is a function of nonlinear nature, such as Gaussian function, sigmoid function, and Heaviside function. The output layer comprises a single neuron concerning each class.
To train neural networks by utilizing a meta-heuristic algorithm, three methods are available. In the first one, algorithms are utilized to gain a mixture of biases and weights that contribute the least MSE. Secondly, an appropriate structure for a given problem is found by utilizing algorithms. Thirdly, the gradient-based learning algorithm's parameters such as momentum and learning rate are adjusted by utilizing a meta-heuristic algorithm. Figure 2 demonstrates the DNN with one hidden layer. For swarm and evolutionary-based algorithms, the process of adaption is interpreted into an appropriate illustration of termination condition, DNN weights, and fitness function.

Local-Global Best Bat Algorithm (LGBA-NN)
Instant detection increases network security by speeding up alerts and by disconnecting compromised IoT devices from the network, preventing the botnet from spreading and preventing further attacks. In related work, several metaheuristics methods for detecting botnet attacks have been developed. The major problem in the metaheuristic algorithms used for botnet attack detection is premature convergence and low diversity.
A lack of diversity is usually the cause of premature convergence. The extent that changes, i.e., the variety of diverse solutions in the population and how specific they are, is measured by low diversity (distance between alternative solutions). Researchers have tried to propose new variants with the latest features in the last decade, but they all have the same advantages, such as achieving the global best optimum, and the same disadvantages, such as premature convergence and low diversity. To overcome this issue, we propose three modifications in the proposed LGBA-NN before use as an optimizer in the neural network.
The first modification includes robust population initialization of bats through Gaussian distribution. The proposed population initialization helps the bat's solution obtain enough divers for generating robust offsprings. Secondly, the proposed BA adopts the local-global best-based inertia weight to update the swarm's entire bat's velocity. Lastly, the local search mechanism is proposed and follows the Gaussian density function and local-global best function to achieve better exploration during each generation.

Gaussian Distribution
For a real-valued random variable, a Gaussian distribution is a form of the variable stochastic process. The probability density function's general form is as follows: In the above equation, the mean or expectation of the distribution (and its mode and median) is denoted by σ, while the standard deviation is shown as ρ. The distribution's variance is ρ 2 . A normal deviate is a random variable that has a Gaussian distribution and is normally distributed.
A Gaussian function is as follows: For nonzero z and arbitrary real constants x and y, a Gaussian's graph has a distinctive differential "bell curve" form. The intensity of the curve's edge is controlled by the parameter x, the direction of the peak's centre is controlled by the parameter y, and the width of the "bell" is controlled by the parameter z (the standard deviation, also known as the Gaussian Root Mean Square (RMS) width).
When combining an exponential function with a concave quadratic function, it forms a function: where α = −0.5/c 2 , β = b/c 2 and γ = 0.5(Log(a−b 2 ) LGBA-NN Steps The proposed architecture of LGBA-NN is presented in Figure 3. The following are the steps adopted in the proposed LGBA-NN.

1.
Initialize the bat population using the Gaussian distribution over internal [0, 1]. The Gaussian density function ensures the diverse locations of each bat in the multidimensional search space. If any two corresponding vectors obtain the same initial solution, the Gaussian density function produces a diversity effect in the second generation. The following equation can be used to initialize the population using Gaussian distribution.
where x G i represents each individual gained initial location using Gaussian distribution over jth dimension. LB i indicates the lower bound, which is set to 0, and UB i shows the upper bound with 1 maximum value. Gauss(0, 1) represents the Gaussian density function used to generate random numbers following Gaussian distribution over the interval of [LB i , UB i ]. Normal distribution starts with a random guess if the bats have any prior knowledge about the solution. However, Gaussian distribution disregards previous experience about the solution. This phenomenon tends to produce a rich diversity in the initial solution of the population.

2.
The second modification includes the local-global best inertia weight to accelerate each bat's velocity during the exploration process. In standard BA, the velocity of each bat is updated without inertia weight acceleration. In general, significant inertia weight is recommended for the initial stages of the quest process to improve global exploration (searching for new areas). However, the inertia weight is decreased for local exploration in the later stages (fine-tuning the current search area). The proposed local-global best inertia weight effect can be produced by taking a significant difference in local best solution of the current swarm with the global best solution of all multitude in the multidimensional search space. This phenomenon supports the convergence process to be robust enough in terms of exploration; if the current generation gained two same fitness values, the proposed inertia weight uses both solutions to decide the bat's next location towards the global optimum. Local-global best inertia weight can be defined using the following equation.
where x j i indicates the local best solution of current swarm and x Gbest i expresses the global best solution of all the swarms in the population.
LGBA-NN updates the bats velocity through following equation. In the third modification, we updated the local search mechanism, which uses the Gaussian density function and local-global best function to achieve better exploration during each generation. During the local searching process, each bat's current fitness was evaluated and compared with each bat's previous fitness; however, in standard BA, the local search mechanism missed the local and global best solution during the last generation. In the previous generation, if any bat has a global fitness greater than the global fitness of bats in the next generation, then the algorithm needs to retain the last global best fitness. We tackled this issue by introducing the Gaussian density function and local-global best function to achieve better exploration during each generation. Local-global best function retained the difference between previous local and global best.
LGBA will check the difference between local and global best fitness; check if it is minimal, smaller than the current fitness; and retains the best optimum solution.
LGBA uses the following equation to update the local search manner.
where Gauss(0, 1) represents the accelerated sequence generated through Gaussian distribution, v LG i indicates the local-global best function retaining the difference between the previous local and global best, where 0.001 is the scaling factor used to balance the exploitation during local search. The pseudo code for proposed LGBA-NN is given in Algorithm 2.

4.
In the last modification, LGBA is further employed in the neural network for hyperparameter tuning and weight optimization to classify ten different botnet attacks with an additional one benign target class. A weight w i.j is linked with each input signal x i following the optimal solution x LG i . The output signal y i is then fed to the classifier for the prediction. For weight optimization of LGBA-NN, we used the following equation.
Algorithm 2 Pseudo code for the proposed LGBA-NN. for j = 1 to N do 10: F j = F min + F min − F max * (0, 1) 11: for i = 1 to d do 12: Use Equation (18) to update velocity of each bat 13: end for 15: if U (0, 1)> r j then 16: for i = 1 to d do 17: Use Equation (19) to obtain local best solution. 18: end for 19: end if 20: if

25:
end if 26: end for 27: Update x Gbest , Gbest ∈ (1, 2, . . . .., N) 28: end while w LG i.j shows the optimal weights obtained by LGBA and can further be used in the DNN as follows: For hyperparameter optimization, LGBA-NN used x LG i as an optimizer instead of other optimizers such as Adam and SGD.

Data Set Description
The proposed LGBA-NN was tested on an N-BaIoT data set with extensive real traffic data with benign and Malicious target classes. The N-BaIoT data set addresses the lack of publicly available botnet data sets, specifically for IoT technology. It shows accurate traffic data from 9 commercial IoT devices that have been infected with Gafgyt and Mirai. Gafgyt is considered one of the most well-known IoT botnets, and therefore, its script and activities have been replicated in other IoT malware. The botnet attacks IoT devices running on Linux by brute-forcing devices' default credentials while using open Telnet ports to launch an attack. N-BaIoT is a multivariate sequential data set with a total of 115 real value features. N-BaIoT is publically available at [68]. A data set description with a total number of target classes and the number of total instances in that class is given in Table 2.

Evaluation Metrics
We used four performance measures to evaluate the performance of LGBA-NN. These evaluation metrics include precision, recall, f1-score, and support.
The recall is calculated by dividing the number of true positives by the number of false negatives plus true positives. True positives are independent variables that the LGBA-NN classifies as positive but are actually positive. While false negatives are independent variables that LGBA-NN classify as negative but are truly positive. Recall can be defined as follows: Recall = True Positives False Negatives + True Positives (22) Precision is calculated by dividing the number of true positives by the number of false positives plus true positives. False positives are independent variables that the LGBA-NN classifies as positive but are actually negative. Precision can be defined as follows: F1 score is the harmonic mean of precision and recall:

Results and Discussion
Devices are usually equipped to classify based on expert labels in deep learning applications.
LGBA-NN, on the other hand, has been trained to identify unusual behavior. As a result, LGBA-NN can detect previously unknown botnet behaviors, which is critical given the ever-evolving variants that render most detection methods obsolete. The IoT domain is too complicated compared to traditional computing environments.
LGBA-NN, on the other hand, tackles the growing complexity of Smart nodes by evaluating each system against other botnet attack target groups. The traffic data of all linked hosts are supposed to be tracked in the enterprise scenario. Nonetheless, the volume of controlled traffic is too high to store and use for in-depth neural network training. To remove features, LGBA-NN employs systematic statistics.
LGBA-NN training is available remotely. As a result, learning is useful and there is no need to be concerned about storage. Furthermore, since LGBA-NN is network-based, it consumes no computing memory from IoT devices typically constrained. As a result, LGBA-NN does not negatively impact its operation.
To the best of our knowledge, no one has previously applied optimized BA to an IoT network traffic for detecting ten botnet attacks. Therefore, we are still short on other variants to compare for a fair comparison of the state-of-the-art algorithms' current state. Furthermore, optimized BA has not been used as highly autonomous independent malware detectors in the broader domain of network traffic, instead of as intermediate devices with either object training or feature extraction or as semimanual anomaly indicators that rely heavily on human labeling for further examination by intelligence analysts.
We divided the experimental configuration into five phases. Firstly, the experiments were enriched with a neural network and no hyperparameter optimization; secondly, we added the Gaussian noise layer in the neural network to check the Gaussian effect on the classifier's performance. A performance evaluation of the neural network without hyperparameter optimization using ten attack types and one benign target class is given in Table 3. Similarly, a performance evaluation of the neural network without hyperparameter optimization and with Gaussian noise addition using ten attack types and one benign target class is presented in Table 4.
In the third phase, the Gaussian noise layer was extracted and the neural network dropout layer was added for further analysis of neural network behavior on the multiclass problem. Table 5, expressing the results, was obtained using the neural network without hyper-parameter optimization and with neural network layer dropout using ten attack types and one benign target class. After that, we merged both the Gaussian noise and neural network dropout layers without hyper-parameter optimization. Table 6 shows comparative results of the neural network without hyper-parameter optimization and with neural network layer dropout and Gaussian noise addition using ten attack types and one benign target class. Lastly, we evaluated LGBA-NN with hyper-parameter optimization. Since there are many other NN optimization methods proposed in the related work. Therefore, for a more thorough comparison, we included some other recent NN optimizers such as BA-NN and PSO-NN. The performance evaluations of BA-NN and PSO-NN using 10 attack types and 1 benign target class are presented in Tables 7 and 8. A comparative evaluation of the proposed LGBA-NN using ten attack types and one benign target class is given in Table 9. In terms of true positives, false negatives, true negatives, and false positives, for the most part, the LGBA-NN demonstrated dominance. Deep architectures' ability to learn variational structure representation and to estimate complex functions is likely to be the reason for this. The confusion matrix obtained through the neural network without hyperparameter optimization and with Gaussian noise addition using ten attack types and one benign target class is presented in Figure 4a. Table 4. Performance evaluation of the neural network without hyperparameter optimization and with Gaussian noise addition using 10 attack types and 1 benign target class. LGBA-NN can minimize the loss of initiated attacks if the classification of attackrelated abnormalities automatically and directly causes exclusion of the compromised IoT system from the network. Variation in loss during the training of neural networks without hyperparameter optimization is visualized in Figure 5a.
Furthermore, LGBA-NN cannot comprehend trivial identity mapping due to the restricted uncertainty imposed by the feature space in the hidden layers. As a result, LGBA-NN suits more common characteristics than unique ones. This is advantageous for IoT devices because their functionality usually is task-oriented, translating into a few standard traffic patterns. Variation in loss during the training of neural network without hyper-parameter optimization and with neural network layer dropout and LGBA-NN is visualized in Figure 5b,c. The confusion matrices obtained through a neural network without hyperparameter optimization and with neural network layer dropout, Gaussian noise addition, and LGBA-NN using ten attack types and one benign target class are illustrated in Table 9 and Figure 4c.

Analysis
Each type of botnet attack activity's uniformity can be applied appropriately to performance measures. An integrated platform with a high degree of traffic predictability can highlight any unusual behavior, increasing recall while decreasing precision. We extracted static and dynamic attributes from the training set for empirical validation, and we used NN to examine the impact of these attributes on the average recall and precision obtained by five NN configurations on the test set.
From Table 3, it can be observed that the neural network shows a higher precision rate of 0.9957 and a higher recall rate of 0.9724 for the benign class. However, regarding botnet attacks, identification neural network fails to deliver a low false-positive rate. Standard CNN without hyperparameter optimization obtained a low precision rate, which means the classifier identified both types of botnet attacks with ten target classes as benign instances at a higher rate. Similarly, a higher mean recall rate recorded for neural networks without hyperparameter optimization shows the ability of the neural network to classify benign samples as malicious botnet attacks, hence decreasing the false-negative rate.
Adding Gaussian noise layer to the neural network without hyperparameter optimization using LGBA slightly improved the average precision rate (see Table 4). However, the average recall rate's net impact compared to the neural network without the Gaussian noise layer is the same. This variant obtained a maximum precision rate of 0.9974 and a minimum recall rate of 0.3173 for the "Mirai ack" target class. As opposed to this, a neural network with a Gaussian noise layer failed to gain a low false-positive rate for the "Gafgyt udp" target class with a minimum precision rate of 0.5089 and a maximum recall rate of 0.9990.
Similarly, removing the Gaussian noise layer and adding a neural network dropout layer to the neural network without hyperparameter optimization using LGBA insignificantly increased the average precision rate (referred to Table 5). However, the average recall rate's net impact correlated to the neural network with the Gaussian noise layer is identical. This modification achieved a maximum precision rate of 1.00 and a maximum recall rate of 0.9990 for the "Mirai syn" target classes. This indicates that the false-positive and false-negative rates are approximately 0 for that particular target class. As opposed to this, adding a neural network dropout layer failed to gain a low false-positive rate for the "Gafgyt udp" and " Gafgyt combo" target classes with a minimum precision rate of 0.00 and minimum recall rate of 0.00.
From Table 6, it can be observed that a neural network without hyperparameter optimization and with neural network layer dropout and Gaussian noise addition bestows a more maximum precision rate of 0.9994 and a higher recall rate of 1.00 for the "Mirai scan" class. Nevertheless, concerning "Gafgyt udp" and "Gafgyt combo," the variant fails to deliver a low false-positive rate.
From Table 7, it can be perceived that the BA-NN confers a higher average recall rate of 0.855931 and a lower precision rate of 0.820524. However, BA-NN failed to obtain a low false-positive rate for the "Gafgyt tcp" target class with a minimum precision rate of 0.00 and a minimum recall rate of 0.00. Referring to Table 8, PSO-NN obtained a high precision rate for all target classes except "Gafgyt tcp" and "Gafgyt udp" with 0.498 and 0.00 respectively. However, both variants BA-NN and PSO-NN achieved higher accuracy compared to the non-optimziation version of the experimental results.
The proposed LGBA-NN (refer to Table 9) managed to overcome the effects of each botnet attack detection except the "Gafgyt tcp" botnet attack class. We can observe that LGBA-NN obtained a low false-negative rate as it misclassified only 10% of its negative instances. However, it still had a 15% false-positive rate, indicating the complexity of multi-class dimensionality. The proposed LGBA-NN received a maximum precision rate of 0.998969 and a maximum recall rate of 0.9989 for the "Mirai udpplain" target classes. Compared to all neural networks, LGBA-NN shows a maximum accuracy of 90%, with the lowest misclassification rate for all target classes. The loss curves and confusion matrix presented in Figures 4 and 5c also confirm the superiority of the proposed LGBA-NN over other variants of the neural network, as LGBA-NN (90% accuracy) outperformed BA-NN (85.5% accuracy) and PSO-NN (85.2% accuracy).

Conclusions
To reduce the risk associated with IoT devices, it is essential to identify DDoS attacks in advance. Early DDoS attack identification improves network security by speeding up the process of disconnecting compromised IoT devices from the network, preventing the botnet from spreading and preventing additional attacks. In this research, LGBA-NN was proposed to accumulate both feature subsets and hyperparameters for efficient botnet detection based on data from 9 commercial IoT devices that were authentically infected by two botnets: Gafgyt and Mirai. The proposed BA uses local-global best-based inertia weight to update the swarm's entire bat's velocity. To tackle with swarm diversity of BA, we proposed Gaussian distribution used in the population initialization. Furthermore, the local search mechanism was enhanced by the Gaussian density function and local-global best function to achieve better exploration during each generation. The proposed LGBA-NN was put to the test on the N-BaIoT data set, which includes a large amount of real-time traffic data for both benign and malicious target groups. We evaluated the proposed LGBA-NN by comparing the performance with several configurations of a non-optimized version of neural networks and some recent variants of optimized neural networks such as PSO-NN and BA-NN. The experimental results proved that LGBA-NN is superior over other recent algorithms.
In future work, we intend to extend the optimization of neural networks using the bat algorithm to other evolutionary models such as differential evolution algorithm, genetic algorithm, and particle swarm optimization.