Modeling of Botnet Detection Using Barnacles Mating Optimizer with Machine Learning Model for Internet of Things Environment

: Owing to the development and expansion of energy-aware sensing devices and au-tonomous and intelligent systems, the Internet of Things (IoT) has gained remarkable growth and found uses in several day-to-day applications. However, IoT devices are highly prone to botnet attacks. To mitigate this threat, a lightweight and anomaly-based detection mechanism that can create proﬁles for malicious and normal actions on IoT networks could be developed. Additionally, the massive volume of data generated by IoT gadgets could be analyzed by machine learning (ML) methods. Recently, several deep learning (DL)-related mechanisms have been modeled to detect attacks on the IoT. This article designs a botnet detection model using the barnacles mating optimizer with machine learning (BND-BMOML) for the IoT environment. The presented BND-BMOML model focuses on the identiﬁcation and recognition of botnets in the IoT environment. To accomplish this, the BND-BMOML model initially follows a data standardization approach. In the presented BND-BMOML model, the BMO algorithm is employed to select a useful set of features. For botnet detection, the BND-BMOML model in this study employs an Elman neural network (ENN) model. Finally, the presented BND-BMOML model uses a chicken swarm optimization (CSO) algorithm for the parameter tuning process, demonstrating the novelty of the work. The BND-BMOML method was experimentally validated using a benchmark dataset and the outcomes indicated signiﬁcant improvements in performance over existing methods.


Introduction
The Internet of Things (IoT) refers to an interconnected network of software, devices, actuators, sensors, and so on that exchange and store information. The advantages of the IoT include the flow of information, automation, and communication with less effort and time [1]. In the IoT structure, physical gadgets have capacities for organization and management derived from being smart gadgets, and such gadgets can become a vigorous part of human life, ranging from the home to big institutional and industrial fields. The IoT brings great innovation to lives by enabling indirect transmission among gadgets and individuals, making it susceptible to various cyber scams [2]. For the IoT, numerous security solutions have been devised, such as prevention, authentication, and detection. Using machine learning (ML) techniques with the IoT might resolve issues regarding privacy and security. Nowadays, it has become crucial to determine where automated techniques for rapid decision making should be run, such as the fog, the cloud, or the thin layer [3]. However, when all ML decisions are performed in the cloud, the IoT decisionmaking process is delayed. With other layers-namely, the fog or thin layer-it is difficult to apply ML solutions because of inadequate resources; namely, energy, bandwidth, and processing [4].
Several research scholars are trying to defend against botnet assaults on the IoT atmosphere. However, several gaps exist for the formulation of an effective detection mechanism. To deal with these attacks, an intrusion detection system (IDS) is one effective method [5]. However, conventional IDSs can often be positioned for IoT settings because of such problems. Complicated cryptographic systems could be embedded in many IoT gadgets for the usual reasons [6]. There are generally two types of IDSs: misuse and anomaly techniques. The misuse-related techniques, termed signature-related techniques, depend on the signs of attacks and are found in many public IDSs.
Existing research notes that deep learning (DL) approaches can detect IoT assaults highly efficiently compared to conventional ML techniques [7]. However, only the cloud layer has the resources for running such techniques. Moreover, such approaches are not continuously active in certain situations, such as remote live functioning, as the mechanism is supposed to constitute realistic decisions quickly [8]. Preceding work on IoT assaults has shown that an ML approach, such as support vector machine (SVM), could offer meaningful outcomes when it is linked with an optimization algorithm or feature reduction or extraction method [9]. This amalgamation of methods fails to address the low source requirement. ML approaches, such as K-nearest neighbors (KNNs), decision trees (DTs), naïve Bayes (NB), and others, are tremendously useful for applications such as non-interactive or offline predictions among small datasets [10]. Supporting all such variables, deep learning (DL) plays a significant role in the medical sector in maintaining security against numerous kinds of assaults and shedding light on the well-known ransomware attacks.
In this study, we designed a botnet detection model using barnacles mating optimizer with machine learning (BND-BMOML) for the IoT environment. For data normalization, the proposed model uses the Z-score normalization technique. Furthermore, the BMO algorithm is employed to select a useful set of features. Finally, chicken swarm optimization (CSO) with the Elman neural network (ENN) model is used for botnet detection. The experimental validation of the BND-BMOML model was carried out using a benchmark dataset.

Literature Review
In Vinayakumar et al. [11], a new botnet detection technique was developed on the basis of two-stage DL architecture to semantically distinguish botnet and legitimate performances at the application layer of the domain name system (DNS) service. Initially, the similarity measure of DNS queries is evaluated through a Siamese network based on predetermined thresholds for choosing the commonest DNS data across Ethernet connections. Next, a domain generation model related to DL framework is recommended for classifying normal and abnormal domain names. In [12], the presented method aims to recognize IoT botnet attacks initiated from compromised IoT gadgets by using the efficacy of the new grey wolf optimizer (GWO) algorithm to discover the features that better define IoT botnet complexity and simultaneously improve the hyperparameter of the one class support vector machine (OCSVM). Popoola et al. [13] developed a robust DL-oriented botnet attack detection technique that could manage extremely imbalanced network traffic datasets. In particular, the synthetic minority oversampling technique (SMOTE) produces further minority samples to achieve class balance, whereas DRNN learns hierarchical feature representation from the balanced network traffic dataset to implement discriminatory classification.
Sriram et al. [14] developed a DL-oriented botnet framework that functions on network traffic flow. The presented method gathers network traffic flow, transforms it to connection records, and applies a DL algorithm to identify assaults originating from the compromised IoT gadgets. Habib et al. [15] developed a detection system based on multi-objective particle swarm optimization (MOPSO) to recognize malicious behavior in IoT network traffic. The efficiency of MOPSO can be confirmed in contrast to filter-based feature selection methods, conventional ML algorithms, and the multi-objective non-dominating sorting genetic algorithm (NSGA-II). Wu et al. [16] designed a common architecture based on the deep reinforcement learning (DRL) technique that efficiently produces adversarial traffic flows to deceive the detection technique by automatically adding perturbation to the sample. During the entire process, the target detector is considered a black box and to be closer to real-time attacks. An RL agent is armed to upgrade the adversarial instances by merging the feedback from the target models (malicious or benign) and the series of activities and is capable of changing the spatial and temporal features of traffic flows when preserving the executability and original functionality.
The author of [17] addresses the IoT cybersecurity threat in smart cities and develops an anomaly detection-IoT (AD-IoT) technique using a smart anomaly detection-based random forest approach. The presented technique could efficiently identify compromised IoT devices at distributed fog nodes. McDermott et al. [18] developed a solution to the recognition of botnet activity within networks and consumer IoT gadgets. A new application of the DL technique was utilized to develop a detection method related to the bidirectional long short term memory (Bi-LSTM)-based recurrent neural network (RNN). Then, word embeddings were used for recognition of attack packets and text conversion into tokenized integer format. The proposed technique was compared with the LSTM-RNN in identifying four attack vectors utilized by the Mirai botnet and the loss and accuracy were estimated.

The Proposed Model
In this article, a new BND-BMOML algorithm was developed for the identification and recognition of botnets in the IoT environment. To accomplish this, the BND-BMOML model initially follows a data standardization approach. In the presented BND-BMOML model, the BMO algorithm is employed to select a useful set of features. For botnet detection, the BND-BMOML technique employs the CSO with ENN model in this study. Figure 1 demonstrates the block diagram of the BND-BMOML system.

Data Standardization
The data standardization procedure (DSP) is a crucial stage in data preprocessing used primarily to provide feature scaling to ensure features are on nearly similar scales, such that all the features are equivalently significant. The DSP makes the data easy to process with the ML algorithm. In the study, the standardization process (Z-score nor-

Data Standardization
The data standardization procedure (DSP) is a crucial stage in data preprocessing used primarily to provide feature scaling to ensure features are on nearly similar scales, such that all the features are equivalently significant. The DSP makes the data easy to process with the ML algorithm. In the study, the standardization process (Z-score normalization) was used, where each feature is rescaled to make sure the standard deviation and mean are within the range of 0 and 1, correspondingly. In this research, the Z-score normalization was employed as follows: The Z-score normalization is effective for different optimization approaches and, specifically, gradient descent (GD), which is widely applied by ML algorithms. The aim of standardization is to enhance the performance of ML algorithms and avoid or mitigate bias in ML classification.

Feature Selection Using BMO Algorithm
At this stage, the BMO algorithm is employed to select a useful set of features from the preprocessed data, thereby increasing accuracy and reducing computation complexity [19]. Barnacles are often found permanently attached to solid substances, such as ships, rocks, sea turtles, and corals. Barnacles are hermaphroditic organisms that have male and female reproduction systems, and the unique feature of barnacles is their penis size, which can stretch to more than the length of their body (up to seven or eight times). The barnacle mating takes place through sperm-cast and normal copulation. In the mating of isolated barnacles, sperm-cast takes place. This can be performed by discharging the fertilized eggs into water. Such behaviors of barnacles in releasing novel offspring provide insights into the use of BMO for resolving the problem of optimization. Like other evolutionary approaches, such as the genetic algorithm (GA), BMO employs the same technique to develop the selection method for the parent to be mated for producing novel offspring. However, the solution process is dissimilar from the GA and does not utilizing familiar selection techniques; namely/ tournament, roulette wheels, and so on. The selection procedure for barnacles that mate can be undertaken according the subsequent rules: • Although barnacles are recognized as hermaphroditic organisms, female barnacles can be fertilized by one or more male barnacles, where all the barnacles are mated with each other to prevent complications; • The value of pl should be initially set by the user and the selective barnacle parents can be arbitrarily performed. The value of pl is the control variable in these algorithms that could be tuned to attain better optimization outcomes, along with maximum iterations and number of barnacles; • The Hardy-Weinberg principle is used when the selective barnacle parents lie in the range of pl. Then, the sperm-cast is imposed to achieve novel offspring.
The generation of novel offspring is guided by the Hardy-Weinberg principle as follows: where k = |barnacle_m − barncle_d|, p indicates the random number uniformly distributed, q = (1 − p), and x N barnacle_m and x N barnacle_d show the variables chosen randomly for the barnacle parents correspondingly. Furthermore, rand() means the random integer ranges between zero and one (0 ∼ 1). p and q characterize the inheritance percentage from the corresponding barnacles' parents. For instance, p is set to 0.80. This shows that the novel offspring inherit 20% (100% − 20%) of the father's features and 80% of the behavior or features of the mother. Equation (4) is used for the exploitation of optimization whereas Equation is used as the exploration of the proposed BMO. Further, it is noteworthy that the exploration (sperm-cast) can only be related to the mother's barnacles because they receive sperm discharged from another barnacle elsewhere. As soon as the barnacle breeds, the number in the population will be doubled from the early population. To control these expansions, something has to be implemented. Like the GA, a sorting method is required in BMO where a better outcome for a specific iteration is positioned at the top half of the doubled populations.
In the modeled BMO technique, the fitness function (FF) is employed to balance the classifier accuracy (maximal) and the selected feature count in every solution (minimal) obtained with the selected feature. Equation (10) symbolizes the FF for the computing solution: At this point, γ R (D) denotes the classifier error rate of a presented classifier (K-nearest neighbor (KNN) classifier). |R| represents the cardinality of chosen set and |C| indicates the number of features in the data. α and β display the two variables for the significance of classification quality and subset length, respectively. ∈[1, 0] and β = 1 − α.

Botnet Detection Using ENN Model
In this study, the ENN model was exploited for botnet detection. The ENN consists of output, input, hidden, and context layers [20]. The major formation of this NN is FFNN; therefore, the relationship within the output, input, and hidden layers (HLs) is completely associated with the multi-layer NN. Furthermore, there exists another layer in the ENN, which is called the context layer. Its input comes from the output of HLs and it stores the initial values of the HLs. The external input, output, and context weight matrixes are defined by W 0 h , W i h , W c h . By considering the ENN form, the dimensions of the input, as well as output layers, are n-viz., T and y(t) = [y 1 (t), y 2 (t), . . . , y n (t)] T -and the dimension of the context layer is m: In Equation (5), l indicates the output and input layers in iteration l. Then, k HLs are considered as follows: Now, x c j (l) designates the signal that is transported from the k-th context layer, and ω 1 kj (l) describes the i-th and j-th weights of the HLs directed from the o-th nodes. Therefore, for the input layer I, the weight of HL k is obtained from ω 2 ki (l). This is accomplished using: which indicates the standardized value of the HL. The next layer is the context layer. Here, the outcome is equivalent to the subsequent expression: Electronics 2022, 11, 3411 6 of 16 In Equation (9), W k indicates the self-connected feedback amongst [0, 1] and it is given as follows: In Equation (10), ω 3 ok determines the weight connecting the k-th to the o-th layers. Figure 2 defines the framework of the ENN technique. max (8) which indicates the standardized value of the HL. The next layer is the context layer.
Here, the outcome is equivalent to the subsequent expression: In Equation (9), indicates the self-connected feedback amongst [0, 1] and it is given as follows: , , 0 1,2, … , In Equation (10), determines the weight connecting the -th to the -th layers. Figure 2 defines the framework of the ENN technique.  To optimally choose the ENN parameters, the CSO algorithm was applied in this work.

Parameter Tuning Using CSO Algorithm
Finally, the weight of the abovementioned ENN model was improved based on the CSO algorithm. The CSO algorithm is stimulated from the hierarchy and foraging behavior of chicken flocks [21]. In the CSO, the location of every individual was considered a candidate solution to the optimized problem. The count of individuals in the chicken flock can be represented as N. Each individual searches for food in a D dimensional space and upgrades their identity at each G generation. The sequence of serial numbers of each individual can be {1, 2, 3, . . . , N r , N r + 1, . . . , tN h N h + 1, . . . , N c }, whereas N r , N h , and N c indicate the maximal serial numbers of the roosters, hens, and chicks in sub-flock afterward sorting, correspondingly. The original and updated location of each rooster is determined as X i,j and X new i,j , whereby i ∈ {1, 2, 3, . . . , N r }, j ∈ {1, 2, 3, . . . , D}. Roosters with low fitness can forage food in a wide search area, as is shown in the following. Now, andn 0, σ 2 indicates the uniformly distributed random number with an average value of 0 and standard deviation of σ 2 . f shows the fitness function. f i and f n denote the fitness value of the i-th rooster and n-th rooster, correspondingly, whereas i, n ∈ {1, 2, 3, . . . , N r } and i = n. ζ denotes a number closer to 0, which is utilized to prevent the denominator | f i | + ζ from being 0. The original and upgraded locations of hens are determined by X i,j and X new i,j , whereas i ∈ {N r + 1, N r + 2, . . . , N h }, j ∈ {1, 2, 3, . . . , D}. X c,j denotes the location of the spouse, and X d,j indicates the location of the individual, where i hens want to steal food, in which c ∈ {1, 2, 3, . . . , tN r }, d ∈ {1, 2, 3, . . . , N h }. The searching and stealing capabilities of the hens are associated with their fitness value.
Now, and indicates a random integer ranging from [0, 1]. The chick follows its mother for food foraging. The small fitness value makes it simple for them to search food by foraging as follows: In Equation (16), X i,j and X new i,j indicate the original and upgraded locations of chicks, correspondingly. For every chick, i ∈ 224 {N h + 1, N h + 2, . . . , N m }, and j ∈ {1, 2, 3, . . . , D}. X m,j shows the location of the mother hen analogous to the i-th chicks, where m ∈ {N γ + 1, N γ + 2, . . . , N h }. refers to the succeeding probability for all the chicks that follow their mother hen to forage. Considering the variances among all the chicks, is generated at random among [0, 2] as presented in Algorithm 1. Update X i,j with Equation (16); The CSO algorithm derives a fitness function (FF) to obtain an improved classifier outcome. In this study, the reduced classifier error rate is considered as the FF, as given below in Equation (17).

Results and Discussion
The proposed model was simulated using Python 3.6.5. The experiments for the proposed model used a PC i5-8600k, GeForce 1050Ti 4 GB, 16 GB RAM, 250 GB SSD, and 1 TB HDD. The parameter settings were as follows: learning rate: 0.01, dropout: 0.5, batch size: 5, epoch count: 50, and activation: ReLU.
This section inspects the bot net classification results of the BND-BMOML model on the N_BaIoT [22] dataset. The dataset comprises 17,001 samples with three class labels. Table 1 provides a detailed explanation of the dataset.    Table 3 and Figure 5 portray the detailed botnet detection results for the BND-BMOML methodology with 70% of the TR dataset. The BND-BMOML approach demonstrated enhanced results with all classes. For example, in the benign class, the BND-BMOML model offered accu y of 99.12%, prec n of 98.45%, reca l of 98.53%, F score of 98.49%, and MCC of 97.87%. Additionally, in the Mirai class, the BND-BMOML algorithm rendered accu y of 99.54%, prec n of 99.39%, reca l of 99.49%, F score of 99.44%, and MCC of 99.05%. Moreover, in the Gafgyt class, the BND-BMOML method achieved accu y of 99.19%, prec n of 98.74%, reca l of 98.51%, F score of 98.63%, and MCC of 98.06%.           Table 4. Result analysis for BND-BMOML algorithm with distinct class labels using 30% of the TS Figure 5. Average analysis of BND-BMOML algorithm using 70% of the TR data. Table 4 and Figure 6 present brief botnet detection results for the BND-BMOML technique using 30% of the TS dataset. The BND-BMOML method displayed enhanced results in every class label. For example, in the benign class, the BND-BMOML algorithm rendered accu y of 99.31%, prec n of 98.76%, reca l of 98.95%, F score of 98.85%, and MCC of 98.36%. Additionally, in the Mirai class, the BND-BMOML technique presented accu y of 99.49%, prec n of 99.37%, reca l of 99.37%, F score of 99.37%, and MCC of 98.94%. Furthermore, in the Gafgyt class, the BND-BMOML method, offered accu y of 99.16%, prec n of 98.66%, reca l of 98.47%, F score of 98.57%, and MCC of 97.97%.  The training accuracy (TRA) and validation accuracy (VLA) attained by the BND-BMOML method in the test dataset are shown in Figure 7. The experimental outcome shows that the BND-BMOML algorithm gained maximum values for TRA and VLA. The VLA was seemingly greater than the TRA. The training accuracy (TRA) and validation accuracy (VLA) attained by the BND-BMOML method in the test dataset are shown in Figure 7. The experimental outcome shows that the BND-BMOML algorithm gained maximum values for TRA and VLA. The VLA was seemingly greater than the TRA.
The training loss (TRL) and validation loss (VLL) acquired by the BND-BMOML technique in the test dataset are displayed in Figure 8. The experimental outcome shows that the BND-BMOML technique exhibited minimal values for the TRL and VLL. Particularly, the VLL was less than the TRL.
A clear precision-recall analysis of the BND-BMOML algorithm in the test dataset is exemplified in Figure 9. The figure shows that the BND-BMOML algorithm resulted in enhanced precision-recall values in every class label. The training accuracy (TRA) and validation accuracy (VLA) attained by the BND-BMOML method in the test dataset are shown in Figure 7. The experimental outcome shows that the BND-BMOML algorithm gained maximum values for TRA and VLA. The VLA was seemingly greater than the TRA.  The training loss (TRL) and validation loss (VLL) acquired by the BND-BMOML technique in the test dataset are displayed in Figure 8. The experimental outcome shows that the BND-BMOML technique exhibited minimal values for the TRL and VLL. Particularly, the VLL was less than the TRL.
A clear precision-recall analysis of the BND-BMOML algorithm in the test dataset is exemplified in Figure 9. The figure shows that the BND-BMOML algorithm resulted in enhanced precision-recall values in every class label.   Table 5 and Figure 10 offer a detailed comparative study of the BND-BMOML model and existing models [23]. The results indicate that the DBN, LSTM, and CNN-RNN models reported lower classification performance. Next, the LSTM-CNN and DNN models achieved slightly higher classifier results. Though the DNN-LSTM model reached reasonable performance with a classification accuracy of 99.11, the BND-BMOML model shows the maximum accuracy of 99.32%.   Table 5 and Figure 10 offer a detailed comparative study of the BND-BMOML model and existing models [23]. The results indicate that the DBN, LSTM, and CNN-RNN models reported lower classification performance. Next, the LSTM-CNN and DNN models achieved slightly higher classifier results. Though the DNN-LSTM model reached reasonable performance with a classification accuracy of 99.11, the BND-BMOML model shows the maximum accuracy of 99.32%. Finally, a brief running time (RUNT) examination of the BND-BMOML model and recent models is provided in Table 6 and Figure 11. The attained results show that the LSTM and DBN models reported higher RUNTs of 3.86 ms and 3.97 ms. Along with that, the DNN-LSTM and CNN-RNN models attained slightly improved RUNTs of 1.27 ms and 1.15 ms, respectively. The LSTM-CNN and DNN models revealed reasonable RUNTs of 0.35 ms and 0.59 ms, respectively. However, the BND-BMOML model showed superior results, with a minimal RUNT of 0.17 ms. Thus, the BND-BMOML model was found to be better than existing models. Finally, a brief running time (RUNT) examination of the BND-BMOML model and recent models is provided in Table 6 and Figure 11. The attained results show that the LSTM and DBN models reported higher RUNTs of 3.86 ms and 3.97 ms. Along with that, the DNN-LSTM and CNN-RNN models attained slightly improved RUNTs of 1.27 ms and 1.15 ms, respectively. The LSTM-CNN and DNN models revealed reasonable RUNTs of 0.35 ms and 0.59 ms, respectively. However, the BND-BMOML model showed superior results, with a minimal RUNT of 0.17 ms. Thus, the BND-BMOML model was found to be better than existing models.    Figure 11. Running time analysis of BND-BMOML approach and recent algorithms.

Conclusions
In this article, a new BND-BMOML system was established for the identification and recognition of botnets in the IoT environment. To accomplish this, the BND-BMOML Figure 11. Running time analysis of BND-BMOML approach and recent algorithms.

Conclusions
In this article, a new BND-BMOML system was established for the identification and recognition of botnets in the IoT environment. To accomplish this, the BND-BMOML model initially follows the data standardization approach. In the presented BND-BMOML model, the BMO algorithm is employed to select a useful set of features. For botnet detection, the BND-BMOML technique employs the ENN model in this study. Finally, the presented BND-BMOML model uses the CSO algorithm for parameter tuning process. Experimental validation of the BND-BMOML approach was applied using a benchmark dataset and the results portrayed a significant improvement in performance over existing methods. Thus, the presented BND-BMOML technique can be exploited for the real-time botnet detection process. In the future, the performance of the BND-BMOML technique can be improved using advanced DL models. Data Availability Statement: Data sharing not applicable to this article as no datasets were generated during the current study.