ELBA-IoT: An Ensemble Learning Model for Botnet Attack Detection in IoT Networks

: Due to the prompt expansion and development of intelligent systems and autonomous, energy-aware sensing devices, the Internet of Things (IoT) has remarkably grown and obstructed nearly all applications in our daily life. However, constraints in computation, storage, and communication capabilities of IoT devices has led to an increase in IoT-based botnet attacks. To mitigate this threat, there is a need for a lightweight and anomaly-based detection system that can build proﬁles for normal and malicious activities over IoT networks. In this paper, we propose an ensemble learning model for botnet attack detection in IoT networks called ELBA-IoT that proﬁles behavior features of IoT networks and uses ensemble learning to identify anomalous network trafﬁc from compromised IoT devices. In addition, our IoT-based botnet detection approach characterizes the evaluation of three different machine learning techniques that belong to decision tree techniques (AdaBoosted, RUSBoosted, and bagged). To evaluate ELBA-IoT, we used the N-BaIoT-2021 dataset, which comprises records of both normal IoT network trafﬁc and botnet attack trafﬁc of infected IoT devices. The experimental results demonstrate that our proposed ELBA-IoT can detect the botnet attacks launched from the compromised IoT devices with high detection accuracy (99.6%) and low inference overhead (40 µ -seconds). We also contrast ELBA-IoT results with other state-of-the-art results and demonstrate that ELBA-IoT is superior.


Introduction
Internet of Things (IoT) is one of the most emerging paradigms in the networking realm. It can be defined as the "interconnection of things" having constrained computational power and capabilities. It can be used to send and receive data over the internet without requiring human-to-computer or human-to-human interaction [1]. The word "things" refers to the IP-enabled, networked devices (both physical and virtual). Things may include telematics boxes, self-driving cars, printers, surveillance cameras, tablets, smartphones, ultra-wideband (UWB), infrared data association (IrDA), ZigBee, NFC data centers, and cellular and Wi-Fi networks. The IoT with all its subtechnologies is considered a network of numerous physical objects (24.15 billion devices in 2019 jumping to 76.45 billion computing devices in 2026) [2]. The financial impact of the IoT may be from $3.9 to $11.1 trillion on the global economy by 2025 [3].
In addition to Internet Protocol (IP), these devices are enabled with several other important technologies, including radio-frequency identification (RFID) technology, sensors, actuators, GPS services, nanotechnologies, near field communication (NFC), and cloud computing.
Apart from the traditional IoT, resource-constrained IoT devices are also worth mentioning. Although these devices involve IoT applications, they are small, low-power, battery-operated devices with disparate design trade-offs. In addition, they also have In this paper, we make use of an ensemble learning model to detect botnet attacks in IoT networks. Ensemble learning involves multiple algorithms to obtain better predictive performance, rather than using a single algorithm of its constituent algorithms [8]. Moreover, the ensemble learning model can deal with data heterogeneity or class imbalance problems that are often faced during anomaly detection [9]. This paper is based on the N-BaIoT2021 dataset [10], which is a network-based anomaly detection approach in ensemble learning. N-BaloT2021 identifies abnormal network behavior to detect anomalous network traffic from the compromised IoT [10]. The N-BaIoT2021 dataset includes different botnet attacks belonging to the two most common IoT botnet vectors, Bashlite and Mirai. The dataset was prepared for comparative analysis of anomaly detection using five IoT devices: a doorbell, thermostat, baby monitor, security camera, and webcam. In addition to the 611,359 samples of normal traffic, the N-BaIoT2021 dataset includes 7737 samples of botnet traffic (malicious) distributed as follows: Bashlite (4737 samples) and Mirai (3000 samples) against the five mentioned IoT devices. Specifically, our contributions in this paper can be stated as follows: • We present a comprehensive efficient detection/classification model/architecture with detailed preprocessing operations that can classify the IoT traffic records of the N-BaIoT2021 dataset into binary classifier (2-class), ternary classifier (3-class), and multiclassifier (10-class).

•
We characterize the performance of four machine-learning-based decision tree models: AdaBoosted decision tree, RUSBoosted decision tree, bagged decision tree, and their ensemble learning model.

•
We provide an inclusive experimental evaluation to gain more insight into the system efficiency and solution approaches, such as the confusion matrix, model precision, model sensitivity, and others.

•
We contrast our best performance results with state-of-the-art works employing several supervised learning algorithms in the same area of study.
The remaining part of this paper is structured as follows: Section 2 provides a literature review with a summary of the most recent surveyed papers. Section 3 presents the system model and architecture with a detailed explanation for each underlying subsystem. Section 4 reports extensive experimental outcomes and performance trajectory as well as a comparison with existing models. Finally, Section 5 concludes the presented work.

Related Work
A significant number of studies have been conducted to understand the nature of IoT cyber-attacks. IoT security professionals have used various detection methods and techniques to detect attacks on IoT networks. In the meantime, attackers have always been sophisticated and fast in their operations. These attackers also developed the latest attacking techniques to counter safeguarding mechanisms. Much research has been conducted to accurately detect anomalies in the IoT. Most IoT-based studies reveal botnet attacks that are based on command and control (C&C) servers. In this scenario, zombies or bots, infected computer s, act as clients to communicate with a C&C server that passes instructions to each zombie computer. If the control of the central server is gained, the entire botnet can be broken down. This is also known as client-server model. Another approach used by hackers against the IoT is the peer-to-peer model. Instead of connecting with a central server, zombie computers in this model are connected via the internet. This model comes into place to fix weaknesses of the client-server model, as removing one or more bots will not ease the problem due to the huge botnet of independent bots. Several IDSes were proposed and developed during the last couple of years; examples can be found in [11][12][13][14][15].
Previous studies also shed light on the types of botnet attacks that hackers used in the face of the IoT. For example, dial-up bots compromise dial-up modems to force them to dial phone numbers for malicious purposes. A "click fraud" is an illegal act of clicking on pay per click (PPC) advertisements or banner ads to increase the number of clicks for advertisers or bot herders. They design automated online bots to create click frauds. Spyware can be used to secretly gather sensitive data, such as credit card numbers or login details, in the victim's IoT device. They can be beneficial for botmasters because they can sell such data on the black market. In addition, a DoS attack is used to compromise a target, such as a server, using multiple compromised computers (botnet) to cause a denial of service for the victim. Botnets are widely used to launch massive DoS and DDoS attacks. The Mirai botnet was utilized in 2016 to attack the domain name service provider Dyn, based in Manchester, and attack volumes were gauged at over 600 Gbps. Lastly, a spambot is a malicious machine that is used to distribute automated spam emails.
Botnet detection techniques for the IoT are either network-based or host-based [16][17][18][19][20]. However, the host-based approach is less realistic. For instance, in [16,17], the authors developed a comprehensive architecture for IoT instruction detection and classification at the network layer of the IoT paradigm. Six different supervised ML methods were employed to develop the IoT-IDS: three ensemble learning methods, two neural network methods, and one kernel method. They evaluated their models using two well-known IoT attack datasets, distilled-Kitsune-2018 and NSL-KDD. Their best results were better than any prior art by 1~20%. In [18], the authors proposed port scanning attack detection for IoT networks by characterizing the performance of several machine learning techniques. However, their best empirical results were recorded for the best logistic regression model, which scored 99.4% and 99.7%, registered for accuracy and F-score with low detection overhead. In [19][20][21][22], the authors' studies barely used the latest techniques, such as autoencoders, which are a part of our proposed technique in ensemble learning-known as N-BaIoT. This means that IoT attacks often remained undetected and unnoticed due to a lack of modern security defense mechanisms. However, now, autoencoders are being used in this paper to provide more accurate and useful results. Autoencoders are mostly used for anomaly detection in wireless sensor networks (WSNs) that are embedded in IoT devices. The algorithm used for detection purposes involves two components-one is located in the IoT cloud, while the other is placed within sensors. During the evaluation, it was seen that the autoencoders' unsupervised learning features enabled adaptation to unexpected changes in IoT network environments. Attempts have also been made to compare many deep learning models for network intrusion detection (NID), including the recurrent neural network (RNN), self-taught learning (STL), and vanilla deep neural net (DNN). Machine learning (ML)-based autoencoders provided efficient outcomes, as they met even the unique constraints of the IoT environments. These constraints included small storage capacity and shortage of computing power due to the IoT's tiny size. Nevertheless, autoencoders provided useful results in detecting botnet attacks. More importantly, some previous studies also relied on statistical approaches that only worked on some predefined values. Unfortunately, cybersecurity threats and attacks on the IoT are accelerating by leaps and bounds, and bot herders are developing the latest attacking techniques. Therefore, statistical approaches could not work against the latest techniques. This is the reason the researcher preferred behavioralbased N-BaIoT, which is dynamic and efficient and provided more accurate results. The proposed method has several advantages. For example, it uses online processing, so it is deployable even in low-memory IoT. This ensemble learning approach is scalable and lightweight across various IoT devices. Furthermore, detection accuracy is very high, and computational load on devices is very low.
The number of false positives is also negligible. Moreover, the unsupervised learning feature facilitates adaptability to dynamic environments.
L. Yang et al. [23] proposed an adaptive IoT streaming data analytics framework-namely, "LightGBM and Drift Adaptation"-to detect anomalies in IoT data streams without the involvement of human beings. R. Qaddoura et al. [24] recommended an approach that was based on three stages: the reduction stage, the support vector machine and synthetic minority oversampling technique (SVM-SMOTE), and the single hidden layer feed-forward neural network (SLFN) stage. The outcomes of this research demonstrated that the SLFN technique and SVM-SMOTE with a ratio of 0.9 and the k value of 3 for the k-means++ clustering method gave better results than other classification techniques and other values. Another study conducted by Wan-Chen Shi et al. [25] showed that the behaviors of network traffic from network packets were inspected through DeepBot, a deep learning method, to detect botnets and classify them into disparate categories. Huy-Trung Nguyen et al. [26] also conducted research for Linux IoT botnet detection. It was based on PSI graph and DGCNN classifiers. In this experiment, the researchers used 10033 EFL files that further included 6031 benign files and 4002 IoT botnet samples. The results showed that PSI graph and DGCNN classifiers achieved an F-measure of 94% and an accuracy of 92%. In addition to the aforementioned studies, the researcher added another work in this paper. Christopher D. McDermott et al. [27] used the BLSTM-RNN detection model to detect botnets within consumer-based IoT devices and networks. Researchers used the word embedding technique to recognize text and convert attack packets into a tokenized integer format. They detected four attack vectors used by Mirai botnet malware and evaluated them for loss and accuracy. According to researchers of this study, the bidirectional technique added overhead to each epoch and increased processing time. However, it proved to be a better progressive model over time. Finally, authors in [28] proposed a detection system for a Ping flood attack in an Internet of Things (IoT) network using a k-means algorithm. Their proposed system exhibits high accuracy results, scoring a 99.94%, with very low false negatives (0.00%) and false positives (1.38%). To sum up, Table 1 summarizes some facts and figures about some recent research in this regard below.

ELBA-IoT Model Development
ELBA-IoT is an anomaly-based intrusion detection system developed to autonomously build profiles for normal and malicious (botnets') behaviors and then provide detection/classification for the traffic communicated through the IoT network. In this research, the proposed system is decomposed into three distinct modules, where each module is implemented via a number of operations. These modules are responsible for processing the input (N-BaIoT 2021 dataset) through a series of consecutive operations to provide the final output (anomaly-based detection). Specifically, the proposed ELBA-IoT system is composed of a data preparation (DP) module, learning process (LP) module, and evaluation process (EP) module, as illustrated in Figure 2.

Implementation of the Data Preparation (DP) Module
The data preparation (DP) module concerns the preprocessing operations for the raw IoT traffic data of the N-BaIoT 2021 dataset and the transformation of preprocessed data into a table of labeled features that can be fed into and trained by the machine learning part of the LP module. The implementation phases of this module include the following consecutive operations:

Data Hosting Process (DHP)
The data hosting process (DHP) is the process of keeping the data on a stable and accessible platform that remains persistent and highly reliable. In this research, we use the MATLAB computing platform as a system to host, train, and evaluate the data and the model. Therefore, this stage is responsible for accepting the collected data records in its CSV format (comma-separated values) and importing the data using MATLAB tables that can be used for further preprocessing operations. By this hosting, every IoT traffic record is formatted as raw in the table, while the data features are represented as columns.

Data Cleansing Process (DCP)
The data cleansing process (DHP) is the process of exploring the dataset to obtain a deeper understanding of the underlying dataset and provide correction for the misinterpreted data. DHP deals with detecting and removing errors and inconsistencies from data in order to improve the quality of data [29]. In this research, we conducted several DHP processes over the imported data, including missing value checks (searching for null-value cells and replacing with zero numerical), corrupted value checks (searching for misinterpreted data and removing them), fixing the attribute names for the main features (the imported data from CSV usually have no names for their attributes), maintaining an atomic representation of the data (make sure all attributes are simple and filled with a single value for each cell), duplicate data checks (make sure all data samples are respected once with no redundancy of specific data records), and label encoding (the target feature is indeed encoded using integer encoding). The output classes are given the values 0 and 1 for the binary classifier and values from 0 to 9 for the multiclassifier. Table 2 below summarizes the encoding process for the target labels and fixes all typos and incorrect data records. In feature selection, we aim to select the features which are highly dependent on the response. The feature selection process (FSP) is the process of selecting all features that positively contribute to the performance of the machine learning model and avoid other features that may negatively impact the model performance. In this study, we employed the correlation coefficient score approach (CCS) to define the most prominent features to be used for ELBA-IoT. The CCS method chooses all the possible feature combinations and then calculates the linear relationship between features and the target. The logic behind using correlation for feature selection is that the good features are highly correlated with the target. Accordingly, the CCS method finds the best features that can be used later in developing the ML model. Consequently, the computationally intensive problem can be overcome by running the algorithm with fewer features for detecting botnet attacks. The Pearson correlation coefficient score [30] used in this research is given by the following formula: where Pearson's correlation coefficient (ρ X, Y ) is the covariance of two variables divided cov (X, Y) by the product of their standard deviations (σ X σ Y ). It is valued between −1 and 1, negative values meaning inverse relation and positive meaning the reverse case.

Data Standardization Process (DSP)
The data standardization process (DSP) is a fundamental step in the data preprocessing stage that is mainly used to provide a feature scaling to make sure features are on almost the same scale so that each feature is equally important. The DSP makes data easier to process by most ML algorithms [31]. In this research, we applied the process of standardization (Z-score normalization) where all features are rescaled to ensure the mean and the standard deviation are with a distribution value between 0 and 1, respectively. The Z-score normalization applied in this research is given by the following formula: Z-score normalization is useful for several optimization algorithms such as gradient descent (GD), which is extensively employed by machine learning algorithms. Standardization aims to increase the accuracy of ML models and mitigate/avoid bias of ML classifiers.

Data Shuffling Process (DSH)
The data shuffling process (DSH) is a fundamental step in the data preprocessing stage that is mainly used for masking confidential numerical data of the target dataset prior to being trained/validated through machine learning models. Shuffling is basically redistributing the data points/records throughout several execution processes (or even through several computing machines). In this research, since we are about to train/test splitting, we performed a uniform shuffle at every epoch, which guarantees that every item has the same chance to occur at any position [32]. The data shuffling applied in this research is performed using the following procedure:

Data Distribution Process (DDP)
The data distribution process (DDP) is an essential operation in every machinelearning-based project. The DDP is basically responsible for dividing the utilized dataset into different subsets to be used for machine learning model training and testing processes. In this research, to ensure optimum data splitting, we employed the random division approach (dividerand) to divide the targets into training, validation, and testing datasets using random indices. Specifically, the dividerand algorithm works as illustrated in Figure 3 below. In addition, to ensure optimum validation/testing process, we also applied k-fold cross-validation [33] during the learning process. In this approach, the dataset is arbitrarily split into k laminated folds, with each fold used as a test/validate dataset once while the other folds are consolidated together for use as a training dataset for the machine learning model generation. For every fold, the performance measures of the test dataset are computed and saved. Once all folds are processed, the overall performance measures are then calculated from all the measures stored for all k-folds. Figure 4 shows the schematic overview of k-fold cross-validation.

Implementation of the Learning Process (LP) Module
Hitherto, the implementation phases of the DP module have been investigated and analyzed. The DP module has distributed datasets that are ready for the learning process (learning includes training and testing), and thus, the next step is to process the preprocessed distributed datasets using an LP-module-based machine learning system. The main objective of this module is to train/test the developed IDS-based machine learning models along with their ensemble model, aiming to obtain the optimal performance trajectory recording maximum detection accuracy with the least detection error. In this research, the proposed solution approach evaluates the performance of three supervised machine learning methods (AdaBoosted decision tree [34], RUSBoosted decision tree [35], and bagged decision tree [36]) in order to provide more inclusive experiments and gain more insight into the solution approach. In addition, the ensemble learning model, making use of the three supervised machine learning methods, is developed to profile the behavioral features of IoT network traffic and identify the anomalous network traffic launched through the compromised IoT devices. We summarize specifications for the implemented machine learning models in Table 3.

Implementation of the Evaluation Process (EP) Module
The evaluation process (EP) is an essential activity to measure and regulate the quality metrics that check the compliance of the system with its requirements and objectives. To validate the efficiency of the ELBA-IoT system, we used the standard evaluation metrics [37] over the k-fold datasets to measure the performance of the ELBA-IoT system using the four aforementioned variants of machine learning models: the AdaBoosted decision tree (ABDT) model, RUSBoosted decision tree (RBDT) model, bagged decision tree (BGDT) model, and ensemble learning model (ELBA-IoT). Figure 5 recapitulates the overall system of measurements employed in this research to validate the system performance and compare the quality of several models [17]. According to the figure (Figure 5), the performance of the developed models was evaluated in terms of confusion matrix analysis to discuss the number of positive and negative samples (predicted as true or false), the number of correctly classified samples (NCS#), the number of misclassified samples (NMS#), the number of correctly-classified samples (NCS#), the detection accuracy proportion (DAC%), the detection precision proportion (DPR%), the detection sensitivity proportion (DSN%), the detection harmonic score (DHS%), the detection inaccuracy error (IAE%), the detection imprecision error (IME%), and the detection insensitivity error (INE%). In addition, for IoT applications, the functional time is a crucial factor for such energy-aware devices; therefore, we evaluated the proposed system in terms of inference overhead, represented by two factors: (1) the prediction speed (PRS), which measures the number of samples predicted within a time unit (1 time unit = 1 s), and conversely, (2) the prediction time (overhead/PRT), which measures the amount of time needed to provide the detection for single-sample traffic.

Results and Discussion
ELBA-IoT is a defense system that can be used for botnet detection and classification. In this section, we provide the results obtained for the performance evaluation of ELBA-IoT and the other ML models at three levels of defense (using three classifiers): the binary classifier, which is used to identify the IoT traffic as either normal or anomaly (botnet); the ternary classifier, which is used to classify the IoT traffic into normal, Mirai botnet, or Bashlite (Gafgyt) botnet; and the multiclass classifier, which is used to classify the IoT traffic into normal, Mirai_Doorbell, Mirai_Thermostat, Mirai_Baby_Monitor, Mirai_Security_Camera, Mirai_Webcam, Gafgyt_Doorbell, Gafgyt_Thermostat, Gafgyt_Baby_Monitor, Gafgyt_Security_Camera, or Gafgyt_Webcam.
To begin, Table 4 presents the results of performance evaluation metrics obtained for the binary classifier (normal vs. anomaly) using the four aforementioned botnet detection models: Model 1 (ADA-IoT), Model 2 (RUS-IoT), Model 3 (BAG-IoT), and Model 4 (ELBA-IoT). According to the table, all binary-class models exhibited an outstanding detectability of botnets, scoring accuracy rates of 99.5-100%, with explicit superiority for ELBA-IoT, which was able to detect all IoT botnet traffic with a 0.0% inaccuracy error rate. Moreover, Table 5 presents the results of performance evaluation metrics obtained for the ternary classifier (normal, Mirai, or Bashlite) using the four aforementioned botnet detection models: Model 1 (ADA-IoT), Model 2 (RUS-IoT), Model 3 (BAG-IoT), and Model 4 (ELBA-IoT). According to the table, all ternary-class models exhibited an outstanding detectability of botnets, scoring accuracy rates of 99.5-100%, with explicit superiority for ELBA-IoT, which was able to detect all IoT botnet traffic with a 0.0% inaccuracy error rate. Moreover, Table 6 presents the results of performance evaluation metrics obtained for the multiclassifier (normal, Mirai_Doorbell, Mirai_Thermostat, Mirai_Baby_Monitor, Mirai_Security_Camera, Mirai_Webcam, Gafgyt_Doorbell, Gafgyt_Thermostat, Gafgyt_Baby_Monitor, or Gafgyt_Security_Camera) using the four aforementioned botnet detection models: Model 1 (ADA-IoT), Model 2 (RUS-IoT), Model 3 (BAG-IoT), and Model 4 (ELBA-IoT). According to the table, all multiclass models exhibited an outstanding detectability of botnets, scoring accuracy rates of 96.2-99.6%, with explicit superiority for ELBA-IoT, which was able to detect all IoT botnet traffic with a 0.4% inaccuracy error rate. In addition, due to the importance of ELBA-IoT for the multiclassifier, we traced its performance trajectory in terms of minimum classification error (MSE) vs. training iterations. Therefore, Figure 6, below, illustrates the performance progression for ELBA-IoT with the multiclassifier by tracing the MSE factor during 30 iterations. To optimize the performance of the ensemble learning model, the Bayesian optimizer was used with acquisition function based on the expected improvement per second plus. Moreover, the ensemble learner employed three powerful ensemble methods (RUSBoosted, AdaBoosted, and bagged), utilizing 10 learners with maximum splits number of 15 splits, number of predictors to sample of 1-116, and optimized learning rate of 0.192. Accordingly, the best point minimum error was recorded after 11 iterations; the classifier recorded the minimum classification error of 3.8 × 10 −3 MSE (~3.8%) before becoming fixed and stable over the remaining iterations. Furthermore, to gain more insight into the inference overhead, Table 7 provides the time complexity results for the developed multiclass inference systems characterized into two aspects: the prediction speed (PRS) and the prediction time (overhead/PRT) for singlesample traffic. According to the table, ELBA-IoT recorded the maximum prediction speed and, thus, it has the least inference overhead, recording a prediction time of 40 µ-Second. Such a reasonable amount of time (at the micro scale) permits the ELBA-IoT model to be efficiently applied at the IoT systems (as gateway devices) to provide botnet classification for IoT traffic at earliest time and low overhead. Finally, Table 8 presents the comparative analysis of our ELBA-IoT system with other present state-of-art IoT-based botnet detection schemes utilizing machine/deep learning approaches in the same area of study. The table compares our best empirical results recorded for ELBA-IoT with the respective factors reported in existing studies. The reported comparison metrics include the detection model (learning model), the attack categories involved in the detection system, the number of output classes of each detection model, and the validation accuracy proportion for the detection models. Consequently, the comparison table (Table 8) contemplates 11 different intrusion/botnet detection models for the IoT environment, spanning from 2017 to 2022, in addition to our proposed ELBA-IoT, which is based on ensemble learning of decision trees (EL-DTs). The considered models include HAEs-2017 (hybrid autoencoders model, comprising autoencoders and denoising autoencoder) [38], BLSTM-RNN-2108 (cascade model, comprising bidirectional long short-term memory and recurrent neural networks) [27], DG-CNN-2018 (deep graph convolutional neural networks model) [26], kNN-2019 (k-Nearest-Neighborbased IDS model) [39], SVM-2019 (support-vector-machine-based IDS model) [40], Hybrid-ML-2019 (hybrid machine learning model composed of the decision tree, random forest, kNN, and deep neural networks) [41], LSTM-RNN-2020 (long short-term memory and recurrent-neural-network-based IDS model) [25], S-CNN-2020 (shallow-convolutionalneural-network-based IDS model) [42], D-CNN-2020 (deep-convolutional-neural-networkbased IDS model) [43], SL-BMM-CE-2021 (statistical-learning-based beta mixture model (BMM) and correntropy-based IDS model) [44], and ADA-DT-2022 (AdaBoosted-decisiontree-based IDS model) [45]. Based on the comparisons provided in the table, we can undoubtedly conclude that our ELBA-IoT model is conspicuously superior, as it provides the top detection accuracy performance for a multiclassifier with 10 labeled classes at the output layer with low inference overhead (40 µ-seconds) that can be adapted effectively with the time-aware and low-power devices of the IoT system. In addition, our ELBA-IoT improved the accuracy of identification of attacks by 1% over the 10-class classifier, 1.4-18.6% over the 5-class classifiers, and 0.3-7.6% over other lower-class classifiers (2-, 3-, and 4-class) for the identified IoT-IDS models in Table 8.

Conclusions and Future Directions
An autonomous, lightweight, intelligent, and accurate intrusion detection system for IoT networks-called ELBA-IoT-was proposed, developed, and evaluated in this paper. ELBA-IoT makes use of supervised ensemble learning methods to profile the behavioral features of IoT network traffic and identify the anomalous network traffic launched through compromised IoT devices. In addition, the proposed solution approach evaluated the performance of three supervised machine learning models (AdaBoosted, RUSBoosted, and bagged) in order to provide more inclusive experiments and gain more insight into the solution approach. Moreover, the N-BaIoT-2021 dataset-a comprehensive and contemporary dataset comprising real-world IoT network traffic-was used to evaluate the performance of ELBA-IoT. Eventually, the empirical outcomes demonstrated the superiority of ELBA-IoT over other existing solutions, scoring a high detection accuracy (99.6%) with low inference overhead (40 µ-seconds). As for future work, one recommended extension of this work is to deploy ELBA-IoT using physical IoT gateway devices to provide real-time botnet detection facilities for IoT networks (this may include deploying IoT-enabled devices, such as the IoT-based Raspberry Pi, ARM Cortex, or Arduino). Such deployment can be accompanied by further investigation on the different low-power IoT nodes' concerns, such as energy consumption, memory usage, and communication complexity. We may also aim to produce a new dataset for IoT botnets using this investigation at several peak times and several IoT applications of interest.