1. Introduction
The ever-decreasing price of electronic devices coupled with the need to transfer automatically huge amount of data between remote locations has resulted in a paradigm known as the Internet of Things (IoT) [
1]. The IoT is a system in which “things” (e.g., electronics and machines) communicate among them without the intervention of human beings to fulfill a specified task (e.g., controlling the temperature of an operating room). The different parts of an IoT system can be dispersed on a large field or placed in an environment (e.g., human stomach, hospital laundry room) where conditions such as acidity, humidity, and temperature do not allow the usage of wired communications [
2,
3]. To this end, Wireless Sensor Network (WSN) technologies are used in the applications of the IoT where wired communications are impossible to implement (e.g., global positioning system) or inadequate to use (e.g., wearable medical devices, ingestible sensors) [
3,
4,
5,
6,
7,
8]. Furthermore, the implementation of IoTs needs to take into account the number of sensors present in the network and the security threats such as the Denial of Service (DoS) attacks [
9]. This fact underscores the need of establishing an adequate management of the network. To this end, the last decade has seen the development of a new paradigm referred to as the Software-Defined Network (SDN) [
10,
11]. The SDN model is drastically transforming traditional processes by providing a centralized control of the whole network making it easier to implement network-wide management protocols and applications such as data aggregation or cryptographic schemes [
12,
13,
14,
15,
16]. The merging of the SDN model with the WSN model results in the Software-Defined Wireless Sensor Network (SDWSN) model.
Cryptographic schemes (i.e., symmetric, asymmetric cryptography and hybrid encryption) used in SDWSN-based IoTs are aimed at protecting them against security threats such as sybil attacks (i.e., an attacker steals the identity of legitimate sensor nodes) and unauthorized access [
10,
17,
18,
19,
20,
21]. Unfortunately, these schemes are not usually sufficient to ensure the integrity of communications in SDWSN-based IoTs [
22,
23,
24,
25,
26]. To this end, the cryptographic schemes can be supplemented with an Intrusion Detection System (IDS) to monitor SDWSN-based IoT traffics and detect if an attack is being carried out by unauthorized entities [
27,
28,
29]. The IDS is usually made up of three building blocks, namely, the flow collector, the anomaly detector, and the anomaly mitigator. Within the ambit of SDWSN-based IoTs, in order to optimize the network performance and monitoring, the IDS is programmatically deployed as a software on the controller.
Figure 1 depicts the overall architecture of an IDS deployed on the SDWSN-based IoT controller.
The function of the flow collector in the IDS is to gather all flow features (e.g., source node name, number of failed login and connection time) and forward them to the anomaly detector [
23,
27,
30]. The anomaly detector plays a central role in the IDS by using the features obtained from the flow collector to assign a class to the flow (e.g., sybil attack, normal traffic). The function of the anomaly mitigator is to take a stand (e.g., pass on or do not pass on the flow) given the class assigned to the flow by the anomaly detector [
31]. The work in this paper will revolve around the anomaly detector given that this specific component constitutes the brain of the IDS because the decision to assign a class to a flow occurs in it. It is noteworthy that the terms “anomaly detector” and “classifier “are sometimes used interchangeably in the literature to simplify the text. In the same vein, the terms “SDWSN” and “SDWSN-based IoT” are used interchangeably in the literature.
Various approaches have been put forward in the literature as IDSs in SDWSNs [
27,
32,
33,
34,
35,
36]. Amid these approaches, the IDSs using as anomaly detector either a Decision Tree (DT), a Naïve Bayes (NB) classifier or an Artificial Neural Network (ANN) are widely used in the literature because they are relatively easier to implement while being very performant on classification tasks [
32,
33,
34,
35,
36,
37,
38]. It is noteworthy to highlight that utterly disparate datasets were used in these published works to train the aforenamed anomaly detectors and for this reason, the performances achieved by an anomaly detector on one dataset could drastically dwindle on a different one. Furthermore, in the case of safety or mission critical networks (e.g., heart rate monitoring, automated insulin delivery) [
39,
40], on one hand the security constraints can prevent the network from using a cloud-based controller, whereas on the other hand the miniaturization constraints can limit the physical size and the memory capacity of the controller while the performance specifications can require a low latency. For these reasons, there is a need to choose judiciously an anomaly detector presenting the fastest execution time, the lowest memory size and energy consumption to guarantee the best trade-off between security and performance for safety or mission critical SDWSNs [
31,
41,
42,
43,
44]. An additional remarkable observation is the fact that given that the SDWSN is a new paradigm, there is not a substantial body of literature related to the intrusion detection in SDWSNs. The Network Security Laboratory-Knowledge Discovery in Databases (NSL-KDD) dataset [
45] is used in this paper to train an NB based anomaly detector, a DT based anomaly detector and a deep ANN based anomaly detector, respectively. It is noteworthy to point out that the state-of-the-art performance metrics established on the NSL-KDD dataset were obtained using a Least Square Support Vector Machine-based (LSSVM) IDS on which a Filter-based Mutual Information Feature Selection (FMIFS) scheme was implemented [
36]. The LSSVM − IDS + FMIFS framework was able to yield the best accuracy (in binary classification) and best F-scores (in multinomial classification) when 18 features were selected. One of the goals of the present paper is to establish state-of-the-art performance metrics by using all 41 features found in the NSL-KDD dataset.
5. Summary and Discussion
In order to proceed to the discussion, the major results gathered in the previous section are reorganized and summarized in this section into
Figure 10 and
Figure 11, and
Table 12.
Figure 10 gives visually the summary of the memory sizes of the anomaly detector models in both the binary classification and the multinomial classification cases.
Figure 11 gives the prediction time of the anomaly detector models in both the binary classification and the multinomial classification cases.
Table 12 summarizes the metrics recorded during the training of the anomaly detectors in the binary classification case (cf.
Table 3 and
Table 5).
In the case of the binary classification; by taking into consideration
Table 12,
Figure 10 and
Figure 11; it can be inferred that the NB-based anomaly detector must be preferred in SDWSNs where the memory size of the controller is limited (e.g., small scale or low-power SDWSNs in an African hospital) [
3,
88]. It should be emphasized that since the higher is the memory size of an anomaly detector the more the controller is energy-intensive, then the NB-based anomaly detector will be the best anomaly detector when the energy consumption is the main concern or the main performance to observe in the SDWSN under consideration [
11,
13,
16,
89]. Conversely, if the memory size of the controller is not a concern, the choice of the anomaly detector will be decided between a DT-based anomaly detector and a deep ANN-based anomaly detector. It is noteworthy that, from all three anomaly detectors considered in this paper, the DT-based anomaly detector has the lowest prediction time. For this reason, the DT-based anomaly detector would be preferred in SDWSNs requiring a low latency (e.g., continuous heart monitoring, fall detection in older adults) [
3,
90,
91,
92].
Table 13 summarizes the aforementioned considerations. It is noteworthy that the deep ANN-based anomaly detector achieved the same accuracy (i.e., 0.999433) for the binary classification as the LSSVM − IDS + FMIFS framework which was the state-of-the-art IDS found in the literature. More importantly, the DT-based anomaly detector pushed the state-of-the-art accuracy to 0.999777 for the binary classification.
It is noteworthy that the NSL-KDD dataset is inherently imbalanced (e.g., 45927 DoS samples, 52 U2R samples and 995 R2L samples in the training set) and for this reason the most adapted traditional performance metric to evaluate each anomaly detector’ s capability for the multinomial classification is the F-score [
93,
94]. Similarly to the binary classification case, the memory size and the prediction time will also be considered when making the choice of the anomaly detector the best adapted for an SDWSN under consideration.
Figure 12 gives the F-scores (for each of the five classes) of the three anomaly detector models developed in the present paper as well the LSSVM − IDS + FMIFS framework’ s ones. From this figure, it can be seen that the DT-based anomaly detector set new the state-of-the-art F-scores.
In the case of the multinomial classification; by taking into consideration
Figure 10,
Figure 11 and
Figure 12; it can be concluded that the number of training samples play a crucial role in the performance of a classifier. The most striking example is the NB-based anomaly detector that has F-scores of 0.07, 0.3 and 0.01 for the DoS, U2R and R2L attacks, respectively. This means that this anomaly detector cannot be relied upon for the detection of these three attacks in SDWSN-based IoTs even though it can be trusted for the classification of the probing attacks and normal traffics (F-scores of 0.84 and 0.94, respectively). Furthermore, it can be concluded that the DT-based anomaly detector presents the highest F-scores, a reasonable memory size and the lowest prediction time whereas the deep ANN-based anomaly detector presents the biggest memory size. For these reasons, the DT-based anomaly detector should be the default choice when dealing with multinomial anomaly classifications in SDWSN-based IoTs. Additionally, given that the performances of deep learning algorithms in general and deep ANNs in particular increase with the size of the training set, it should be noted that the deep ANN-based anomaly detector would outperform the DT-based one if more U2R and R2L attacks samples could be added to the training set [
87,
95,
96,
97]. Finally, given that the miniaturization of the controllers, the ever-increasing memory size of the miniaturized controllers and the fact that deep ANN-based anomaly detector can outperform the DT-based one if more U2R and R2L attacks samples could be added to the training set, the deep ANN classifier should be expected to become in the near future the default anomaly detector in SDWSNs.
Table 14 summarizes the considerations drawn from the multinomial classification case.
Table 15 gives some examples of IoT applications in healthcare.
Table 15 may be used in combination with
Table 13 or
Table 14 to guide the choice of an adequate anomaly detector.
6. Conclusions
In this paper, the NSL-KDD dataset was used to train three classifiers for intrusion detection in IoTs in general and SDWSN-based IoTs in particular. New state-of-the-art accuracy and F-scores have been established by a DT classifier trained on 118 features derived empirically from the 41 features of the NSL-KDD dataset. It was also found that in the case of the binary classification, aside from the memory size, the DT-based anomaly detector presented the best performance metrics and for this reason it should be used as the default anomaly detector in SDWSNs. In the case of small scale or low-power SDWSNs where the memory size of the controller is intrinsically required to be low, the NB-based anomaly detector should be used instead of the DT-based one but with the strong caveat of less security. For this reason, the memory size of the controller should be chosen accordingly when designing SDWSN-based IoTs to avoid compromising data in sensible environments and healthcare application scenarios. In the case of the multinomial classification, it was also found that DT-based anomaly detector presented the best performance metrics and for this reason it should be used as the default anomaly detector in SDWSNs. Additionally, it was found that the NB-based anomaly detector could not be used given its bad performance metrics for the multinomial classification. Finally, given the performance metrics of the deep ANN-based anomaly detector, the memory sizes of this last for both the binomial and the multinomial classification, the ever-increasing number of data collected, the miniaturization of the controllers and the amazing fact the bigger the dataset size, the better the performance metrics of a deep ANN classifier; this last should be expected to become the next default anomaly detector in SDWSNs.