Detection of DoS Attacks in an IoT Environment with MQTT Protocol Based on Intelligent Binary Classiﬁers

: The present work deals with the problem of detecting Denial of Service attacks in an IoT environment. To achieve this goal, a dataset registered in an MQTT protocol network is used, applying dimension reduction techniques combined with classiﬁcation algorithms. The ﬁnal classiﬁers presents successful results.


Introduction
Use of the IoT (Internet of Things) paradigm has increased during recent years; this technology has become an essential pillar for a wide variety of processes, in industrial, home, and telecommunications applications, among others. This new concept contributes to encourage connectivity between physical devices, such as controllers, sensors, and actuators, looking for a greater flexibility and process optimisation [1].
However, the significant increase in the flow of communications has resulted a rise in vulnerability, caused by different attacks that put at risk the system's integrity. According to the nature of each attack, different consequences are possible, such as appearance of malware that may harm the equipment, unauthorised access to network information, or DoS (Denial of Service) attacks [2].
In this context, the implementation of algorithms capable of detecting these attacks plays a significant role to ensure the integrity of an IoT environment. Accordingly, this work proposes the use of different intelligent techniques to face the task of detecting DoS attacks in an MQTT network. This document is structured as follows: After the present section, the description of the dataset is carried out in the case of study section. Then, the used techniques are detailed, followed by the experiments and results section. Finally, the conclusions are exposed in the last section.

Dataset Description
The MQTT (Message Queuing Telemetry Transport) protocol works at the application level of the TCP (Transmission Control Protocol). This environment is one of the most used in IoT systems [3]. It is based on a star architecture, which pivots on a central broker that manages the network messages. The message procedure follows a publication/subscription approach, where the messages are characterised as a string implementing a nested structure.
To generate the dataset, a server with an Aedes library acted as broker. An ESP 8266 device was in charge of establishing a connection with the several sensors and actuators. However, the broker was vulnerable to DoS attacks through port MQTT 1883. An MQTTmalaria program was in charge of performing these operations.
The traffic registered during the experiments contained a total number of 94624 samples, with 65 variables containing network information and a label indicating whether the instance is "normal" or "attack". After an initial analysis of the original dataset, the repeated samples were removed, and the constant variables deleted. Furthermore, the categorical variables were transformed following a natural coding criteria. Finally, the data presented 39 variables, 49910 normal instances, and 9429 attacks.

Principal Component Analysis
This dimension reduction technique aims to find the directions of higher variability in a dataset, known as principal components [4]. This is performed through the calculation of the eigenvalues of the correlation matrix. Then, using the eigenvectors, the initial set can be linearly transformed into lower dimension space.

Classification Techniques Logistic Regression
The Logistic Regression (LR) classification technique makes use of a sigmoid function to calculate the class membership probability, whose values are fitted following a gradient descent criteria [5].

K Nearest Neighbours
This classification method uses the data density to label a new instance. To estimate the class membership, it evaluates the K Nearest Neighbours (KNN) and counts the number of samples of each class [5].

Decision Trees
A Decision Tree (DT) algorithm is implemented by repeatedly splitting the dataset using a criteria that maximises the sample separation. At each split, the entropy decrease should be maximised due to the own split [5].

Deep Neural Networks
The Deep Neural Networks (DNN) are based on an architecture made of multiple layers, whose neurons are connected with the neurons of adjacent layers. The weight of each connection, and the parameters of activation functions are tuned during the training process following a minimising error criteria [5].

Experimental Setup
Different experiments were carried out to obtain the best classifier. First, with the aim of minimising the computation times and improve the classifier performance, a dimension reduction was carried out using PCA. In this case, two types of reduction were considered: two components and five components. A 10-fold cross-validation was developed, measuring the accuracy, F1 score, precision, recall, specificity, and the Area Under the Receiving Operating Curve (AUC) [6]. This last measure is the one selected to determine the best classifier, because it is nonsensitive to class distribution.

Results
First, an initial analysis of the PCA result was conducted. From the results achieved in Figure 1, the number of components selected were two and five. With this configuration, the four classification techniques were tested, leading to the final results shown in Figure 2.

Conclusions
The present papers deals with the detection of DoS attack by means of intelligent classifiers. LR classifiers do not achieve as good a performance as the rest of the techniques. Furthermore, using two and five components does not affect significantly the classifiers performance. The implementation of this approach could entail significant benefits for IoT environments with MQTT protocols.