Next Article in Journal
On Leader-Following Consensus in Multi-Agent Systems with Discrete Updates at Random Times
Previous Article in Journal
The Second Law and Entropy Misconceptions Demystified
Previous Article in Special Issue
Novel Models of Image Permutation and Diffusion Based on Perturbed Digital Chaos
Open AccessArticle

Anomaly Detection for Individual Sequences with Applications in Identifying Malicious Tools

by Shachar Siboni 1,* and Asaf Cohen 2,*
1
Department of Software and Information Systems Engineering, Ben-Gurion University of the Negev, Beer-Sheva 8410501, Israel
2
School of Electrical and Computer Engineering, Ben-Gurion University of the Negev, Beer-Sheva 8410501, Israel
*
Authors to whom correspondence should be addressed.
This paper is an extended version of our paper published in IEEE International Workshop on Information Forensics and Security (WIFS), Atlanta, GA, USA, 3–5 December 2014. This version includes additional theoretical and experimental results, as well as overviews, explanations and discussion.
Entropy 2020, 22(6), 649; https://doi.org/10.3390/e22060649
Received: 25 April 2020 / Revised: 3 June 2020 / Accepted: 8 June 2020 / Published: 12 June 2020
(This article belongs to the Special Issue Information Theoretic Security and Privacy of Information Systems)
Anomaly detection refers to the problem of identifying abnormal behaviour within a set of measurements. In many cases, one has some statistical model for normal data, and wishes to identify whether new data fit the model or not. However, in others, while there are normal data to learn from, there is no statistical model for this data, and there is no structured parameter set to estimate. Thus, one is forced to assume an individual sequences setup, where there is no given model or any guarantee that such a model exists. In this work, we propose a universal anomaly detection algorithm for one-dimensional time series that is able to learn the normal behaviour of systems and alert for abnormalities, without assuming anything on the normal data, or anything on the anomalies. The suggested method utilizes new information measures that were derived from the Lempel–Ziv (LZ) compression algorithm in order to optimally and efficiently learn the normal behaviour (during learning), and then estimate the likelihood of new data (during operation) and classify it accordingly. We apply the algorithm to key problems in computer security, as well as a benchmark anomaly detection data set, all using simple, single-feature time-indexed data. The first is detecting Botnets Command and Control (C&C) channels without deep inspection. We then apply it to the problems of malicious tools detection via system calls monitoring and data leakage identification.We conclude with the New York City (NYC) taxi data. Finally, while using information theoretic tools, we show that an attacker’s attempt to maliciously fool the detection system by trying to generate normal data is bound to fail, either due to a high probability of error or because of the need for huge amounts of resources. View Full-Text
Keywords: anomaly detection; individual sequences; one-dimensional time series; universal compression; probability assignment; statistical model; learning; computer security; botnets; command and control channels; NYC taxi data anomaly detection; individual sequences; one-dimensional time series; universal compression; probability assignment; statistical model; learning; computer security; botnets; command and control channels; NYC taxi data
Show Figures

Figure 1

MDPI and ACS Style

Siboni, S.; Cohen, A. Anomaly Detection for Individual Sequences with Applications in Identifying Malicious Tools. Entropy 2020, 22, 649.

Show more citation formats Show less citations formats
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Search more from Scilit
 
Search
Back to TopTop