Next Article in Journal
Diffusion Model of a Non-Integer Order PIγ Controller with TCP/UDP Streams
Previous Article in Journal
Investigation on Identifying Implicit Learning Event from EEG Signal Using Multiscale Entropy and Artificial Bee Colony
Previous Article in Special Issue
Energy Efficiency Optimization in Massive MIMO Secure Multicast Transmission
Article

A Compression-Based Method for Detecting Anomalies in Textual Data

1
Departamento de Ingeniería Informática, Escuela Politécnica Superior, Universidad Autónoma de Madrid, 28049 Madrid, Spain
2
Institute of Physical and Information Technologies (ITEFI), Spanish National Research Council (CSIC), 28006 Madrid, Spain
*
Author to whom correspondence should be addressed.
Academic Editor: Sotiris Kotsiantis
Entropy 2021, 23(5), 618; https://doi.org/10.3390/e23050618
Received: 31 March 2021 / Revised: 12 May 2021 / Accepted: 12 May 2021 / Published: 16 May 2021
(This article belongs to the Special Issue Information Theoretic Security and Privacy of Information Systems)
Nowadays, information and communications technology systems are fundamental assets of our social and economical model, and thus they should be properly protected against the malicious activity of cybercriminals. Defence mechanisms are generally articulated around tools that trace and store information in several ways, the simplest one being the generation of plain text files coined as security logs. Such log files are usually inspected, in a semi-automatic way, by security analysts to detect events that may affect system integrity, confidentiality and availability. On this basis, we propose a parameter-free method to detect security incidents from structured text regardless its nature. We use the Normalized Compression Distance to obtain a set of features that can be used by a Support Vector Machine to classify events from a heterogeneous cybersecurity environment. In particular, we explore and validate the application of our method in four different cybersecurity domains: HTTP anomaly identification, spam detection, Domain Generation Algorithms tracking and sentiment analysis. The results obtained show the validity and flexibility of our approach in different security scenarios with a low configuration burden. View Full-Text
Keywords: intrusion detection systems; anomaly detection; normalized compression distance; text mining; data-driven security intrusion detection systems; anomaly detection; normalized compression distance; text mining; data-driven security
Show Figures

Figure 1

MDPI and ACS Style

de la Torre-Abaitua, G.; Lago-Fernández, L.F.; Arroyo, D. A Compression-Based Method for Detecting Anomalies in Textual Data. Entropy 2021, 23, 618. https://doi.org/10.3390/e23050618

AMA Style

de la Torre-Abaitua G, Lago-Fernández LF, Arroyo D. A Compression-Based Method for Detecting Anomalies in Textual Data. Entropy. 2021; 23(5):618. https://doi.org/10.3390/e23050618

Chicago/Turabian Style

de la Torre-Abaitua, Gonzalo; Lago-Fernández, Luis F.; Arroyo, David. 2021. "A Compression-Based Method for Detecting Anomalies in Textual Data" Entropy 23, no. 5: 618. https://doi.org/10.3390/e23050618

Find Other Styles
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Search more from Scilit
 
Search
Back to TopTop