Comparative Study between Big Data Analysis Techniques in Intrusion Detection
AbstractCybersecurity ventures expect that cyber-attack damage costs will rise to $11.5 billion in 2019 and that a business will fall victim to a cyber-attack every 14 seconds. Notice here that the time frame for such an event is seconds. With petabytes of data generated each day, this is a challenging task for traditional intrusion detection systems (IDSs). Protecting sensitive information is a major concern for both businesses and governments. Therefore, the need for a real-time, large-scale and effective IDS is a must. In this work, we present a cloud-based, fault tolerant, scalable and distributed IDS that uses Apache Spark Structured Streaming and its Machine Learning library (MLlib) to detect intrusions in real-time. To demonstrate the efficacy and effectivity of this system, we implement the proposed system within Microsoft Azure Cloud, as it provides both processing power and storage capabilities. A decision tree algorithm is used to predict the nature of incoming data. For this task, the use of the MAWILab dataset as a data source will give better insights about the system capabilities against cyber-attacks. The experimental results showed a 99.95% accuracy and more than 55,175 events per second were processed by the proposed system on a small cluster. View Full-Text
Share & Cite This Article
Hafsa, M.; Jemili, F. Comparative Study between Big Data Analysis Techniques in Intrusion Detection. Big Data Cogn. Comput. 2019, 3, 1.
Hafsa M, Jemili F. Comparative Study between Big Data Analysis Techniques in Intrusion Detection. Big Data and Cognitive Computing. 2019; 3(1):1.Chicago/Turabian Style
Hafsa, Mounir; Jemili, Farah. 2019. "Comparative Study between Big Data Analysis Techniques in Intrusion Detection." Big Data Cogn. Comput. 3, no. 1: 1.
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.