applsci-logo

Journal Browser

Journal Browser

Machine Learning for Attack and Defense in Cybersecurity

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: closed (20 December 2021) | Viewed by 7172

Special Issue Editors


E-Mail Website
Guest Editor
Center for Mathematical and Data Sciences, Kobe University, Kobe 657-8501, Japan
Interests: neural networks; machine learning; big data analysis; cybersecurity; privacy-preserving machine learning; text mining; pattern recognition
National Institute of Information and Communications Technology, Tokyo 184-8795, Japan
Interests: cybersecurity; network security; malware analysis; data mining; machine learning; neural networks

Special Issue Information

Dear Colleagues,

Thanks to the breakthroughs in AI technology in the big data era, AI is now used everywhere in our life, from consumer electronics, vehicles, and smart systems in offices and factories, to various cloud services on the Internet. AI and machine learning are now becoming key technologies that support our lives and society. This means that our living infrastructure, property, and privacy have more opportunities to be targeted by cyberattacks. More seriously, it is considered that attackers also understand and use the characteristics of machine learning in a smart way to evade detection and carry out cyberattacks effectively and on a large scale. To face such tough situations in cybersecurity, we should discuss machine learning for cybersecurity not only from the defender side (blue team) but also from the attacker side (red team). This Special Issue welcomes high-quality papers on machine learning approaches to all types of attacks and defenses in cybersecurity. Case studies on actual cyberattack and cyberdefense systems based on machine learning are also welcome. 

Prof. Dr. Seiichi Ozawa
Dr. Tao Ban
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • machine learning for cyberattack
  • machine learning for cyberdefense
  • machine learning for data analysis in cybersecurity
  • machine learning for threat analysis
  • machine learning for visualization of cyberattacks
  • generative model for cyberattacks
  • evasion attacks to/with machine learning models
  • machine learning for/against adversarial examples

Published Papers (3 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

16 pages, 2427 KiB  
Article
Investigating the Influence of Feature Sources for Malicious Website Detection
by Ahmad Chaiban, Dušan Sovilj, Hazem Soliman, Geoff Salmon and Xiaodong Lin
Appl. Sci. 2022, 12(6), 2806; https://doi.org/10.3390/app12062806 - 9 Mar 2022
Cited by 7 | Viewed by 2094
Abstract
Malicious websites in general, and phishing websites in particular, attempt to mimic legitimate websites in order to trick users into trusting them. These websites, often a primary method for credential collection, pose a severe threat to large enterprises. Credential collection enables malicious actors [...] Read more.
Malicious websites in general, and phishing websites in particular, attempt to mimic legitimate websites in order to trick users into trusting them. These websites, often a primary method for credential collection, pose a severe threat to large enterprises. Credential collection enables malicious actors to infiltrate enterprise systems without triggering the usual alarms. Therefore, there is a vital need to gain deep insights into the statistical features of these websites that enable Machine Learning (ML) models to classify them from their benign counterparts. Our objective in this paper is to provide this necessary investigation, more specifically, our contribution is to observe and evaluate combinations of feature sources that have not been studied in the existing literature—primarily involving embeddings extracted with Transformer-type neural networks. The second contribution is a new dataset for this problem, GAWAIN, constructed in a way that offers other researchers not only access to data, but our whole data acquisition and processing pipeline. The experiments on our new GAWAIN dataset show that the classification problem is much harder than reported in other studies—we are able to obtain around 84% in terms of test accuracy. For individual feature contributions, the most relevant ones are coming from URL embeddings, indicating that this additional step in the processing pipeline is needed in order to improve predictions. A surprising outcome of the investigation is lack of content-related features (HTML, JavaScript) from the top-10 list. When comparing the prediction outcomes between models trained on commonly used features in the literature versus embedding-related features, the gain with embeddings is slightly above 1% in terms of test accuracy. However, we argue that even this somewhat small increase can play a significant role in detecting malicious websites, and thus these types of feature categories are worth investigating further. Full article
(This article belongs to the Special Issue Machine Learning for Attack and Defense in Cybersecurity)
Show Figures

Figure 1

20 pages, 1071 KiB  
Article
Detecting Web-Based Attacks with SHAP and Tree Ensemble Machine Learning Methods
by Samuel Ndichu, Sangwook Kim, Seiichi Ozawa, Tao Ban, Takeshi Takahashi and Daisuke Inoue
Appl. Sci. 2022, 12(1), 60; https://doi.org/10.3390/app12010060 - 22 Dec 2021
Cited by 4 | Viewed by 2505
Abstract
Attacks using Uniform Resource Locators (URLs) and their JavaScript (JS) code content to perpetrate malicious activities on the Internet are rampant and continuously evolving. Methods such as blocklisting, client honeypots, domain reputation inspection, and heuristic and signature-based systems are used to detect these [...] Read more.
Attacks using Uniform Resource Locators (URLs) and their JavaScript (JS) code content to perpetrate malicious activities on the Internet are rampant and continuously evolving. Methods such as blocklisting, client honeypots, domain reputation inspection, and heuristic and signature-based systems are used to detect these malicious activities. Recently, machine learning approaches have been proposed; however, challenges still exist. First, blocklist systems are easily evaded by new URLs and JS code content, obfuscation, fast-flux, cloaking, and URL shortening. Second, heuristic and signature-based systems do not generalize well to zero-day attacks. Third, the Domain Name System allows cybercriminals to easily migrate their malicious servers to hide their Internet protocol addresses behind domain names. Finally, crafting fully representative features is challenging, even for domain experts. This study proposes a feature selection and classification approach for malicious JS code content using Shapley additive explanations and tree ensemble methods. The JS code features are obtained from the Abstract Syntax Tree form of the JS code, sample JS attack codes, and association rule mining. The malicious and benign JS code datasets obtained from Hynek Petrak and the Majestic Million Service were used for performance evaluation. We compared the performance of the proposed method to those of other feature selection methods in the task of malicious JS code content detection. With a recall of 0.9989, our experimental results show that the proposed approach is a better prediction model. Full article
(This article belongs to the Special Issue Machine Learning for Attack and Defense in Cybersecurity)
Show Figures

Figure 1

24 pages, 1201 KiB  
Article
Two-Phase Deep Learning-Based EDoS Detection System
by Chien-Nguyen Nhu and Minho Park
Appl. Sci. 2021, 11(21), 10249; https://doi.org/10.3390/app112110249 - 1 Nov 2021
Cited by 1 | Viewed by 1612
Abstract
Cloud computing is currently considered the most cost-effective platform for offering business and consumer IT services over the Internet. However, it is prone to new vulnerabilities. A new type of attack called an economic denial of sustainability (EDoS) attack exploits the pay-per-use model [...] Read more.
Cloud computing is currently considered the most cost-effective platform for offering business and consumer IT services over the Internet. However, it is prone to new vulnerabilities. A new type of attack called an economic denial of sustainability (EDoS) attack exploits the pay-per-use model to scale up the resource usage over time to the extent that the cloud user has to pay for the unexpected usage charge. To prevent EDoS attacks, a few solutions have been proposed, including hard-threshold and machine learning-based solutions. Among them, long short-term memory (LSTM)-based solutions achieve much higher accuracy and false-alarm rates than hard-threshold and other machine learning-based solutions. However, LSTM requires a long sequence length of the input data, leading to a degraded performance owing to increases in the calculations, the detection time, and consuming a large number of computing resources of the defense system. We, therefore, propose a two-phase deep learning-based EDoS detection scheme that uses an LSTM model to detect each abnormal flow in network traffic; however, the LSTM model requires only a short sequence length of five of the input data. Thus, the proposed scheme can take advantage of the efficiency of the LSTM algorithm in detecting each abnormal flow in network traffic, while reducing the required sequence length of the input data. A comprehensive performance evaluation shows that our proposed scheme outperforms the existing solutions in terms of accuracy and resource consumption. Full article
(This article belongs to the Special Issue Machine Learning for Attack and Defense in Cybersecurity)
Show Figures

Figure 1

Back to TopTop