Machine Learning Methodologies and Applications in Cybersecurity Data Analysis

Special Issue Editors


E-Mail Website
Guest Editor
School of Computer, National University of Defense Technology, Changsha 410073, China
Interests: AI for networks; multipath transmission; cybersecurity

E-Mail Website
Guest Editor
Department of Electrical and Electronic Systems Engineering, College of Engineering, Ibaraki University, Hitachi city, Japan
Interests: wireless communication; wireless sensing; AI; security
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Department of Computer Science, University of Tsukuba, Tsukuba 3058577, Japan
Interests: machine learning; data analysis; security; bioinformatics

E-Mail Website
Guest Editor
Department of Information Technology, Hunan Police Academy, Changsha 410000, China
Interests: cybersecurity; deep learning; artificial intelligence for IT operations

Special Issue Information

Dear Colleagues,

Machine learning (ML) represents a pivotal technology for current and future information systems, with many domains already leveraging its capabilities. However, ML deployment in cybersecurity is still at an early stage, revealing a significant discrepancy between research and practice. ML is able to quickly analyze large volumes of historical and dynamic data, enabling applications to operationalize data from various sources in near-real time. Recently, we have witnessed the rapid development in ML methodologies and applications for cybersecurity data analysis in threat detection, raw data analysis, and alert management, among others. Yet, in this specific domain, unleashing the full benefits of ML in practice stems from balancing the underlying conflict between the intrinsic characteristics of the cybersecurity domain and the fundamental assumptions of ML. 

This Special Issue aims to collect recent advancements in machine learning methodologies and applications targeted towards tackling cybersecurity data challenges, highly valuing interdisciplinary research to contribute new challenges, research questions, approaches, and datasets related to this topic. 

This Special Issue invites new research contributions to machine learning methodologies and applications specifically tailored to cybersecurity data analysis challenges. The scope includes but is not limited to the following topics:

  • ML methods and applications for capturing/handling/evaluating cybersecurity datasets;
  • ML methods and applications for data-driven cybersecurity decision making;
  • ML methods and applications for security policy rule generation;
  • ML methods and applications for protecting valuable security data;
  • ML methods and applications for context-aware cybersecurity data analysis;
  • ML methods and applications for feature engineering in cybersecurity;
  • ML methods and applications for PHY/MAC/L3-L7 security protocol design and evaluation
  • ML methods and applications for PHY/MAC/L3-L7 security protocol optimization;
  • ML methods and applications for data-driven network protocol fuzzing;
  • ML methods and applications for data-driven anomaly/ intrusion detection;
  • ML methods and applications for data-driven network traffic analysis;
  • ML methods and applications for data-driven endpoint detection and response;
  • ML methods and applications for data-driven cybersecurity defense framework;
  • Cybersecurity datasets/benchmark for data analysis in ML methods and applications;
  • Cybersecurity prototypes/testbeds for data analysis in ML methods and applications, etc.

We look forward to receiving your contributions.  

Prof. Dr. Biao Han
Dr. Xiaoyan Wang
Prof. Dr. Xiucai Ye
Dr. Na Zhao
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Big Data and Cognitive Computing is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1800 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • machine learning
  • cybersecurity
  • data science
  • artificial intelligence

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (4 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

22 pages, 23322 KiB  
Article
MS-PreTE: A Multi-Scale Pre-Training Encoder for Mobile Encrypted Traffic Classification
by Ziqi Wang, Yufan Qiu, Yaping Liu, Shuo Zhang and Xinyi Liu
Big Data Cogn. Comput. 2025, 9(8), 216; https://doi.org/10.3390/bdcc9080216 - 21 Aug 2025
Abstract
Mobile traffic classification serves as a fundamental component in network security systems. In recent years, pre-training methods have significantly advanced this field. However, as mobile traffic is typically mixed with third-party services, the deep integration of such shared services results in highly similar [...] Read more.
Mobile traffic classification serves as a fundamental component in network security systems. In recent years, pre-training methods have significantly advanced this field. However, as mobile traffic is typically mixed with third-party services, the deep integration of such shared services results in highly similar TCP flow characteristics across different applications. This makes it challenging for existing traffic classification methods to effectively identify mobile traffic. To address the challenge, we propose MS-PreTE, a two-phase pre-training framework for mobile traffic classification. MS-PreTE introduces a novel multi-level representation model to preserve traffic information from diverse perspectives and hierarchical levels. Furthermore, MS-PreTE incorporates a focal-attention mechanism to enhance the model’s capability in discerning subtle differences among similar traffic flows. Evaluations demonstrate that MS-PreTE achieves state-of-the-art performance on three mobile application datasets, boosting the F1 score for Cross-platform (iOS) to 99.34% (up by 2.1%), Cross-platform (Android) to 98.61% (up by 1.6%), and NUDT-Mobile-Traffic to 87.70% (up by 2.47%). Moreover, MS-PreTE exhibits strong generalization capabilities across four real-world traffic datasets. Full article
Show Figures

Figure 1

18 pages, 5825 KiB  
Article
Detection and Localization of Hidden IoT Devices in Unknown Environments Based on Channel Fingerprints
by Xiangyu Ju, Yitang Chen, Zhiqiang Li and Biao Han
Big Data Cogn. Comput. 2025, 9(8), 214; https://doi.org/10.3390/bdcc9080214 - 20 Aug 2025
Abstract
In recent years, hidden IoT monitoring devices installed indoors have raised significant concerns about privacy breaches and other security threats. To address the challenges of detecting such devices, low positioning accuracy, and lengthy detection times, this paper proposes a hidden device detection and [...] Read more.
In recent years, hidden IoT monitoring devices installed indoors have raised significant concerns about privacy breaches and other security threats. To address the challenges of detecting such devices, low positioning accuracy, and lengthy detection times, this paper proposes a hidden device detection and localization system that operates on the Android platform. This technology utilizes the Received Signal Strength Indication (RSSI) signals received by the detection terminal device to achieve the detection, classification, and localization of hidden IoT devices in unfamiliar environments. This technology integrates three key designs: (1) actively capturing the RSSI sequence of hidden devices by sending RTS frames and receiving CTS frames, which is used to generate device channel fingerprints and estimate the distance between hidden devices and detection terminals; (2) training an RSSI-based ranging model using the XGBoost algorithm, followed by multi-point localization for accurate positioning; (3) implementing augmented reality-based visual localization to support handheld detection terminals. This prototype system successfully achieves active data sniffing based on RTS/CTS and terminal localization based on the RSSI-based ranging model, effectively reducing signal acquisition time and improving localization accuracy. Real-world experiments show that the system can detect and locate hidden devices in unfamiliar environments, achieving an accuracy of 98.1% in classifying device types. The time required for detection and localization is approximately one-sixth of existing methods, with system runtime maintained within 5 min. The localization error is 0.77 m, a 48.7% improvement over existing methods with an average error of 1.5 m. Full article
Show Figures

Figure 1

19 pages, 10741 KiB  
Article
Electroencephalography-Based Motor Imagery Classification Using Multi-Scale Feature Fusion and Adaptive Lasso
by Shimiao Chen, Nan Li, Xiangzeng Kong, Dong Huang and Tingting Zhang
Big Data Cogn. Comput. 2024, 8(12), 169; https://doi.org/10.3390/bdcc8120169 - 25 Nov 2024
Viewed by 1504
Abstract
Brain–computer interfaces, where motor imagery electroencephalography (EEG) signals are transformed into control commands, offer a promising solution for enhancing the standard of living for disabled individuals. However, the performance of EEG classification has been limited in most studies due to a lack of [...] Read more.
Brain–computer interfaces, where motor imagery electroencephalography (EEG) signals are transformed into control commands, offer a promising solution for enhancing the standard of living for disabled individuals. However, the performance of EEG classification has been limited in most studies due to a lack of attention to the complementary information inherent at different temporal scales. Additionally, significant inter-subject variability in sensitivity to biological motion poses another critical challenge in achieving accurate EEG classification in a subject-dependent manner. To address these challenges, we propose a novel machine learning framework combining multi-scale feature fusion, which captures global and local spatial information from different-sized EEG segmentations, and adaptive Lasso-based feature selection, a mechanism for adaptively retaining informative subject-dependent features and discarding irrelevant ones. Experimental results on multiple public benchmark datasets revealed substantial improvements in EEG classification, achieving rates of 81.36%, 75.90%, and 68.30% for the BCIC-IV-2a, SMR-BCI, and OpenBMI datasets, respectively. These results not only surpassed existing methodologies but also underscored the effectiveness of our approach in overcoming specific challenges in EEG classification. Ablation studies further confirmed the efficacy of both the multi-scale feature analysis and adaptive selection mechanisms. This framework marks a significant advancement in the decoding of motor imagery EEG signals, positioning it for practical applications in real-world BCIs. Full article
Show Figures

Figure 1

15 pages, 2140 KiB  
Article
Adaptive Management of Multi-Scenario Projects in Cybersecurity: Models and Algorithms for Decision-Making
by Vadim Tynchenko, Alexander Lomazov, Vadim Lomazov, Dmitry Evsyukov, Vladimir Nelyub, Aleksei Borodulin, Andrei Gantimurov and Ivan Malashin
Big Data Cogn. Comput. 2024, 8(11), 150; https://doi.org/10.3390/bdcc8110150 - 4 Nov 2024
Cited by 2 | Viewed by 1908
Abstract
In recent years, cybersecurity management has increasingly required advanced methodologies capable of handling complex, evolving threat landscapes. Scenario network-based approaches have emerged as effective strategies for managing uncertainty and adaptability in cybersecurity projects. This article introduces a scenario network-based approach for managing cybersecurity [...] Read more.
In recent years, cybersecurity management has increasingly required advanced methodologies capable of handling complex, evolving threat landscapes. Scenario network-based approaches have emerged as effective strategies for managing uncertainty and adaptability in cybersecurity projects. This article introduces a scenario network-based approach for managing cybersecurity projects, utilizing fuzzy linguistic models and a Takagi–Sugeno–Kanga fuzzy neural network. Drawing upon L. Zadeh’s theory of linguistic variables, the methodology integrates expert analysis, linguistic variables, and a continuous genetic algorithm to predict membership function parameters. Fuzzy production rules are employed for decision-making, while the Mamdani fuzzy inference algorithm enhances interpretability. This approach enables multi-scenario planning and adaptability across multi-stage cybersecurity projects. Preliminary results from a research prototype of an intelligent expert system—designed to analyze project stages and adaptively construct project trajectories—suggest the proposed approach is effective. In computational experiments, the use of fuzzy procedures resulted in an over 25% reduction in errors compared to traditional methods, particularly in adjusting project scenarios from pessimistic to baseline projections. While promising, this approach requires further testing across diverse cybersecurity contexts. Future studies will aim to refine scenario adaptation and optimize system response in high-risk project environments. Full article
Show Figures

Figure 1

Back to TopTop