This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Open AccessArticle
Semi-Supervised Learning for Intrusion Detection in Large Computer Networks
by
Brandon Williams
Brandon Williams and
Lijun Qian
Lijun Qian *
CREDIT Center, Department of Electrical and Computer Engineering, Prairie View A&M University, Prairie View, TX 77446, USA
*
Author to whom correspondence should be addressed.
Appl. Sci. 2025, 15(11), 5930; https://doi.org/10.3390/app15115930 (registering DOI)
Submission received: 20 October 2024
/
Revised: 17 May 2025
/
Accepted: 21 May 2025
/
Published: 24 May 2025
Abstract
In an increasingly interconnected world, securing large networks against cyber-threats has become paramount as cyberattacks become more rampant, difficult, and expensive to remedy. This research explores data-driven security by applying semi-supervised machine learning techniques for intrusion detection in large-scale network environments. Novel methods (including decision tree with entropy-based uncertainty sampling, logistic regression with self-training, and co-training with random forest) are proposed to perform intrusion detection with limited labeled data. These methods leverage both available labeled data and abundant unlabeled data. Extensive experiments on the CIC-DDoS2019 dataset show promising results; both the decision tree with entropy-based uncertainty sampling and the co-training with random forest models achieve 99% accuracy. Furthermore, the UNSW-NB15 dataset is introduced to conduct a comparative analysis between base models (random forest, decision tree, and logistic regression) when using only labeled data and the proposed models when using partially labeled data. The proposed methods demonstrate superior results when using 1%, 10%, and 50% labeled data, highlighting their effectiveness and potential for improving intrusion detection systems in scenarios with limited labeled data.
Share and Cite
MDPI and ACS Style
Williams, B.; Qian, L.
Semi-Supervised Learning for Intrusion Detection in Large Computer Networks. Appl. Sci. 2025, 15, 5930.
https://doi.org/10.3390/app15115930
AMA Style
Williams B, Qian L.
Semi-Supervised Learning for Intrusion Detection in Large Computer Networks. Applied Sciences. 2025; 15(11):5930.
https://doi.org/10.3390/app15115930
Chicago/Turabian Style
Williams, Brandon, and Lijun Qian.
2025. "Semi-Supervised Learning for Intrusion Detection in Large Computer Networks" Applied Sciences 15, no. 11: 5930.
https://doi.org/10.3390/app15115930
APA Style
Williams, B., & Qian, L.
(2025). Semi-Supervised Learning for Intrusion Detection in Large Computer Networks. Applied Sciences, 15(11), 5930.
https://doi.org/10.3390/app15115930
Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details
here.
Article Metrics
Article Access Statistics
For more information on the journal statistics, click
here.
Multiple requests from the same IP address are counted as one view.