Applied Sciences

Research

17 pages, 2335 KB

Open AccessArticle

QUIC Network Traffic Classification Using Ensemble Machine Learning Techniques

by Sultan Almuhammadi, Abdullatif Alnajim and Mohammed Ayub

Appl. Sci. 2023, 13(8), 4725; https://doi.org/10.3390/app13084725 - 9 Apr 2023

Cited by 15 | Viewed by 6208

The Quick UDP Internet Connections (QUIC) protocol provides advantages over traditional TCP, but its encryption functionality reduces the visibility for operators into network traffic. Many studies deploy machine learning and deep learning algorithms on QUIC traffic classification. However, standalone machine learning models are [...] Read more.

The Quick UDP Internet Connections (QUIC) protocol provides advantages over traditional TCP, but its encryption functionality reduces the visibility for operators into network traffic. Many studies deploy machine learning and deep learning algorithms on QUIC traffic classification. However, standalone machine learning models are subject to overfitting and poor predictability in complex network traffic environments. Deep learning on the other hand requires a huge dataset and intensive parameter fine-tuning. On the contrary, ensemble techniques provide reliability, better prediction, and robustness of the trained model, thereby reducing the chance of overfitting. In this paper, we approach the QUIC network traffic classification problem by utilizing five different ensemble machine learning techniques, namely: Random Forest, Extra Trees, Gradient Boosting Tree, Extreme Gradient Boosting Tree, and Light Gradient Boosting Model. We used the publicly available dataset with five different services such as Google Drive, YouTube, Google Docs, Google Search, and Google Music. The models were trained using a different number of features on different scenarios and evaluated using several performance metrics. The results show that Extreme Gradient Boosting Tree and Light Gradient Boosting Model outperform the other models and achieve one of the highest results among the state-of-the-art models found in the literature with a simpler model and features. Full article

(This article belongs to the Special Issue Applications of Artificial Intelligence and Machine Learning in Cyber Security)

► Show Figures

Figure 1

24 pages, 409 KB

Open AccessArticle

Detection of Algorithmically Generated Malicious Domain Names with Feature Fusion of Meaningful Word Segmentation and N-Gram Sequences

by Shaojie Chen, Bo Lang, Yikai Chen and Chong Xie

Appl. Sci. 2023, 13(7), 4406; https://doi.org/10.3390/app13074406 - 30 Mar 2023

Cited by 7 | Viewed by 4648

Abstract

Domain generation algorithms (DGAs) play an important role in network attacks and can be mainly divided into two types: dictionary-based and character-based. Dictionary-based algorithmically generated domains (AGDs) are similar in composition to normal domains and are harder to detect. Although methods based on [...] Read more.

Domain generation algorithms (DGAs) play an important role in network attacks and can be mainly divided into two types: dictionary-based and character-based. Dictionary-based algorithmically generated domains (AGDs) are similar in composition to normal domains and are harder to detect. Although methods based on meaningful word segmentation and n-gram sequence features exhibit good detection performance for AGDs, they are inadequate for mining meaningful word features of domain names, and the performance of hybrid detection of character-based and dictionary-based AGDs needs to be further improved. Therefore, in this paper, we first describe the composition of dictionary-based AGDs using meaningful word segmentation, introduce the standard deviation to better measure the word distribution features, and construct additional 11-dimensional statistical features for word segmentation results as a supplement. Then, by combining 3-gram and 1-gram sequence features, we improve the detection performance for both character-based and dictionary-based AGDs. Finally, we perform feature fusion of the above four kinds of features to achieve an end-to-end detection method for both kinds of AGDs. Experimental results showed that our method achieved an accuracy of 97.24% on the full dataset and better accuracy and F1 values than existing methods on both dictionary-based and character-based AGD datasets. Full article

(This article belongs to the Special Issue Applications of Artificial Intelligence and Machine Learning in Cyber Security)

► Show Figures

Figure 1

21 pages, 4879 KB

Open AccessArticle

Anomaly Detection Method for Unknown Protocols in a Power Plant ICS Network with Decision Tree

by Kyoung-Mun Lee, Min-Yang Cho, Jung-Gu Kim and Kyung-Ho Lee

Appl. Sci. 2023, 13(7), 4203; https://doi.org/10.3390/app13074203 - 26 Mar 2023

Cited by 1 | Viewed by 3341

Abstract

This study aimed to enhance the stability and security of power plant control network systems by developing detectable models using artificial intelligence machine learning techniques. Due to the closed system operation policy of facility manufacturers, it is challenging to detect and respond to [...] Read more.

This study aimed to enhance the stability and security of power plant control network systems by developing detectable models using artificial intelligence machine learning techniques. Due to the closed system operation policy of facility manufacturers, it is challenging to detect and respond to security threats using standard security systems. With the increasing digitization of control systems, the risk of external malware penetration is also on the rise. To address this, machine learning techniques were applied to extract patterns from network traffic data produced at an average of 6.5 TB per month, and fingerprinting was used to detect unregistered terminals accessing the control network. By setting a threshold between transmission amounts and attempts using one month of data, an anomaly judgment model was learned to define patterns of data communication between the origin and destination. The hypothesis was tested using machine learning techniques if a new pattern occurred and no traffic occurred. The study confirmed that this method can be applied to not only plant control systems but also closed-structured control networks, where availability is critical, and other industries that use large amounts of traffic data. Experimental results showed that the proposed model outperformed existing models in terms of detection efficiency and processing time. Full article

(This article belongs to the Special Issue Applications of Artificial Intelligence and Machine Learning in Cyber Security)

► Show Figures

Figure 1

14 pages, 3243 KB

Open AccessArticle

Photovoltaic Power-Stealing Identification Method Based on Similar-Day Clustering and QRLSTM Interval Prediction

by Shurong Peng, Lijuan Guo, Bin Li, Shuang Lu, Huixia Chen and Sheng Su

Appl. Sci. 2023, 13(6), 3506; https://doi.org/10.3390/app13063506 - 9 Mar 2023

Cited by 2 | Viewed by 2080

Abstract

In order to defraud state subsidies, some unscrupulous users use improper means to steal photovoltaic (PV) power. This behavior brings potential safety hazards to photovoltaic grid-connected operations. In this paper, a photovoltaic power-stealing identification method based on similar-day clustering and interval prediction of [...] Read more.

In order to defraud state subsidies, some unscrupulous users use improper means to steal photovoltaic (PV) power. This behavior brings potential safety hazards to photovoltaic grid-connected operations. In this paper, a photovoltaic power-stealing identification method based on similar-day clustering and interval prediction of the quantile regression model for long short-term memory neural network (QRLSTM) is proposed. First, photovoltaic data are clustered into three similar days by the similar-day clustering according to weather conditions. Second, compared with the quantile regression neural network (QRNN) prediction method, the good prediction performance of the QRLSTM method is illustrated. Third, using the prediction intervals with different confidence levels on three similar days, according to the time scale (short-term, medium-term and long-term) combined with different electricity-stealing judgment indicators, a three-layer photovoltaic power-stealing screening framework is constructed, and the degree of user power stealing is qualitatively analyzed. Last, the power generation data of eight photovoltaic users in a certain region of northwest China and the data of four groups of artificially constructed power-stealing users are used as an example for simulation. The simulation results prove the feasibility of the proposed method in this paper. Full article

(This article belongs to the Special Issue Applications of Artificial Intelligence and Machine Learning in Cyber Security)

► Show Figures

Figure 1

15 pages, 4096 KB

Open AccessCommunication

Deep Learning-Based Network Intrusion Detection Using Multiple Image Transformers

by Taehoon Kim and Wooguil Pak

Appl. Sci. 2023, 13(5), 2754; https://doi.org/10.3390/app13052754 - 21 Feb 2023

Cited by 27 | Viewed by 5550

Abstract

The development of computer vision-based deep learning models for accurate two-dimensional (2D) image classification has enabled us to surpass existing machine learning-based classifiers and human classification capabilities. Recently, steady efforts have been made to apply these sophisticated vision-based deep learning models as network [...] Read more.

The development of computer vision-based deep learning models for accurate two-dimensional (2D) image classification has enabled us to surpass existing machine learning-based classifiers and human classification capabilities. Recently, steady efforts have been made to apply these sophisticated vision-based deep learning models as network intrusion detection domains, and various experimental results have confirmed their applicability and limitations. In this paper, we present an optimized method for processing network intrusion detection system (NIDS) datasets using vision-based deep learning models by further expanding existing studies to overcome these limitations. In the proposed method, the NIDS dataset can further enhance the performance of existing deep-learning-based intrusion detection by converting the dataset into 2D images through various image transformers and then integrating into three-channel RGB color images, unlike the existing method. Various performance evaluations confirm that the proposed method can significantly improve intrusion detection performance over the recent method using grayscale images, and existing NIDSs without the use of images. As network intrusion is increasingly evolving in complexity and variety, we anticipate that the intrusion detection algorithm outlined in this study will facilitate network security. Full article

(This article belongs to the Special Issue Applications of Artificial Intelligence and Machine Learning in Cyber Security)

► Show Figures

Figure 1

17 pages, 1166 KB

Open AccessArticle

SAViP: Semantic-Aware Vulnerability Prediction for Binary Programs with Neural Networks

by Xu Zhou, Bingjie Duan, Xugang Wu and Pengfei Wang

Appl. Sci. 2023, 13(4), 2271; https://doi.org/10.3390/app13042271 - 10 Feb 2023

Cited by 1 | Viewed by 3245

Abstract

Vulnerability prediction, in which static analysis is leveraged to predict the vulnerabilities of binary programs, has become a popular research topic. Traditional vulnerability prediction methods depend on vulnerability patterns, which must be predefined by security experts in a time-consuming manner. The development of [...] Read more.

Vulnerability prediction, in which static analysis is leveraged to predict the vulnerabilities of binary programs, has become a popular research topic. Traditional vulnerability prediction methods depend on vulnerability patterns, which must be predefined by security experts in a time-consuming manner. The development of Artificial Intelligence (AI) has yielded new options for vulnerability prediction. Neural networks allow vulnerability patterns to be learned automatically. However, current works extract only one or two types of features and use traditional models such as word2vec, which results in the loss of much instruction-level information. In this paper, we propose a model named SAViP to predict vulnerabilities in binary programs. To fully extract binary information, we integrate three kinds of features: semantic, statistical, and structural features. For semantic features, we apply the Masked Language Model (MLM) pre-training task of the RoBERTa model to the assembly code to build our language model. Using this model, we innovatively combine the beginning token and the operation-code token to create the instruction embedding. For the statistical features, we design a 56-dimensional feature vector that contains 43 kinds of instructions. For the structural features, we improve the ability of the structure2vec network to obtain the characteristic of the network by emphasizing node self-attention. Through these optimizations, we significantly increase the accuracy of vulnerability prediction over existing methods. Our experiments show that SAViP achieves a recall of 77.85% and Top 100∼600 accuracies all above 95%. The results are 10% and 13% higher than those of the state-of-the-art V-Fuzz, respectively. Full article

(This article belongs to the Special Issue Applications of Artificial Intelligence and Machine Learning in Cyber Security)

► Show Figures

Figure 1

19 pages, 5717 KB

Open AccessArticle

An Intrusion Detection and Classification System for IoT Traffic with Improved Data Engineering

by Abdulaziz A. Alsulami, Qasem Abu Al-Haija, Ahmad Tayeb and Ali Alqahtani

Appl. Sci. 2022, 12(23), 12336; https://doi.org/10.3390/app122312336 - 2 Dec 2022

Cited by 62 | Viewed by 8734

Abstract

Nowadays, the Internet of Things (IoT) devices and applications have rapidly expanded worldwide due to their benefits in improving the business environment, industrial environment, and people’s daily lives. However, IoT devices are not immune to malicious network traffic, which causes potential negative consequences [...] Read more.

Nowadays, the Internet of Things (IoT) devices and applications have rapidly expanded worldwide due to their benefits in improving the business environment, industrial environment, and people’s daily lives. However, IoT devices are not immune to malicious network traffic, which causes potential negative consequences and sabotages IoT operating devices. Therefore, developing a method for screening network traffic is necessary to detect and classify malicious activity to mitigate its negative impacts. This research proposes a predictive machine learning model to detect and classify network activity in an IoT system. Specifically, our model distinguishes between normal and anomaly network activity. Furthermore, it classifies network traffic into five categories: normal, Mirai attack, denial of service (DoS) attack, Scan attack, and man-in-the-middle (MITM) attack. Five supervised learning models were implemented to characterize their performance in detecting and classifying network activities for IoT systems. This includes the following models: shallow neural networks (SNN), decision trees (DT), bagging trees (BT), k-nearest neighbor (kNN), and support vector machine (SVM). The learning models were evaluated on a new and broad dataset for IoT attacks, the IoTID20 dataset. Besides, a deep feature engineering process was used to improve the learning models’ accuracy. Our experimental evaluation exhibited an accuracy of 100% recorded for the detection using all implemented models and an accuracy of 99.4–99.9% recorded for the classification process. Full article

(This article belongs to the Special Issue Applications of Artificial Intelligence and Machine Learning in Cyber Security)

► Show Figures

Figure 1

20 pages, 2769 KB

Open AccessArticle

FCNN-SE: An Intrusion Detection Model Based on a Fusion CNN and Stacked Ensemble

by Chen Chen, Yafei Song, Shaohua Yue, Xiaodong Xu, Lihua Zhou, Qibin Lv and Lintao Yang

Appl. Sci. 2022, 12(17), 8601; https://doi.org/10.3390/app12178601 - 27 Aug 2022

Cited by 29 | Viewed by 3353

Abstract

As a security defense technique to protect networks from attacks, a network intrusion detection model plays a crucial role in the security of computer systems and networks. Aiming at the shortcomings of a complex feature extraction process and insufficient information extraction of the [...] Read more.

As a security defense technique to protect networks from attacks, a network intrusion detection model plays a crucial role in the security of computer systems and networks. Aiming at the shortcomings of a complex feature extraction process and insufficient information extraction of the existing intrusion detection models, an intrusion detection model named the FCNN-SE, which uses the fusion convolutional neural network (FCNN) for feature extraction and stacked ensemble (SE) for classification, is proposed in this paper. The proposed model mainly includes two parts, feature extraction and feature classification. Multi-dimensional features of traffic data are first extracted using convolutional neural networks of different dimensions and then fused into a network traffic dataset. The heterogeneous base learners are combined and used as a classifier, and the obtained network traffic dataset is fed to the classifier for final classification. The comprehensive performance of the proposed model is verified through experiments, and experimental results are evaluated using a comprehensive performance evaluation method based on the radar chart method. The comparison results on the NSL-KDD dataset show that the proposed FCNN-SE has the highest overall performance among all compared models, and a more balanced performance than the other models. Full article

(This article belongs to the Special Issue Applications of Artificial Intelligence and Machine Learning in Cyber Security)

► Show Figures

Figure 1

Journal Menu

Journal Browser

Applications of Artificial Intelligence and Machine Learning in Cyber Security

Share This Special Issue

Special Issue Editors

Special Issue Information

Benefits of Publishing in a Special Issue

Published Papers (8 papers)

Research

Further Information

Guidelines

MDPI Initiatives

Follow MDPI