A Data Enhancement Algorithm for DDoS Attacks Using IoT
Abstract
:1. Introduction
2. Related Works
Work | Year | Algorithms | Dataset | Results |
---|---|---|---|---|
[10] | 2022 | GMM-SMOTE | UCI | On average, the AUC value has been improved by 6.09%. |
[11] | 2017 | K-means SMOTE | UCI | To a certain extent, has addressed the issue of noise and has alleviated intra-class imbalance. |
[12] | 2023 | RUCSMOTE | KEEL | AUC and GM have generally increased by 2 to 7 percentage points. |
[13] | 2021 | GSMOTEBoost | KEEL | AUC and GM have generally improved by 1 to 3 percentage points. |
[14] | 2022 | HDP-SMOTE | NSL-KDD UNSW-NB15 | F1 score and GM have generally shown an improvement of 1 to 6 percentage points. |
[15] | 2019 | KDE | [15] | F1 score and GM have demonstrated a general enhancement ranging from 0.6 to 7 percentage points. |
[16] | 2011 | NDO | UCI | The computational complexity has been reduced. |
[17] | 2021 | GK-Means | [17] | The F1-score and accuracy have exhibited a general improvement ranging from 1 to 6 percentage points. |
[18] | 2019 | PDE-SMOTE | UCL KEEL | The F1-score and GM have experienced a general enhancement ranging from 1 to 3 percentage points. |
[19] | 2020 | SGM | UNSW-NB15 | The detection rate has reached 99.74% in binary classification and 96.54% in multi-class classification. |
3. Background
3.1. SMOTE
3.2. K-Means SMOTE
3.3. Gaussian Probability Distribution
4. KG-SMOTE
5. Experiments
5.1. Dataset
5.2. Experimental Environment and Evaluation Index
5.3. Experimental Results and Analysis
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- White Paper on IoT Operating System Security; China Communications Standards Association: Beijing, China, 2022.
- CNCERT Internet Security Threat Report; National Internet Emergency Response Center: Beijing, China, 2022.
- Zhang, Y.; Zhang, T.; Chen, J.; Wang, Y.; Zhou, Q. Network Intrusion Detection Based on SMOTE and Machine Learning. J. Beijing Inst. Technol. 2019, 39, 1258–1262. [Google Scholar] [CrossRef]
- Li, L.; Yu, Y.; Bai, S.; Hou, Y.; Hao, Y. Research on Intrusion Detection Method Based on Second Training Techniques. J. Beijing Inst. Technol. 2017, 37, 1246–1252. [Google Scholar] [CrossRef]
- Li, Y.; Chai, Y.; Hu, Y.; Yin, H. A Review of Unbalanced Data Classification Methods. Control. Decis. Mak. 2019, 34, 673–688. [Google Scholar] [CrossRef]
- Yu, Y.L.; Jiang, K.Z.; Wang, K.; Sheng, J. An Improved Undersampling Algorithm for Unbalanced Data in K-means Clustering. Softw. Guide 2020, 19, 205–209. [Google Scholar]
- Xue, W.; Zhang, J. Dealing with Imbalanced Dataset: A Resampling Method Based on the Improved SMOTE Algorithm. Commun. Stat.-Simul. Comput. 2016, 45, 1160–1172. [Google Scholar] [CrossRef]
- He, H.; Garcia, E.A. Learning from Imbalanced Data. IEEE Trans. Knowl. Data Eng. 2009, 21, 1263–1284. [Google Scholar]
- Chawla, N.V.; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P. SMOTE: Synthetic Minority Over-Sampling Technique. J. Artif. Intell. Res. 2002, 16, 321–357. [Google Scholar] [CrossRef]
- Yehui, T.; Shouwei, Z. An Improved SMOTE Algorithm Based on Gaussian Hybrid Clustering for Unbalanced Data. Softw. Guide 2022, 21, 110–114. [Google Scholar]
- Last, F.; Douzas, G.; Bacao, F. Oversampling for Imbalanced Learning Based on K-Means and SMOTE. arXiv 2017, arXiv:1711.00837. [Google Scholar]
- Shen, Z.; Hua, X.; Jinhai, C. Resampling Algorithm for Unbalanced Data. Small Microcomput. Syst. 2023; online ahead of print. [Google Scholar] [CrossRef]
- Zhang, Z.L.; Chen, Y.Y.; Tang, J.Y.; Luo, X. A Gaussian Oversampling-based Ensemble Learning Algorithm. Syst. Eng. Theory Pract. 2021, 41, 513–523. [Google Scholar]
- Jiang, Z.; Qian, Y.; Zhang, S. An Oversampling Method for Intrusion Detection Data Based on Highest Density Points. Comput. Simul. 2022, 39, 391–398. [Google Scholar]
- Kamalov, F. Kernel Density Estimation Based Sampling for Imbalanced Class Distribution. Inf. Sci. 2020, 512, 1192–1201. [Google Scholar] [CrossRef]
- Zhang, H.; Wang, Z. A Normal Distribution-Based Over-Sampling Approach to Imbalanced Data Classification. In Advanced Data Mining and Applications, Proceedings of the 7th International Conference, ADMA 2011, Beijing, China, 17–19 December 2011; Part I; Springer Berlin Heidelberg: Berlin, Germany, 2011; pp. 83–96. [Google Scholar]
- Hassan, M.M.; Eesa, A.S.; Mohammed, A.J.; Arabo, W.K. Oversampling method based on Gaussian distribution and K-Means clustering. Comput. Mater. Contin. 2021, 69, 451–469. [Google Scholar]
- Li, T.; Zheng, S.; Zou, H.; Qin, L.; Yang, J. Research on Improved SMOTE Algorithm Based on Probability Density Estimation. J. Nanjing Normal Univ. (Nat. Sci. Ed.) 2019, 42, 65–72. [Google Scholar]
- Zhang, H.; Huang, L.; Wu, C.Q.; Li, Z. An Effective Convolutional Neural Network Based on SMOTE and Gaussian Mixture Model for Intrusion Detection in Imbalanced Dataset. Comput. Netw. 2020, 177, 107315. [Google Scholar] [CrossRef]
- Hassan, M.M. Bayesian Sensitivity Analysis to Quantifying Uncertainty in a Dendroclimatology Model. In Proceedings of the 2018 International Conference on Advanced Science and Engineering (ICOASE), Duhok, Iraq, 9–11 October 2018; IEEE: Toulouse, France, 2018; pp. 363–368. [Google Scholar]
- Zhong, Z.F.; Li, M.H.; Zhang, Y. Improvement of K-Means Algorithm for Adaptive K Value in Machine Learning. Comput. Eng. Des. 2021, 42, 136–141. [Google Scholar]
- Yi, Q.; Xiangyu, L.; Yanhui, D. MBB-IoT: Construction and Evaluation of IoT DDoS Traffic Dataset from a New Perspective. Comput. Mater. Continua. 2023; in press. [Google Scholar]
- Zou, Q.; Xie, S.; Lin, Z.; Wu, M.; Ju, Y. Finding the Best Classification Threshold in Imbalanced Classification. Big Data Res. 2016, 5, 2–8. [Google Scholar] [CrossRef]
Attack Type | Sample Size | IR |
---|---|---|
Benign | 537,052 | |
BASHLITE TCP | 3268 | 1:164 |
BASHLITE UDP | 1410 | 1:380 |
BASHLITE RandHex | 964 | 1:557 |
BASHLITE UDPHex | 699 | 1:768 |
mirai greip | 1332 | 1:403 |
mirai UDP | 1258 | 1:426 |
mirai syn | 2155 | 1:249 |
mirai http | 1786 | 1:300 |
mirai greeth | 1493 | 1:359 |
High Categrocal Features | Meaningless Label |
---|---|
bidirectional_ece_packets, src2dst_cwr_packets, bidirectional_cwr_packets, src2dst_ece_packets, if_same_vlan_or_not, dst2src_urg_packets, dst2src_ece_packets, dst2src_cwr_packets, vlan_id | id |
Experimental Platforms | Environmental Configuration |
---|---|
Operating Systems | Fedora Linux 35 (Workstation Edition) |
CPU | Intel(R) Xeon(R) Gold 6346 CPU @ 3.10 GHz |
GPU | NVIDIA-SMI 520.61.05 |
RAM | 32 GB |
Programming Language | Python3.9.12 |
torch | 2.0.1 + cu118 |
scikit-learn | 1.2.2 |
Index | Category | NR | SMOTE | Borderline SMOTE | K-Means SMOTE | KG-SMOTE |
---|---|---|---|---|---|---|
Precision | Benign | 0.9983 | 0.9998 | 0.9998 | 0.9985 | 0.9987 |
BASHLITE TCP | 0.9853 | 0.9990 | 0.9990 | 1.0000 | 0.9983 | |
BASHLITE UDP | 0.9169 | 0.8157 | 0.7939 | 0.9299 | 0.9324 | |
BASHLITE RandHex | 0.9647 | 0.7247 | 0.7403 | 0.9680 | 0.9704 | |
BASHLITE UDPHex | 0.7633 | 0.7371 | 0.7262 | 0.8699 | 0.9727 | |
mirai greip | 0.9730 | 0.6599 | 0.6618 | 0.9821 | 0.9750 | |
mirai UDP | 0.9964 | 0.6401 | 0.6533 | 0.9626 | 0.9991 | |
mirai syn | 0.9193 | 0.6653 | 0.6487 | 0.9429 | 0.9629 | |
mirai http | 0.8964 | 0.7005 | 0.7368 | 0.9556 | 0.9700 | |
mirai greeth | 0.9262 | 0.6537 | 0.6362 | 0.9444 | 0.9811 |
Index | Category | NR | SMOTE | Borderline SMOTE | K-Means SMOTE | KG-SMOTE |
---|---|---|---|---|---|---|
Recall | Benign | 0.9985 | 0.9988 | 0.9987 | 0.9991 | 0.9991 |
BASHLITE TCP | 0.9904 | 0.9990 | 0.9969 | 0.9946 | 0.9922 | |
BASHLITE UDP | 0.8203 | 0.8389 | 0.8673 | 0.9009 | 0.9469 | |
BASHLITE RandHex | 0.9679 | 0.8007 | 0.7972 | 0.9556 | 0.9568 | |
BASHLITE UDPHex | 0.8208 | 0.6615 | 0.6256 | 0.9556 | 0.9619 | |
mirai udp | 0.9295 | 0.7473 | 0.7367 | 0.9419 | 0.9812 | |
mirai greip | 0.9654 | 0.6223 | 0.6461 | 0.9552 | 0.9718 | |
mirai syn | 0.8652 | 0.7661 | 0.7370 | 0.9276 | 0.9188 | |
mirai http | 0.9565 | 0.6832 | 0.7186 | 0.9374 | 0.9712 | |
mirai greeth | 0.9498 | 0.6867 | 0.6867 | 0.8921 | 0.9723 |
Index | Category | NR | SMOTE | Borderline SMOTE | K-Means SMOTE | KG-SMOTE |
---|---|---|---|---|---|---|
F1-score | Benign | 0.9985 | 0.9993 | 0.9992 | 0.9988 | 0.9989 |
BASHLITE TCP | 0.8878 | 0.9990 | 0.9979 | 0.9973 | 0.9952 | |
BASHLITE UDP | 0.7659 | 0.8271 | 0.8290 | 0.9151 | 0.9396 | |
BASHLITE RandHex | 0.8663 | 0.7608 | 0.7677 | 0.9618 | 0.9635 | |
BASHLITE UDPHex | 0.7910 | 0.6973 | 0.6722 | 0.9107 | 0.9673 | |
mirai udp | 0.8618 | 0.6896 | 0.6925 | 0.9521 | 0.9901 | |
mirai greip | 0.8692 | 0.6406 | 0.6538 | 0.9685 | 0.9734 | |
mirai syn | 0.8914 | 0.7122 | 0.6901 | 0.9352 | 0.9403 | |
mirai http | 0.8254 | 0.6918 | 0.7276 | 0.9464 | 0.9706 | |
mirai greeth | 0.8378 | 0.6698 | 0.6605 | 0.9175 | 0.9767 |
Index | Category | NR | SMOTE | Borderline SMOTE | K-Means SMOTE | KG-SMOTE |
---|---|---|---|---|---|---|
AUC | Benign | 0.9983 | 0.9975 | 0.9967 | 0.9808 | 0.9830 |
BASHLITE TCP | 0.9251 | 0.9994 | 0.9984 | 0.9972 | 0.9960 | |
BASHLITE UDP | 0.9100 | 0.9190 | 0.9332 | 0.9503 | 0.9733 | |
BASHLITE RandHex | 0.9838 | 0.8999 | 0.8982 | 0.9777 | 0.9783 | |
BASHLITE UDPHex | 0.9101 | 0.8305 | 0.8126 | 0.9776 | 0.9809 | |
mirai udp | 0.9647 | 0.8729 | 0.8677 | 0.9708 | 0.9906 | |
mirai greip | 0.9826 | 0.8105 | 0.8224 | 0.9775 | 0.9858 | |
mirai syn | 0.9323 | 0.8819 | 0.8673 | 0.9636 | 0.9592 | |
mirai http | 0.9779 | 0.8408 | 0.8586 | 0.9686 | 0.9855 | |
mirai greeth | 0.9747 | 0.8427 | 0.8426 | 0.9459 | 0.9860 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Lv, H.; Du, Y.; Zhou, X.; Ni, W.; Ma, X. A Data Enhancement Algorithm for DDoS Attacks Using IoT. Sensors 2023, 23, 7496. https://doi.org/10.3390/s23177496
Lv H, Du Y, Zhou X, Ni W, Ma X. A Data Enhancement Algorithm for DDoS Attacks Using IoT. Sensors. 2023; 23(17):7496. https://doi.org/10.3390/s23177496
Chicago/Turabian StyleLv, Haibin, Yanhui Du, Xing Zhou, Wenkai Ni, and Xingbang Ma. 2023. "A Data Enhancement Algorithm for DDoS Attacks Using IoT" Sensors 23, no. 17: 7496. https://doi.org/10.3390/s23177496
APA StyleLv, H., Du, Y., Zhou, X., Ni, W., & Ma, X. (2023). A Data Enhancement Algorithm for DDoS Attacks Using IoT. Sensors, 23(17), 7496. https://doi.org/10.3390/s23177496