Improved KNN Algorithm for Fine-Grained Classification of Encrypted Network Flow
Abstract
:1. Introduction
2. Related Work
3. Improvement of K-Nearest Neighbor (KNN) Algorithm
3.1. Weighted-Feature KNN
Algorithm 1: WKNN |
3.2. Feature Selection and Feature Weight Self-Adaptive Algorithm for Weighted Feature KNN (WKNN)
Algorithm 2: WKNN-Selfada |
- Step 1: obtain the ranks of each feature based on the feature distances . For example, if , the ranks of the features are (3,1,2), where the ith element of rank means the rank of the ith feature according to the feature distance from small to large.
- Step 2: obtain the parameter , which is the update ratio of the weights. It is set to the ratio of the smallest weighted feature-based point distance in this loop, min(kDistances), to the weighted feature-based point distance between this decision sample and the target sample, kDistances[i].
- Step 3: calculate the denominator of the new weights, . The denominator increment is given by , where is the update ratio and m is the dimension of the feature vector. Thus, the new denominator is .
- Step 4: calculate the molecular of the new weights, . The molecular increments are given by , where i is the feature index. Thus, the new molecular are , where is the weight’s molecular of the ith feature.
- Step 5: calculate the new feature weights, . The sum of all weights equal to 1 under any situation.
4. Fine-Grained Classification of Encrypted Network Flows
4.1. Description of Classification Framework
4.2. Design of Candidate Feature Set
4.3. Fine-Grained Classification Method
Algorithm 3: FCE-KNN |
5. Experiments and Evaluation
5.1. Setup
5.2. Metric
5.3. Experiments and Results
5.3.1. Identification of Encryption Status of Network Flows
5.3.2. Identification of Application Type of Encrypted Network Flows
5.3.3. Identification of Content Type of Encrypted Network Flows
5.4. Analysis of Time Complexity and Consumption
5.5. Discussion
6. Conclusions
Author Contributions
Funding
Acknowledgments
Conflicts of Interest
References
- Dhote, Y.; Agrawal, S.; Deen, A.J. A Survey on Feature Selection Techniques for Internet Traffic Classification. In Proceedings of the International Conference on Computational Intelligence & Communication Networks (CICN), Madhya Pradesh, India, 12–14 December 2015; IEEE: Piscataway, NJ, USA, 2015; pp. 1375–1380. [Google Scholar]
- Rathore, M.M.; Ahmad, A.; Paul, A.; Rho, S. Exploiting encrypted and tunneled multimedia calls in high-speed big data environment. Multimed. Tools Appl. 2018, 77, 4959–4984. [Google Scholar] [CrossRef]
- Velan, P.; Čermák, M.; Čeleda, P.; Drašar, M. A survey of methods for encrypted traffic classification and analysis. Int. J. Netw. Manag. 2015, 25, 355–374. [Google Scholar] [CrossRef]
- Hirvonen, M.; Sailio, M. Two-Phased Method for Identifying SSH Encrypted Application Flows. In Proceedings of the 7th International Wireless Communications and Mobile Computing Conference (IWCMC), Istanbul, Turkey, 4–8 July 2011; IEEE: Piscataway, NJ, USA, 2011; pp. 1033–1038. [Google Scholar]
- Wang, Z. The applications of deep learning on traffic identification. BlackHat 2015, 24, 1–10. [Google Scholar]
- Tan, X.; Xie, Y.; Ma, H.; Yu, S.; Hu, J. Recognizing the content types of network traffic based on a hybrid DNN-HMM model. J. Netw. Comput. Appl. 2019, 142, 51–62. [Google Scholar] [CrossRef]
- Cao, Z.; Xiong, G.; Zhao, Y.; Li, Z.; Guo, L. A Survey on Encrypted Traffic Classification. In Proceedings of the 2014 International Conference on Applications and Techniques in Information Security (ATIS), Melbourne, VIC, Australia, 26–28 November 2014; Springer: Berlin/Heidelberg, Germany, 2014; pp. 73–81. [Google Scholar]
- Alshammari, R.; Zincir-Heywood, A.N. Machine Learning Based Encrypted Traffic Classification: Identifying ssh and Skype. In Proceedings of the 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications (CISDA), Ottawa, ON, Canada, 8–10 July 2009; IEEE: Piscataway, NJ, USA, 2009; pp. 1–8. [Google Scholar]
- Usama, M.; Qadir, J.; Raza, A.; Arif, H.; Yau, K.A.; Elkhatib, Y.; Hussain, A.; Al-Fuqaha, A. Unsupervised machine learning for networking: Techniques, applications and research challenges. IEEE Access 2019, 7, 65579–65615. [Google Scholar] [CrossRef]
- Limthong, K.; Fukuda, K.; Ji, Y.; Yamada, S. Unsupervised learning model for real-time anomaly detection in computer networks. IEICE Trans. Inf. Syst. 2014, 97, 2084–2094. [Google Scholar] [CrossRef] [Green Version]
- Suthaharan, S. Big data classification: Problems and challenges in network intrusion prediction with machine learning. ACM Sigmetrics Perform. Eval. Rev. 2014, 41, 70–73. [Google Scholar] [CrossRef]
- Krawczyk, B.; Minku, L.L.; Gama, J.; Stefanowski, J.; Wozniak, M. Ensemble learning for data stream analysis: A survey. Inf. Fusion 2017, 37, 132–156. [Google Scholar] [CrossRef]
- Thanh Noi, P.; Kappas, M. Comparison of random forest, k-nearest neighbor, and support vector machine classifiers for land cover classification using Sentinel-2 imagery. Sensors 2018, 18, 18. [Google Scholar] [CrossRef] [Green Version]
- Oehmcke, S.; Zielinski, O.; Kramer, O. KNN Ensembles with Penalized DTW for Multivariate Time Series Imputation. In Proceedings of the 2016 International Joint Conference on Neural Networks (IJCNN), Vancouver, BC, Canada, 24–29 July 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 2774–2781. [Google Scholar]
- Liu, Z.; Wang, R.; Japkowicz, N.; Cai, Y.; Tang, D.; Cai, X. Mobile app traffic flow feature extraction and selection for improving classification robustness. J. Netw. Comput. Appl. 2019, 125, 190–208. [Google Scholar] [CrossRef]
- Draper-Gil, G.; Lashkari, A.H.; Mamun, M.S.I.; Ghorbani, A.A. Characterization of Encrypted and Vpn Traffic Using Time-Related. In Proceedings of the 2nd International Conference on Information Systems Security and Privacy (ICISSP), Rome, Italy, 19–21 February 2016; pp. 407–414. [Google Scholar]
- Yin, C.; Wang, H.; Wang, J. Network Data Stream Classification by Deep Packet Inspection and Machine Learning. In Advanced Multimedia and Ubiquitous Engineering, Proceedings of the FutureTech 2018, Salerno, Italy, 23–25 April 2018; Springer: Berlin/Heidelberg, Germany, 2018; pp. 245–251. [Google Scholar]
- Sherry, J.; Lan, C.; Popa, R.A.; Ratnasamy, S. Blindbox: Deep packet inspection over encrypted traffic. ACM SIGCOMM Comput. Commun. Rev. 2015, 45, 213–226. [Google Scholar] [CrossRef]
- Meiners, C.; Norige, E.; Liu, A.X.; Torng, E. Flowsifter: A counting Automata Approach to Layer 7 Field Extraction for Deep Flow Inspection. In Proceedings of the IEEE INFOCOM, Orlando, FL, USA, 25–30 March 2012; IEEE: Piscataway, NJ, USA, 2012; pp. 1746–1754. [Google Scholar]
- Zeng, X.; Chen, X.; Shao, G.; He, T.; Han, Z.; Wen, Y.; Wang, Q. Flow Context and Host Behavior Based Shadowsocks’s Traffic Identification. IEEE Access 2019, 7, 41017–41032. [Google Scholar] [CrossRef]
- Zhu, H.; Zhu, L. Online and automatic identification of encryption network behaviors in big data environment. Pract. Exp. 2019, 31, e4849. [Google Scholar]
- Zygmunt, M.; Konieczny, M.; Zielinski, S. Accuracy of Statistical Machine Learning Methods in Identifying Client Behavior Patterns at Network Edge. In Proceedings of the 42nd International Conference on Telecommunications and Signal Processing (TSP), Budapest, Hungary, 3–5 July 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 575–579. [Google Scholar]
- An, H.M.; Lee, S.K.; Ham, J.H.; Kin, M. Traffic Identification Based on Applications using Statistical Signature Free from Abnormal TCP Behavior. J. Inf. Sci. Eng. 2015, 31, 1669–1692. [Google Scholar]
- Wang, H.; Qian, C.; Yu, Y.; Yang, H.; Lam, S.S. Practical network-wide packet behavior identification by AP classifier. IEEE/ACM Trans. Netw. 2017, 25, 2886–2899. [Google Scholar] [CrossRef]
- Zhu, A. A P2P Network Traffic Classification Method Based on C4. 5 Decision Tree Algorithm. In Proceedings of the 9th International Symposium on Linear Drives for Industry Applications (LDIA), Hangzhou, China, 7–10 January 2013; Springer: Berlin/Heidelberg, Germany, 2014; pp. 373–379. [Google Scholar]
- Linping, S.; Hongtao, M.; Yunlang, M.; Rui, C. The Research of Classified Method of the Network Traffic in Security Access Platform Based on Decision Tree. In Proceedings of the 7th IEEE International Conference on Software Engineering & Service Science (ISESS), Beijing, China, 26–28 August 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 475–480. [Google Scholar]
- Fries, T.P. Classification of Network Traffic Using Fuzzy Clustering for Network Security. In Proceedings of the Industrial Conference on Data Mining (ICDM), New York, NY, USA, 12–13 July 2017; Volume 1, pp. 278–285. [Google Scholar]
- Kim, J.; Sim, A. A New Approach to Multivariate Network Traffic Analysis. J. Comput. Sci. Technol. 2019, 34, 388–402. [Google Scholar] [CrossRef] [Green Version]
- Cha, S.; Kim, H. Detecting Encrypted Traffic: A Machine Learning Approach. In Proceedings of the International Workshop on Information Security Application (WISA), Jeju Island, Korea, 25–27 August 2016; pp. 54–65. [Google Scholar]
- Vargas-Muñoz, M.J.; Martínez-Peláez, R.; Velarde-Alvarado, P.; Moreno-García, E.; Torres-Roman, D.L.; Ceballos-Mejía, J.J. Classification of Network Anomalies in Flow Level Network Traffic Using Bayesian Networks. In Proceedings of the 2018 International Conference on Electronics, Communications and Computers, Cholula Puebla, Mexico, 21–23 February 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 238–243. [Google Scholar]
- Sun, G.; Chen, T.; Su, Y.; Li, C. Internet traffic classification based on incremental support vector machines. Mob. Netw. Appl. 2018, 23, 789–796. [Google Scholar] [CrossRef]
- Gómez, S.E.; Martínez, B.C.; Sánchez-Esguevillas, A.J.; Callejo, L.H. Ensemble network traffic classification: Algorithm comparison and novel ensemble scheme proposal. Comput. Netw. 2017, 127, 68–80. [Google Scholar] [CrossRef]
- De Souza, E.N.; Matwin, S.; Fernandes, S. Network Traffic Classification Using AdaBoost Dynamic. In Proceedings of the IEEE International Conference on Communications Workshops (ICC), Budapest, Hungary, 9–13 June 2013; IEEE: Piscataway, NJ, USA, 2013; pp. 1319–1324. [Google Scholar]
- Yang, Y.; Kang, C.; Gou, G.; Li, Z.; Xiong, G. TLS/SSL Encrypted Traffic Classification with Autoencoder and Convolutional Neural Network. In Proceedings of the IEEE 20th International Conference on High Performance Computing and Communications; IEEE 16th International Conference on Smart City; IEEE 4th International Conference on Data Science and Systems (HPCC/SmartCity/DSS), Mercure Exeter, UK, 28–30 June 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 362–369. [Google Scholar]
- Zou, Z.; Ge, J.; Zheng, H.; Han, C.; Yao, Z. Encrypted Traffic Classification with a Convolutional Long Short-Term Memory Neural Network. In Proceedings of the IEEE 20th International Conference on High Performance Computing and Communications; IEEE 16th International Conference on Smart City; IEEE 4th International Conference on Data Science and Systems (HPCC/SmartCity/DSS), Mercure Exeter, UK, 28–30 June 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 329–334. [Google Scholar]
- Aceto, G.; Ciuonzo, D.; Montieri, A.; Pescape, A. MIMETIC: Mobile encrypted traffic classification using multimodal deep learning. Comput. Netw. 2019, 165, 106944. [Google Scholar] [CrossRef]
- Aceto, G.; Ciuonzo, D.; Montieri, A.; Pescape, A. Mobile encrypted traffic classification using deep learning: Experimental evaluation, lessons learned, and challenges. IEEE Trans. Netw. Serv. Manag. 2019, 16, 445–458. [Google Scholar] [CrossRef]
- Aceto, G.; Ciuonzo, D.; Montieri, A.; Pescape, A. Multi-classification approaches for classifying mobile app traffic. J. Netw. Comput. Appl. 2018, 103, 131–145. [Google Scholar] [CrossRef]
- Aceto, G.; Ciuonzo, D.; Montieri, A.; Pescape, A. Mobile encrypted traffic classification using deep learning. In Proceedings of the Network Traffic Measurement and Analysis Conference (TMA), Austria, Vienna, 26–29 June 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 1–8. [Google Scholar]
- Lotfollahi, M.; Siavoshani, M.J.; Zade, R.S.H.; Saberian, M. Deep packet: A novel approach for encrypted traffic classification using deep learning. Soft Comput. 2020, 24, 1999–2012. [Google Scholar] [CrossRef] [Green Version]
- Zeng, Y.; Gu, H.; Wei, W.; Guo, Y. Deep-Full-Range: A Deep Learning Based Network Encrypted Traffic Classification and Intrusion Detection Framework. IEEE Access 2019, 7, 45182–45190. [Google Scholar] [CrossRef]
- Song, M.; Ran, J.; Li, S. Encrypted Traffic Classification Based on Text Convolution Neural Networks. In Proceedings of the IEEE 7th International Conference on Computer Science and Network Technology (ICCSNT), Dalian, China, 19–20 October 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 432–436. [Google Scholar]
- Rezaei, S.; Liu, X. Multitask Learning for Network Traffic Classification. Available online: https://arxiv.org/abs/1906.05248 (accessed on 10 February 2020).
- Sun, G.; Liang, L.; Chen, T.; Xiao, F.; Lang, F. Network traffic classification based on transfer learning. Comput. Electr. Eng. 2018, 69, 920–927. [Google Scholar] [CrossRef]
- Sun, G.; Li, S.; Chen, T.; Li, X.; Zhu, S. Active Learning Method for Chinese Spam Filtering. Int. J. Perform. Eng. 2017, 13, 511–518. [Google Scholar] [CrossRef]
- Zhu, H.; Zhu, L.; Shen, M.; Khan, S. Online and automatic identification and mining of encryption network behavior in big data environment. J. Intell. Fuzzy Syst. 2018, 34, 1111–1119. [Google Scholar]
- Wang, W.; Zhu, M.; Wang, J.; Zeng, X.; Yang, Z. End-to-End Encrypted Traffic Classification with One-Dimensional Convolution Neural Networks. In Proceedings of the IEEE International Conference on Intelligence and Security Informatics (ISI), Beijing, China, 22–24 July 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 43–48. [Google Scholar]
- Wang, W.; Zhu, M.; Zeng, X.; Ye, X.; Sheng, Y. Malware Traffic Classification Using Convolutional Neural Network for Representation Learning. In Proceedings of the International Conference on Information Networking (ICOIN), Da Nang, Vietnam, 11–13 January 2017; pp. 712–717. [Google Scholar]
- Xia, J.; Shen, J.; Wu, Y. A Four-Stage Hybrid Feature Subset Selection Approach for Network Traffic Classification Based on Full Coverage. In Proceedings of the International Conference on Security, Privacy and Anonymity in Computation, Communication and Storage (SpaCCS), Guangzhou, China, 12–15 December 2017; Springer: Berlin/Heidelberg, Germany; pp. 178–191. [Google Scholar]
- Dorfinger, P.; Panholzer, G.; John, W. Entropy Estimation for Real-Time Encrypted Traffic Identification (Short Paper). In Proceedings of the International Workshop on Traffic Monitoring and Analysis (TMA), Vienna, Australia, 27–30 April 2011; Springer: Berlin/Heidelberg, Germany; pp. 164–171. [Google Scholar]
- Shen, M.; Zhang, J.; Zhu, L.; Xu, K.; Du, X.; Liu, Y. Encrypted Traffic Classification of Decentralized Applications on Ethereum Using Feature Fusion. In Proceedings of the IEEE/ACM International Symposium on Quality of Service (IWQoS), Phoenix, AZ, USA, 24–28 June 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 180–190. [Google Scholar]
- Wu, D.; Chen, X.; Chen, C.; Zhang, J.; Xiang, Y.; Zhou, W. On Addressing the Imbalance Problem: A Correlated KNN Approach for Network Traffic Classification. In Proceedings of the International Conference on Network and System Security (NSS), New York, NY, USA, 3–5 November 2015; pp. 138–151. [Google Scholar]
- Zhu, H.; Zhu, L. Encrypted network behaviors identification based on dynamic time warping and k-nearest neighbor. Clust. Comput. 2017, 20, 1–10. [Google Scholar] [CrossRef]
- Carela-Español, V.; Barlet-Ros, P.; Solé-Simó, M.; Dainotti, A.; Donato, W.D.; Pescape, A. K-Dimensional Trees for Continuous Traffic Classification. In Proceedings of the Second International Workshop on Traffic Monitoring and Analysis, Zurich, Switzerland, 7 April 2010; Springer: Berlin/Heidelberg, Germany, 2010; pp. 1–14. [Google Scholar]
- Bar-Yanai, R.; Langberg, M.; Peleg, D.; Roditty, L. Realtime Classification for Encrypted Traffic. In Proceedings of the 9th International Symposium on Experimental Algorithms, Ischia Island, Naples, Italy, 20–22 May 2010; Springer: Berlin/Heidelberg, Germany, 2010; pp. 373–385. [Google Scholar]
- McGaughey, D.; Semeniuk, T.; Smith, R.; Knight, S. A systematic approach of feature selection for encrypted network traffic classification. In Proceedings of the IEEE International Systems Conference (SysCon), Vancouver, BC, Canada, 23–26 April 2018; pp. 1–8. [Google Scholar]
- Dong, Y.; Cao, R.; Zhang, M. A Multi-Objective Evolutionary Algorithm for Multimedia Traffic Classification. In Proceedings of the 2019 IEEE 21st International Conference on High Performance Computing and Communications; IEEE 17th International Conference on Smart City; IEEE 5th International Conference on Data Science and Systems (HPCC/SmartCity/DSS), Zhangjiajie, China, 10–12 August 2019; IEEE: Piscataway, NJ, USA, 2018; pp. 2804–2810. [Google Scholar]
- Saber, A.; Belkacem, F.; Moncef, A. Encrypted Network Traffic Identification: LDA-KNN Approach. In Proceedings of the 9 ème édition du colloque Tendances dans les Applications Mathématiques en Tunisie Algérie et Maroc, Tlemcen, Algeria, 23–27 February 2019; pp. 1–3. [Google Scholar]
- Manju, N.; Harish, B.S.; Prajwal, V. Ensemble Feature Selection and Classification of Internet Traffic using XGBoost Classifier. Int. J. Comput. Netw. Inf. Secur. 2019, 11, 37. [Google Scholar]
- Jamil, H.A. Feature Selection and Machine Learning Classification for Live P2P Traffic. In Proceedings of the International Conference on Industrial Engineering and Operations Management (IEOM), Bangkok, Thailand, 5–7 March 2019; pp. 1–9. [Google Scholar]
- Bugata, P.; Drotár, P. Weighted nearest neighbors feature selection. Knowl. Based Syst. 2019, 163, 749–761. [Google Scholar] [CrossRef]
- Sun, B.; Cheng, W.; Goswami, P.; Bai, G. Short-term traffic forecasting using self-adjusting k-nearest neighbors. IET Intell. Transp. Syst. 2017, 12, 41–48. [Google Scholar] [CrossRef] [Green Version]
- Saleh, A.I.; Talaat, F.M.; Labib, L.M. A hybrid intrusion detection system (HIDS) based on prioritized k-nearest neighbors and optimized SVM classifiers. Artif. Intell. Rev. 2019, 51, 403–443. [Google Scholar] [CrossRef]
- Su, M. Using clustering to improve the KNN-based classifiers for online anomaly network traffic identification. J. Netw. Comput. Appl. 2011, 34, 722–730. [Google Scholar] [CrossRef]
- Ma, Y.; Xie, Q.; Liu, Y.; Xiong, S. A weighted KNN-based automatic image annotation method. Neural Comput. Appl. 2019, 1–12. [Google Scholar] [CrossRef]
- Dong, Y.; Zhao, J.; Jin, J. Novel feature selection and classification of Internet video traffic based on a hierarchical scheme. Comput. Netw. 2017, 119, 102–111. [Google Scholar] [CrossRef]
- Alshammari, R.; Zincir-Heywood, A.N. Can encrypted traffic be identified without port numbers, IP addresses and payload inspection? Comput. Netw. 2011, 55, 1326–1350. [Google Scholar] [CrossRef]
Feature | Calculation Formula | |
---|---|---|
1 | Minimum of packet inter-arrival time | iat_min = min(iat[]) |
2 | Maximum of packet inter-arrival time | iat_max = max(iat[]) |
3 | Mean of packet inter-arrival time | iat_mean = mean(iat[]) |
4 | Standard deviation of packet inter-arrival time | iat_std = std(iat[]) |
5 | Minimum of IP packet bytes | byte_min = min(byte[]) |
6 | Minimum of IP packet bytes | byte_max = max(byte[]) |
7 | Mean of IP packet bytes | byte_mean = mean(byte[]) |
8 | Standard deviation of IP packet bytes | byte_std = std(byte[]) |
9 | Number of IP packet bytes per second | byte_psec = |
10 | Number of packets per second | pac_psec = |
11 | Minimum of forward packet inter-arrival time | fiat_min = min(fiat[]), [] |
12 | Maximum of forward packet inter-arrival time | fiat_max = max(fiat[]) |
13 | Mean of forward packet inter-arrival time | fiat_mean = mean(fiat[]) |
14 | Standard deviation of forward packet inter-arrival time | fiat_std = std(fiat[]) |
15 | Minimum of IP packet bytes of forward packets | fbyte_min = min(fbyte[]), [] |
16 | Minimum of IP packet bytes of forward packets | fbyte_max = max(fbyte[]) |
17 | Mean of IP packet bytes of forward packets | fbyte_mean = mean(fbyte[]) |
18 | Standard deviation of IP packet bytes of forward packets | fbyte_std = std(fbyte[]) |
19 | Number of IP packet bytes of forward packets per second | fbyte_psec = |
20 | Number of forward packets per second | fpac_psec = |
21 | Minimum of backward packet inter-arrival time | biat_min = min(biat[]), [] |
22 | Maximum of backward packet inter-arrival time | biat_max = max(biat[]) |
23 | Mean of backward packet inter-arrival time | biat_mean = mean(biat[]) |
24 | Standard deviation of backward packet inter-arrival time | biat_std = std(biat[]) |
25 | Minimum of IP packet bytes of backward packets | bbyte_min = min(bbyte[]), [] |
26 | Minimum of IP packet bytes of backward packets | bbyte_max = max(bbyte[]) |
27 | Mean of IP packet bytes of backward packets | bbyte_mean = mean(bbyte[]) |
28 | Standard deviation of IP packet bytes of backward packets | bbyte_std = std(bbyte[]) |
29 | Number of IP packet bytes of backward packets per second | bbyte_psec = |
30 | Number of backward packets per second | bpac_psec = |
31 | Number of forward packets for the first 10 packets | |
32 | Number of backward packets for the first 10 packets | |
33 | Number of forward packets for the first 60 packets | |
34 | Number of backward packets for the first 60 packets | |
35 | Ratio of the number of backward packets to forward packets for the first 10 packets | |
36 | Ratio of the number of backward packets to forward packets for the first 60 packets |
Label Type | Label Name and Number | ||
---|---|---|---|
Encryption Status | Encrypted (18468) | Non-Encrypted (262072) | |
Application type (Encrypted flows) | AIM (32) | Email (298) | |
ICQ (31) | FTPS (125) | ||
VoipBuster (1618) | Facebook (2494) | ||
Spotify (137) | BitTorrent (477) | ||
Hangouts (10871) | Netflix (173) | ||
YouTube (213) | Skype (1835) | ||
SFTP (28) | Vimeo (136) | ||
Content type (Encrypted flows) | Chat (4327) | File (1497) | |
VoIP (11,985) | Streaming (659) |
Method | Test on Known-Application Flows | Test on Unknown-Application Flows | ||||||
---|---|---|---|---|---|---|---|---|
Acc (%) | Pre (%) | Rec (%) | F1 | Acc (%) | Pre (%) | Rec (%) | F1 | |
FCE-KNN | 99.34 | 98.56 | 91.31 | 0.94 | 99.30 | 99.37 | 94.69 | 0.96 |
DTW-KNN [53] | 98.99 | 96.71 | 87.68 | 0.91 | 98.20 | 94.46 | 90.12 | 0.92 |
C4.5 [16] | 98.99 | 98.12 | 86.27 | 0.91 | 98.96 | 99.67 | 91.51 | 0.95 |
ADA [32] | 99.11 | 97.87 | 88.41 | 0.92 | 99.06 | 99.99 | 92.08 | 0.95 |
AISVM [31] | 95.01 | 59.17 | 78.01 | 0.67 | 88.10 | 45.94 | 0.31 | 0 |
Method | Precision (%) | ||||||
AIM | BitTorrent | FTPS | Hangouts | ICQ | |||
FCE-KNN | 66.66 | 93.54 | 82.35 | 89.13 | 80.76 | 96.40 | 33.33 |
DTW-KNN [53] | 0 | 86.95 | 78.18 | 86.91 | 45.71 | 95.78 | 0 |
C4.5 [16] | 0 | 77.77 | 0 | 96.06 | 0 | 95.04 | 0 |
ADA [32] | 0 | 85 | 87.5 | 82.79 | 84.61 | 95.39 | 0 |
AISVM [31] | 0 | 0 | 0 | 0 | 0 | 72.02 | 0 |
Method | Precision (%) | ||||||
Netflix | SFTP | Skype | Spotify | Vimeo | VoipBuster | YouTube | |
FCE-KNN | 52.5 | 0 | 83.84 | 45.45 | 50 | 98.15 | 83.33 |
DTW-KNN [53] | 29.41 | 0 | 76.22 | 35.29 | 60 | 98.70 | 70.73 |
C4.5 [16] | 62.5 | 0 | 47.39 | 41.66 | 0 | 95.93 | 0 |
ADA [32] | 59.25 | 0 | 72.41 | 57.14 | 32.60 | 97.22 | 80 |
AISVM [31] | 0 | 0 | 28.69 | 0 | 0 | 84.59 | 0 |
Method | Recall (%) | ||||||
AIM | BitTorrent | FTPS | Hangouts | ICQ | |||
FCE-KNN | 33.33 | 91.57 | 93.33 | 87.17 | 84 | 97.33 | 16.66 |
DTW-KNN [53] | 0 | 84.21 | 71.66 | 85.17 | 64 | 97.19 | 0 |
C4.5 [16] | 0 | 66.31 | 0 | 73.34 | 0 | 96.27 | 0 |
ADA [32] | 0 | 71.57 | 70 | 87.77 | 44 | 97.24 | 0 |
AISVM [31] | 0 | 0 | 0 | 0 | 0 | 96.18 | 0 |
Method | Recall (%) | ||||||
Netflix | SFTP | Skype | Spotify | Vimeo | VoipBuster | YouTube | |
FCE-KNN | 60 | 0 | 82.01 | 37.03 | 59.25 | 98.45 | 69.76 |
DTW-KNN [53] | 42.85 | 0 | 76.02 | 44.44 | 33.33 | 94.44 | 67.44 |
C4.5 [16] | 14.28 | 0 | 89.10 | 18.51 | 0 | 94.75 | 0 |
ADA [32] | 45.71 | 0 | 74.38 | 29.62 | 55.55 | 97.22 | 37.20 |
AISVM [31] | 0 | 0 | 34.87 | 0 | 0 | 86.41 | 0 |
Method | F1-Score | ||||||
Aim | Bittorrent | Ftps | Hangouts | Icq | |||
FCE-KNN | 0.44 | 0.92 | 0.87 | 0.88 | 0.82 | 0.96 | 0.22 |
DTW-KNN [53] | 0 | 0.85 | 0.74 | 0.86 | 0.53 | 0.96 | 0 |
C4.5 [16] | 0 | 0.71 | 0 | 0.83 | 0 | 0.95 | 0 |
ADA [32] | 0 | 0.77 | 0.77 | 0.85 | 0.57 | 0.96 | 0 |
AISVM [31] | 0 | 0 | 0 | 0 | 0 | 0.83 | 0 |
Method | F1-Score | ||||||
Netflix | Sftp | Skype | Spotify | Vimeo | Voipbuster | Youtube | |
FCE-KNN | 0.56 | 0 | 0.82 | 0.40 | 0.54 | 0.98 | 0.75 |
DTW-KNN [53] | 0.34 | 0 | 0.76 | 0.39 | 0.42 | 0.96 | 0.69 |
C4.5 [16] | 0.23 | 0 | 0.61 | 0.25 | 0 | 0.95 | 0 |
ADA [32] | 0.51 | 0 | 0.73 | 0.39 | 0.41 | 0.97 | 0.50 |
AISVM [31] | 0 | 0 | 0.31 | 0 | 0 | 0.85 | 0 |
Method | Chat | File | Streaming | VoIP | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
P(%) | R(%) | F1 | P(%) | R(%) | F1 | P(%) | R(%) | F1 | P(%) | R(%) | F1 | |
FCE-KNN | 95.78 | 96.99 | 0.96 | 88.85 | 90.63 | 0.89 | 91.72 | 92.42 | 0.92 | 98.82 | 98.08 | 0.98 |
DTW-KNN [53] | 93.68 | 94.22 | 0.93 | 82.43 | 81.60 | 0.82 | 80.62 | 78.78 | 0.79 | 97.03 | 97.07 | 0.97 |
C4.5 [16] | 97.64 | 91.10 | 0.94 | 64.10 | 83.61 | 0.72 | 82.14 | 52.27 | 0.63 | 96.01 | 96.62 | 0.96 |
ADA [32] | 97.04 | 94.80 | 0.95 | 80.93 | 80.93 | 0.80 | 83.33 | 75.75 | 0.79 | 97.15 | 98.45 | 0.97 |
AISVM [31] | 0 | 0 | 0 | 50.30 | 27.42 | 0.35 | 0 | 0 | 0 | 66.74 | 98.28 | 0.79 |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Ma, C.; Du, X.; Cao, L. Improved KNN Algorithm for Fine-Grained Classification of Encrypted Network Flow. Electronics 2020, 9, 324. https://doi.org/10.3390/electronics9020324
Ma C, Du X, Cao L. Improved KNN Algorithm for Fine-Grained Classification of Encrypted Network Flow. Electronics. 2020; 9(2):324. https://doi.org/10.3390/electronics9020324
Chicago/Turabian StyleMa, Chencheng, Xuehui Du, and Lifeng Cao. 2020. "Improved KNN Algorithm for Fine-Grained Classification of Encrypted Network Flow" Electronics 9, no. 2: 324. https://doi.org/10.3390/electronics9020324