Towards a Near-Real-Time Protocol Tunneling Detector Based on Machine Learning Techniques †
Abstract
:1. Introduction
- we implement a protocol tunneling detector prototype which analyzes, in near real-time, a byte sequence of the packets flowing in the monitored network.
- the proposed prototype combines
- –
- an artificial neural network (ANN), based on [22], that accurately classifies clear-text protocols and identifies possible anomalies in network connections;
- –
- a support vector machine that is able to detect compressed/encrypted traffic within unencrypted connections.
- we design and implement an input sanitization module, which automatically removes inconsistent data from models’ training sets to significantly increase the models’ performance.
2. Related Work
3. Background
3.1. DNS Tunneling
3.2. Support Vector Machines
3.3. Artificial Neural Networks
4. Protocol Tunneling Detector
4.1. General Approach
- binary representation of collected bytes
- bit-stream entropy and p-values obtained from statistical tests for random and pseudorandom number generators for cryptographic applications [34]
- statistical properties of the bit-stream hexadecimal representation
- and we keep the protocol label associated to the bit stream itself. While the binary representation of the N bytes is meant to label the protocol of each packet under analysis, the sequential features allow to understand if the packet content is either compressed or encrypted.
4.2. Feature Extraction
- number of different alphanumeric characters in h normalized over h length;
- number of different letters in h normalized over h length;
- longest consecutive sequence of the same character in h normalized over h length.
4.3. Input Sanitization
4.4. Anomaly Detection
5. Experimental Evaluation
6. Discussion
7. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- ENISA Threat Landscape 2022. Available online: https://www.enisa.europa.eu/publications/enisa-threat-landscape-2022 (accessed on 6 February 2023).
- Cost of a Data Breach. A Million-Dollar Race to Detect and Respond. 2022. Available online: https://www.ibm.com/reports/data-breach (accessed on 6 February 2023).
- The SolarWinds Cyber-Attack: What You Need to Know. Available online: https://www.cisecurity.org/solarwinds (accessed on 6 February 2023).
- 7 Top Trends in Cybersecurity for 2022. Available online: https://www.gartner.com/en/articles/7-top-trends-in-cybersecurity-for-2022 (accessed on 6 February 2023).
- Ucci, D.; Aniello, L.; Baldoni, R. Survey of machine learning techniques for malware analysis. Comput. Secur. 2019, 81, 123–147. [Google Scholar] [CrossRef]
- Protocol Tunneling. Available online: https://attack.mitre.org/techniques/T1572/ (accessed on 6 February 2023).
- Encrypted Traffic Analysis. Available online: https://www.enisa.europa.eu/publications/encrypted-traffic-analysis (accessed on 6 February 2023).
- Bisio, F.; Saeli, S.; Lombardo, P.; Bernardi, D.; Perotti, A.; Massa, D. Real-time behavioral DGA detection through machine learning. In Proceedings of the International Carnahan Conference on Security Technology (ICCST), Madrid, Spain, 23–26 October 2017; pp. 1–6. [Google Scholar] [CrossRef]
- Lombardo, P.; Saeli, S.; Bisio, F.; Bernardi, D.; Massa, D. Fast Flux Service Network Detection via Data Mining on Passive DNS Traffic. In Proceedings of the International Conference on Information Security, Guildford, UK, 9–12 September 2018; pp. 463–480. [Google Scholar] [CrossRef]
- Saeli, S.; Bisio, F.; Lombardo, P.; Massa, D. DNS Covert Channel Detection via Behavioral Analysis: A Machine Learning Approach. In Proceedings of the International Conference on Malicious and Unwanted Software (MALWARE), Nantucket, MA, USA, 22–24 October 2019; pp. 46–55. [Google Scholar]
- Ucci, D.; Sobrero, F.; Bisio, F.; Zorzino, M. Near-real-time Anomaly Detection in Encrypted Traffic using Machine Learning Techniques. In Proceedings of the IEEE Symposium Series on Computational Intelligence, SSCI 2021, Orlando, FL, USA, 5–7 December 2021; pp. 1–8. [Google Scholar] [CrossRef]
- Felt, A.P.; Barnes, R.; King, A.; Palmer, C.; Bentzel, C.; Tabriz, P. Measuring HTTPS Adoption on the Web. In Proceedings of the 26th USENIX Conference on Security Symposium, Vancouver, BC, Canada, 16–18 August 2017; pp. 1323–1338. [Google Scholar]
- The Relevance of Network Security in an Encrypted World. Available online: https://blogs.vmware.com/networkvirtualization/2020/09/network-security-encrypted.html/ (accessed on 6 February 2023).
- Encryption, Privacy in the Internet Trends Report. Available online: https://duo.com/decipher/encryption-privacy-in-the-internet-trends-report (accessed on 6 February 2023).
- Keeping Up with the Performance Demands of Encrypted Web Traffic. Available online: https://www.fortinet.com/blog/industry-trends/keeping-up-with-performance-demands-of-encrypted-web-traffic (accessed on 6 February 2023).
- Google Transparency Report: HTTPS Encryption on the Web. Available online: https://transparencyreport.google.com/https/overview?hl=en (accessed on 6 February 2023).
- Cisco Encrypted Traffic Analytics. Available online: https://www.cisco.com/c/en/us/solutions/collateral/enterprise-networks/enterprise-network-security/nb-09-encrytd-traf-anlytcs-wp-cte-en.pdf (accessed on 6 February 2023).
- ENISA Threat Landscape—Malware. Available online: https://www.enisa.europa.eu/publications/malware/at_download/fullReport (accessed on 6 February 2023).
- Taylor, R.W.; Fritsch, E.J.; Liederbach, J. Digital Crime and Digital Terrorism; Prentice Hall Press: Hoboken, NJ, USA, 2014. [Google Scholar]
- Cyber Security Review. Available online: https://www.treasuryandrisk.com/2012/02/01/cyber-security-review/ (accessed on 6 February 2023).
- Yadav, T.; Mallari, R.A. Technical aspects of cyber kill chain. arXiv 2016, arXiv:1606.03184. [Google Scholar]
- Applying Machine Learning to Network Anomalies. Available online: https://www.youtube.com/watch?v=qOfgNd-qijI (accessed on 6 February 2023).
- Wang, Y.; Zhou, A.; Liao, S.; Zheng, R.; Hu, R.; Zhang, L. A comprehensive survey on DNS tunnel detection. Comput. Netw. 2021, 197, 108322. [Google Scholar] [CrossRef]
- Do, V.T.; Engelstad, P.; Feng, B.; van Do, T. Detection of DNS Tunneling in Mobile Networks Using Machine Learning. In Proceedings of the Information Science and Applications, Macau, China, 20–23 March 2017; Kim, K., Joukov, N., Eds.; Springer: Singapore, 2017; pp. 221–230. [Google Scholar]
- Buczak, A.L.; Hanke, P.A.; Cancro, G.J.; Toma, M.K.; Watkins, L.A.; Chavis, J.S. Detection of Tunnels in PCAP Data by Random Forests. In Proceedings of the CISRC’16 11th Annual Cyber and Information Security Research Conference, Oak Ridge, TN, USA, 5–7 April 2016. [Google Scholar] [CrossRef]
- Lambion, D.; Josten, M.; Olumofin, F.; De Cock, M. Malicious DNS Tunneling Detection in Real-Traffic DNS Data. In Proceedings of the 2020 IEEE International Conference on Big Data (Big Data), Atlanta, GA, USA, 10–13 December 2020; pp. 5736–5738. [Google Scholar] [CrossRef]
- Palau, F.; Catania, C.; Guerra, J.; Garcia, S.; Rigaki, M. DNS tunneling: A deep learning based lexicographical detection approach. arXiv 2020, arXiv:2006.06122. [Google Scholar]
- Zhang, J.; Yang, L.; Yu, S.; Ma, J. A DNS tunneling detection method based on deep learning models to prevent data exfiltration. In Proceedings of the Network and System Security: 13th International Conference, NSS 2019, Sapporo, Japan, 15–18 December 2019; Springer: Berlin/Heidelberg, Germany, 2019; pp. 520–535. [Google Scholar]
- Ahmed, J.; Gharakheili, H.H.; Raza, Q.; Russell, C.; Sivaraman, V. Real-time detection of DNS exfiltration and tunneling from enterprise networks. In Proceedings of the 2019 IFIP/IEEE Symposium on Integrated Network and Service Management (IM), Arlington, VA, USA, 8–12 April 2019; pp. 649–653. [Google Scholar]
- Sanjay; Rajendran, B.; Pushparaj Shetty, D. DNS amplification & DNS tunneling attacks simulation, detection and mitigation approaches. In Proceedings of the 2020 International Conference on Inventive Computation Technologies (ICICT), Coimbatore, India, 26–28 February 2020; pp. 230–236. [Google Scholar]
- Vapnik, V. The Nature of Statistical Learning Theory; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2013. [Google Scholar]
- Swersky, L.; Marques, H.O.; Sander, J.; Campello, R.J.; Zimek, A. On the evaluation of outlier detection and one-class classification methods. In Proceedings of the 2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA), Montreal, QC, Canada, 17–19 October 2016; pp. 1–10. [Google Scholar] [CrossRef]
- Abiodun, O.I.; Jantan, A.; Omolara, A.E.; Dada, K.V.; Mohamed, N.A.; Arshad, H. State-of-the-art in artificial neural network applications: A survey. Heliyon 2018, 4, e00938. [Google Scholar] [CrossRef] [PubMed]
- A Statistical Test Suite for Random and Pseudorandom Number Generators for Cryptographic Applications. Available online: https://nvlpubs.nist.gov/nistpubs/legacy/sp/nistspecialpublication800-22r1a.pdf (accessed on 6 February 2023).
- Chawla, N.V.; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P. SMOTE: Synthetic Minority over-Sampling Technique. J. Artif. Int. Res. 2002, 16, 321–357. [Google Scholar] [CrossRef]
- Shannon, C.E. A Mathematical Theory of Communication. Bell Syst. Tech. J. 1948, 27, 379–423. [Google Scholar] [CrossRef]
- Berg, A.; Forsberg, D. Identifying DNS-tunneled traffic with predictive models. arXiv 2019, arXiv:1906.11246. [Google Scholar]
- Mahdavifar, S.; Hanafy Salem, A.; Victor, P.; Razavi, A.H.; Garzon, M.; Hellberg, N.; Lashkari, A.H. Lightweight Hybrid Detection of Data Exfiltration Using DNS Based on Machine Learning. In Proceedings of the ICCNS 2021: The 11th International Conference on Communication and Network Security, Weihai, China, 3–5 December 2021; pp. 80–86. [Google Scholar] [CrossRef]
- Iodine DNS Tunnel. Available online: https://github.com/elastic/examples/blob/master/Security%20Analytics/dns_tunnel_detection/dns-tunnel-iodine.pcap (accessed on 6 February 2023).
- Ali, S.; Rehman, S.U.; Imran, A.; Adeem, G.; Iqbal, Z.; Kim, K.I. Comparative Evaluation of AI-Based Techniques for Zero-Day Attacks Detection. Electronics 2022, 11, 3934. [Google Scholar] [CrossRef]
Statistics | Count [(%)] |
---|---|
DNS packets | 30,669 (1.10%) |
SMB packets | 65,944 (2.35%) |
HTTP packets | 262 (0.01%) |
NTP packets | 46 (0.002%) |
DHCP packets | 20 (0.001%) |
KRB packets | 741 (0.03%) |
SFTP packets | 69,158 (2.46%) |
Not labeled packets | 61,552 (2.20%) |
SSL packets | 2,571,608 (91.84%) |
Distinct connections | 51,459 |
Distinct source machines | 758 |
Distinct dest. machines | 1566 |
Model | Kernel | t | C | ||
---|---|---|---|---|---|
DHCP one-class SVM | RBF | 0.77 | − | ||
DNS one-class SVM | RBF | 0.77 | − | ||
NTP one-class SVM | RBF | 0.92 | − | ||
HTTP one-class SVM | RBF | 0.91 | − | ||
SMB one-class SVM | RBF | 0.77 | − | ||
KRB one-class SVM | RBF | 0.97 | − | ||
SFTP one-class SVM | RBF | 0.97 | − | ||
SSH one-class SVM | RBF | 0.97 | − | ||
SSL one-class SVM | RBF | 0.97 | − | ||
Compression/encryption detector | RBF | − | − | 100 |
Tunnel Type | No. of PCAP Packets | No. of Processed PCAP Packets | No. of Connections | (%) | |
---|---|---|---|---|---|
Telnet over DNS tunnel [37] | M | M | 457 | 457 | |
SFTP over DNS tunnel [37] | 2 M | 1 M | 209 | 209 | |
SSH over DNS tunnel [37] | M | M | 545 | 545 | |
Light file exfiltration [38] | 187,500 | 102,000 | 7617 | 7361 | |
Heavy file exfiltration [38] | M | 765,000 | 43,964 | 42,441 | |
Data exfiltration over Iodine | 438 | 247 | 1 | 1 | |
DNS tunnel [39] |
Dataset | No. of PCAP | No. of Processed | No. of Connections | (%) | |
---|---|---|---|---|---|
Packets | PCAP Packets | ||||
Legitimate traffic | M | M | 51,459 | 2966 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Sobrero, F.; Clavarezza, B.; Ucci, D.; Bisio, F. Towards a Near-Real-Time Protocol Tunneling Detector Based on Machine Learning Techniques. J. Cybersecur. Priv. 2023, 3, 794-807. https://doi.org/10.3390/jcp3040035
Sobrero F, Clavarezza B, Ucci D, Bisio F. Towards a Near-Real-Time Protocol Tunneling Detector Based on Machine Learning Techniques. Journal of Cybersecurity and Privacy. 2023; 3(4):794-807. https://doi.org/10.3390/jcp3040035
Chicago/Turabian StyleSobrero, Filippo, Beatrice Clavarezza, Daniele Ucci, and Federica Bisio. 2023. "Towards a Near-Real-Time Protocol Tunneling Detector Based on Machine Learning Techniques" Journal of Cybersecurity and Privacy 3, no. 4: 794-807. https://doi.org/10.3390/jcp3040035
APA StyleSobrero, F., Clavarezza, B., Ucci, D., & Bisio, F. (2023). Towards a Near-Real-Time Protocol Tunneling Detector Based on Machine Learning Techniques. Journal of Cybersecurity and Privacy, 3(4), 794-807. https://doi.org/10.3390/jcp3040035