Threat Hunting Architecture Using a Machine Learning Approach for Critical Infrastructures Protection
Abstract
:1. Introduction
2. Motivation and Previous Work
3. Outline of the System
4. System Architecture
4.1. Layer 1: Data Collectors
- Logs, such as Syslogs from the Operating System (OS), logs from network hardware devices, etc.
- PCAPs (Packet Captures, files with information about network traffic) [69].
- Threat Management Platforms (TMP), such as MISP [70].
- Advanced Persistent Threat (APT) [73] management tools.
- OSINT (Open Source Intelligence [74]) sources, with their specific need in terms of normalization due to the wide variety of data typologies.
4.2. Layer 2: Database
Proposed Database and Data Model
4.3. Layer 2: Data Preprocessing Components
- Number Normalization: Number normalization components are in charge of modifying a dataset of numbers by generating a new dataset with standard deviation 1 and mean 0, by multiplying all values by a specific factor, setting all minimum values to a specific threshold, etc.
- Text Normalization: Text normalization components are in charge of modifying texts by removing all forbidden characters, by adapting sentences to a predefined structure, etc.
- One-Hot Encoders: One-Hot Encoders components convert a categorical classification to a numerical classification by assigning a number to each one of the possible values [82].
4.4. Layer 2: ML Components
4.5. Layer 3: Big Data, Exchangers, and Generators
4.5.1. Big Data Statistics
- Which are the types of attacks that have greater occurrence?
- Which are the types of attacks that have greater impact?
- Which are the devices usually attacked?
- Which are the devices not usually attacked but were attacked recently?
4.5.2. Data Exchangers
4.5.3. Hypothesis Generators
- Simple filters: Basic filtering rules (e.g., if/else rules).
- Complex filters: These rules find context by selecting more data related to the analyzed one (e.g., find how many times this pattern has been repeated).
- ML filters: These apply ML techniques from ML components to generate hypotheses.
4.5.4. ML Sequences Presets
4.6. Layer 4: Interaction Components
4.6.1. HMI
4.6.2. External Access Gateway
4.7. Common Layer: Communications
4.8. Common Layer: Authentication Management
5. System Prototype
5.1. Components
- Sigma Converters.
- Number Normalization.
- Text Normalization.
- One-Hot Encoders.
- APT Clustering components.
- Anomaly detectors.
- NLP.
- Decision trees.
- Neural networks.
5.2. Validation
5.2.1. HMI: Raw Data Visualizations
5.2.2. HMI: Chart Data Visualizations
5.3. Verification
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Abbreviations
API | Application Programming Interface |
APT | Advanced Persistent Threat |
CI | Critical Infrastructures |
CSA | Cyber Situational Awareness |
ECS | Elastic Common Schema |
ES | Elastic Search |
HA | High Availability |
HMI | Human-Machine Interface |
IDS | Instrusion Detection System |
IoC | Indicator of Compromise |
IoT | Internet of Things |
IP | Internet Protocol |
IPS | Intrusion Prevention System |
IT | Information Technology |
ML | Machine Learning |
OS | Operating System |
OSINT | Open Source Intelligence |
OTP | One Time Passwords |
SDN | Software-Defined Networks |
SIEM | Security Information and Event Management |
SME | Small and Medium Enterprise |
SSLA | Security Service Levels Agreements |
TMP | Threat Management Platforms |
VPN | Virtual Private Network |
VR | Virtual-Reality |
References
- PRAETORIAN. D3.1 Transitioning Risk Management, 2021. PRAETORIAN H2020 Project Deliverables, Not yet published.
- Li, J.H. Cyber security meets artificial intelligence: A survey. Front. Inf. Technol. Electron. Eng. 2018, 19, 1462–1474. [Google Scholar] [CrossRef]
- Falandays, J.B.; Nguyen, B.; Spivey, M.J. Is prediction nothing more than multi-scale pattern completion of the future? Brain Res. 2021, 1768, 147578. [Google Scholar] [CrossRef]
- Federmeier, K.D. Thinking ahead: The role and roots of prediction in language comprehension. Psychophysiology 2007, 44, 491–505. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Riegler, A. The role of anticipation in cognition. In Proceedings of the AIP Conference Proceedings. Am. Inst. Phys. 2001, 573, 534–541. [Google Scholar]
- Slattery, T.J.; Yates, M. Word skipping: Effects of word length, predictability, spelling and reading skill. Q. J. Exp. Psychol. 2018, 71, 250–259. [Google Scholar] [CrossRef] [PubMed]
- Lehner, P.; Seyed-Solorforough, M.M.; O’Connor, M.F.; Sak, S.; Mullin, T. Cognitive biases and time stress in team decision making. IEEE Trans. Syst. Man -Cybern.-Part Syst. Humans 1997, 27, 698–703. [Google Scholar] [CrossRef]
- Bilge, L.; Dumitraş, T. Before we knew it: An empirical study of zero-day attacks in the real world. In Proceedings of the 2012 ACM Conference on Computer and Communications Security, Raleigh North, CA, USA, 16–18 October 2012; pp. 833–844. [Google Scholar]
- Markowsky, G.; Markowsky, L. Visualizing cybersecurity events. In Proceedings of the International Conference on Security and Management (SAM), Las Vegas, NV, USA, 22–25 July 2013; p. 1. [Google Scholar]
- Young, C.S. Representing Cybersecurity Risk. In Cybercomplexity; Springer: Berlin/Heidelberg, Germany, 2022; pp. 19–24. [Google Scholar]
- Endsley, M.R. Measurement of situation awareness in dynamic systems. Hum. Factors 1995, 37, 65–84. [Google Scholar] [CrossRef]
- Franke, U.; Brynielsson, J. Cyber situational awareness–a systematic review of the literature. Comput. Secur. 2014, 46, 18–31. [Google Scholar] [CrossRef]
- Chen, S.; Guo, C.; Yuan, X.; Merkle, F.; Schaefer, H.; Ertl, T. Oceans: Online collaborative explorative analysis on network security. Proceedings of Eleventh Workshop on Visualization for Cyber Security, Paris, France, 10 November 2014; pp. 1–8. [Google Scholar]
- Choi, H.; Lee, H. PCAV: Internet attack visualization on parallel coordinates. In Proceedings of the International Conference on Information and Communications Security, Beijing, China, 10–13 December 2005; Springer: Berlin/Heidelberg, Germany, 2005; pp. 454–466. [Google Scholar]
- Jahromi, A.N.; Hashemi, S.; Dehghantanha, A.; Parizi, R.M.; Choo, K.K.R. An enhanced stacked LSTM method with no random initialization for malware threat hunting in safety and time-critical systems. IEEE Trans. Emerg. Top. Comput. Intell. 2020, 4, 630–640. [Google Scholar] [CrossRef]
- Schmitt, S.; Kandah, F.I.; Brownell, D. Intelligent threat hunting in software-defined networking. In Proceedings of the 2019 IEEE International Conference on Consumer Electronics (ICCE), Las Vegas, NV, USA, 11–13 January 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 1–5. [Google Scholar]
- Schmitt, S. Advanced Threat Hunting over Software-Defined Networks in Smart Cities; University of Tennessee at Chattanooga: Chattanooga, Tennessee, USA, 2018. [Google Scholar]
- HaddadPajouh, H.; Dehghantanha, A.; Khayami, R.; Choo, K.K.R. A deep recurrent neural network based approach for internet of things malware threat hunting. Future Gener. Comput. Syst. 2018, 85, 88–96. [Google Scholar] [CrossRef]
- Raju, A.D.; Abualhaol, I.Y.; Giagone, R.S.; Zhou, Y.; Huang, S. A survey on cross-architectural IoT malware threat hunting. IEEE Access 2021, 9, 91686–91709. [Google Scholar] [CrossRef]
- Homayoun, S.; Dehghantanha, A.; Ahmadzadeh, M.; Hashemi, S.; Khayami, R. Know abnormal, find evil: Frequent pattern mining for ransomware threat hunting and intelligence. IEEE Trans. Emerg. Top. Comput. 2017, 8, 341–351. [Google Scholar] [CrossRef]
- Neto, A.J.H.; dos Santos, A.F.P. Cyber threat hunting through automated hypothesis and multi-criteria decision making. In Proceedings of the 2020 IEEE International Conference on Big Data (Big Data), Atlanta, GA, USA, 10–13 December 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 1823–1830. [Google Scholar]
- Gonzalez-Granadillo, G.; Faiella, M.; Medeiros, I.; Azevedo, R.; Gonzalez-Zarzosa, S. ETIP: An Enriched Threat Intelligence Platform for improving OSINT correlation, analysis, visualization and sharing capabilities. J. Inf. Secur. Appl. 2021, 58, 102715. [Google Scholar] [CrossRef]
- Azevedo, R.; Medeiros, I.; Bessani, A. PURE: Generating quality threat intelligence by clustering and correlating OSINT. In Proceedings of the 2019 18th IEEE International Conference On Trust, Security And Privacy In Computing And Communications (TrustCom), Rotorua, New Zealand, 5–8 August 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 483–490. [Google Scholar]
- Alves, F.; Ferreira, P.M.; Bessani, A. OSINT-based Data-driven Cybersecurity Discovery. In Proceedings of the 12th Eurosys Doctoral Conference, Porto, Portugal, 23 April 2018; pp. 1–5. [Google Scholar]
- Kott, A.; Wang, C.; Erbacher, R.F. Cyber Defense and Situational Awareness; Springer: Berlin/Heidelberg, Germany, 2015; Volume 62. [Google Scholar]
- Greitzer, F.L.; Noonan, C.F.; Franklin, L. Cognitive Foundations for Visual Analytics; Technical Report; Pacific Northwest National Lab.(PNNL): Richland, WA, USA, 2011. [Google Scholar]
- Eslami, M.; Zheng, G.; Eramian, H.; Levchuk, G. Deriving cyber use cases from graph projections of cyber data represented as bipartite graphs. In Proceedings of the 2017 IEEE International Conference on Big Data (Big Data), Boston, MA, USA, 11–14 December 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 4658–4663. [Google Scholar]
- Kotenko, I.; Novikova, E. Visualization of security metrics for cyber situation awareness. In Proceedings of the 2014 Ninth International Conference on Availability, Reliability and Security, Fribourg, Switzerland, 8–12 September 2014; IEEE: Piscataway, NJ, USA, 2014; pp. 506–513. [Google Scholar]
- Beaver, J.M.; Steed, C.A.; Patton, R.M.; Cui, X.; Schultz, M. Visualization techniques for computer network defense. In Proceedings of the Sensors, and Command, Control, Communications, and Intelligence (C3I) Technologies for Homeland Security and Homeland Defense X. SPIE, Orlando, FL, USA,, 25–28 April 2011; Volume 8019, pp. 18–26. [Google Scholar]
- Goodall, J.R.; Ragan, E.D.; Steed, C.A.; Reed, J.W.; Richardson, G.D.; Huffer, K.M.; Bridges, R.A.; Laska, J.A. Situ: Identifying and explaining suspicious behavior in networks. IEEE Trans. Vis. Comput. Graph. 2018, 25, 204–214. [Google Scholar] [CrossRef] [PubMed]
- Zhuo, Y.; Zhang, Q.; Gong, Z. Cyberspace situation representation based on niche theory. In Proceedings of the 2008 International Conference on Information and Automation, Zhangjiajie, China, 20–23 June 2008; IEEE: Piscataway, NJ, USA, 2008; pp. 1400–1405. [Google Scholar]
- Pike, W.A.; Scherrer, C.; Zabriskie, S. Putting security in context: Visual correlation of network activity with real-world information. In VizSEC 2007; Springer: Berlin/Heidelberg, Germany, 2008; pp. 203–220. [Google Scholar]
- Abraham, S.; Nair, S. Comparative analysis and patch optimization using the cyber security analytics framework. J. Def. Model. Simul. 2018, 15, 161–180. [Google Scholar] [CrossRef]
- Graf, R.; Gordea, S.; Ryan, H.M.; Houzanme, T. An Expert System for Facilitating an Institutional Risk Profile Definition for Cyber Situational Awareness. In Proceedings of the ICISSP, Rome, Italy, 19–21 February 2016; pp. 347–354. [Google Scholar]
- Lohmann, S.; Heimerl, F.; Bopp, F.; Burch, M.; Ertl, T. Concentri cloud: Word cloud visualization for multiple text documents. In Proceedings of the 2015 19th International Conference on Information Visualisation, Barcelona, Spain, 22–24 July 2015; IEEE: Piscataway, NJ, USA, 2015; pp. 114–120. [Google Scholar]
- Xu, J.; Tao, Y.; Lin, H. Semantic word cloud generation based on word embeddings. In Proceedings of the 2016 IEEE Pacific Visualization Symposium (PacificVis), Taipei, Taiwan, 19–22 April 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 239–243. [Google Scholar]
- De Ville, B. Decision trees. Wiley Interdiscip. Rev. Comput. Stat. 2013, 5, 448–455. [Google Scholar]
- Tak, S.; Cockburn, A. Enhanced spatial stability with hilbert and moore treemaps. IEEE Trans. Vis. Comput. Graph. 2012, 19, 141–148. [Google Scholar] [CrossRef] [Green Version]
- Angelini, M.; Bonomi, S.; Lenti, S.; Santucci, G.; Taggi, S. MAD: A visual analytics solution for Multi-step cyber Attacks Detection. J. Comput. Lang. 2019, 52, 10–24. [Google Scholar]
- Zhong, C.; Alnusair, A.; Sayger, B.; Troxell, A.; Yao, J. AOH-map: A mind mapping system for supporting collaborative cyber security analysis. In Proceedings of the 2019 IEEE Conference on Cognitive and Computational Aspects of Situation Management (CogSIMA), Las Vegas, NV, USA, 8–11 April 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 74–80. [Google Scholar]
- Cho, S.; Han, I.; Jeong, H.; Kim, J.; Koo, S.; Oh, H.; Park, M. Cyber kill chain based threat taxonomy and its application on cyber common operational picture. In Proceedings of the 2018 International Conference On Cyber Situational Awareness, Data Analytics And Assessment (Cyber SA), Glasgow, Scotland, UK, 11–12 June 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 1–8. [Google Scholar]
- Kabil, A.; Duval, T.; Cuppens, N.; Comte, G.L.; Halgand, Y.; Ponchel, C. From cyber security activities to collaborative virtual environments practices through the 3D cybercop platform. In Proceedings of the International Conference on Information Systems Security, Funchal, Madeira, Portugal, 22–24 January 2018; Springer: Berlin/Heidelberg, Germany, 2018; pp. 272–287. [Google Scholar]
- Kopylec, J.; D’Amico, A.; Goodall, J. Visualizing cascading failures in critical cyber infrastructures. In Proceedings of the International Conference on Critical Infrastructure Protection, Hanover, NH, USA, 18–21 March 2007; Springer: Berlin/Heidelberg, Germany, 2007; pp. 351–364. [Google Scholar]
- Llopis, S.; Hingant, J.; Pérez, I.; Esteve, M.; Carvajal, F.; Mees, W.; Debatty, T. A comparative analysis of visualisation techniques to achieve cyber situational awareness in the military. In Proceedings of the 2018 International Conference on Military Communications and Information Systems (ICMCIS), Varsoiva, Poland, 22–23 May 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 1–7. [Google Scholar]
- Carvalho, V.S.; Polidoro, M.J.; Magalhaes, J.P. Owlsight: Platform for real-time detection and visualization of cyber threats. In Proceedings of the 2016 IEEE 2nd International Conference on Big Data Security on Cloud (BigDataSecurity), New York, NY, USA, 8–10 April 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 61–66. [Google Scholar]
- Pietrowicz, S.; Falchuk, B.; Kolarov, A.; Naidu, A. Web-Based Smart Grid Network Analytics Framework. In Proceedings of the 2015 IEEE International Conference on Information Reuse and Integration, San Francisco, CA, USA, 13–15 August 2015; IEEE: Piscataway, NJ, USA, 2015; pp. 496–501. [Google Scholar]
- Matuszak, W.J.; DiPippo, L.; Sun, Y.L. Cybersave: Situational awareness visualization for cyber security of smart grid systems. In Proceedings of the Tenth Workshop on Visualization for Cyber Security, Atlanta, GA, USA, 14 October 2013; pp. 25–32. [Google Scholar]
- Kabil, A.; Duval, T.; Cuppens, N. Alert characterization by non-expert users in a cybersecurity virtual environment: A usability study. In Proceedings of the International Conference on Augmented Reality, Virtual Reality and Computer Graphics, Lecce, Italy, 7–10 September 2020; Springer: Berlin/Heidelberg, Germany, 2020; pp. 82–101. [Google Scholar]
- Kullman, K.; Cowley, J.; Ben-Asher, N. Enhancing cyber defense situational awareness using 3D visualizations. In Proceedings of the 13th International Conference on Cyber Warfare and Security ICCWS 2018, National Defense University, Washington, DC, USA, 8–9 March 2018; pp. 369–378. [Google Scholar]
- Kullman, K.; Asher, N.B.; Sample, C. Operator impressions of 3D visualizations for cybersecurity analysts. In Proceedings of the ECCWS 2019 18th European Conference on Cyber Warfare and Security, Coimbra, Portugal, 4–5 July 2019; Academic Conferences and publishing limited: Red Hook, NY, USA, 2019; p. 257. [Google Scholar]
- Reed, J. Threat Hunting with ML: Another Reason to SMLE. 17 February 2021. Available online: https://www.splunk.com/en_us/blog/platform/threat-research-at-splunk-using-smle.html (accessed on 28 March 2023).
- Liang, J.; Kim, Y. Evolution of Firewalls: Toward Securer Network Using Next Generation Firewall. In Proceedings of the 2022 IEEE 12th Annual Computing and Communication Workshop and Conference (CCWC), Virutal, 26–29 January 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 752–759. [Google Scholar]
- IBM X-Force Exchange. Available online: https://exchange.xforce.ibmcloud.com/ (accessed on 3 March 2023).
- The Security Immune System: An Integrated Approach to Protecting Your Organization. Available online: https://www.midlandinfosys.com/pdf/qradar-siem-cybersecurity-ai-products.pdf (accessed on 3 March 2023).
- Anomali ThreatStream: Automated Threat Intelligence Management at Scale. Available online: https://www.anomali.com/products/threatstream (accessed on 3 March 2023).
- Wang, B.; Najjar, L.; Xiong, N.N.; Chen, R.C. Stochastic optimization: Theory and applications. J. Appl. Math. 2013, 2013, 949131. [Google Scholar] [CrossRef]
- McCall, J. Genetic algorithms for modelling and optimisation. J. Comput. Appl. Math. 2005, 184, 205–222. [Google Scholar] [CrossRef]
- Jangla, K. Docker Compose. In Accelerating Development Velocity Using Docker; Springer: Berlin/Heidelberg, Germany, 2018; pp. 77–98. [Google Scholar]
- Li, Y.; Li, W.; Jiang, C. A survey of virtual machine system: Current technology and future trends. In Proceedings of the 2010 Third International Symposium on Electronic Commerce and Security, Guangzhou, China, 29–31 July 2010; IEEE: Piscataway, NJ, USA, 2010; pp. 332–336. [Google Scholar]
- Medel, V.; Rana, O.; Bañares, J.Á.; Arronategui, U. Modelling performance & resource management in kubernetes. In Proceedings of the 9th International Conference on Utility and Cloud Computing, Shanghai, Chine, 6–9 December 2016; pp. 257–262. [Google Scholar]
- Kotas, C.; Naughton, T.; Imam, N. A comparison of Amazon Web Services and Microsoft Azure cloud platforms for high performance computing. In Proceedings of the 2018 IEEE International Conference on Consumer Electronics (ICCE), Las Vegas, NV, USA, 12–14 January 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 1–4. [Google Scholar]
- Gray, J.; Siewiorek, D.P. High-availability computer systems. Computer 1991, 24, 39–48. [Google Scholar] [CrossRef] [Green Version]
- Wilson, K.S. Conflicts among the pillars of information assurance. IT Prof. 2012, 15, 44–49. [Google Scholar] [CrossRef]
- Rinaldi, S.M.; Peerenboom, J.P.; Kelly, T.K. Identifying, understanding, and analyzing critical infrastructure interdependencies. IEEE Control Syst. Mag. 2001, 21, 11–25. [Google Scholar]
- Fleissner, S.; Baniassad, E. A commensalistic software system. In Proceedings of the Companion to the 21st ACM SIGPLAN Symposium on Object-Oriented Programming Systems, Languages, and Applications, Portland, OR, USA, 22–26 October 2006; pp. 560–573. [Google Scholar]
- Torchiano, M.; Jaccheri, L.; Sørensen, C.F.; Wang, A.I. COTS products characterization. In Proceedings of the 14th International Conference on Software Engineering and Knowledge Engineering, Ischia, Italy, 15–19 July 2002; pp. 335–338. [Google Scholar]
- Coppolino, L.; D’Antonio, S.; Formicola, V.; Romano, L. Integration of a System for Critical Infrastructure Protection with the OSSIM SIEM Platform: A dam case study. In Proceedings of the International Conference on Computer Safety, Reliability, and Security, Naples, Italy, 19–22 September 2011; Springer: Berlin/Heidelberg, Germany, 2011; pp. 199–212. [Google Scholar]
- Cerullo, G.; Formicola, V.; Iamiglio, P.; Sgaglione, L. Critical Infrastructure Protection: Having SIEM technology cope with network heterogeneity. arXiv 2014, arXiv:1404.7563. [Google Scholar]
- Veselý, V. Extended Comparison Study on Merging PCAP Files. ElectroScope 2012, 2012, 1–6. [Google Scholar]
- Wagner, C.; Dulaunoy, A.; Wagener, G.; Iklody, A. Misp: The design and implementation of a collaborative threat intelligence sharing platform. In Proceedings of the 2016 ACM on Workshop on Information Sharing and Collaborative Security, Vienna, Austria, 24 October 2016; pp. 49–56. [Google Scholar]
- Groenewegen, A.; Janssen, J.S. TheHive Project: The Maturity of an Open-Source Security Incident Response Platform; SNE/OS3; University of Amsterdam: Amsterdam, The Netherlands, 2021. [Google Scholar]
- Gonashvili, M. Knowledge Management for Incident Response Teams; Masaryk University: Brno, Czech Republic, 2019. [Google Scholar]
- Cole, E. Advanced Persistent Threat: Understanding the Danger and How to Protect Your Organization; Syngress: Oxford, UK, 2012. [Google Scholar]
- Tabatabaei, F.; Wells, D. OSINT in the Context of Cyber-Security. Open Source Intell. Investig. 2016, 1, 213–231. [Google Scholar]
- Verhoef, R. Sigma Rules! The Generic Signature Format for SIEM Systems. 19 June 2020. Available online: https://isc.sans.edu/diary/rss/26258 (accessed on 7 February 2023).
- Ömer. What Is Sigma? Threat Hunting in Siem Products with Sigma Rules–Example Sigma Rules. 21 March 2021. Available online: https://www.systemconf.com/2021/03/21/what-is-sigma-threat-hunting-in-siem-products-with-sigma-rules-example-sigma-rules/ (accessed on 7 February 2023).
- Naik, N.; Jenkins, P.; Savage, N.; Yang, L.; Boongoen, T.; Iam-On, N.; Naik, K.; Song, J. Embedded YARA rules: Strengthening YARA rules utilising fuzzy hashing and fuzzy rules for malware analysis. Complex Intell. Syst. 2021, 7, 687–702. [Google Scholar] [CrossRef]
- Naik, N.; Jenkins, P.; Savage, N.; Yang, L. Cyberthreat Hunting-Part 1: Triaging ransomware using fuzzy hashing, import hashing and YARA rules. In Proceedings of the 2019 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), New Orleans, LA, USA, 23–26 June 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 1–6. [Google Scholar]
- Knuth, D.E. The Art of Computer Programming, 2nd ed.; Sorting and Searching; Addison Wesley Longman Publishing Co., Inc.: Boston, MA, USA, 1998; Volume 3. [Google Scholar]
- Gianvecchio, S.; Burkhalter, C.; Lan, H.; Sillers, A.; Smith, K. Closing the Gap with APTs Through Semantic Clusters and Automated Cybergames. In Proceedings of the Security and Privacy in Communication Networks, Orlando, FL, USA, 23–25 October 2019; Chen, S., Choo, K.K.R., Fu, X., Lou, W., Mohaisen, A., Eds.; Springer International Publishing: Cham, Switzerland, 2019; pp. 235–254. [Google Scholar]
- Divya, M.S.; Goyal, S.K. ElasticSearch: An advanced and quick search technique to handle voluminous data. Compusoft 2013, 2, 171. [Google Scholar]
- Hancock, J.T.; Khoshgoftaar, T.M. Survey on categorical data for neural networks. J. Big Data 2020, 7, 28. [Google Scholar] [CrossRef] [Green Version]
- Schetinin, V.; Schult, J. A neural-network technique to learn concepts from electroencephalograms. Theory Biosci. 2005, 124, 41–53. [Google Scholar] [CrossRef] [Green Version]
- Gallant, S.I.; Gallant, S.I. Neural Network Learning and Expert Systems; MIT Press: Cambridge, MA, USA, 1993. [Google Scholar]
- Murthy, S.K.; Kasif, S.; Salzberg, S. A system for induction of oblique decision trees. J. Artif. Intell. Res. 1994, 2, 1–32. [Google Scholar] [CrossRef]
- Quinlan, J.R. Induction of decision trees. Mach. Learn. 1986, 1, 81–106. [Google Scholar] [CrossRef] [Green Version]
- Zhang, T.; Ramakrishnan, R.; Livny, M. BIRCH: A new data clustering algorithm and its applications. Data Min. Knowl. Discov. 1997, 1, 141–182. [Google Scholar] [CrossRef]
- Zhang, T.; Ramakrishnan, R.; Livny, M. BIRCH: An efficient data clustering method for very large databases. ACM Sigmod Rec. 1996, 25, 103–114. [Google Scholar] [CrossRef]
- Khan, K.; Rehman, S.U.; Aziz, K.; Fong, S.; Sarasvady, S. DBSCAN: Past, present and future. In Proceedings of the Fifth International Conference on the Applications of Digital Information and Web Technologies (ICADIWT 2014), Chennai, India, 17–19 February 2014; IEEE: Piscataway, NJ, USA, 2014; pp. 232–238. [Google Scholar]
- Çelik, M.; Dadaşer-Çelik, F.; Dokuz, A.Ş. Anomaly detection in temperature data using DBSCAN algorithm. In Proceedings of the 2011 International Symposium on Innovations in Intelligent Systems and Applications, Istanbul, Turkey, 15–18 June 2011; IEEE: Piscataway, NJ, USA, 2011; pp. 91–95. [Google Scholar]
- Liu, F.T.; Ting, K.M.; Zhou, Z.H. Isolation forest. In Proceedings of the 2008 Eighth IEEE International Conference on Data Mining, Pisa, Italy, 15–19 December 2008; IEEE: Piscataway, NJ, USA, 2008; pp. 413–422. [Google Scholar]
- Ding, Z.; Fei, M. An anomaly detection approach based on isolation forest algorithm for streaming data using sliding window. IFAC Proc. Vol. 2013, 46, 12–17. [Google Scholar] [CrossRef]
- Amer, M.; Goldstein, M.; Abdennadher, S. Enhancing one-class support vector machines for unsupervised anomaly detection. In Proceedings of the ACM SIGKDD Workshop on Outlier Detection and Description, Chicago, Illinois, 11 August 2013; pp. 8–15. [Google Scholar]
- Hejazi, M.; Singh, Y.P. One-class support vector machines approach to anomaly detection. Appl. Artif. Intell. 2013, 27, 351–366. [Google Scholar] [CrossRef]
- Ukwen, D.O.; Karabatak, M. Review of NLP-based Systems in Digital Forensics and Cybersecurity. In Proceedings of the 2021 9th International Symposium on Digital Forensics and Security (ISDFS), Elazig, Turkey, 28–29 June 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 1–9. [Google Scholar]
- Georgescu, T.M. Natural language processing model for automatic analysis of cybersecurity-related documents. Symmetry 2020, 12, 354. [Google Scholar] [CrossRef]
- Mathews, S.M. Explainable artificial intelligence applications in NLP, biomedical, and malware classification: A literature review. In Proceedings of the Intelligent Computing-Proceedings of the Computing Conference, London, UK, 16–17 July 2019; Springer: Berlin/Heidelberg, Germany, 2019; pp. 1269–1292. [Google Scholar]
- Al-Omari, M.; Rawashdeh, M.; Qutaishat, F.; Alshira’H, M.; Ababneh, N. An intelligent tree-based intrusion detection model for cyber security. J. Netw. Syst. Manag. 2021, 29, 20. [Google Scholar] [CrossRef]
- Sarker, I.H. Deep cybersecurity: A comprehensive overview from neural network and deep learning perspective. SN Comput. Sci. 2021, 2, 154. [Google Scholar]
- Fang, H. Managing data lakes in big data era: What’s a data lake and why has it became popular in data management ecosystem. In Proceedings of the 2015 IEEE International Conference on Cyber Technology in Automation, Control, and Intelligent Systems (CYBER), Shenyang, China, 8–12 June 2015; IEEE: Piscataway, NJ, USA, 2015; pp. 820–824. [Google Scholar]
- Goyal, G.; Singh, K.; Ramkumar, K. A detailed analysis of data consistency concepts in data exchange formats (JSON & XML). In Proceedings of the 2017 International Conference on Computing, Communication and Automation (ICCCA), Greater Noida, India, 5–6 May 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 72–77. [Google Scholar]
- Barnum, S. Standardizing cyber threat intelligence information with the structured threat information expression (stix). Mitre Corp. 2012, 11, 1–22. [Google Scholar]
- Riesco, R.; Villagrá, V.A. Leveraging cyber threat intelligence for a dynamic risk framework. Int. J. Inf. Secur. 2019, 18, 715–739. [Google Scholar] [CrossRef]
- Na, S.; Kim, T.; Kim, H. A study on the classification of common vulnerabilities and exposures using naïve bayes. In Proceedings of the International Conference on Broadband and Wireless Computing, Communication and Applications, Asan, Republic of Korea, 5–7 November 2016; Springer: Berlin/Heidelberg, Germany, 2016; pp. 657–662. [Google Scholar]
- Radack, S.; Kuhn, R. Managing security: The security content automation protocol. IT Prof. 2011, 13, 9–11. [Google Scholar] [CrossRef]
- VirusTotal: Analyse Suspicious Files, Domains, IPs and URLs to Detect Malware and Other Breaches, Automatically Share Them with the Security Community. Available online: https://www.virustotal.com (accessed on 3 March 2023).
- URLhaus: Malware URL Exchange. Available online: https://urlhaus.abuse.ch/ (accessed on 3 March 2023).
- Masse, M. REST API Design Rulebook: Designing Consistent RESTful Web Service Interfaces; O’Reilly Media, Inc.’: Sebastopol, CA, USA, 2011. [Google Scholar]
- Naik, N. Choice of effective messaging protocols for IoT systems: MQTT, CoAP, AMQP and HTTP. In Proceedings of the 2017 IEEE International Systems Engineering Symposium (ISSE), Vienna, Austria, 11–13 October 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 1–7. [Google Scholar]
- Sandhu, R.S.; Coyne, E.J.; Feinstein, H.L.; Youman, C.E. Role-based access control models. Computer 1996, 29, 38–47. [Google Scholar] [CrossRef] [Green Version]
- Tomasek, M.; Cerny, T. On web services ui in user interface generation in standalone applications. In Proceedings of the 2015 Conference on Research in Adaptive and Convergent Systems, Prague, Czech Republic, 9–12 October 2015; pp. 363–368. [Google Scholar]
- Montesi, F.; Weber, J. Circuit breakers, discovery, and API gateways in microservices. arXiv 2016, arXiv:1609.05830. [Google Scholar]
- Xu, R.; Jin, W.; Kim, D. Microservice security agent based on API gateway in edge computing. Sensors 2019, 19, 4905. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Jeong, J.; Chung, M.Y.; Choo, H. Integrated OTP-based user authentication scheme using smart cards in home networks. In Proceedings of the 41st Annual Hawaii International Conference on System Sciences (HICSS 2008), Big Island, HI, USA, 7–10 January 2008; IEEE: Piscataway, NJ, USA, 2008; p. 294. [Google Scholar]
- Zhao, S.; Hu, W. Improvement on OTP authentication and a possession-based authentication framework. Int. J. Multimed. Intell. Secur. 2018, 3, 187–203. [Google Scholar] [CrossRef]
- Bihis, C. Mastering OAuth 2.0; Packt Publishing Ltd.: Birmingham, UK, 2015. [Google Scholar]
- Hardt, D. The OAuth 2.0 Authorization Framework. RFC 6749, RFC Editor, 2012. Available online: http://www.rfc-editor.org/rfc/rfc6749.txt (accessed on 28 March 2023).
- Haag, S.; Anderl, R. Digital twin–Proof of concept. Manuf. Lett. 2018, 15, 64–66. [Google Scholar] [CrossRef]
- Srinath, K. Python–the fastest growing programming language. Int. Res. J. Eng. Technol. 2017, 4, 354–357. [Google Scholar]
- Nelli, F. Python Data Analytics: Data Analysis and Science Using PANDAs, Matplotlib and the Python Programming Language; Apress: Sebastopol, CA, USA, 2015. [Google Scholar]
- Hao, J.; Ho, T.K. Machine learning made easy: A review of scikit-learn package in python programming language. J. Educ. Behav. Stat. 2019, 44, 348–361. [Google Scholar] [CrossRef]
- Al-Shaer, R.; Spring, J.M.; Christou, E. Learning the associations of mitre att & ck adversarial techniques. In Proceedings of the 2020 IEEE Conference on Communications and Network Security (CNS), Virtual, 28–30 June 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 1–9. [Google Scholar]
- Alexander, O.; Belisle, M.; Steele, J. MITRE ATT&CK for Industrial Control Systems: Design and Philosophy; The MITRE Corporation: Bedford, MA, USA, 2020. [Google Scholar]
- Ahmed, M.; Panda, S.; Xenakis, C.; Panaousis, E. MITRE ATT&CK-driven cyber risk assessment. In Proceedings of the 17th International Conference on Availability, Reliability and Security, Vienna, Austria, 23–26 August 2022; pp. 1–10. [Google Scholar]
- Roy, G.M. RabbitMQ in Depth; Simon and Schuster: New York, NY, USA, 2017. [Google Scholar]
- Ionescu, V.M. The analysis of the performance of RabbitMQ and ActiveMQ. In Proceedings of the 2015 14th RoEduNet International Conference-Networking in Education and Research (RoEduNet NER), Craiova, Romania, 24–26 September 2015; IEEE: Piscataway, NJ, USA, 2015; pp. 132–137. [Google Scholar]
- Rostanski, M.; Grochla, K.; Seman, A. Evaluation of highly available and fault-tolerant middleware clustered architectures using RabbitMQ. In Proceedings of the 2014 Federated Conference on Computer Science and Information Systems, Warsaw, Poland, 7–10 September 2014; IEEE: Piscataway, NJ, USA, 2014; pp. 879–884. [Google Scholar]
ECS Field | Description |
---|---|
event.dataset | Name of the dataset |
event.id | Unique ID to describe the event |
event.ingested | Timestamp when an event arrived to the central data store |
event.created | The date/time when the event was first read by an agent |
event.starts | The date when the event started |
event.end | The date when the event ended |
event.action | The action captured by the event |
event.original | Raw text message of entire event |
source.ip | IP address of the source (IPv4 or IPv6) |
source.mac | MAC address of the source |
source.port | Port of the source |
source.hostname | Hostname of the source. |
destination.ip | IP address of the destination (IPv4 or IPv6) |
destination.mac | MAC address of the destination |
destination.port | Port of the destination |
destination.hostname | Hostname of the destination |
Question |
---|
Does the prototype give fast access to the information considered as relevant? |
Does the prototype receive updated information from external sources? |
Does the prototype send information to external sources? |
Does the prototype provide tools to easily create/edit/delete preprocessing components? |
Does the prototype provide tools to easily create/edit/delete ML components? |
Does the prototype help at the decision making process? |
Is the prototype easy to use? |
Metric | After 1 Month | After 6 Months |
---|---|---|
Percentage of benign events marked correctly by the prototype | 31.56% | 83.49% |
Percentage of malign events marked correctly by the platform | 23.16% | 73.08% |
Ratio of likeliness of the hypothesis | 24.62% | 89.24% |
Percentage of attacks detected by the platform | 26.74% | 86.31% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Aragonés Lozano, M.; Pérez Llopis, I.; Esteve Domingo, M. Threat Hunting Architecture Using a Machine Learning Approach for Critical Infrastructures Protection. Big Data Cogn. Comput. 2023, 7, 65. https://doi.org/10.3390/bdcc7020065
Aragonés Lozano M, Pérez Llopis I, Esteve Domingo M. Threat Hunting Architecture Using a Machine Learning Approach for Critical Infrastructures Protection. Big Data and Cognitive Computing. 2023; 7(2):65. https://doi.org/10.3390/bdcc7020065
Chicago/Turabian StyleAragonés Lozano, Mario, Israel Pérez Llopis, and Manuel Esteve Domingo. 2023. "Threat Hunting Architecture Using a Machine Learning Approach for Critical Infrastructures Protection" Big Data and Cognitive Computing 7, no. 2: 65. https://doi.org/10.3390/bdcc7020065