HuntGPT: Integrating Machine Learning-Based Anomaly Detection and Explainable AI with Large Language Models (LLMs)
Abstract
1. Introduction
2. Background
2.1. Network Anomaly Detection
2.2. Explainable AI
2.3. Chatbots for Security & ChatGPT
3. Architecture
- Analytics engine: This powerhouse layer is responsible for performing the network packet analysis, examining network data, and detecting and managing anomalies and inconsistencies in the network flows.
- Data Storage: We leverage Elasticsearch as our primary document storage, prized for its near real-time search capabilities, scalability, and reliability. It houses all detected network anomalies and the corresponding original flow data. For storing plots and images, Amazon S3 buckets are our go-to, guaranteeing security and accessibility.
- User Interface (UI): The dashboard UI, constructed with Gradio, is the interactive front-end of our system, presenting the analytic engine’s outcomes to human analysts in a user-friendly manner. It integrates with OpenAI’s Language Model API, enabling seamless interactions between analysts and the system for ongoing discussions and analysis.
3.1. Component Diagram
3.1.1. Anomaly Detection Application Server
- ML Model Loader: loading the pre-trained machine learning model to assess incoming data points and then process the anomalies through explainability frameworks (SHAP and LIME) to provide interpretable explanations for predictions.
- Elasticsearch Connector: Enabling seamless communication with Elasticsearch, the connector module handles authentication and index management, ensuring secure access to the Elasticsearch cluster. This component establishes a secure connection, creates and verifies indices, and ultimately enables the efficient storage of information about both detected and original packets.
- Prediction: The prediction component analyzes individual network flows to determine the presence of anomalies.
- Explainer: This component generates prediction factors, plots, and JSON documents. The explainer indexes the generated data into Elasticsearch, constructing a structured repository that facilitates efficient querying and exploration of detected and original network flows. To augment the interpretability of our findings, the explainer component uploads plot images to an AWS S3 bucket. These plots enhance our understanding of the model dynamics.
- Elasticsearch: Elasticsearch plays a pivotal role in storing and organizing information. The system leverages Elasticsearch to manage both the “Detected Packets Index” and the “Original Packets Index,” optimizing data accessibility and analysis.
- AWS S3 Bucket: Serving as a centralized repository for our visual resources, housing the uploaded plots.
3.1.2. Intrusion Detection System Dashboard
- OpenAI Connector: The OpenAI Connector is mainly used for authentication with the OpenAI API and also initiates the prerecorded message history. It also keeps track of the user conversations.
- Anomaly Packet Data Fetching: It looks through all the documents in the Elasticsearch index and extracts all the important information we need from that document.
- OpenAI API Unit: Integrates the detected packet flow data with the curated fine-tuning prompts and feeds them to the OpenAI API.
- AI Assistant Analysis: The AI Assistant receives all the data preserved in the document, accompanied by the refined prompts from the unit, and it generates a comprehensive analysis for the human agent. This analysis not only reveals the details but also enables direct interactive communication with the human agent, enabling a seamless exchange of information and insights.
3.2. Anomaly Detection Model & Dataset
3.3. Elasticsearch Data Schemas
3.4. User Experience & Use Cases
4. Evaluation and Results
4.1. Prototype Functionality
4.2. Response Quality Analysis
4.2.1. Technical Knowledge in Cybersecurity
- CISM Certified Information Security Manager Practice Exams: The updated self-study guide, written by Peter H. Gregory, featuring hundreds of practice exam questions that match those on the live test [14].
- ISACA official CISM practice Quiz. A free practice quiz includes questions from ISACA’s test prep solutions that are the same difficulty level as ISACA’s official CISM exam [45].
- ISACA official cybersecurity fundamentals practice quiz: A practice quiz including questions from ISACA’s test prep solutions that are the same level of difficulty as ISACA’s official Cybersecurity Fundamentals exam [46].
4.2.2. Response Evaluation
5. Limitations, Threats to Validity, and Security Considerations
5.1. Dataset and Detection-Engine Scope
5.2. Evaluation Alignment
5.3. Semantic Fidelity and Information Loss
5.4. Adversarial Robustness and the LLM Attack Surface
5.5. Context-Window Management
5.6. Latency
5.7. Scale and Breadth of the LLM Evaluation
6. Conclusions and Future Work
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Morgan, S. 2023 Cybersecurity Almanac: 100 Facts, Figures, Predictions, and Statistics. 2023. Available online: https://cybersecurityventures.com/cybersecurity-almanac-2023/ (accessed on 23 September 2023).
- Cybersecurity, C.I. Framework for Improving Critical Infrastructure Cybersecurity; National Institute of Standards and Technology: Gaithersburg, MD, USA, 2018; Volume 4162018. Available online: https://nvlpubs.nist.gov/nistpubs/cswp/nist.cswp.04162018.pdf (accessed on 1 September 2023).
- Shu, X.; Araujo, F.; Schales, D.L.; Stoecklin, M.P.; Jang, J.; Huang, H.; Rao, J.R. Threat intelligence computing. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, Toronto, ON, Canada, 15–19 October 2018; pp. 1883–1898. [Google Scholar]
- Bhuyan, M.H.; Bhattacharyya, D.K.; Kalita, J.K. Network Anomaly Detection: Methods, Systems and Tools. IEEE Commun. Surv. Tutor. 2014, 16, 303–336. [Google Scholar] [CrossRef]
- Mavroeidis, V.; Jøsang, A. Data-Driven Threat Hunting Using Sysmon. In Proceedings of the 2nd International Conference on Cryptography, Security and Privacy, ICCSP 2018, New York, NY, USA, 16–18 March 2018; pp. 82–88. [Google Scholar] [CrossRef]
- Subramanian, K.; Meng, W. Threat Hunting Using Elastic Stack: An Evaluation. In Proceedings of the 2021 IEEE International Conference on Service Operations and Logistics, and Informatics (SOLI), Singapore, 11–12 December 2021; pp. 1–6. [Google Scholar] [CrossRef]
- Aldauiji, F.; Batarfi, O.; Bayousef, M. Utilizing Cyber Threat Hunting Techniques to Find Ransomware Attacks: A Survey of the State of the Art. IEEE Access 2022, 10, 61695–61706. [Google Scholar] [CrossRef]
- Thottan, M.; Ji, C. Anomaly detection in IP networks. IEEE Trans. Signal Process. 2003, 51, 2191–2204. [Google Scholar] [CrossRef]
- Cavallaro, L.; Kinder, J.; Pendlebury, F.; Pierazzi, F. Are Machine Learning Models for Malware Detection Ready for Prime Time? IEEE Secur. Priv. 2023, 21, 53–56. [Google Scholar] [CrossRef]
- Charmet, F.; Tanuwidjaja, H.C.; Ayoubi, S.; Gimenez, P.F.; Han, Y.; Jmila, H.; Blanc, G.; Takahashi, T.; Zhang, Z. Explainable artificial intelligence for cybersecurity: A literature survey. Ann. Telecommun. 2022, 77, 789–812. [Google Scholar] [CrossRef]
- Linkov, I.; Galaitsi, S.; Trump, B.D.; Keisler, J.M.; Kott, A. Cybertrust: From Explainable to Actionable and Interpretable Artificial Intelligence. Computer 2020, 53, 91–96. [Google Scholar] [CrossRef]
- Nyre-Yu, M.; Morris, E.; Moss, B.C.; Smutz, C.; Smith, M. Explainable AI in Cybersecurity Operations: Lessons Learned from xAI Tool Deployment. In Proceedings of the Usable Security and Privacy (USEC) Symposium, San Diego, CA, USA, 28 April 2022; Volume 28. [Google Scholar]
- Stolfo, S.; Fan, W.; Lee, W.; Prodromidis, A.; Chan, P. KDD Cup 1999 Data; UCI Machine Learning Repository: Irvine, CA, USA, 1999. [Google Scholar] [CrossRef]
- Gregory, P. CISM Certified Information Security Manager Practice Exams, 2nd ed.; McGraw Hill LLC: New York, NY, USA, 2023. [Google Scholar]
- Heikkilä, M.; Rättyä, A.; Pieskä, S.; Jämsä, J. Security challenges in small- and medium-sized manufacturing enterprises. In Proceedings of the 2016 International Symposium on Small-Scale Intelligent Manufacturing Systems (SIMS), Narvik, Norway, 21–24 June 2016; pp. 25–30. [Google Scholar] [CrossRef]
- Yihunie, F.; Abdelfattah, E.; Regmi, A. Applying Machine Learning to Anomaly-Based Intrusion Detection Systems. In Proceedings of the 2019 IEEE Long Island Systems, Applications and Technology Conference (LISAT), Farmingdale, NY, USA, 3 May 2019; pp. 1–5. [Google Scholar] [CrossRef]
- Eltanbouly, S.; Bashendy, M.; AlNaimi, N.; Chkirbene, Z.; Erbad, A. Machine Learning Techniques for Network Anomaly Detection: A Survey. In Proceedings of the 2020 IEEE International Conference on Informatics, IoT, and Enabling Technologies (ICIoT), Doha, Qatar, 2–5 February 2020; pp. 156–162. [Google Scholar] [CrossRef]
- Zhao, Y.; Chen, J.; Wu, D.; Teng, J.; Yu, S. Multi-Task Network Anomaly Detection Using Federated Learning. In Proceedings of the 10th International Symposium on Information and Communication Technology, Hanoi, Vietnam, 4–6 December 2019; pp. 273–279. [Google Scholar] [CrossRef]
- Preuveneers, D.; Rimmer, V.; Tsingenopoulos, I.; Spooren, J.; Joosen, W.; Ilie-Zudor, E. Chained Anomaly Detection Models for Federated Learning: An Intrusion Detection Case Study. Appl. Sci. 2018, 8, 2663. [Google Scholar] [CrossRef]
- Mothukuri, V.; Khare, P.; Parizi, R.M.; Pouriyeh, S.; Dehghantanha, A.; Srivastava, G. Federated-Learning-Based Anomaly Detection for IoT Security Attacks. IEEE Internet Things J. 2022, 9, 2545–2554. [Google Scholar] [CrossRef]
- Nguyen, T.D.; Marchal, S.; Miettinen, M.; Fereidooni, H.; Asokan, N.; Sadeghi, A.R. DÏoT: A Federated Self-learning Anomaly Detection System for IoT. In Proceedings of the 2019 IEEE 39th International Conference on Distributed Computing Systems (ICDCS), Dallas, TX, USA, 7–10 July 2019; pp. 756–767. [Google Scholar] [CrossRef]
- Sheikhi, S.; Kostakos, P. Cyber threat hunting using unsupervised federated learning and adversary emulation. In Proceedings of the 2023 IEEE International Conference on Cyber Security and Resilience (CSR), Venice, Italy, 31 July–2 August 2023; pp. 315–320. [Google Scholar] [CrossRef]
- Sheikhi, S.; Kostakos, P. DDoS attack detection using unsupervised federated learning for 5G networks and beyond. In Proceedings of the 2023 Joint European Conference on Networks and Communications & 6G Summit (EuCNC/6G Summit), Gothenburg, Sweden, 6–9 June 2023; pp. 442–447. [Google Scholar] [CrossRef]
- Bhattacharyya, D.K.; Kalita, J. Network Anomaly Detection: A Machine Learning Perspective; CRC Press: Boca Raton, FL, USA, 2013. [Google Scholar] [CrossRef]
- Sheikhi, S. Painted Wolf Optimization: A Novel Nature-Inspired Metaheuristic Algorithm for Real-World Optimization Problems. Comput. Mater. Contin. 2026, 87, 7. [Google Scholar] [CrossRef]
- Sheikhi, S. SLMFORGE: Small Language Models for Federated Feature Selection via Union Aggregation in Cybersecurity. In Proceedings of the 2025 IEEE International Conference on Big Data (BigData), Macau, China, 8–11 December 2025; IEEE: Piscataway, NJ, USA, 2025; pp. 4784–4792. [Google Scholar]
- Porambage, P.; Gür, G.; Osorio, D.P.M.; Liyanage, M.; Gurtov, A.; Ylianttila, M. The roadmap to 6G security and privacy. IEEE Open J. Commun. Soc. 2021, 2, 1094–1122. [Google Scholar] [CrossRef]
- Sheikhi, S.; Kostakos, P. Advancing security in 5g core networks through unsupervised federated time series modeling. In Proceedings of the 2024 IEEE International Conference on Cyber Security and Resilience (CSR), London, UK, 2–4 September 2024; IEEE: Piscataway, NJ, USA, 2024; pp. 353–356. [Google Scholar]
- Gunning, D.; Stefik, M.; Choi, J.; Miller, T.; Stumpf, S.; Yang, G.Z. XAI—Explainable artificial intelligence. Sci. Robot. 2019, 4, eaay7120. [Google Scholar] [CrossRef] [PubMed]
- Nguyen, Q.P.; Lim, K.W.; Divakaran, D.M.; Low, K.H.; Chan, M.C. Gee: A gradient-based explainable variational autoencoder for network anomaly detection. In Proceedings of the 2019 IEEE Conference on Communications and Network Security (CNS), Washington, DC, USA, 10–12 June 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 91–99. [Google Scholar]
- Han, D.; Wang, Z.; Chen, W.; Zhong, Y.; Wang, S.; Zhang, H.; Yang, J.; Shi, X.; Yin, X. Deepaid: Interpreting and improving deep learning-based anomaly detection in security applications. In Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security, Virtual, 15–19 November 2021; pp. 3197–3217. [Google Scholar]
- Sheikhi, S.; Kostakos, P.; Loven, L. ExAI5G: A Logic-Based Explainable AI Framework for Intrusion Detection in 5G Networks. arXiv 2026, arXiv:2604.18052. [Google Scholar]
- Nadeem, A.; Vos, D.; Cao, C.; Pajola, L.; Dieck, S.; Baumgartner, R.; Verwer, S. Sok: Explainable machine learning for computer security applications. In Proceedings of the 2023 IEEE 8th European Symposium on Security and Privacy (EuroS&P), Delft, The Netherlands, 3–7 July 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 221–240. [Google Scholar]
- Ben Peretz, D. A Siri for Network Security: How Chatbots Can Enhance Business Agility. 2020. Available online: https://www.infosecurity-magazine.com/opinions/network-chatbots-agility/ (accessed on 22 July 2023).
- Franco, M.F.; Rodrigues, B.; Scheid, E.J.; Jacobs, A.; Killer, C.; Granville, L.Z.; Stiller, B. SecBot: A Business-Driven Conversational Agent for Cybersecurity Planning and Management. In Proceedings of the 2020 16th International Conference on Network and Service Management (CNSM), Izmir, Turkey, 2–6 November 2020; pp. 1–7. [Google Scholar] [CrossRef]
- McIntosh, T.; Liu, T.; Susnjak, T.; Alavizadeh, H.; Ng, A.; Nowrozy, R.; Watters, P. Harnessing GPT-4 for Generation of Cybersecurity GRC Policies: A Focus on Ransomware Attack Mitigation. Comput. Secur. 2023, 134, 103424. [Google Scholar] [CrossRef]
- Setianto, F.; Tsani, E.; Sadiq, F.; Domalis, G.; Tsakalidis, D.; Kostakos, P. GPT-2C: A Parser for Honeypot Logs Using Large Pre-Trained Language Models. In Proceedings of the 2021 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM ’21, Virtual, 8–11 November 2022; pp. 649–653. [Google Scholar] [CrossRef]
- Nour, B.; Pourzandi, M.; Debbabi, M. A Survey on Threat Hunting in Enterprise Networks. IEEE Commun. Surv. Tutor. 2023, 25, 2299–2324. [Google Scholar] [CrossRef]
- Sebastian, G. Do ChatGPT and other AI chatbots pose a cybersecurity risk?: An exploratory study. Int. J. Secur. Priv. Pervasive Comput. (IJSPPC) 2023, 15, 1–11. [Google Scholar]
- Yue, T.; Au, D.; Au, C.C.; Iu, K.Y. Democratizing financial knowledge with ChatGPT by OpenAI: Unleashing the Power of Technology. SSRN 2023. [Google Scholar] [CrossRef]
- Si, C.; Gan, Z.; Yang, Z.; Wang, S.; Wang, J.; Boyd-Graber, J.; Wang, L. Prompting gpt-3 to be reliable. arXiv 2022, arXiv:2210.09150. [Google Scholar]
- Gilson, A.; Safranek, C.W.; Huang, T.; Socrates, V.; Chi, L.; Taylor, R.A.; Chartash, D. How Does ChatGPT Perform on the United States Medical Licensing Examination? The Implications of Large Language Models for Medical Education and Knowledge Assessment. JMIR Med. Educ. 2023, 9, e45312. [Google Scholar] [CrossRef] [PubMed]
- Chu, M.N. Assessing the Benefits of ChatGPT for Business: An Empirical Study on Organizational Performance. IEEE Access 2023, 11, 76427–76436. [Google Scholar] [CrossRef]
- Olusola, A.A.; Oladele, A.S.; Abosede, D.O. Analysis of KDD’99 intrusion detection dataset for selection of relevance features. In Proceedings of the World Congress on Engineering and Computer Science. WCECS, San Francisco, CA, USA, 20–22 October 2010; Volume 1, pp. 20–22. [Google Scholar]
- ISACA. CISM Practice Quiz. 2023. Available online: https://www.isaca.org/credentialing/cism/cism-practice-quiz (accessed on 1 October 2023).
- ISACA. Cybersecurity Fundamentals Practice Quiz. Available online: https://www.isaca.org/-/media/info/csxf-practice-quiz/index.html (accessed on 1 October 2023).
- py-Readability-Metrics: A Python Library for Computing Readability Metrics. Available online: https://pypi.org/project/py-readability-metrics/ (accessed on 1 October 2023).
- Sheikhi, S.; Loven, L.; Kostakos, P. Beyond the Leaderboard: A Survey of the Science of Evaluation, Benchmarking, and Methodologies for Large Language Models. IEEE Access 2026, 14, 66493–66515. [Google Scholar] [CrossRef]





| Study | Approach | Dataset/Setting | Key Contribution |
|---|---|---|---|
| Yihunie et al. [16] | Comparative study of SGD, Random Forest, Logistic Regression, SVM, and a sequential model | NSL-KDD | Identified Random Forest as the best-performing classifier among the five evaluated baselines. |
| Eltanbouly et al. [17] | Hybrid Random Forest and weighted K-means method | Public IDS benchmarks | Proposed a two-phase scheme combining online misuse detection using Random Forest with offline categorization of random attacks using weighted K-means. |
| Zhao et al. [18] | Multi-task deep neural network under federated learning, MT-DNN-FL | Multiple public traffic datasets | Performed joint anomaly detection, VPN/Tor traffic recognition, and traffic classification while preserving data confidentiality. |
| Preuveneers et al. [19] | Chained anomaly detection with blockchain-audited federated learning | Network intrusion benchmarks | Enabled auditable and transparent model updates with only 5–15% performance overhead, while supporting richer neural architectures. |
| Mothukuri et al. [20] | Federated learning for IoT anomaly detection | IoT traffic | Developed a privacy-preserving anomaly detection approach tailored to IoT security attacks. |
| Nguyen et al. [21] | DÏoT federated self-learning system | IoT device telemetry | Enabled self-learning anomaly detection across distributed IoT deployments without centralized data collection. |
| Sheikhi and Kostakos [22] | Unsupervised federated learning combined with adversary emulation | Threat-hunting environments | Linked unsupervised federated learning with adversary emulation to improve cyber threat-hunting coverage. |
| Sheikhi and Kostakos [23] | Unsupervised federated learning for DDoS detection | 5G core network, GTP protocol | Demonstrated DDoS detection in 5G core networks and showed the applicability of federated approaches to next-generation infrastructures. |
| Label | Description |
|---|---|
| _id | Represents a unique identifier for each detected packet entry. |
| Prediction | Stores the prediction outcome associated with the detected packet, whether normal or malicious, and also the specific genre of the attack. |
| Factors | Captures the contributing factors that influenced the prediction outcome and the key attributes that led to the anomaly’s detection. We store textual descriptions of these factors for later translation into visual plots and for feeding into the chatbot to provide contextual answers. |
| Exp-img | Stores references to the plots used in presenting the prediction process, enhancing the comprehensibility of the anomaly detection process. Exp-img represents the local explanation plot provided by the LIME tabular explainer. |
| Shap-img | Shap-img represent the top features contributing to the predicted anomaly class using the SHAP framework. |
| Original-data | Denotes the number of the detected packets, which can be used as a foreign key to access the complete information regarding the detected packets. |
| Exam | No. of Questions | GPT-3.5 Turbo Success Rate |
|---|---|---|
| CISM Certified Information Security Manager Practice Exams [14] | 40 | 82.5% |
| ISACA official CISM practice Quiz [45] | 10 | 80% |
| ISACA official cybersecurity fundamentals practice quiz [46] | 25 | 72% |
| Readability Formula | Generated Anomaly Explanation | Chatbot Answers | ||
|---|---|---|---|---|
| Score | Grade Level | Score | Grade Level | |
| Flesch-Kincaid Grade Level | 15.7 | 16 | 14.9 | 15 |
| Flesch Reading Ease | 22.7 | graduate | 23.9 | graduate |
| Dale Chall | 12.6 | graduate | 12.11 | graduate |
| Automated Readability Index | 16.3 | graduate | 15.9 | graduate |
| Coleman Liau Index | 15.3 | 15 | 16.3 | 16 |
| Linsear Write | 17.4 | 17 | 15.8 | 16 |
| Query | Chatbot Response |
|---|---|
| How can I prevent such an attack | To prevent DoS attacks, you can implement measures like traffic monitoring, firewalls, load balancers, and rate limiting to detect and mitigate abnormal traffic patterns. Ensuring network redundancy and having a robust incident response plan can also help minimize the impact. |
| What is a firewall, and how do I implement it (with examples)? | A firewall is a network security device that filters incoming and outgoing network traffic according to predefined security rules. Examples include Cisco ASA, Palo Alto Networks NGFW, Fortinet FortiGate, and Check Point. To implement a firewall, configure rule sets to allow or block specific traffic and define security policies to secure your network from unauthorized access. |
| Are there any free ones? | Yes. Examples include ZoneAlarm (free version for personal use) and Windows Firewall (built-in on Windows). Free solutions vary in features and protection level. |
| How can I install ZoneAlarm? | To install ZoneAlarm: visit the official website, navigate to Products/Downloads, download the free version, run the installer, and follow the on-screen prompts (license agreement, installation options, etc.). [proceeds to give detailed instructions on downloading ZoneAlarm] |
| What would be a good configuration for a small home network? | Install and update ZoneAlarm; use medium/high security mode; enable automatic program control; customize program permissions; enable Stealth Mode; set up Network Zones; monitor traffic; keep it updated. |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Ali, T.; Kostakos, P.; Sheikhi, S. HuntGPT: Integrating Machine Learning-Based Anomaly Detection and Explainable AI with Large Language Models (LLMs). Telecom 2026, 7, 73. https://doi.org/10.3390/telecom7030073
Ali T, Kostakos P, Sheikhi S. HuntGPT: Integrating Machine Learning-Based Anomaly Detection and Explainable AI with Large Language Models (LLMs). Telecom. 2026; 7(3):73. https://doi.org/10.3390/telecom7030073
Chicago/Turabian StyleAli, Tarek, Panos Kostakos, and Saeid Sheikhi. 2026. "HuntGPT: Integrating Machine Learning-Based Anomaly Detection and Explainable AI with Large Language Models (LLMs)" Telecom 7, no. 3: 73. https://doi.org/10.3390/telecom7030073
APA StyleAli, T., Kostakos, P., & Sheikhi, S. (2026). HuntGPT: Integrating Machine Learning-Based Anomaly Detection and Explainable AI with Large Language Models (LLMs). Telecom, 7(3), 73. https://doi.org/10.3390/telecom7030073

