Evaluating LLMs for the Automated Generation of Operational Detection Rules in Enterprise EDR Environments
Abstract
1. Introduction
- RQ1: To what extent can an LLM accurately interpret and contextualize graph-based threat intelligence retrieved from OpenCTI?
- RQ2: How effectively can LLM-driven automation utilize structured ATT&CK-aligned threat intelligence to generate operational detection logic?
- RQ3: Is it feasible for an LLM to generate syntactically valid and operationally precise KQL detection rules for MDE without human semantic intervention?
- RQ4: Can such a system materially reduce analyst workload while improving the forensic visibility of silent EDR blocks?
- We conduct a systematic empirical evaluation of LLM-generated detection logic across a representative subset of the MITRE ATT&CK matrix, specifically targeting under-detected reconnaissance and lateral movement techniques.
- Unlike related work focusing on simple IoC extraction (hashes/IPs), our approach emphasizes the operationalization of behavioral TTPs to detect early-stage reconnaissance and lateral movement.
- We provide a rigorous before-and-after analysis of detection coverage, quantifying the detection delta provided by LLM-generated rules compared to native MDE heuristics.
- We introduce a custom normalization layer designed to mitigate LLM hallucinations regarding proprietary EDR schemas, addressing the technical reliability of AI-generated security code.
2. Related Work
3. Methodology
3.1. Threat Intelligence Tool Evaluation and Selection
3.2. Selection of MITRE ATT&CK Techniques and Attack Scenarios
3.3. LLM-Assisted Detection Rule Generation Process & Prompt Engineering
3.4. Post-Processing and Validation of Generated KQL
3.5. Experimental Evaluation Methodology
3.6. Reproducibility Considerations
4. System Architecture
4.1. System Architecture & Environment
- Windows Server 2025 (OS build 26100.4946, 12 August 2025 update): Configured as an Active Directory Domain Services (AD DS) controller hosting the domain and authenticating enterprise users. A Global Administrator account is logged in to facilitate management operations.
- Windows 11 Enterprise (OS build 26100.4946, 12 August 2025 update): Domain-joined and used as the primary employee workstation. The domain user “Bob Worker” is logged in during experiments.
- Kali Linux (Rolling release 13 June 2025, kernel 6.12.0): Used exclusively as an attacker machine. This VM performs all offensive activities during the evaluation phase.
4.2. Deployment of MDE
4.3. OpenCTI Threat-Intelligence Platform
- MITRE ATT&CK: Provides the entire matrix of tactics, techniques, sub-techniques, and associated metadata.
- MISP-feed: Delivers indicators and event collections from publicly available MISP instances.
- AlienVault OTX: Enriches OpenCTI with community-generated indicators, threat actors, campaigns, and observable relationships.
4.4. Python-Based LLM Assistant Implementation
4.5. System-Level Data Flow
- Analyst Input → MITRE technique ID or name is submitted through the Python script.
- OpenCTI Retrieval → The script queries OpenCTI, retrieves MITRE metadata and extracts technique context.
- LLM Processing → The extracted metadata is transformed into a structured prompt; the LLM outputs tuned KQL; post-processing ensures schema correctness.
- MDE Integration → The analyst tests the query in Advanced Hunting, validates detections generated during simulated attacks, and converts it into a custom detection rule.
4.6. Challenges
5. Evaluation
5.1. Environment
5.2. Intelligence Integration and LLM Pipeline
5.3. Execution of Attacks Prior to Custom Rule Deployment
- Phishing (T1566): Implemented via HiddenEye to harvest credentials. As this relied on user interaction rather than malicious process execution, MDE failed to generate alerts, highlighting a significant gap in endpoint-only visibility.
- Account Discovery (T1087): Executed using the netexec framework. Despite the generation of authentication logs, MDE yielded zero alerts, confirming that reconnaissance often remains below the detection threshold of native heuristics.
- Kerberoasting (T1558.003): Executed via Atomic Red Team. This was reliably detected and blocked by MDE’s native engine, generating high-severity incidents.
- SMB Reconnaissance (T1135/T1021.002): While MDE flagged the subsequent reverse-shell delivery, the initial share enumeration was inconsistently detected, indicating a lack of early-stage visibility.
- Credential Dumping (T1003): While secretsdump.py triggered alerts, newer tools like regsecrets.py were silently blocked without generating an incident, demonstrating a prevention-detection mismatch where the analyst is left unaware of the thwarted attempt.
5.4. Evaluation of LLM-Generated Rule Behavior
| Algorithm 1: Interactive LLM Generation of MITRE-Aware MDE Detections |
![]() |
6. Discussion
6.1. Native MDE Detection Performance: The Prevention-Visibility Paradox
6.2. Operational Efficacy of LLM-Generated KQL
6.3. Comparative Assessment and the Intelligence-to-Logic Latency
6.4. Limitations and Operational Impact
7. Conclusions and Future Work
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Appendix A. Representative KQL Query Generation and Correction
Pass-the-Hash (T1550.002)
- Invalid or unsupported table (DeviceAuthenticationEvents);
- Incorrect field naming (AuthenticationMethod);
- Missing required fields (DeviceId, ReportId);
- Incomplete alignment with MDE schema.
- Replaced invalid table with DeviceLogonEvents;
- Corrected AuthenticationMethod → AuthenticationPackageName;
- Ensured compatibility with NTLM logon telemetry;
- Added required fields DeviceId and ReportId.
References
- Kampourakis, K.E.; Gkioulos, V.; Kavallieratos, G.; Lin, J.C. Digital Twin-Enabled Incident Detection and Response: A Systematic Review of Critical Infrastructures Applications. Int. J. Inf. Secur. 2025, 24, 194. [Google Scholar] [CrossRef]
- Arfeen, A.; Ahmed, S.; Khan, M.A.; Jafri, S.F.A. Endpoint Detection & Response: A Malware Identification Solution. In Proceedings of the 2021 International Conference on Cyber Warfare and Security (ICCWS); IEEE: New York, NY, USA, 2021; pp. 1–8. [Google Scholar] [CrossRef]
- Alturkistani, H.; Chuprat, S. Artificial intelligence and large language models in advancing cyber threat intelligence: A systematic literature review. Res. Sq. 2024. [Google Scholar] [CrossRef]
- Kampourakis, K.E.; Kampourakis, V.; Chatzoglou, E.; Kambourakis, G.; Gritzalis, S. Knowledge-to-Data: LLM-Driven Synthesis of Structured Network Traffic for Testbed-Free IDS Evaluation. arXiv 2026, arXiv:2601.05022. [Google Scholar] [CrossRef]
- Smiliotopoulos, C.; Kambourakis, G.; Kolias, C. Detecting Lateral Movement: A Systematic Survey. Heliyon 2024, 10, e26317. [Google Scholar] [CrossRef] [PubMed]
- Ho, G.; Dhiman, M.; Akhawe, D.; Paxson, V.; Savage, S.; Voelker, G.M.; Wagner, D. Hopper: Modeling and detecting lateral movement. In Proceedings of the 30th USENIX Security Symposium (USENIX Security 21); IEEE: New York, NY, USA, 2021; pp. 3093–3110. [Google Scholar]
- Hassan, W.U.; Bates, A.; Marino, D. Tactical provenance analysis for endpoint detection and response systems. In Proceedings of the 2020 IEEE Symposium on Security and Privacy (SP); IEEE: New York, NY, USA, 2020; pp. 1172–1189. [Google Scholar] [CrossRef]
- Kaur, H.; Sanjaiy SL, D.; Paul, T.; Kumar Thakur, R.; Kumar Reddy, K.V.; Mahato, J.; Naveen, K. Evolution of Endpoint Detection and Response (EDR) in Cyber Security: A Comprehensive Review. E3S Web Conf. 2024, 556, 01006. [Google Scholar] [CrossRef]
- Yusof, Z.B. Effectiveness of endpoint detection and response solutions in combating modern cyber threats. J. Adv. Cybersecur. Sci. Threat Intell. Countermeas. 2024, 8, 1–9. [Google Scholar]
- Al-Sada, B.; Sadighian, A.; Oligeri, G. MITRE ATT&CK: State of the art and way forward. ACM Comput. Surv. 2024, 57, 1–37. [Google Scholar] [CrossRef]
- Rajesh, P.; Alam, M.; Tahernezhadi, M.; Monika, A.; Chanakya, G. Analysis of cyber threat detection and emulation using mitre attack framework. In Proceedings of the 2022 International Conference on Intelligent Data Science Technologies and Applications (IDSTA); IEEE: New York, NY, USA, 2022; pp. 4–12. [Google Scholar] [CrossRef]
- Xu, H.; Wang, S.; Li, N.; Wang, K.; Zhao, Y.; Chen, K.; Yu, T.; Liu, Y.; Wang, H. Large language models for cyber security: A systematic literature review. ACM Trans. Softw. Eng. Methodol. 2024. [Google Scholar] [CrossRef]
- Zhang, J.; Bu, H.; Wen, H.; Liu, Y.; Fei, H.; Xi, R.; Li, L.; Yang, Y.; Zhu, H.; Meng, D. When llms meet cybersecurity: A systematic literature review. Cybersecurity 2025, 8, 55. [Google Scholar] [CrossRef]
- Saura, P.F.; Jayaram, K.R.; Isahagian, V.; Bernabé, J.B.; Skarmeta, A. On Automating Security Policies with Contemporary LLMs. arXiv 2025. [Google Scholar] [CrossRef]
- Wang, H.; Xu, M.; Guo, Y.; Han, W.; Lim, H.W.; Dong, J.S. RulePilot: An LLM-Powered Agent for Security Rule Generation. arXiv 2025. [Google Scholar] [CrossRef]
- Anvilogic; SANS Institute. State of Detection Engineering Report; Technical Report; Anvilogic: Palo Alto, CA, USA, 2025. [Google Scholar]
- OpenCTI Platform. OpenCTI Official Documentation. 2024. Available online: https://www.opencti.io/en/ (accessed on 15 January 2025).
- OpenCTI Platform. OpenCTI Platform GitHub Repository. 2024. Available online: https://github.com/OpenCTI-Platform/opencti (accessed on 15 January 2025).
- MISP Project. MISP Threat Sharing Platform. 2024. Available online: https://github.com/MISP/MISP (accessed on 15 January 2025).
- Yeti Platform. Yeti Threat Intelligence Platform. 2023. Available online: https://github.com/yeti-platform/yeti (accessed on 15 January 2025).
- Prelude Security. 2026. Available online: https://www.preludesecurity.com/ (accessed on 23 January 2026).
- Silas Cutler. MalPipe: Malware/IOC Ingestion and Processing Engine. 2024. Available online: https://github.com/silascutler/MalPipe (accessed on 23 January 2026).
- Tian, S.; Zhang, T.; Liu, J.; Wang, J.; Wu, X.; Zhu, X.; Zhang, R.; Zhang, W.; Yuan, Z.; Mao, S.; et al. Exploring the Role of Large Language Models in Cybersecurity: A Systematic Survey. arXiv 2025. [Google Scholar] [CrossRef]
- NIST. NIST AI Risk Management Framework. 2023. Available online: https://www.nist.gov/itl/ai-risk-management-framework (accessed on 15 January 2025).
- Johnnykonst. LLM-Assisted MITRE-Aware KQL Generation for Microsoft Defender for Endpoint. 2026. Available online: https://github.com/Johnnykonst/LLM-Assistant-for-Microsoft-Defender-for-Endpoint (accessed on 12 February 2026).
- Microsoft. Proactively Hunt for Threats with Advanced Hunting in Microsoft Defender. 2026. Available online: https://learn.microsoft.com/en-us/defender-xdr/advanced-hunting-overview (accessed on 23 January 2026).
- Smiliotopoulos, C.; Barmpatsalou, K.; Kambourakis, G. Revisiting the Detection of Lateral Movement through Sysmon. Appl. Sci. 2022, 12, 7746. [Google Scholar] [CrossRef]







| Attack Category | MITRE Technique | Tool Used | Target System |
|---|---|---|---|
| Credential Access (Phishing) | Phishing (T1566) | HiddenEye | Windows 11 Enterprise (user workstation) |
| Discovery (Account Discovery) | Account Discovery (T1087) | Native tools/scripted enumeration | Windows Server 2025/Active Directory |
| Credential Access (Kerberos) | Kerberoasting (T1558.003) | Atomic Red Team/Kerberoast tooling | Windows Server 2025/Active Directory |
| Discovery/Lateral Movement (SMB) | Network Share Discovery (T1135)/Remote Services (T1021.002) | SMB enumeration/reverse shell delivery | Windows Server 2025, Windows 11 Enterprise |
| Credential Access (Dumping) | OS Credential Dumping (T1003) | impacket-secretsdump.py, regsecrets.py, netexec (LSASSY) | Windows Server 2025 (Domain Controller) |
| Credential Access/Lateral Movement | Pass the Hash (T1550.002) | impacket/netexec | Windows Server 2025, Windows 11 Enterprise |
| Component | OS/Version | CPU | RAM | Role |
|---|---|---|---|---|
| Windows Server VM | Windows Server 2025 (OS build 26100.4946, 12 August 2025 update) | 4 | 6 GB | Active Directory Domain Controller |
| Windows 11 VM | Windows 11 Enterprise (OS build 26100.4946, 12 August 2025 update) | 4 | 6 GB | Domain-joined workstation |
| Kali VM | Kali Linux (Rolling, 13 June 2025; kernel 6.12.0) | 4 | 4 GB | Attacker’s machine (not onboarded to MDE) |
| Software/Platform | Version/Notes |
|---|---|
| Windows Server | Windows Server 2025 (OS build 26100.4946, 12 August 2025 update) |
| Windows 11 | Windows 11 Enterprise (OS build 26100.4946, 12 August 2025 update) |
| Kali Linux | Rolling release (13 June 2025), kernel 6.12.0 |
| OpenCTI | 6.7.5 |
| Docker Desktop | 4.54.0 |
| Python | 3.13.5 |
| Oracle VirtualBox | 7.1.10 |
| Microsoft Defender for Endpoint | P2 license |
| Attack Technique | Native MDE Detection | Native MDE Blocking | Custom Rule Detection | Rule Execution Mode (NRT/Scheduled) |
|---|---|---|---|---|
| Phishing (HiddenEye) | No | No | Yes (after tuning) | Scheduled |
| Account Discovery/User Enumeration | No | No | Yes | NRT |
| Kerberoasting | Yes | Yes | No (baseline) | Scheduled |
| SMB share enumeration/SMB-based activity | Mixed | Mixed | Yes (enumeration) | Scheduled |
| Credential dumping (secretsdump.py) | Yes | Yes | Yes | Scheduled |
| Credential dumping (regsecrets.py/LSASSY) | No (silent) | Yes (silent) | Yes | Scheduled |
| Pass-the-hash | Mixed | Mixed | Mixed (schema fixes needed) | Scheduled |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Konstantaras, I.; Chatzoglou, E.; Kampourakis, K.E.; Kambourakis, G. Evaluating LLMs for the Automated Generation of Operational Detection Rules in Enterprise EDR Environments. Electronics 2026, 15, 2088. https://doi.org/10.3390/electronics15102088
Konstantaras I, Chatzoglou E, Kampourakis KE, Kambourakis G. Evaluating LLMs for the Automated Generation of Operational Detection Rules in Enterprise EDR Environments. Electronics. 2026; 15(10):2088. https://doi.org/10.3390/electronics15102088
Chicago/Turabian StyleKonstantaras, Ioannis, Efstratios Chatzoglou, Konstantinos E. Kampourakis, and Georgios Kambourakis. 2026. "Evaluating LLMs for the Automated Generation of Operational Detection Rules in Enterprise EDR Environments" Electronics 15, no. 10: 2088. https://doi.org/10.3390/electronics15102088
APA StyleKonstantaras, I., Chatzoglou, E., Kampourakis, K. E., & Kambourakis, G. (2026). Evaluating LLMs for the Automated Generation of Operational Detection Rules in Enterprise EDR Environments. Electronics, 15(10), 2088. https://doi.org/10.3390/electronics15102088


