1. Introduction
The dynamic and evolving nature of cyberattacks, employing sophisticated vectors and zero-day vulnerabilities, renders traditional cybersecurity defenses inadequate [
1,
2]. The escalating frequency and impact of breaches necessitate advanced prediction methods beyond reactive detection [
1]. Modern breaches often bypass defensive layers of the NIST framework [
2], exploiting complex attack vectors, with zero-day vulnerabilities, for example, exploited for an average of 312 days before detection [
3]. This urgency underscores the transformative role of artificial intelligence (AI) in cybersecurity [
4]. Unlike traditional rule-based or signature-based systems, AI adapts to new threats by analyzing vast datasets to detect patterns and predict attacker behaviors [
5,
6]. Our proposed solution is a hybrid approach combining Graph Neural Networks (GNNs) and Long Short-Term Memory (LSTM) networks. GNNs excel at analyzing graph-structured data inherent in cybersecurity (e.g., MITRE ATT&CK framework [
7]), uncovering hidden patterns in attack relationships. LSTM networks are highly effective in modeling temporal sequences, crucial for predicting the progression of cyberattacks over time [
8]. The complementary strengths of GNNs (structural analysis) and LSTM networks (temporal modeling) provide a holistic understanding of threat behaviors, enabling more effective defenses [
9,
10,
11,
12].
The contemporary cybersecurity landscape is characterized by an alarming escalation in both the frequency and sophistication of security breaches across all sectors. Recent comprehensive analysis reveals that data breaches have reached unprecedented levels, with organizations experiencing an average cost of
$4.88 million per incident in 2024, representing a significant increase from previous years [
13]. These breaches exploit various attack vectors, from sophisticated social engineering campaigns to zero-day vulnerabilities in critical infrastructure, often combining multiple techniques in coordinated campaigns that bypass traditional defensive measures.
Industrial control systems represent a particularly critical domain where conventional security measures prove inadequate against sophisticated adversarial attacks. These systems, which control critical infrastructure including power grids, water treatment facilities, and manufacturing processes, face unique vulnerabilities that require specialized detection mechanisms. Recent advances in adversarial attack detection for industrial control systems have demonstrated the effectiveness of LSTM-based intrusion detection systems combined with black-box defense strategies, achieving significant improvements in detecting anomalous behavior patterns within operational technology environments [
14]. However, these temporal-focused approaches often miss the crucial structural relationships between attack components and system dependencies that characterize modern multi-stage attacks.
The evolution of contemporary attack methodologies reveals fundamental shifts in adversarial tactics that challenge existing defense paradigms. Modern adversaries deploy coordinated campaigns that simultaneously exploit network vulnerabilities, social engineering, and system misconfigurations, creating attack chains that are difficult to detect through single-method approaches. State-sponsored and criminal organizations employ long-term infiltration strategies that remain dormant for extended periods, requiring detection models capable of identifying subtle patterns across extended timeframes and correlating seemingly unrelated events. The increasing prevalence of unknown vulnerabilities demands predictive systems that can anticipate attack vectors based on structural relationships and behavioral patterns rather than relying solely on known signatures or historical data. Critical infrastructure faces unique risks where cyberattacks can cause catastrophic real-world damage, necessitating detection frameworks that understand both digital attack progression and physical system relationships. These emerging threat patterns highlight the fundamental limitations of reactive security measures and emphasize the critical importance of proactive, predictive approaches that can anticipate and mitigate attacks before they achieve their strategic objectives.
Through this approach, the paper seeks to contribute a robust solution to the rapidly evolving cybersecurity landscape, ensuring that defenses remain adaptive, resilient, and effective in mitigating complex threats. The main contributions of this research are summarized as follows:
Novel hybrid architecture: we propose a unified GNN-LSTM framework that uniquely integrates structural analysis of attack patterns through GNNs with temporal sequence modeling via LSTM, enabling comprehensive attack vector reconstruction;
MITRE ATT&CK integration: our model leverages the MITRE ATT&CK framework’s structural relationships, transforming them into graph representations that enhance the understanding of attack dependencies and progression paths;
Superior performance: experimental evaluation demonstrates exceptional performance with 99% AUC on the CICIDS2017 dataset, 85% F1-score for technique prediction, and 0.05 MSE for risk assessment, outperforming traditional approaches;
Explainability features: the integration of SHAP-based explainability mechanisms provides transparent insights into model decisions, crucial for security analysts in operational environments;
Scalability solutions: we address computational challenges through optimized architectures that handle missing data and large-scale deployments, making the approach practical for real-world cybersecurity applications.
The remainder of this paper is organized as follows.
Section 2 presents a comprehensive literature review examining the current approaches to attack vector prediction, the challenges of dynamic cyber threats, and the complexities of modern cybersecurity data.
Section 3 details our proposed hybrid GNN-LSTM methodology, including the model architecture, data preprocessing pipeline, and integration with the MITRE ATT&CK framework.
Section 4 describes the experimental setup, datasets used, and evaluation metrics.
Section 4 also presents our experimental results and comparative analysis with existing approaches.
Section 5 discusses the implications of our findings, limitations, and potential applications in real-world security operations. Finally,
Section 6 concludes this paper with a summary of key findings and directions for future research.
2. Literature Review
Prediction of attack vectors is a proactive cybersecurity approach, aiming to anticipate system exploitation rather than merely reacting to known threats. The increasing sophistication of modern attacks, including zero-day vulnerabilities that remain undetected for an average of 312 days (Bilge and Dumitras [
3]) and adversaries’ use of polymorphic techniques and multi-stage strategies like Advanced Persistent Threats (APTs) (Deldar et al. [
15], Che Mat et al. [
16]), underscores the critical need for dynamic, adaptive predictive systems.
However, predicting attack vectors is challenging due to the complexity and sheer volume of data generated by modern systems. Predictive models must process vast amounts of telemetry data—system logs, network traffic, and event traces—to identify potential threats, a task even advanced statistical techniques struggle with in real-time (Enoch et al. [
17]). The inherent unpredictability of adversarial behavior further complicates this, as adversaries constantly adapt, creating a moving target. As Anjum N et al. [
18] highlighted, understanding dependencies across various attack stages remains a significant challenge, demanding models that can not only identify known patterns but also generalize to novel and evolving attack methods.
Researchers in this domain are investigating several critical questions to advance attack vector prediction:
How can predictive systems generalize to unseen attack techniques?
What methodologies can handle the scale and complexity of modern cybersecurity data?
How can predictive models integrate real-time data streams?
Traditional machine learning, like Babenko et al.’s [
10] use of Learning Vector Quantization (LVQ) for DDoS detection, provided initial insights but struggled with the temporal and structural complexities of modern multi-stage attacks, prompting exploration of more sophisticated architectures. Recent advancements in AI and ML, particularly Graph Neural Networks (GNNs) and Long Short-Term Memory (LSTM) networks, offer promising solutions. GNNs excel at analyzing interconnected data, crucial for understanding complex attack paths within frameworks like MITRE ATT&CK [
17]. LSTM networks are effective for modeling temporal sequences, enabling predictions of attack progression over time (Laghrissi et al. [
19], Ergen et al. [
20]). However, both GNNs and LSTM networks have limitations in scalability and interpretability.
The dynamic nature of cyber threats is a significant challenge. Modern attacks rapidly evolve, exploiting zero-day vulnerabilities that can remain undetected for extended periods (Bilge and Dumitras [
3]). Advanced Persistent Threats (APTs), characterized by their prolonged, adaptive, and multi-stage nature (Che Mat et al. [
16]), and polymorphic malware (Deldar et al. [
15], Enoch et al. [
17]) demand predictive models that can adapt and understand behavioral patterns (Anjum et al. [
18]). Integrating real-time threat intelligence from diverse sources like IDS and SIEM is crucial (Mohammadi et al. [
21]), though standardization of data formats remains a challenge (Schlette et al. [
22]). A multi-faceted approach incorporating behavioral analysis and real-time intelligence is essential for the adaptive predictive systems.
The complexity of modern cybersecurity data presents a formidable challenge. Enterprise networks generate immense volumes of high-dimensional, heterogeneous data from various sources (Enoch et al. [
17], Anjum et al. [
18]), complicating analysis and correlation (Schlette et al. [
22]). Real-time attack prediction is computationally demanding (Mohammadi et al. [
21]), requiring innovations like attention mechanisms (Vaswani et al. [
23]) to reduce overhead. Furthermore, the “black box” nature of deep learning models hinders interpretability (Ribeiro et al. [
11]), necessitating explainable AI (XAI) features in hybrid models (Sun et al. [
24]). Addressing these challenges requires advances in data processing, real-time analysis, and standardization, along with integrating XAI techniques for actionable insights.
Predicting attack vectors requires analyzing both structural relationships and temporal progressions, making a hybrid GNN-LSTM approach optimal. This model leverages GNNs for understanding graph-structured attack paths (Enoch et al. [
17]) and LSTM for modeling sequential attack events (Laghrissi et al. [
19], Ergen et al. [
20]). Sun et al. [
24], Staudemeyer [
25] confirmed the superiority of such hybrid models over standalone approaches. While GNNs and LSTM networks face inherent scalability and computational efficiency challenges with large datasets (Hamilton et al. [
26]) and missing features (Taguchi et al. [
27]), our proposed method includes optimizations to address these, and the integration of explainability is paramount in these complex hybrid systems (Guidotti et al. [
28]).
To enhance interpretability in cybersecurity-related AI models, the integration of explainable artificial intelligence (XAI) has become a significant research focus. Mendes and Rios [
29] conducted a systematic literature review of XAI techniques in cybersecurity, emphasizing the growing need for models that not only perform well but also provide transparent and trustworthy outputs. Among practical tools, GNNExplainer, introduced by Ying et al. [
30], provides interpretable insights into Graph Neural Network predictions by identifying critical subgraphs and node features that influence model decisions. This capacity to generate localized explanations is essential for auditing decisions in sensitive domains such as intrusion detection and threat attribution.
Modern distributed information systems face increasing cyberthreats, which highlights the critical importance of effective approaches to detecting and reconstructing attack vectors. Research in this area emphasizes the need to model cybersecurity risks in such systems (Ermukhambetova et. al. [
31], Olekh, H. et al. [
32], Palko et al. [
33,
34]) as a fundamental step towards increasing resilience. Given the dynamic nature of cyberattacks, the ability to accurately reproduce the sequence of actions of an attacker is key to proactive defense and rapid incident response.
Against the backdrop of constantly evolving threats, traditional detection methods often prove insufficient, which stimulates the development of more advanced approaches based on machine learning. In particular, neural networks have demonstrated significant potential in detecting a variety of cyberattacks, including threats such as SQL injections (Hubskyi et al. [
35]). Building on previous work on prioritizing cybersecurity measures (Hnatiienko et al. [
36]), this paper proposes a hybrid approach that combines Graph Neural Networks (GNNs) and Long Short-Term Memory (LSTM) to improve the reconstruction of complex attack vectors. This allows for a deeper understanding of the spread of threats in distributed information environments.
Based on the conducted literature analysis, it is evident that predicting attack vectors is a complex and multifaceted challenge that requires integrating advanced methods for structural and temporal data analysis. Modern cyber threats, characterized by their dynamic and adaptive nature, surpass the capabilities of traditional reactive defense systems. Hybrid approaches, particularly those combining Graph Neural Networks (GNNs) and Long Short-Term Memory (LSTM) networks, have emerged as promising solutions to address these challenges. These methods leverage the strengths of GNNs for analyzing structural dependencies and LSTM networks for capturing temporal sequences, offering a comprehensive framework for understanding and predicting attack progression. However, unresolved challenges, such as explainability, scalability, and handling noisy or incomplete data, underscore the need for further research.
The primary objective of this study is to synthesize and evaluate the effectiveness of a hybrid GNN-LSTM model for cybersecurity applications, specifically in predicting and mitigating attack vectors. The research aims to achieve the following:
Develop a unified hybrid model integrating Graph Neural Networks (GNNs) and Long Short-Term Memory (LSTM) networks to analyze the structural and temporal aspects of cyberattacks;
Investigate the feasibility and performance of the proposed approach in real-world cybersecurity scenarios, emphasizing its scalability and computational efficiency;
Validate the hybrid model using diverse datasets to evaluate its adaptability and generalizability to the continuously evolving landscape of cyber threats.
Based on the comprehensive literature analysis presented above,
Table 1 summarizes the key studies examined, highlighting their methodological approaches, strengths, limitations, and reported performance metrics. This comparative analysis reveals several critical gaps in the existing literature that motivate our hybrid GNN-LSTM approach.
The analysis presented in
Table 1 reveals that while individual components of our proposed approach have been validated in isolation, no existing work comprehensively integrates Graph Neural Networks for structural analysis with LSTM networks for temporal modeling specifically for attack vector reconstruction.
5. Discussion and Future Prospects
The hybrid GNN-LSTM model demonstrated strong performance in reconstructing and predicting attack vectors. It achieved an AUC of 0.99 on both balanced and imbalanced datasets, showing its ability to handle class imbalance prevalent in cybersecurity. A high F1-score of 0.85 for technique prediction highlights its proficiency in identifying attack step sequences, crucial for proactive mitigation. Furthermore, a Mean Squared Error (MSE) of 0.05 for risk assessment suggests reliable quantification of threat severity, aiding timely decision-making.
Visualizations (
Figure 6 and
Figure 7) confirm the model’s capture of structural relationships, with clear separation of benign and attack instances due to the GNN’s effectiveness in modeling MITRE ATT&CK techniques. The LSTM component accurately predicted technique sequences, aligning with Sahu et al. [
65] on LSTM’s strength in detecting cyber threat patterns. The high F1-score for attack vector reconstruction validates the model’s integrated structural and temporal analysis. Despite its strengths, the model has limitations. The calibration plot (
Figure 12) shows overconfidence in some mid-range predictions, potentially leading to misinterpretations. Platt scaling or isotonic regression could improve this. A slight overlap in risk predictions (
Figure 9) indicates difficulty distinguishing subtle attack behaviors, suggesting a need for advanced feature engineering or additional data sources like behavioral telemetry.
Performance on imbalanced datasets showed a slight drop in recall for attack classes (0.94 vs. 0.98 on balanced sets), implying some attacks might be missed in real-world scenarios. This aligns with challenges noted by Enoch et al. [
17] regarding complex, high-dimensional cybersecurity data. Future work could explore resampling or cost-sensitive learning. The hybrid model’s computational complexity, due to the integrated GNN and LSTM components, increases training and inference times. Techniques like model pruning or lightweight architectures by Zhang et al. [
59] could enhance efficiency for real-time deployment. Future research avenues include incorporating attention mechanisms by Sahu et al. [
65] to focus on critical features, integrating real-time threat intelligence [
59] for adaptability to emerging threats, and applying the model to other datasets to validate its generalizability. The models studied in the article are planned to be used in applied information systems in the future [
67,
68,
69,
70,
71].
Our hybrid GNN-LSTM model effectively reconstructed attack vectors, achieving a 0.99 AUC on the CICIDS2017 dataset. However, this research also highlighted several key areas for future development:
We will validate the model using anonymized corporate SIEM logs from Kazakhstani telecommunications operators (2023–2025) to test its ability to detect attacks not present in public datasets. We plan also to redesign the architecture into a cascaded system. A simplified GNN will handle initial traffic filtering on edge routers, sending only suspicious sequences to a central server for full model analysis. This is projected to reduce computational load by 70% while maintaining accuracy.
We will develop a report generation module to convert model outputs into analyst-friendly descriptions (e.g., “Detected action sequence characteristic of credential theft…”). This requires integration with the MITRE ATT&CK framework and a specialized language module. A pilot project at the JSC “Institute of Digital Engineering and Technology” will integrate the model into existing security infrastructure for real-world validation and optimization. Results will inform a commercial solution for the Kazakhstan cybersecurity market, considering local regulations.
Our long-term goal is an adaptive system that evolves with threats. We will research continual learning and episodic memory to prevent catastrophic forgetting when retraining on new attack types, preserving knowledge of rare but critical threats.
6. Conclusions
The developed hybrid GNN-LSTM model effectively reconstructs and predicts attack vectors, demonstrating robust performance. The model achieved an AUC of 0.99 on both balanced and imbalanced test sets of the CICIDS2017 dataset, proving its high accuracy in distinguishing between benign and attack instances, even with significant class imbalance. It accurately predicted MITRE ATT&CK techniques with an F1-score of 0.85, showing its capability to reconstruct attack sequences.
The model’s risk prediction performance, with an MSE of 0.05, highlights its reliability in assessing potential threat severity for operational deployment. Visualizations (t-SNE and PCA) confirmed the model’s ability to capture structural patterns, while the LSTM component effectively modeled temporal dependencies, leading to accurate attack progression predictions. The F1-score of 0.97 for the attack class on the imbalanced test set underscores the model’s practical usefulness in real-world scenarios where attack instances are rare.
The main challenge lies in the fact that CICIDS2017 represents a limited set of attack scenarios. Future research priorities include comprehensive multi-dataset validation across NSL-KDD, UNSW-NB15, and ToN-IoT datasets; real-world deployment in enterprise environments; dataset-adaptive graph structure optimization; systematic benchmarking against existing approaches; and specialized applications for critical infrastructure protection. The next phase will involve validating the model on corporate data collected directly from the SIEM systems of large organizations.
Its performance on the CICIDS2017 dataset validates its scalability, accuracy, and generalizability, positioning it as a promising tool for enhancing proactive cybersecurity defenses. However, addressing identified limitations through improved calibration, feature engineering, and computational optimization will be crucial for successful operational deployment.