Next Article in Journal
Diffusion-Guided Model Predictive Control for Signal Temporal Logic Specifications
Previous Article in Journal
Topology Robustness of State Estimation Against False Data Injection and Network Parameter Attacks on Power Monitoring and Control Systems
Previous Article in Special Issue
An Efficient Certificate-Based Linearly Homomorphic Signature Scheme for Secure Network Coding
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Cyber Attack Path Prediction Approach Based on a Text-Enhanced Graph Attention Mechanism

1
China Nuclear Power Operation Technology Corporation, Ltd. (CNPO), Wuhan 430074, China
2
School of Computer Science and Artificial Intelligence, Hubei University of Technology, Wuhan 430068, China
3
Hubei Provincial Key Laboratory of Green Intelligent Computing Power Network, Wuhan 430068, China
4
Hubei Provincial Engineering Research Center for Digital & Intelligent Manufacturing Technologies and Applications, Wuhan 430068, China
*
Author to whom correspondence should be addressed.
Electronics 2026, 15(3), 552; https://doi.org/10.3390/electronics15030552
Submission received: 31 December 2025 / Revised: 24 January 2026 / Accepted: 25 January 2026 / Published: 27 January 2026
(This article belongs to the Special Issue Cryptography in Internet of Things)

Abstract

In order to solve the problem of traditional methods not being able to discover hidden attack trajectories, we propose a cyber attack path prediction approach based on a text-enhanced graph attention mechanism in this paper. Specifically, we design an ontology that captures multi-dimensional links between vulnerabilities, weaknesses, attack patterns, and tactics by integrating CVE, CWE, CAPEC, and ATT&CK into Neo4j. Then, we inject natural language descriptions into the attention mechanism to develop a text-enhanced GAT that can alleviate data sparsity. The experiment shows that compared with existing baselines, our approach improveds MRR and Hits@5 by 12.3% and 13.2%, respectively. Therefore, the proposed approach can accurately predict attack paths and support active cyber defense.

1. Introduction

With the rapid advancement of Internet technology, billions of smart devices have been deployed in critical fields such as industrial production, smart homes, and smart cities. However, IoT devices are often characterized by limited resources, weak security mechanisms, and persistent connectivity, making them ideal targets for cyber attacks. Conventional security defense measures heavily rely on signature-based detection and reactive blocking and are no longer able to cope with the constantly evolving attack chains that exploit inter-device trust relationships and cross domain vulnerabilities.
In recent years, knowledge graphs and graph neural networks (GNNs) have been increasingly used for attack modeling and prediction [1,2]. Existing work has demonstrated the practicality of integrating secure repositories such as common vulnerabilities and exposures (CVEs), common weakness enumeration (CWE), and common attack pattern enumeration and classification (CAPEC) into a unified graph. However, existing schemes have the following limitations: (1) They typically treat security entities as purely structural nodes and ignore the rich textual semantics in vulnerability descriptions and threat reports. (2) They lack clear mechanisms to handle the inherent data sparsity in network deployments. Therefore, there are still some challenges in using knowledge graph-based methods to achieve cybersecurity predictions with unique features and scales. To achieve accurate attack path prediction in complex networks, it is essential not only to model structural relations but also to incorporate text-level semantics and address sparsity through dedicated mechanisms. In this paper, we propose a cyber attack path prediction approach based on a text-enhanced graph attention mechanism. This approach can transform the attack prediction task into a link prediction problem and can identify and supplement potential entity relationships that are not directly observed in the entire graph, thereby more effectively predicting attack paths. The main contributions of this article are outlined as follows:
  • To tackle the multi-source heterogeneity and the large-scale nature of data in attack prediction, we propose an innovative text-enhanced graph attention mechanism knowledge graph reasoning model. This model effectively utilizes textual node information to enhance the reasoning capability of cybersecurity knowledge graphs and applies it to attack path prediction.
  • To address insufficient attack dimension information when constructing cybersecurity knowledge graphs, we introduce a novel method that extracts hidden attack patterns from cyber threat intelligence and integrates CAPEC to supplement attack dimension information. By mining potential attack patterns from threat reports and applying them to knowledge graph construction, a correlation with ATT&CK attack techniques is established, providing a solid knowledge foundation for attack prediction and attack path formation, thereby achieving more complete and practical attack path construction.
  • Through comparative and ablation experiments, the proposed approach significantly reduces the search complexity of attack prediction while lowering time and space costs, assessing the effectiveness of the proposed approach in attack path prediction.
The following sections of this article are arranged as follows. Section 2 reviews related work on the proposed approach. Section 3 details the construction of the security knowledge graph. Section 4 presents model development and experimental evaluation. Section 5 concludes the study.

2. Related Work

2.1. Representation, Construction, and Graph Database Storage of Knowledge Graphs

A knowledge graph constitutes a structured repository of semantic information, aimed at capturing entities, abstract concepts, and the relations among them through graph-based structures [3]. In terms of representation methods, property graph and resource description framework (RDF) models are often considered. The former allows for the attachment of key-value property pairs to nodes and edges and is widely adopted in industry. Due to their flexibility and efficient adjacency traversal capability, property graphs are particularly suitable for real-time querying of large-scale cybersecurity knowledge [4]. In contrast, RDF, as a W3C standard, emphasizes semantic and reasoning capabilities and is often used in conjunction with ontology languages like RDFS and OWL to define concept hierarchies and complex constraints. However, its triple-based storage and reasoning face performance bottlenecks on large-scale graphs [5]. In terms of storage, native graph databases have become the preferred choice for storing and managing knowledge graphs due to their performance advantages over relational databases in handling complex associative queries [6]. Nevertheless, existing storage schemes in cyber attack and defense scenarios still need to address issues such as concurrent write conflicts and incremental update efficiency, which is particularly critical for dynamically evolving threat intelligence graphs.

2.2. Network Attack Modeling and Path Prediction Methods

Attack path prediction is the core of proactive defense. Some studies primarily rely on attack graph-based methods, which formally model network assets, vulnerabilities, and connectivity to generate all possible attack paths [7,8]. While these methods can systematically reveal network vulnerabilities, they often face challenges such as state-space explosion and high levels of computational complexity. In recent years, knowledge graph-based methods have provided a new paradigm for attack path prediction. Knowledge graphs can integrate multi-source heterogeneous threat intelligence, such as vulnerabilities, weaknesses, attack patterns, and adversarial tactics and techniques to construct a semantically rich network of associations. Transforming the attack prediction task into a link prediction or path reasoning problem on the knowledge graph has become a research hotspot [9]. For example, some studies integrate authoritative data sources like ATT&CK and CAPEC to construct knowledge graphs for advanced persistent threat (APT) attack analysis, supporting threat hunting and attack chain reconstruction [2]. However, these schemes ignore the textual semantics in vulnerability descriptions and threat reports, resulting in a significant decrease in link prediction accuracy in data-sparse scenarios. In addition, most studies rely on static graphs for inference and lack the ability to model dynamic attacks [10].

2.3. Construction Practices of Cybersecurity Knowledge Graphs

Building a high-quality cybersecurity knowledge graph (CKG) is a systematic engineering task, with its key lying in ontology design and multi-source data fusion. A well-designed domain ontology, such as the unified cybersecurity ontology, provides a unified semantic framework for entities and relationships [11], serving as the cornerstone of the knowledge graph. On the data level, knowledge typically needs to be extracted from structured vulnerability databases, semi-structured threat intelligence reports, and unstructured security analysis texts, involving natural language processing techniques such as named entity recognition and relation extraction [12]. However, these methods still face challenges in processing long texts and semantic disambiguation, requiring the integration of domain dictionaries and knowledge constraints to improve extraction accuracy. Successfully constructed CKGs have demonstrated their value in multiple scenarios. For example, MITRE’s CyGraph system is used for network attack situational understanding and attack surface identification, as well as knowledge graph-based APT organization profiling and attribution analysis [13]. These practices provide important references for this study in building an integrated, prediction-oriented CKG, but they also expose issues such as the lack of link-weight learning mechanisms tailored for prediction tasks and dynamic update mechanisms. This is precisely the direction that the text-enhanced graph attention model proposed in this paper aims to address.

2.4. Attack Prediction Models Based on Graph Neural Networks

GNNs play a crucial role in mining deep threat clues from constructed knowledge graphs [14]. GNNs are capable of processing graph-structured inputs, fusing neighboring node features via iterative message passing to derive compact embeddings for both nodes and edges, thereby rendering them well suited to applications such as community detection, anomaly detection, and link prediction. In attack prediction, GNNs are used to model the evolution of attacker behavior. Classical GNN models like GraphSAGE have been applied to intrusion detection. More cutting-edge work focuses on enhancing model reasoning and interpretability. A graph attention (GAT) network distinguishes the importance of different neighbors in the graph through an attention mechanism, thereby allowing a finer focus on key attack evidence [15]. Existing schemes explore the combination of textual semantic information (encoded via pre-trained language models) [16,17] such as vulnerability descriptions with graph structural information to enrich node feature representations, thereby improving the accuracy of link prediction (i.e., attack path prediction). Furthermore, some schemes combine reinforcement learning with knowledge graph embeddings [18,19], modeling path prediction as a sequential decision-making problem to find optimal attack paths. However, these schemes still face the challenge of slow convergence speed in defense scenarios that require high real-time performance. The text-enhanced graph attention mechanism proposed in this article is designed to address the shortcomings of existing methods in terms of semantic text fusion and adaptability to sparse data.

2.5. Summary

Although existing schemes have made significant progress, many CKG constructions focus on single data sources or specific applications, with insufficient discussion of the deep integration of multi-dimensional intelligence sources like CVE, CWE, CAPEC, and ATT&CK to build a unified graph that is directly usable for end-to-end prediction. Furthermore, the deep integration of textual semantics and graph structural attention for attack path prediction tasks remains a current research frontier. Therefore, this study systematically constructs a cybersecurity knowledge graph that integrates multi-source threat intelligence and innovatively proposes a prediction model based on a text-enhanced graph attention mechanism. It provides a complete framework, from multi-source data fusion and graph construction to intelligent prediction, promoting the evolution of cybersecurity defense from static analysis to a dynamic, predictive mode.

3. Cybersecurity Knowledge Graph Construction

3.1. Knowledge Graph Data Analysis

A knowledge graph is a structured semantic knowledge database symbolically representing various concepts and their inter-relationships in the real world [20]. From the viewpoint of knowledge acquisition, constructing a knowledge graph generally involves three stages: information extraction, knowledge integration, and knowledge refinement [21]. Within information extraction, two prevailing strategies are recognized: top-down and bottom-up. The top-down approach derives ontological and relational structures from structured, semi-structured, or otherwise high-quality datasets. Conversely, the bottom-up approach starts from semi-structured or unstructured sources, applying techniques such as entity detection, relation identification, and attribute discovery to extract candidate facts, then selecting the most reliable ones for inclusion in the knowledge repository [22]. The mapping data considered in this paper includes CVE, CWE, CAPEC, and attack patterns in ATT&CK ID. The data is described as follows:
CVE is a well-known record of security vulnerabilities that provides a unique identification number for each public security vulnerability or exposure, in addition to providing a short summary describing the products and version information that are affected by the modified vulnerability, as well as version information to better understand the security vulnerability and evaluate related services, tools, and databases, as shown in Figure 1. In this paper, we search websites and summarize all information related to each vulnerability and collect and summarizes all CVE information from websites. For each vulnerability, the proposed approachretrieves the CVE-ID from the given vulnerability report web page and associates the CVE with a CWE as a “BelongOf” relationship—for example, CVE-2019-14816 is BelongOf CWE-120.
CWE is a community-developed list that provides a standardized dictionary of software security weaknesses, which includes many different types of software weaknesses, such as buffer overflows, injection attacks, inappropriate assignment of privileges, etc. Each weakness has a unique CWE ID and a detailed description. This description includes the nature of the weakness, the possible consequences, the associated attack patterns, and how the weakness can be avoided or mitigated, e.g., CWE-59 falls under the ChildOf category, enabling security analysts to uncover latent associations among diverse CWEs. Our study extracted relevant records from CWE-706891, where each entry possesses a distinct CWE-ID, natural language description, links to other CWEs, and an associated attack pattern. Table 1 defines eight types of CWE relationships: BelongOf captures the linkage between CWE and CVE; AttackOf represents connections to CAPEC attack patterns; ChildOf, ParentOf, CanPrecede, CanFollow, and PeerOf describe pairwise relations among CWEs; and Semantics encodes hyperlinks from a CWE entry to external web pages in its vulnerability report.
CAPEC documents the tactics and methods attackers use to exploit software or hardware vulnerabilities. This database helps security analysts and developers understand attacker behavior and enhance cyber defense capabilities [23]. The CAPEC entries detail the typical characteristics of attackers and their attack techniques. It also explores the challenges and potential risks associated with exploiting known vulnerabilities. For example, CAPEC-86 stands for “Directory Traversal”, CAPEC-116 stands for “LDAP Injection”, and CAPEC-134 stands for “Cross-Site Scripting (XSS) Attacks”. In our experiments, we consider the attack pattern descriptions, correlations, and weaknesses of each CAPEC entry as key analysis features. In addition, we consider the relationship between a CAPEC and its associated weakness (CWE) as TargetOf.
ATT&CK is a knowledge base about adversarial behaviors based on real-world observations that details the tactics, techniques, and processes that an attacker is likely to employ. Its tactic distribution diagram is shown in Figure 2. The ATT&CK framework not only helps security analysts understand the behavioral patterns of an attacker but also provides a framework for evaluating an organization’s security defense capabilities. During the collection of cyber threat intelligence, we extract ATT&CK IDs associated with attack patterns. These IDs provide unique identifiers for each observed tactic and technique. For example, consider a known CVE, such as CVE-2020-1472, a vulnerability affecting Microsoft. This data relationship type is set to ExploitedBy versus CorrespondsTo, indicating a pre-existing relationship. For example, consider a known CVE, such as CVE-2020-1472, a vulnerability affecting the Microsoft remote desktop protocol (RDP) that allows an unauthenticated adversary to execute arbitrary code by sending specially crafted RDP packets, which, in the ATT&CK framework, can be mapped to “T1210—Exploitation for Client Execution”. Using the Exploitation for Client Execution technique, it is possible to establish Exploitation techniques CWE, CVE, CAPECCVE-2020-1472 ExploitedBy T1210, and CWE-20 CorrespondsTo T1210, which indicates that CVE-2020-1472 is an attacker’s way of exploiting the ”T1210—Exploitation for Client Execution” technique, while CWE-20 corresponds to this ATT&CK ID technique.
By synthesizing CVE, CWE, CAPEC, and ATT&CK, we construct a multi-dimensional, semantically rich knowledge graph that provides comprehensive data support for cybersecurity analysis and defense.

3.2. Ontology Modeling

The knowledge graph ontology is the architectural core of the knowledge graph, defining the basic elements of the data model that make up the graph, such as entities, concepts, attributes, relationships, and the hierarchical and logical relationships between them [24]. It not only ensures the consistency, accuracy, and usability of the graph but also supports efficient knowledge reasoning and querying through explicit semantic descriptions, thereby enhancing the scalability and maintainability of the graph. The ontology design focuses on the precise representation of concepts in the domain and the logical rules of complex relationships. It enables knowledge graphs to automatically process complex queries and analyses and supports the automatic discovery and reasoning of knowledge, thereby improving the value and efficiency of knowledge graphs in practical applications. The ontology model usually includes the following aspects:
  • Entity definition determines which major entities and concepts will be included in the knowledge graph. For instance, in the cybersecurity domain, entities may include vulnerabilities, attack patterns, attackers, defenses, etc. The correspondence between data dimensions and entity types is presented in Table 2.
  • Attribute definition defines, for each entity and concept, a set of attributes that characterize the entity. For example, a vulnerability entity may have attributes such as its unique identification number, affected products, release date, etc.
  • Relationship definition defines the types of relationships that may exist between entities. For example, a vulnerability may be associated with a particular attack pattern, or an attack pattern may exploit a particular weakness.
  • Identify of entity types involves the definition of four main entity types—CVE (vulnerability instance), CWE (vulnerability type), CAPEC (attack pattern), and ATT&CK ID (attack technique and tactics).
  • Determining entity attributes: Vulnerability entities contain attributes such as vulnerability name, vulnerability description, vulnerability level, exploitability of the vulnerability, etc. The vulnerability entity contains attributes such as vulnerability name, vulnerability description, related attack pattern, etc. The entity contains attributes such as attack pattern name, attack pattern description, typical target, etc. The CAPEC ATT&CK contains attributes such as tactic name, technique name, process name, etc. The correspondence between entity types and their attributes is shown in Table 3.
  • Identify relationships between entities: The BelongOf relationship between CVE and CWE indicates that a specific CVE belongs to a certain CEW category. AttackOf represents the attack relationship between CWE and CAPEC, which reveals the types of vulnerabilities that attackers may exploit. This is crucial for identifying and defending against potential attack paths. The ExploitedBy relationship indicates that a particular CVE may be exploited by a particular ATT&CK technique or tactic. The “ExploitedBy” relationship between CWE and ATT&CK ID further refines the direct link between the vulnerability type and the attack technique. The enumerated relationships are listed in Table 4. With these types of relationships, a knowledge graph may provide a comprehensive view of cybersecurity threats. The ontology design model is shown in Figure 3.

3.3. Building a Knowledge Graph Based on Neo4j

In this study, we employ BeautifulSoup to crawl and parse web pages containing CVE, CWE, and CAPEC data, as well as data extracted from network threat intelligence that include ATT&CK IDs. We collect textual descriptions and corresponding CWE-IDs for each CVE web page. Since attack patterns play a critical role in CAPEC, we utilize the textual descriptions of each CAPEC entry, along with their inter-relationships. The collection process can be divided into three steps: (1) Define the strategy and scope of data collection. (2) Identify the types of assets to be collected and specify the security elements (e.g., vulnerabilities and configurations). (3) Select the Nmap tool for data acquisition, conduct comprehensive scans of network assets, and store the resulting data in a database.
After completing the ontology structure, construction is carried out based on the Neo4j construct. Neo4j is a graph-based database uses a graph structure to represent data and relationships [25,26]. This structure makes connections between data more intuitive and easier to understand. Neo4j provides a rich set of APIs and tools that enable developers to easily build, query, and manipulate graph databases, enabling rapid processing and analysis of large-scale datasets and supporting complex queries and operations.
In this paper, we choose Neo4j Desktop 1.4.15 as the software platform for knowledge graph construction and use the Python 3.9 programming language, combined with the py2neo library, to achieve interaction with the Neo4j graph database. We first perform initial setup on the database, including the configuration of accounts and passwords on the local server, to ensure that the structure and dynamic changes in the knowledge graph can be visually observed through the browser interface. Then, we import various types of entity nodes and their inter-relationships in bulk and assign attributes to these entity nodes to better map concepts and relationships in the real world. In this paper, a containeris constructed based on the assets of the shooting range and the data collected above knowledge graph entities, which can be accessed through the desktop view within the database by clicking on the nodes to view the attributes. Some of the nodes are shown in Figure 4.

4. Experimental Results and Analysis

4.1. Modeling Framework

The attack path prediction model presented in our approach is based on the link prediction framework of the graph attention mechanism, which aims to map out the attack paths by predicting the potential relationships between entities in the knowledge graph. Faced with entities from different data sources and their complex inter-relationships, this model innovatively introduces the utilization of the corresponding textual information of the entities to enhance the accuracy and depth of the prediction of relationships between different nodes. Specifically, the model first adopts structural embedding techniques and textual description generation methods to construct vectorized representations of nodes and their associated texts. Subsequently, we incorporate an attention mechanism layer that allows the model to selectively attend to salient features and produce relational embeddings capturing inter-entity links, thereby effectively achieving relationship prediction. This process not only optimizes the process of identifying entity relationships but also strengthens the model’s ability to understand potential threats in complex cybersecurity environments by combining structural and textual data. Through this approach, we can accurately reveal and predict the strategies and attack sequences that may be employed by attackers, providing a new and more efficient means of network security protection. The attack path prediction model includes data preprocessing, vectorized representation generation, the application of the attention mechanism, and the final relationship prediction process.
  • We train the structural representation of the knowledge graph using the TransE algorithm, which efficiently captures the connectivity patterns of entities and relations such as CVE, CWE, CAPEC, and ATT&CK IDs extracted from cyber threat intelligence. The textual descriptions of these entities are vectorized through the Word2Vec model and transformed into numerical information to capture the entities’ semantic features.
  • Subsequently, these vectors of textual feature are combined with the structural feature vectors generated by TransE to form a rich entity representation that integrates structural and textual information.
Based on this, the model introduces an attention mechanism layer that is capable of assigning weights to potential connections between entities, thereby focusing on the most critical information during the prediction process. In this way, the text-enhanced GAT model can make effective predictions of possible entity relationships in the knowledge graph, providing strong technical support to reveal links that are not directly observed [5]. The architecture of the text-enhanced graph attention mechanism is shown in Figure 5.

4.2. Attack Path Prediction for Graph Attention Mechanisms Based on Text Enhancement

In this section, we design a graph attention network architecture augmented with textual features (termed text-enhanced GAT) that emphasizes the necessity of considering the knowledge contained in the triad of its neighbors when analyzing each entity. A description of the model’s process is presented as follows.
  • Structure embedding and description embedding generation: Structural embedding is performed through two strategies. On the one hand, the ternary is trained by the TransE model and embedded as the initial structure, with the dimension of structural embedding vector set to 100. On the other hand, when processing each text description, as many semantic knowledge features as possible are extracted. Specifically, stem extraction is performed on the text using CVE and CWE; then, the tokens (words) are sent to the widely used NLTK word2vec model [27,28]. Additionally, each word is converted into a vector that connects all word vectors in each entity description. Then, the sentence length is set to 100–375 (maximum sentence length). If the sentence length is less than this value, a filler is used to supplement it. The correlation vector is initialized at 0 with a latitude of 100, and the sentence vectors of are fed into a two-layer convolutional neural network (375 × 100 CNN), with the kernel width set to 100 to scan the convolutional feature space and to apply ab activation operation. The first ReLUCNN layer is followed by a k-max (k = 20) pooling layer to better represent the frequent features and to obtain the positional information in the feature space; then, the convolutional neural features are sent to the second CNN layer, which is activated, and the average pooling layer is used to compress the neural features and to obtain their full-play average features. Finally, the ReLUstructural feature vectors (generated by 100 dimensions) are combined with the TransE CNN-generated text feature vector (100 dimensions) and connected to form a 200-dimensional feature vector. For example, one entity node in the graph is not only associated with multiple CVEs but is also related to different ATT&CK techniques. Among them, CVE represents different instances of vulnerabilities, while ATT&CK describes different attack strategies that attackers may implement to exploit vulnerabilities. The descriptive message reads as follows: “EternalBlue exploits vulnerabilities in Microsoft’s implementation of the server message block protocol. It allows attackers to execute arbitrary code on the target system”. The TransE model is trained on the above structural information to generate embedding vectors of all entities and relationships related to the “EternalBlue” vulnerability, each with a dimension of 100. The above description text is processed by removing deactivated words and stemming using the NLTK toolkit. For example, “exploits” is changed to “exploit”, and “vulnerabilities” is changed to “vulnerability”. Each processed word is then converted into a 100-dimensional vector using the Word2Vec model. These text embeddings are processed using CNN, and after two layers of convolution and pooling operations, a description embedding is obtained. This description embedding is spliced with the structural embedding to obtain a 200-dimensional vector that fully captures the structural and semantic properties of the “EternalBlue” vulnerability.
  • Knowledge graph attention layer: We devise a multi-head attention neural network layer for extracting relation features of entities to generate a connection vector ( ρ ( h , r , t ) ). Inspired by the attention mechanism, we use a single-layer feed-forward neural network to assign coefficients to each neighboring node within a triple and apply the activation function, i.e.,
    α ( h , r , t ) = LeakyReLU W ρ ( h , r , t ) ,
    for nonlinear mapping prior to normalization, where W is a linear transformation matrix. A softmax layer is introduced to normalize the attention scores, yielding the following standardized attention parameters:
    β ( h , r , t ) = exp α ( h , r , t ) n N h r R h n exp α ( h , r , n ) ,
    where N h represents the set of neighboring nodes of entity h and R h n corresponds to the set of relations linking h with its neighboring nodes. After the normalization step described in the preceding equation, the embedding of the entity is updated by aggregating the attention weights from all the neighboring nodes. This updated embedding is then subjected to a linear transformation. To prevent the loss of the entity’s original semantic information during training, the initial embedding is combined with another linearly transformed embedding, thereby enabling the multi-head attention network to capture deeper neural representations of both entities and relations.
  • Model training: Optimization is performed as a loss function with the following formula:
    L = ( h , r , t ) S ( h , r , t ) S max f ( h , r , t ) f ( h , r , t ) + γ , 0
    where ( h , r , t ) denotes a valid instance from set S of legitimate triples, ( h , r , t ) denotes an invalid sample constructed by randomly substituting the head entity or the tail entity of a valid triple, and S = ( h , r , t ) h E h ( h , r , t ) t E t . f ( h , r , t ) = h + r t 1 2 2 defines the value to be minimized for optimization of the L2 paradigm dissimilarity measure, and r is an edge hyperparameter used to determine the correct and incorrect triad boundaries (set greater than 0).
    Using the 200-dimensional combined vector obtained above, after going through the attention layer, it is found that “EternalBlue” is associated with CVE-2017-0144 and CVE-2017-0144 is further associated with CAPEC-31, with the attention weight indicating the importance of CAPEC-31 in predicting the attack path of the “EternalBlue” vulnerability. By using the softmax function to calculate the normalized attention value, the embedding value of “EternalBlue” can be updated to obtain a new embedding value that integrates structural information from multiple neighboring nodes. This embedding value reflects the importance and contextual information of “EternalBlue” in the knowledge graph.

4.3. Results

The text enhancement map pays attention to the network model. Specifically, for the GAT module, the weight decay coefficient is chosen as 5 × 10−6, while that for the convolutional layers is fixed at 1 × 10−5. Parameter optimization is performed using an iterative update approach with a learning rate of 1 × 10−3. To mitigate the overfitting problem, a dropout rate of 0.3 is uniformly applied to each convolutional layer. Throughout training, a batch size of 8923 is employed for all samples. The Adam algorithm is chosen as the optimization function, along with its recommended hyperparameters.
Baseline Comparison: The TransH model is an extension of TransE model, which provides a hyperplane ( W r ) for each relationship (r). The TransH model compensates for the poor performance of the TransE model in one-to-many, many-to-one, and many-to-many relationships. Therefore, the TransH model, combined with text model, is selected for comparison with the Text-enhanced GAT model proposed in this paper.
For the link prediction task, we adopt a unified negative sampling strategy that randomly destroys the head or tail entities to generate K negative examples for each positive instance while maintaining validity constraints. In addition, we divide the dataset into training, validation, and testing sets, with proportions of 70%, 15%, and 15%, respectively. Stratified sampling is applied to maintain the distribution of entity types and relationship types in the splitting process, minimizing potential biases.
Evaluation Metrics: For each triple, it is necessary to predict the missing elements according to the other two elements and provide a candidate list containing the missing elements. Knowledge graph link prediction performance can only be as common as mean rank (MR), Hits@N, and mean reciprocal rank (MRR). MR is used to measure the model’s performance in predicting missing relationships between entities with the following formula:
MR = 1 | S | i = 1 | S | rank i
where S represents the set of all triples, | S | denotes the total number of triples in this set, and rank i is the predicted ranking of links for the i-th triple.
MRR considers all the test triples and calculates the inverse of the correct relationship ranking for each triple in its correlation ranking list. Next, we compute the average of these inversion statistics to obtain the final assessment metric with the following formula:
MMR = 7 | S | i = 1 | S | 7 rank i
Hits@N indicates the proportion of correct entities among the top N candidates predicted by this model. A higher value of this metric signifies superior retrieval performance of the model. Its formula is expressed as follows:
Hits @ N = 7 | S | i = 1 | S | I ( rank i N )
Hits@N measures the hit rate of correct entities in the model’s top-N prediction list. MR evaluates the average position of correct predictions for the test dataset. MRR evaluates model performance by averaging the reciprocal ranks of the correct entities across individual prediction tasks. The corresponding experimental outcomes are summarized in Table 5.
From Table 5, it can be seen that the text-enhanced GAT model proposed in this paper finds the correct entity to be predicted earlier than the baseline model, and MMR is improved by 0.131, while Hits@5 is improved by 0.179. This indicates that the text-enhanced GAT model can better represent the textual description of entities, thereby improving prediction accuracy. In predicting the missing head entity, the MRR is increased by 0.134, while Hits@5 is increased 0.195. This improvement indicates that text-enhanced GAT can more effectively alleviate the problem of uncertainty in head entity prediction by utilizing descriptive semantics. In predicting the tail task, the MRR is increased by 0.136, while Hits@5 is increased by 0.161. The result indicates that when entities correspond to specific vulnerabilities, attack patterns, or strategies described in natural language reports, textual information helps improve the ranking of tail entities. Therefore, the model proposed in this paper improves the results of both head and tail prediction tasks.
Ablation Experiments: Typically, systematically removing or “ablating” certain components of the model has a certain impact on model performance. In this way, the part that contributes the most to the improvement of model performance can be identified. This research method can provide an intuitive understanding of the balance between model complexity and performance. In this section, we perform ablation studies by developing three modified models (i.e., CNN layer and TransE M-1, M-2, and M-3) and omitting the effects of initialization and attention components. The link prediction performance is tested using metrics such as MR, MRR, and Hits@N. The results are shown in Table 6.
Table 6 shows a comparison of results between the proposed source model and three improved models. When the negative effects of M-1, M-2, and M-3 are minimized, the MR of the CNN thesis increases from 91 to 149, MRR decreases to 0.626, and Hits@5 declines to 0.665. In the ablation variants, M-2 exhibits the second greatest performance degradation, which is obtained by removing the TransE module. Compared with the baseline model, the MR of M-2 increases by 235, but its MRR decreases to 0.522, and its Hits@5 decreases by 0.638. Further inspection reveals that the attention layer constitutes a pivotal component of the text-enhanced GAT architecture. When excluding this layer (model M-3), the MRR falls to 0.476, and Hits@5 declines to 0.531. The results underscore the decisive role of the attention mechanism in shaping the prediction capability of the text-enhanced GAT model.
Inference Time Analysis: In this study, we evaluate the inference performance of a knowledge graph-based model, with a focus on analyzing the inference time of the execution graph neural network for link prediction tasks. The inference process involves loading the entire graph structure (1.25 million nodes and 1.47 million edges) from a graph database, retrieving text-enhanced features, and executing forward propagation through the graph attention mechanism to generate node/edge representations and calculate prediction results. We conduct 1000 inference runs for each of the three target categories (CVE, CWE, and CAPE) and calculate the average execution time. The results are presented in Table 7.
As shown in Table 7, under the full-graph condition, the average inference times for CVE, CWE, and CAPEC are 42.6 ms, 38.4 ms, and 46.8 ms, respectively. These times are all within the millisecond response range, thereby meeting the requirements for real-time analysis in cybersecurity applications.
Scalability Analysis: To evaluate the scalability of the proposed text-enhanced graph attention mechanism, we extend the inference-time analysis by evaluating model performance on subgraphs of varying sizes extracted from the full knowledge graph (1,257,355 nodes and 1,478,742 edges). Subgraph scales are selected to represent incremental growth in nodes and edges while preserving the original structural characteristics of the cybersecurity domain. For each scale, we measure the average inference time for link prediction on CVE, CWE, and CAPEC-related subgraphs, as well as the corresponding F1 score and GPU memory usage. A comparison of inference time and performance at different subgraph scales is shown in Table 8.
The results presented in Table 8 demonstrate that inference time increases gradually with graph size, remaining in the millisecond range, even at full scale. This indicates that the model has good scalability, as the average inference times for CVE, CWE, and CAPEC tasks only moderately increase when expanding from a small subgraph (10,000 nodes) to the full knowledge graph (1.26 million nodes) (e.g., CVE from 28 ms to 42.6 ms). The F1 score decreases slightly (≤5.0%) as scale increases, reflecting minor performance degradation due to increased graph complexity. GPU memory usage scales accordingly but remains manageable on a single high-end GPU. These findings confirm that the proposed method sustains real-time inference capability across a wide range of graph sizes, a key advantage for practical deployment in cybersecurity scenarios.

4.4. Experimental Case Study

In this section, we present a real-world cybersecurity case study to illustrate how the GAT method can be applied to predict potential cyber attack paths. To visualize the process, we consider a specific cybersecurity scenario and use the cypher query command expressed as “MATCH (n:Software) WHERE n.name = ‘Apache’ AND n.version = ‘2.4.29’ RETURN n” to retrieve information about a particular software node from the knowledge graph. Starting from the server software node, i.e., Apache 2.4.29, the relevant information obtained by this node is further expanded, including CVE, CWE, CAPEC, as well as ATT&CK IDs extracted from threat intelligence reports. For example, this server may be associated with CVE7-9798 (a vulnerability affecting Apache HTTP servers), which allows remote attackers to execute arbitrary code through crafted requests. This vulnerability corresponds to CWE-20 (improper input validation), which can be exploited by attackers to launch CAPEC-31 (command injection) attacks, executing unexpected commands or accessing unauthorized data. By correlating relevant threat intelligence, this vulnerability can also be associated with ATT&CK technology T1190, which describes the exploitation of vulnerabilities in publicly accessible applications.
Based on these associations, we construct a branching path starting from Apache 2.4.29, passing through CVE7-9798, CWE-20, and CAPEC-31 and finally linking to ATT&CK T1190. This path not only identifies potential attack entry points but also points out specific techniques that adversaries may use. This analysis process helps security analysts better understand the nature of threats in order to propose appropriate countermeasures. Therefore, this method enables previously unconnected relationships in the knowledge graph to be discovered and predicted, providing solid support for network security defense.

5. Conclusions

In this article, we systematically study the construction of a knowledge graph based on cybersecurity and its application in attack path prediction. By integrating multi-source data and designing a reasonable ontology model, a structured and semantic cybersecurity knowledge graph is successfully constructed. The model based on text-enhanced the graph attention mechanism performs well in the link prediction task, effectively improving the accuracy of attack path prediction.

Author Contributions

Conceptualization and supervision, H.G.; investigation, data collection, and formal analysis, H.T.; writing—original draft preparation, B.Y.; writing—review and editing, G.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

Authors Hanjun Gao and Hang Tong were employed by the China Nuclear Power Operation Technology Corporation, Ltd. (CNPO). The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

  1. Ji, S.; Pan, S.; Cambria, E.; Marttinen, P.; Yu, P.S. A survey on knowledge graphs: Representation, acquisition and applications. IEEE Trans. Neural Netw. Learn. Syst. 2021, 33, 494–514. [Google Scholar] [CrossRef] [PubMed]
  2. Alsaheel, S.; Alrubayyi, H.; Alrubayyi, T.; Alosefer, A. ATT&CK-KG: A knowledge graph of adversary techniques and tactics. In Proceedings of the 2021 IEEE International Conference on Big Data; IEEE: New York, NY, USA, 2021; pp. 3059–3068. [Google Scholar]
  3. Zhou, K.; Xie, Y.; Liu, X. A multi-hop reasoning framework for cyber threat intelligence knowledge graph. In Proceedings of the 23rd International Conference on Trust, Security and Privacy in Computing and Communications, TrustCom 2024, Sanya, China, 17–21 December 2024; pp. 1037–1044. [Google Scholar]
  4. Shao, M.; Ding, Y.; Cao, J.; Li, Y. GraphFVD: Property graph-based fine-grained vulnerability detection. Comput. Secur. 2025, 151, 104350. [Google Scholar] [CrossRef]
  5. Yao, Y.; Ye, D.; Li, P.; Han, X.; Lin, Y.; Liu, Z. DocRED: A large-scale document-level relation extraction dataset. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics; Association for Computational Linguistics: Stroudsburg, PA, USA, 2019; pp. 764–777. [Google Scholar]
  6. Ghaderyan, D.; Aybat, N.S.; Aguiar, A.P.; Pereira, F.L. A fast row-stochastic decentralized method for distributed optimization over directed graphs. IEEE Trans. Autom. Control. 2024, 69, 275–289. [Google Scholar] [CrossRef]
  7. Marin, F.; Arduin, P.E.; Merad, M. Physics-informed graph neural networks for attack path prediction. J. Cybersecur. Priv. 2025, 5, 15. [Google Scholar] [CrossRef]
  8. Wang, X.; Bryan, C.; Li, Y.; Pan, R.; Liu, Y.; Chen, W.; Ma, K.L. Umbra: A visual analysis approach for defense construction against inference attacks on sensitive information. IEEE Trans. Vis. Comput. Graph. 2022, 28, 2776–2790. [Google Scholar] [CrossRef] [PubMed]
  9. Jiang, Z.; Liu, H.; Li, J.; Li, X.; Ma, J.; Yu, P.S. Target link protection against link-prediction-based attacks via artificial bee colony algorithm based on random walk. Int. J. Mach. Learn. Cybern. 2024, 15, 4959–4971. [Google Scholar] [CrossRef]
  10. Suo, Y.; Chai, R.; Chai, S.; Garcia, M.; Xia, Y. Active response mechanism against dynamic propagation attacks in distributed sensor networks. Inf. Fusion 2026, 126, 103582. [Google Scholar] [CrossRef]
  11. Preuvneers, D.; Joosen, W. An ontology-based cybersecurity framework for AI-enabled systems and applications. Future Internet 2024, 16, 69. [Google Scholar] [CrossRef]
  12. Yan, H.; Gui, T.; Dai, J.; Guo, Q.; Qiu, X. A unified generative framework for various NER subtasks. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing; Association for Computational Linguistics: Stroudsburg, PA, USA, 2021; pp. 5808–5822. [Google Scholar]
  13. Xiao, Y.; Zhang, S.; Zhou, H.; Li, M.; Yang, H.; Zhang, R. FuseLinker: Leveraging LLM’s pre-trained text embeddings and domain knowledge to enhance GNN-based link prediction on biomedical knowledge graphs. J. Biomed. Inform. 2024, 158, 104730. [Google Scholar] [CrossRef] [PubMed]
  14. Wu, Z.; Pan, S.; Chen, F.; Long, G.; Zhang, C.; Yu, P.S. A comprehensive survey on graph neural networks. IEEE Trans. Neural Netw. Learn. Syst. 2021, 32, 4–24. [Google Scholar] [CrossRef] [PubMed]
  15. Liao, N.; Wang, J.; Guan, J.; Fan, H. A multi-step attack identification and correlation method based on multi-information fusion. Comput. Electr. Eng. 2024, 117, 109249. [Google Scholar] [CrossRef]
  16. La, Z.; Qian, Y.; Leng, H.; Gu, T.; Gong, W.; Chen, J. MC-GAT: Multi-channel graph attention networks for capturing diverse information in complex graphs. Cogn. Comput. 2024, 16, 595–607. [Google Scholar] [CrossRef]
  17. Jin, B.; Zhang, Y.; Li, S.; Han, J. Bridging text data and graph data: Towards semantics and structure-aware knowledge discovery. Proceeding of the 17th International Conference on Web Search and Data Mining, Merida, Mexico, 4–8 March 2024. [Google Scholar]
  18. Cui, H.; Peng, T.; Xiao, F.; Han, J.; Han, R.; Liu, L. Incorporating anticipation embedding into reinforcement learning framework for multi-hop knowledege graph question answering. Inf. Sci. 2024, 619, 745–761. [Google Scholar] [CrossRef]
  19. Cheng, Y.; Bajaber, O.; Tsegai, S.A.; Song, D.; Gao, P. CTINexus: Automatic cyber threat intelligence knowledge graph construction using large language models. In Proceedings of the 2025 IEEE 10th European Symposium on Security and Privacy; IEEE: New York, NY, USA, 2025; pp. 923–938. [Google Scholar]
  20. Liang, K.; Liu, Y.; Zhou, S. Knowledge graph contrastive learning based on relation-symmetrical structure. IEEE Trans. Knowl. Data Eng. 2024, 36, 226–238. [Google Scholar] [CrossRef]
  21. Choi, S.; Jung, Y. Knowledge graph construction: Extraction, learning, and evaluation. Appl. Sci. 2025, 15, 3727. [Google Scholar] [CrossRef]
  22. Zhu, L.; Li, W.; Mao, R. PAED: Zero-shot persona attribute extraction in dialogues. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics; Association for Computational Linguistics: Stroudsburg, PA, USA, 2023; pp. 9771–9787. [Google Scholar]
  23. Zhou, D.; Zhou, B.; Zheng, Z.; Soylu, A.; Savkovic, O.; Kostylev, E.V. ScheRe: Schema Reshaping for Enhancing Knowledge Graph Construction. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management; ACM: New York, NY, USA, 2022; pp. 5074–5078. [Google Scholar]
  24. Xu, B.; Tong, R.J.; Li, Y.; Chen, P.; Li, H.; Liang, J.; Fan, X.; Tong, J. An architectural framework for educational knowledge graphs (IEEE P2807.6): Ontology design, llm integration, and adaptive learning applications. In 2025 IEEE Conference on Artificial Intelligence (CAI); IEEE: New York, NY, USA, 2025; pp. 1610–1616. [Google Scholar]
  25. Estak, M.; Heriko, M.; Druovec, T.W.; Turkanovi, M. Applying k-vertex cardinality constraints on a Neo4j graph database. Future Gener. Comput. Syst. 2021, 115, 459–474. [Google Scholar] [CrossRef]
  26. Xie, Y.; Jia, L.; Dai, J. Construction of a traditional Chinese medicine dao yin science knowledge graph based on Neo4j*. In Proceeding of the 2023 IEEE International Conference on Bioinformatics and Biomedicine (BIBM); IEEE: New York, NY, USA, 2023; pp. 4662–4666. [Google Scholar]
  27. Kumari, A.; Lobiyal, D.K. Efficient estimation of Hindi WSD with distributed word representation in vector space. J. King Saud Univ. Comput. Inf. Sci. 2022, 34, 6092–6103. [Google Scholar] [CrossRef]
  28. Gadusu, S.R.; McGinty, H. Semantic similarity for drug slang identification: A comparative analysis of Word2Vec and BERT. In Proceedings of the 13th Knowledge Capture Conference 2025 (K-CAP ’25); Association for Computing Machinery: New York, NY, USA, 2025; pp. 190–193. [Google Scholar]
Figure 1. CVE web page introduction.
Figure 1. CVE web page introduction.
Electronics 15 00552 g001
Figure 2. ATT&CK tactical map.
Figure 2. ATT&CK tactical map.
Electronics 15 00552 g002
Figure 3. Diagram of the ontology design model.
Figure 3. Diagram of the ontology design model.
Electronics 15 00552 g003
Figure 4. Diagram of Neo4j database storage.
Figure 4. Diagram of Neo4j database storage.
Electronics 15 00552 g004
Figure 5. Architecture of the text-enhanced graph attention mechanism.
Figure 5. Architecture of the text-enhanced graph attention mechanism.
Electronics 15 00552 g005
Table 1. Distribution statistics of CWE relationships.
Table 1. Distribution statistics of CWE relationships.
TypeNumber
BelongOf4003
AttackOf1067
ChildOf1084
ParentOf1084
CanPrecede129
CanFollow129
Peerof113
Semantic719
Table 2. Dimensional design.
Table 2. Dimensional design.
Data DimensionsEntity Type
Vulnerability dimensionLoophole
Weak-point dimension (math.)Failing
Attack dimensionAttack tactics and attack techniques
Table 3. Entity and attribute table.
Table 3. Entity and attribute table.
Entity TypePhysical Properties
LoopholeVulnerability Name, vulnerability level, and vulnerability exploitability
FailingWeakness name, weakness description, and associated attack pattern
CAPECAttack pattern name, attack pattern description, and typical targets
ATT&CKTactical name, technical name, and process name
Table 4. Relationship listing.
Table 4. Relationship listing.
Head EntityRelationshipTail Entity
CVEBelongsToCWE
CVEExploitedByCAPEC
CVEExploitedByATT&CK ID
CWEAttackOfCAPEC
CAPECTargetsCWE
CAPECCorrespondsToATT&CK ID
ATT&CK IDExploitsCVE
ATT&CK IDRelatesToCWE
ATT&CK IDRelatesToCAPEC
Table 5. Entity forecast results.
Table 5. Entity forecast results.
PredictionModelMRMMRHist@5
Head entityTransH combined text model1970.5270.523
<?, r, t>Text-enhanced GAT1580.6610.718
Tail entityTransH combined text model400.5710.585
<h, r, ?>Text-enhanced GAT330.7070.746
AverageTransH combined text model1190.5560.559
resultText-enhanced GAT960.6870.738
Table 6. Results of ablation experiments.
Table 6. Results of ablation experiments.
ModelExplanationMRMMRHist@5
M-1Remove CNN layers1490.6260.665
M-2Remove TransE2350.5220.638
M-3Remove attention layers2590.4760.531
Our approachText-Enhanced GAT910.6910.740
Table 7. Inference times for knowledge graph model under full-graph condition.
Table 7. Inference times for knowledge graph model under full-graph condition.
Retrieval TargetAverage Retrieval Time (ms)Standard Deviation (ms)Retrieval Success Rate (%)
CVE42.63.2100
CWE38.42.9100
CAPEC46.84.1100
Table 8. Inference time and performance at different subgraph scales.
Table 8. Inference time and performance at different subgraph scales.
NEAIT-CVE (ms)AIT-CWE (ms)AIT-CAPEC (ms)F1 Score (%)CMU (GB)
400050,00028253192.51.8
200,000250,00035323890.84.2
400,000500,00039364389.66.5
800,0001,000,00041374588.29.1
Note: N represents the number of nodes; E represents the number of edges; AIT-CVE, AIT-CWE, and AIT-CAPEC represent the average inference time for link prediction on CVE, CWE, and CAPEC, respectively; and CMU represents CPU memory usage.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Gao, H.; Tong, H.; Yong, B.; Shen, G. A Cyber Attack Path Prediction Approach Based on a Text-Enhanced Graph Attention Mechanism. Electronics 2026, 15, 552. https://doi.org/10.3390/electronics15030552

AMA Style

Gao H, Tong H, Yong B, Shen G. A Cyber Attack Path Prediction Approach Based on a Text-Enhanced Graph Attention Mechanism. Electronics. 2026; 15(3):552. https://doi.org/10.3390/electronics15030552

Chicago/Turabian Style

Gao, Hanjun, Hang Tong, Baoyan Yong, and Gang Shen. 2026. "A Cyber Attack Path Prediction Approach Based on a Text-Enhanced Graph Attention Mechanism" Electronics 15, no. 3: 552. https://doi.org/10.3390/electronics15030552

APA Style

Gao, H., Tong, H., Yong, B., & Shen, G. (2026). A Cyber Attack Path Prediction Approach Based on a Text-Enhanced Graph Attention Mechanism. Electronics, 15(3), 552. https://doi.org/10.3390/electronics15030552

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop