Review Reports
- Hanjun Gao1,
- Hang Tong1 and
- Gang Shen2,3,4,*
- et al.
Reviewer 1: Anonymous Reviewer 2: Anonymous Reviewer 3: Zoltán Nyikes
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThe paper addresses a critical and timely challenge in the Internet of Things (IoT) ecosystem: transitioning from reactive security to proactive attack path prediction. While the fusion of the four major databases is a significant undertaking and the application of text-augmented GATs is a sound approach for addressing the data sparsity inherent in security graphs, the manuscript suffers from technical inconsistencies, poor mathematical typesetting, and a lack of specific IoT-contextualized evaluation:
1) Section 4.2 explicitly states that the TransE model is used for structural embedding. However, Table 5 and the subsequent discussion compare the proposed model against a "TransH combined text model." It is unclear if TransE or TransH was used in the final "Our scheme," and using different base models for the comparison without explanation undermines the experimental rigor.
2) Several formulas (Pages 10 and 11) have significant rendering issues.
3) Despite the title, the paper primarily discusses generic cybersecurity databases (CVE/CWE). The "IoT" aspect is only mentioned in the introduction and conclusion. The experimental evaluation is conducted on a "shooting range" dataset, but there is no discussion of IoT-specific protocols (MQTT, CoAP), device constraints, or network topologies that would justify the "IoT" focus.
4) Figure 2 is a low-resolution screenshot of the MITRE ATT&CK website that is barely legible. Figure 4, while showing the Neo4j interface, provides little insight into the actual schema or a specific "attack path" example.
5) The metrics (MR, MMR, Hits@5) are standard and show improvement. However, the "Average result" in Table 5 is simply the mean of two tasks (Head and Tail prediction), which doesn't add much value beyond the individual task results.
6) The writing style is procedural rather than analytical. To me, it feels more like a technical report than an article. A scientific paper is an argument for a specific hypothesis. Currently, the paper reads as: We built a graph, we ran a model, and it worked. It needs more why: Why does text-enhancement solve the specific problem of IoT data sparsity? Why did the GAT underperform on certain tail entities?
7) The Results section ends abruptly. Add a subsection on security implications how can a network admin actually use this predicted path?
Author Response
Response to Reviewer 1:
Thanks for your comments on our paper. We have revised our paper according to your comments:
Comments 1: Section 4.2 explicitly states that the TransE model is used for structural embedding. However, Table 5 and the subsequent discussion compare the proposed model against a "TransH combined text model." It is unclear if TransE or TransH was used in the final "Our scheme," and using different base models for the comparison without explanation undermines the experimental rigor.
Response 1: Thank you for pointing this out. Because the TransH is an extension of TransE and a commonly used knowledge graph link prediction model, the TransH combined with text model will be selected for comparison with the Text-enhanced GAT proposed in this paper. We have provided additional explanations in Section 4.3. Please see the blue text at the end of the first paragraph in Section 4.3. “Baseline Comparison: The TransH model is an extension of TransE model, which……with the Text-enhanced GAT proposed in this paper.”
Comments 2: Several formulas (Pages 10 and 11) have significant rendering issues.
Response 2: Thank you for pointing this out. We have revised some formulas in Pages 10 and 11. Please see the blue text.
Comments 3: Despite the title, the paper primarily discusses generic cybersecurity databases (CVE/CWE). The "IoT" aspect is only mentioned in the introduction and conclusion. The experimental evaluation is conducted on a "shooting range" dataset, but there is no discussion of IoT-specific protocols (MQTT, CoAP), device constraints, or network topologies that would justify the "IoT" focus.
Response 3: Thank you for pointing this out. We have revised the title, abstract and introduction. Please see the blue text in Title, Abstract and Introduction.
Comments 4: Figure 2 is a low-resolution screenshot of the MITRE ATT&CK website that is barely legible. Figure 4, while showing the Neo4j interface, provides little insight into the actual schema or a specific "attack path" example.
Response 4: Thank you for pointing this out. We have updated Figure 1, 2, 4, and provided detailed explanations for case. Please see the blue text in Section 4.4. “In this section, we present a real-world cybersecurity case study to illustrate how the GAT method can be applied to predict potential cyberattack paths……providing solid support for network security defense.”
Comments 5: The metrics (MR, MMR, Hits@5) are standard and show improvement. However, the "Average result" in Table 5 is simply the mean of two tasks (Head and Tail prediction), which doesn't add much value beyond the individual task results.
Response 5: Thank you for pointing this out. For individual tasks, we describe the performance of both head entity prediction and tail entity prediction tasks, and also reflects the numerical values of individual tasks in the Table 5. Please see Section 4.3 and Table 5. “From Table 5, it can be seen that the text-enhanced GAT model proposed in this paper……Therefore, the model of this paper has improved the prediction results for both predicting the head and predicting the tail tasks.”
Comments 6: The writing style is procedural rather than analytical. To me, it feels more like a technical report than an article. A scientific paper is an argument for a specific hypothesis. Currently, the paper reads as: We built a graph, we ran a model, and it worked. It needs more why: Why does text-enhancement solve the specific problem of IoT data sparsity? Why did the GAT underperform on certain tail entities?
Response 6: Thank you for pointing this out. The primary objective of this article is to combine a multi-source network security knowledge base with a text enhanced graph attention network and demonstrate its effectiveness on a benchmark dataset. In order to make the methods and experimental procedures transparent and reproducible, we adopted a step-by-step programmatic description, which may make the article look more like a technical report. In the manuscript, the reviewer's "why" is mainly reflected in the design principles and ablation experiments, but we believe it can be further clarified. We will conduct a more comprehensive argument in the subsequent work.
Comments 7: The Results section ends abruptly. Add a subsection on security implications how can a network admin actually use this predicted path?
Response 7: Thank you for pointing this out. We have added detailed explanations for case. Please see the blue text in Section 4.4. “In this section, we present a real-world cybersecurity case study to illustrate how the GAT method can be applied to predict potential cyberattack paths……providing solid support for network security defense.”
Special thanks to you for your good comments.
Author Response File:
Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for Authors- The title of the paper is too generic and mainly does not acknowledge the methodological depth and/or main contribution of the study.
- The abstract is weakly written and misses the core components of the abstract such as the key quantitative results, the main contribution, even the motivation of the study is not clear in the abstract. The provided abstract can be considered as a paper outline, which is not supposed to be this way.
- The introduction should strengthen the research gap and the high-level view of the proposed approach focusing on the novelty and main contributions.
- The introduction misses the critical connection between the recent related work and the proposed work in the study and the exact positioning of the study among the recent related work.
- The literature review provided in the related work section is very shallow, lacks technical depth, and mostly descriptives. Additionally, the critical analysis of the related work should be enhanced, and it should rely on strong journal papers. Moreover, there should be strong connection between the related work and the proposed approach to emphasize the contribution of the related approach and position it among the related work. However, there are many references in this section that are either not related or out of the context of the paper.
- The methodology is generally clear but it is incomplete in fundamental aspects It just lists what the provided approach does but does not specify why or under what circumstances, for example it misses or under-specifies:
- The justification of the different decisions used in the design of the proposed approach such as why Word2Vec and CNN are used (even though it is outdated and there are many superior alternatives), why TransE is used not TransH or ComplEx, the reason behind the adopted fusion method … etc.
- The IoT environment is actually ignored in the methodology, for example, the imposed constraints in IoT are never used in the methodology, the IoT topology, IoT attack scenarios, even though the paper claims to be used for IoT.
- The details of the dataset construction process such as number of nodes, negative sampling strategy and the Train/Testing/Validation ratios should be provided.
- The detailed architecture of the proposed approach, for example, the number of attention heads, GAT layers, how the relations are treated … etc.
- There is a major concern for the comparative analysis in the results section:
- The limited evaluation scenarios used in the results limit the validity of the provided approach.
- No complexity or run-time analysis provided.
- No scalability analysis provided.
- The provided comparative analysis is very weak there is no justification why TransH is used even though there are many stronger models that are worth compared such as ComplEx, RotatEor even recent text-aware KG models. The result section really needs stronger baselines to position the paper among related models and to validate the provided approach.
- The results section is mainly descriptive, it just reports numbers but misses deep analysis of these numbers or even describes why these results occur or the ramifications of these results.
- There are numerous limitations for this work, however only very few of them are acknowledged very briefly. The limitations of this work should be further discussed such as:
- The use of a single dataset.
- IoT details since the paper claims that this work is for IoT, however, no details about the IoT devices, resource limitations, network topology, or even attach scenarios.
- The limited theoretical justification of the different design choices.
- Scalability analysis
- Complexity or efficiency analysis.
- …
- The reproducibility of both the methodology and the results in this study is in the low side mainly because there are many important information that are missing such as the details of dataset construction, the detailed model architecture (as indicated in the comments above), the details of the negative sampling strategy, no pseudo-code provided, no dataset release, no code repository … etc.
- Some important claims lack any citations such as:
- Line 16: “IoT devices are often characterized by limited … ”
- Line 18: “Traditional security measures, which primarily rely on detecting and blocking …”
- Line 21: “The fundamental contradiction in IoT security lies in the evolving nature …”
- …
- Some references are not peer-reviewed such as [6] and [18]. Also, the ones published in workshops did not pass through the normal peer-review process such as [7] and [9].
- Some references are irrelevant and/or out of the context to the study such as [9], [11], [12], [16], [29], and [30].
- iThenticate report shows 20% similarity index, which is fairly high, please kindly consider reducing it to the minimum possible.
- Some phrases located in the manuscript seems to be informal for academic writing such as in Lines 142, 205: “and so on”
- There are grammatical/spelling mistakes in the manuscript such as:
- Line 136: “we collects”
- Line 213: “which indicates that a specific that may be by …” the sentence is grammatically incorrect.
- Line 213: “and the of attack pattern mapping …” the sentence is grammatically incorrect.
- Line 170: ““indicating with a pre-existing …” should be ““indicating a pre-existing …”
- Line 174: “to establish Exploitation" technique” the quotation is not closed.
- Line 287: “apply the with activation …” the sentence is grammatically incorrect.
- Line 295: “For example, one entity node in the graph, is associated” comma use is incorrect.
- Line 110: “th deep integration” spelling mistake.
- Line 168, 169: “is set to ExploitedBy vs. CorrespondsTo,”, please use versus instead of vs. for better readability.
- … etc.
- There are many sentences in the manuscript that are too long and should be factorized into multiple sentences for better readability such as the sentences in lines 187-190.
- Better use passive tone instead of active tones (we, our … etc.) in academic writing.
- The paper really needs fully rigorous proofreading.
Author Response
Response to Reviewer 2:
Thanks for your comments on our paper. We have revised our paper according to your comments:
Comments 1: The title of the paper is too generic and mainly does not acknowledge the methodological depth and/or main contribution of the study.
Response 1: Thank you for pointing this out. We have revised the title of this paper. Please see the blue text in title. “A Cyber Attack Path Prediction Scheme Based on Text-Enhanced Graph Attention Mechanism”.
Comments 2: The abstract is weakly written and misses the core components of the abstract such as the key quantitative results, the main contribution, even the motivation of the study is not clear in the abstract. The provided abstract can be considered as a paper outline, which is not supposed to be this way.
Response 2: Thank you for pointing this out. We have revised the abstract. Please see the blue text in Abstract. “In order to solve the problem of traditional methods not being able to discover hidden attack trajectories, we propose a cyber attack path prediction scheme based on Text-enhanced graph attention mechanism in this paper. Specifically, we design an ontology that captures multi-dimensional links between vulnerabilities, weaknesses, attack patterns and tactics by integrating CVE, CWE, CAPEC and ATT&CK into Neo4j. Then, we inject natural language descriptions into the attention mechanism to develop a text-enhanced GAT that can alleviate data sparsity. The experiment shows that compared with existing baseline, our scheme has improved MRR and Hits@5 by 12.3% and 13.2%, respectively. Therefore, the proposed scheme can accurately predict attack paths and support active cyber defense.”
Comments 3: The introduction should strengthen the research gap and the high-level view of the proposed approach focusing on the novelty and main contributions.
Response 3: Thank you for pointing this out. We have revised the Introduction. Please see the blue text in Introduction. “With the rapid advancement of Internet technology……that are not directly observed in the entire graph, thereby more effectively predicting attack paths.”
Comments 4: The introduction misses the critical connection between the recent related work and the proposed work in the study and the exact positioning of the study among the recent related work.
Response 4: Thank you for pointing this out. We have revised the introduction section based on the problem, strengthened the discussion on the key connection between recent related work and this study, and clarified the positioning of this study in related work. Please see the blue text in Introduction. “With the rapid advancement of Internet technology……that are not directly observed in the entire graph, thereby more effectively predicting attack paths.”
Comments 5: The literature review provided in the related work section is very shallow, lacks technical depth, and mostly descriptives. Additionally, the critical analysis of the related work should be enhanced, and it should rely on strong journal papers. Moreover, there should be strong connection between the related work and the proposed approach to emphasize the contribution of the related approach and position it among the related work. However, there are many references in this section that are either not related or out of the context of the paper.
Response 5: Thank you for pointing this out. We have revised the related work and added some high-level references for citation. Please see the blue text in Related work and References.
Comments 6: The methodology is generally clear but it is incomplete in fundamental aspects It just lists what the provided approach does but does not specify why or under what circumstances, for example it misses or under-specifies:
1) The justification of the different decisions used in the design of the proposed approach such as why Word2Vec and CNN are used (even though it is outdated and there are many superior alternatives), why TransE is used not TransH or ComplEx, the reason behind the adopted fusion method … etc.
Response 6 1): Thank you for pointing this out. We agree that explicitly articulating the rationale behind each design decision would strengthen the methodological foundation and help readers assess the trade offs involved. Below we clarify our choices from the perspectives of task requirements, data characteristics, stability, reproducibility, and engineering feasibility:
(1) Word2Vec and CNN for text encoding
Although transformer-based encoders (e.g., BERT, RoBERTa) can generate superior semantic representations, our scheme is designed for resource‑constrained environments, where only CPU inference is used and memory usage must be low. Due to Word2Vec providing lightweight word embeddings that can be pre computed and loaded, there is no need for high-performance GPUs. In addition, adding convolutional neural network (CNN) layers on top of fixed embeddings can efficiently extract local n-gram patterns. In order to balance accuracy and deployment simplicity, we have chosen this combination. When hardware conditions permit, we will consider its expansion.
(2) TransE for knowledge graph embedding
TransE has features such as simplicity, ease of training, and strong interpretability, which are helpful for debugging and parameter tuning in practical application environments. Although TransH and ComplEx can capture richer relational patterns, they require higher computational costs and more complex training processes. Considering our focus on achieving stable and reproducible link prediction on a medium-sized network security knowledge graph (tens of thousands of entities), we have adopted TransE.
(3) Fusion method selection
We adopted a feature connection and attention weighting strategy to combine the text semantics of Word2Vec+CNN with the graph structure embedding of TransE. This method not only retains the advantages of both modes, but also allows the model to adapt through attention learning, which is crucial for prediction in sparse data states.
2) The IoT environment is actually ignored in the methodology, for example, the imposed constraints in IoT are never used in the methodology, the IoT topology, IoT attack scenarios, even though the paper claims to be used for IoT.
Response 6 2): Thank you for pointing this out. We have revised the title, abstract and introduction to focus the proposed scheme on predicting cyber attacks. Please see the blue text in Title, Abstract and Introduction.
3) The details of the dataset construction process such as number of nodes, negative sampling strategy and the Train/Testing/Validation ratios should be provided.
Response 6 3): Thank you for pointing this out. Because the process of building the model in the article was described in Section 4.2, and the attributes of the knowledge graph were mentioned in Tables 2, 3, and 4 in Section 3.2, the specific number of nodes and edges was not mentioned. In addition, we added content on negative sampling strategy and training/testing/validation ratio in Section 4.3. Please see the blue text in Section 4.3. “Baseline Comparison : The TransH model is an extension of TransE model……minimizing potential biases.”
4) The detailed architecture of the proposed approach, for example, the number of attention heads, GAT layers, how the relations are treated … etc.
Response 6 4): Thank you for pointing this out. The process of building the model in the article was described in Section 4.2
Comments 7: There is a major concern for the comparative analysis in the results section:
1) The limited evaluation scenarios used in the results limit the validity of the provided approach.
Response 7 1): Thank you for pointing this out. We have added the Baseline Comparison in Section 4.3 and Section 4.4 Experimental Case Study to strengthen the evaluation of proposed scheme. Please see the blue text in Section 4.3 and Section 4.4. “Baseline Comparison: The TransH model is an extension of TransE model……minimizing potential biases.”, “In this section, we present a real-world cybersecurity……providing solid support for network security defense.”
2) No complexity or run-time analysis provided.
Response 7 2): Thank you for pointing this out. We will strengthen the complexity or run-time analysis in our future work.
3) No scalability analysis provided.
Response 7 3): Thank you for pointing this out. We will strengthen the scalability analysis in our future work.
4) The provided comparative analysis is very weak there is no justification why TransH is used even though there are many stronger models that are worth compared such as ComplEx, RotatEor even recent text-aware KG models. The result section really needs stronger baselines to position the paper among related models and to validate the provided approach.
Response 7 4): Thank you for pointing this out. Because the TransH is an extension of TransE and a commonly used knowledge graph link prediction model, the TransH combined with text model will be selected for comparison with the Text-enhanced GAT proposed in this paper. We have provided additional explanations in Section 4.3. Please see the blue text at the end of the first paragraph in Section 4.3. “Baseline Comparison: The TransH model is an extension of TransE model, which……with the Text-enhanced GAT proposed in this paper.”
5) The results section is mainly descriptive, it just reports numbers but misses deep analysis of these numbers or even describes why these results occur or the ramifications of these results.
Response 7 5): Thank you for pointing this out. We have added the analysis of results in Section 4.3 to enhance the feasibility of our scheme. Please see the blue text in Section 4.3. “This improvement indicates that text-enhanced GAT can more effectively……textual information helps improve the ranking of tail entities.”
Comments 8: There are numerous limitations for this work, however only very few of them are acknowledged very briefly. The limitations of this work should be further discussed such as:
1) The use of a single dataset.
Response 8 1): Thank you for pointing this out. We will conduct evaluations on multiple datasets in our future work.
2) IoT details since the paper claims that this work is for IoT, however, no details about the IoT devices, resource limitations, network topology, or even attach scenarios.
Response 8 2): Thank you for pointing this out. We have revised the Title, Introduction and Related work to fit the article. Please see the blue text in Title, Introduction and Related work.
3) The limited theoretical justification of the different design choices.
Response 8 3): Thank you for pointing this out. We will strengthen the theoretical justification in our future work.
4) Scalability analysis
Response 8 4): Thank you for pointing this out. We will strengthen the scalability analysis in our future work.
5) Complexity or efficiency analysis.
Response 8 5): Thank you for pointing this out. We will strengthen the complexity and efficiency analysis in our future work.
Comments 9: The reproducibility of both the methodology and the results in this study is in the low side mainly because there are many important information that are missing such as the details of dataset construction, the detailed model architecture (as indicated in the comments above), the details of the negative sampling strategy, no pseudo-code provided, no dataset release, no code repository … etc.
Response 9: Thank you for pointing this out. The process of building the model in the article was described in Section 4.2, and the attributes of the knowledge graph were mentioned in Tables 2, 3, and 4 in Section 3.2. In addition, we added content on negative sampling strategy and training/testing/validation ratio in Section 4.3. Please see the blue text in Section 4.3. “Baseline Comparison : The TransH model is an extension of TransE model……minimizing potential biases.”
In addition, we added the Figure of architecture of text-enhanced graph attention mechanism. Please see Figure 5.
Comments 10: Some important claims lack any citations such as:
1) Line 16: “IoT devices are often characterized by limited … ”
2) Line 18: “Traditional security measures, which primarily rely on detecting and blocking …”
3) Line 21: “The fundamental contradiction in IoT security lies in the evolving nature …”
Response 10: Thank you for pointing this out. We have modified the Introduction and added some new References.
Comments 11: Some references are not peer-reviewed such as [6] and [18]. Also, the ones published in workshops did not pass through the normal peer-review process such as [7] and [9].
Response 11: Thank you for pointing this out. We have revised these references. Please see the blue text in References.
Comments 12: Some references are irrelevant and/or out of the context to the study such as [9], [11], [12], [16], [29], and [30].
Response 12: Thank you for pointing this out. We have revised these references. Please see the blue text in References.
Comments 13: iThenticate report shows 20% similarity index, which is fairly high, please kindly consider reducing it to the minimum possible.
Response 13: Thank you for pointing this out. We have revised the duplicate parts of the paper.
Comments 14: Comments on the Quality of English Language
1) Some phrases located in the manuscript seems to be informal for academic writing such as in Lines 142, 205: “and so on”
Response 14 1): Thank you for pointing this out. We have made the modifications mentioned above and highlighted them in blue in the text.
2) There are grammatical/spelling mistakes in the manuscript such as:
2-1) Line 136: “we collects”
2-2) Line 213: “which indicates that a specific that may be by …” the sentence is grammatically incorrect.
2-3) Line 213: “and the of attack pattern mapping …” the sentence is grammatically incorrect.
2-4) Line 170: ““indicating with a pre-existing …” should be ““indicating a pre-existing …”
2-5) Line 174: “to establish Exploitation" technique” the quotation is not closed.
2-6) Line 287: “apply the with activation …” the sentence is grammatically incorrect.
2-7) Line 295: “For example, one entity node in the graph, is associated” comma use is incorrect.
2-8) Line 110: “th deep integration” spelling mistake.
2-9) Line 168, 169: “is set to ExploitedBy vs. CorrespondsTo,”, please use versus instead of vs. for better readability.
Response 14 2): Thank you for pointing this out. We have made the modifications mentioned above and highlighted them in blue in the text.
3) There are many sentences in the manuscript that are too long and should be factorized into multiple sentences for better readability such as the sentences in lines 187-190.
Response 14 3): Thank you for pointing this out. We have addressed similar issues in the text.
4) Better use passive tone instead of active tones (we, our … etc.) in academic writing.
Response 14 4): Thank you for pointing this out. We have addressed similar issues in the text.
5) The paper really needs fully rigorous proofreading.
Response 14 5): Thank you for pointing this out. We have carefully proofread the entire text for grammar.
Special thanks to you for your good comments.
Reviewer 3 Report
Comments and Suggestions for AuthorsPeer Review
While reading the manuscript, it became clear that the authors are addressing a problem that is both timely and practically relevant for IoT security research. Modelling attack paths in environments where information is fragmented across different sources is not a trivial task, and the decision to rely on a knowledge graph as the backbone of the analysis feels well justified. The paper is generally easy to follow, and I did not find myself struggling to understand how the different parts connect.
One aspect I found particularly useful is the way established security taxonomies are brought together. CVE, CWE, CAPEC and ATT&CK are often used in isolation, and combining them into a single graph is a sensible step if the goal is to reason about multi-stage attacks. I initially expected the work to remain close to earlier graph-based approaches, but the use of textual information within the attention mechanism gives the model a more distinctive flavour and makes the approach feel less generic.
The technical sections are detailed and reflect a good understanding of both ontology design and graph neural networks. At times, however, the level of detail becomes quite dense. A short, informal explanation of the main design choices—before diving into formal definitions—would help readers who are less familiar with this type of model. This is not a major issue, but it would improve accessibility.
The experimental evaluation is carefully carried out. The reported improvements over baseline methods are supported by the results, and the inclusion of ablation experiments is particularly helpful in understanding which components actually matter. Rather than pushing strong claims, the discussion remains cautious, which I appreciated. The paper would benefit from a brief comment on how the approach might scale as the size of the knowledge graph grows, especially in real-world deployments.
From a presentation point of view, the manuscript is in good shape. Most figures and tables serve their purpose well, although a few technical passages could be streamlined to improve flow. Minor language edits would help in places, but there is no need for structural changes.
Overall, this is a solid piece of work that addresses a relevant problem with a technically sound solution. If the authors clarify a few design decisions and slightly improve readability in the denser sections, I see no major obstacles to publication.
Recommendation: Minor revision.
Author Response
Response to Reviewer 3:
Thanks for your comments on our paper. We have revised our paper according to your comments. Please see the Responses to Reviewer 1 and Reviewer 2.
Special thanks to you for your good comments.
Response to Reviewer 1:
Thanks for your comments on our paper. We have revised our paper according to your comments:
Comments 1: Section 4.2 explicitly states that the TransE model is used for structural embedding. However, Table 5 and the subsequent discussion compare the proposed model against a "TransH combined text model." It is unclear if TransE or TransH was used in the final "Our scheme," and using different base models for the comparison without explanation undermines the experimental rigor.
Response 1: Thank you for pointing this out. Because the TransH is an extension of TransE and a commonly used knowledge graph link prediction model, the TransH combined with text model will be selected for comparison with the Text-enhanced GAT proposed in this paper. We have provided additional explanations in Section 4.3. Please see the blue text at the end of the first paragraph in Section 4.3. “Baseline Comparison: The TransH model is an extension of TransE model, which……with the Text-enhanced GAT proposed in this paper.”
Comments 2: Several formulas (Pages 10 and 11) have significant rendering issues.
Response 2: Thank you for pointing this out. We have revised some formulas in Pages 10 and 11. Please see the blue text.
Comments 3: Despite the title, the paper primarily discusses generic cybersecurity databases (CVE/CWE). The "IoT" aspect is only mentioned in the introduction and conclusion. The experimental evaluation is conducted on a "shooting range" dataset, but there is no discussion of IoT-specific protocols (MQTT, CoAP), device constraints, or network topologies that would justify the "IoT" focus.
Response 3: Thank you for pointing this out. We have revised the title, abstract and introduction. Please see the blue text in Title, Abstract and Introduction.
Comments 4: Figure 2 is a low-resolution screenshot of the MITRE ATT&CK website that is barely legible. Figure 4, while showing the Neo4j interface, provides little insight into the actual schema or a specific "attack path" example.
Response 4: Thank you for pointing this out. We have updated Figure 1, 2, 4, and provided detailed explanations for case. Please see the blue text in Section 4.4. “In this section, we present a real-world cybersecurity case study to illustrate how the GAT method can be applied to predict potential cyberattack paths……providing solid support for network security defense.”
Comments 5: The metrics (MR, MMR, Hits@5) are standard and show improvement. However, the "Average result" in Table 5 is simply the mean of two tasks (Head and Tail prediction), which doesn't add much value beyond the individual task results.
Response 5: Thank you for pointing this out. For individual tasks, we describe the performance of both head entity prediction and tail entity prediction tasks, and also reflects the numerical values of individual tasks in the Table 5. Please see Section 4.3 and Table 5. “From Table 5, it can be seen that the text-enhanced GAT model proposed in this paper……Therefore, the model of this paper has improved the prediction results for both predicting the head and predicting the tail tasks.”
Comments 6: The writing style is procedural rather than analytical. To me, it feels more like a technical report than an article. A scientific paper is an argument for a specific hypothesis. Currently, the paper reads as: We built a graph, we ran a model, and it worked. It needs more why: Why does text-enhancement solve the specific problem of IoT data sparsity? Why did the GAT underperform on certain tail entities?
Response 6: Thank you for pointing this out. The primary objective of this article is to combine a multi-source network security knowledge base with a text enhanced graph attention network and demonstrate its effectiveness on a benchmark dataset. In order to make the methods and experimental procedures transparent and reproducible, we adopted a step-by-step programmatic description, which may make the article look more like a technical report. In the manuscript, the reviewer's "why" is mainly reflected in the design principles and ablation experiments, but we believe it can be further clarified. We will conduct a more comprehensive argument in the subsequent work.
Comments 7: The Results section ends abruptly. Add a subsection on security implications how can a network admin actually use this predicted path?
Response 7: Thank you for pointing this out. We have added detailed explanations for case. Please see the blue text in Section 4.4. “In this section, we present a real-world cybersecurity case study to illustrate how the GAT method can be applied to predict potential cyberattack paths……providing solid support for network security defense.”
Special thanks to you for your good comments.
Response to Reviewer 2:
Thanks for your comments on our paper. We have revised our paper according to your comments:
Comments 1: The title of the paper is too generic and mainly does not acknowledge the methodological depth and/or main contribution of the study.
Response 1: Thank you for pointing this out. We have revised the title of this paper. Please see the blue text in title. “A Cyber Attack Path Prediction Scheme Based on Text-Enhanced Graph Attention Mechanism”.
Comments 2: The abstract is weakly written and misses the core components of the abstract such as the key quantitative results, the main contribution, even the motivation of the study is not clear in the abstract. The provided abstract can be considered as a paper outline, which is not supposed to be this way.
Response 2: Thank you for pointing this out. We have revised the abstract. Please see the blue text in Abstract. “In order to solve the problem of traditional methods not being able to discover hidden attack trajectories, we propose a cyber attack path prediction scheme based on Text-enhanced graph attention mechanism in this paper. Specifically, we design an ontology that captures multi-dimensional links between vulnerabilities, weaknesses, attack patterns and tactics by integrating CVE, CWE, CAPEC and ATT&CK into Neo4j. Then, we inject natural language descriptions into the attention mechanism to develop a text-enhanced GAT that can alleviate data sparsity. The experiment shows that compared with existing baseline, our scheme has improved MRR and Hits@5 by 12.3% and 13.2%, respectively. Therefore, the proposed scheme can accurately predict attack paths and support active cyber defense.”
Comments 3: The introduction should strengthen the research gap and the high-level view of the proposed approach focusing on the novelty and main contributions.
Response 3: Thank you for pointing this out. We have revised the Introduction. Please see the blue text in Introduction. “With the rapid advancement of Internet technology……that are not directly observed in the entire graph, thereby more effectively predicting attack paths.”
Comments 4: The introduction misses the critical connection between the recent related work and the proposed work in the study and the exact positioning of the study among the recent related work.
Response 4: Thank you for pointing this out. We have revised the introduction section based on the problem, strengthened the discussion on the key connection between recent related work and this study, and clarified the positioning of this study in related work. Please see the blue text in Introduction. “With the rapid advancement of Internet technology……that are not directly observed in the entire graph, thereby more effectively predicting attack paths.”
Comments 5: The literature review provided in the related work section is very shallow, lacks technical depth, and mostly descriptives. Additionally, the critical analysis of the related work should be enhanced, and it should rely on strong journal papers. Moreover, there should be strong connection between the related work and the proposed approach to emphasize the contribution of the related approach and position it among the related work. However, there are many references in this section that are either not related or out of the context of the paper.
Response 5: Thank you for pointing this out. We have revised the related work and added some high-level references for citation. Please see the blue text in Related work and References.
Comments 6: The methodology is generally clear but it is incomplete in fundamental aspects It just lists what the provided approach does but does not specify why or under what circumstances, for example it misses or under-specifies:
1) The justification of the different decisions used in the design of the proposed approach such as why Word2Vec and CNN are used (even though it is outdated and there are many superior alternatives), why TransE is used not TransH or ComplEx, the reason behind the adopted fusion method … etc.
Response 6 1): Thank you for pointing this out. We agree that explicitly articulating the rationale behind each design decision would strengthen the methodological foundation and help readers assess the trade offs involved. Below we clarify our choices from the perspectives of task requirements, data characteristics, stability, reproducibility, and engineering feasibility:
(1) Word2Vec and CNN for text encoding
Although transformer-based encoders (e.g., BERT, RoBERTa) can generate superior semantic representations, our scheme is designed for resource‑constrained environments, where only CPU inference is used and memory usage must be low. Due to Word2Vec providing lightweight word embeddings that can be pre computed and loaded, there is no need for high-performance GPUs. In addition, adding convolutional neural network (CNN) layers on top of fixed embeddings can efficiently extract local n-gram patterns. In order to balance accuracy and deployment simplicity, we have chosen this combination. When hardware conditions permit, we will consider its expansion.
(2) TransE for knowledge graph embedding
TransE has features such as simplicity, ease of training, and strong interpretability, which are helpful for debugging and parameter tuning in practical application environments. Although TransH and ComplEx can capture richer relational patterns, they require higher computational costs and more complex training processes. Considering our focus on achieving stable and reproducible link prediction on a medium-sized network security knowledge graph (tens of thousands of entities), we have adopted TransE.
(3) Fusion method selection
We adopted a feature connection and attention weighting strategy to combine the text semantics of Word2Vec+CNN with the graph structure embedding of TransE. This method not only retains the advantages of both modes, but also allows the model to adapt through attention learning, which is crucial for prediction in sparse data states.
2) The IoT environment is actually ignored in the methodology, for example, the imposed constraints in IoT are never used in the methodology, the IoT topology, IoT attack scenarios, even though the paper claims to be used for IoT.
Response 6 2): Thank you for pointing this out. We have revised the title, abstract and introduction to focus the proposed scheme on predicting cyber attacks. Please see the blue text in Title, Abstract and Introduction.
3) The details of the dataset construction process such as number of nodes, negative sampling strategy and the Train/Testing/Validation ratios should be provided.
Response 6 3): Thank you for pointing this out. Because the process of building the model in the article was described in Section 4.2, and the attributes of the knowledge graph were mentioned in Tables 2, 3, and 4 in Section 3.2, the specific number of nodes and edges was not mentioned. In addition, we added content on negative sampling strategy and training/testing/validation ratio in Section 4.3. Please see the blue text in Section 4.3. “Baseline Comparison : The TransH model is an extension of TransE model……minimizing potential biases.”
4) The detailed architecture of the proposed approach, for example, the number of attention heads, GAT layers, how the relations are treated … etc.
Response 6 4): Thank you for pointing this out. The process of building the model in the article was described in Section 4.2
Comments 7: There is a major concern for the comparative analysis in the results section:
1) The limited evaluation scenarios used in the results limit the validity of the provided approach.
Response 7 1): Thank you for pointing this out. We have added the Baseline Comparison in Section 4.3 and Section 4.4 Experimental Case Study to strengthen the evaluation of proposed scheme. Please see the blue text in Section 4.3 and Section 4.4. “Baseline Comparison: The TransH model is an extension of TransE model……minimizing potential biases.”, “In this section, we present a real-world cybersecurity……providing solid support for network security defense.”
2) No complexity or run-time analysis provided.
Response 7 2): Thank you for pointing this out. We will strengthen the complexity or run-time analysis in our future work.
3) No scalability analysis provided.
Response 7 3): Thank you for pointing this out. We will strengthen the scalability analysis in our future work.
4) The provided comparative analysis is very weak there is no justification why TransH is used even though there are many stronger models that are worth compared such as ComplEx, RotatEor even recent text-aware KG models. The result section really needs stronger baselines to position the paper among related models and to validate the provided approach.
Response 7 4): Thank you for pointing this out. Because the TransH is an extension of TransE and a commonly used knowledge graph link prediction model, the TransH combined with text model will be selected for comparison with the Text-enhanced GAT proposed in this paper. We have provided additional explanations in Section 4.3. Please see the blue text at the end of the first paragraph in Section 4.3. “Baseline Comparison: The TransH model is an extension of TransE model, which……with the Text-enhanced GAT proposed in this paper.”
5) The results section is mainly descriptive, it just reports numbers but misses deep analysis of these numbers or even describes why these results occur or the ramifications of these results.
Response 7 5): Thank you for pointing this out. We have added the analysis of results in Section 4.3 to enhance the feasibility of our scheme. Please see the blue text in Section 4.3. “This improvement indicates that text-enhanced GAT can more effectively……textual information helps improve the ranking of tail entities.”
Comments 8: There are numerous limitations for this work, however only very few of them are acknowledged very briefly. The limitations of this work should be further discussed such as:
1) The use of a single dataset.
Response 8 1): Thank you for pointing this out. We will conduct evaluations on multiple datasets in our future work.
2) IoT details since the paper claims that this work is for IoT, however, no details about the IoT devices, resource limitations, network topology, or even attach scenarios.
Response 8 2): Thank you for pointing this out. We have revised the Title, Introduction and Related work to fit the article. Please see the blue text in Title, Introduction and Related work.
3) The limited theoretical justification of the different design choices.
Response 8 3): Thank you for pointing this out. We will strengthen the theoretical justification in our future work.
4) Scalability analysis
Response 8 4): Thank you for pointing this out. We will strengthen the scalability analysis in our future work.
5) Complexity or efficiency analysis.
Response 8 5): Thank you for pointing this out. We will strengthen the complexity and efficiency analysis in our future work.
Comments 9: The reproducibility of both the methodology and the results in this study is in the low side mainly because there are many important information that are missing such as the details of dataset construction, the detailed model architecture (as indicated in the comments above), the details of the negative sampling strategy, no pseudo-code provided, no dataset release, no code repository … etc.
Response 9: Thank you for pointing this out. The process of building the model in the article was described in Section 4.2, and the attributes of the knowledge graph were mentioned in Tables 2, 3, and 4 in Section 3.2. In addition, we added content on negative sampling strategy and training/testing/validation ratio in Section 4.3. Please see the blue text in Section 4.3. “Baseline Comparison : The TransH model is an extension of TransE model……minimizing potential biases.”
In addition, we added the Figure of architecture of text-enhanced graph attention mechanism. Please see Figure 5.
Comments 10: Some important claims lack any citations such as:
1) Line 16: “IoT devices are often characterized by limited … ”
2) Line 18: “Traditional security measures, which primarily rely on detecting and blocking …”
3) Line 21: “The fundamental contradiction in IoT security lies in the evolving nature …”
Response 10: Thank you for pointing this out. We have modified the Introduction and added some new References.
Comments 11: Some references are not peer-reviewed such as [6] and [18]. Also, the ones published in workshops did not pass through the normal peer-review process such as [7] and [9].
Response 11: Thank you for pointing this out. We have revised these references. Please see the blue text in References.
Comments 12: Some references are irrelevant and/or out of the context to the study such as [9], [11], [12], [16], [29], and [30].
Response 12: Thank you for pointing this out. We have revised these references. Please see the blue text in References.
Comments 13: iThenticate report shows 20% similarity index, which is fairly high, please kindly consider reducing it to the minimum possible.
Response 13: Thank you for pointing this out. We have revised the duplicate parts of the paper.
Comments 14: Comments on the Quality of English Language
1) Some phrases located in the manuscript seems to be informal for academic writing such as in Lines 142, 205: “and so on”
Response 14 1): Thank you for pointing this out. We have made the modifications mentioned above and highlighted them in blue in the text.
2) There are grammatical/spelling mistakes in the manuscript such as:
2-1) Line 136: “we collects”
2-2) Line 213: “which indicates that a specific that may be by …” the sentence is grammatically incorrect.
2-3) Line 213: “and the of attack pattern mapping …” the sentence is grammatically incorrect.
2-4) Line 170: ““indicating with a pre-existing …” should be ““indicating a pre-existing …”
2-5) Line 174: “to establish Exploitation" technique” the quotation is not closed.
2-6) Line 287: “apply the with activation …” the sentence is grammatically incorrect.
2-7) Line 295: “For example, one entity node in the graph, is associated” comma use is incorrect.
2-8) Line 110: “th deep integration” spelling mistake.
2-9) Line 168, 169: “is set to ExploitedBy vs. CorrespondsTo,”, please use versus instead of vs. for better readability.
Response 14 2): Thank you for pointing this out. We have made the modifications mentioned above and highlighted them in blue in the text.
3) There are many sentences in the manuscript that are too long and should be factorized into multiple sentences for better readability such as the sentences in lines 187-190.
Response 14 3): Thank you for pointing this out. We have addressed similar issues in the text.
4) Better use passive tone instead of active tones (we, our … etc.) in academic writing.
Response 14 4): Thank you for pointing this out. We have addressed similar issues in the text.
5) The paper really needs fully rigorous proofreading.
Response 14 5): Thank you for pointing this out. We have carefully proofread the entire text for grammar.
Special thanks to you for your good comments.
Round 2
Reviewer 1 Report
Comments and Suggestions for AuthorsThe authors have addressed all major concerns. The paper is now a good contribution to the field of cybersecurity graph learning. The remaining minor revisions are simply to polish the paper:
1) Throughout Section 3.3 and 4, remove references to "Neo4j Desktop 1.4.15" and "local server passwords." These are procedural report details that distract from the scientific contribution.
2) Ensure the abstract is fully updated to match the new title and the specific metrics mentioned in the results.
3) Ensure Figure 2 is high-resolution in the final print version to ensure the text in the MITRE ATT&CK table is legible.
Author Response
Response to Reviewer 1:
Thanks for your comments on our paper. We have revised our paper according to your comments:
Comments 1: Throughout Section 3.3 and 4, remove references to "Neo4j Desktop 1.4.15" and "local server passwords." These are procedural report details that distract from the scientific contribution.
Response 1: Thank you for pointing this out. We have removed the Reference to [26].
Comments 2: Ensure the abstract is fully updated to match the new title and the specific metrics mentioned in the results.
Response 2: Thank you for pointing this out. We have ensured that the abstract is fully updated and matches the new title, as well as the specific indicators mentioned in the results.
Comments 3: Ensure Figure 2 is high-resolution in the final print version to ensure the text in the MITRE ATT&CK table is legible.
Response 3: Thank you for pointing this out. We have confirmed that the clarity of Figure 2 meets the publishing requirements.
Special thanks to you for your good comments.
Reviewer 2 Report
Comments and Suggestions for AuthorsThe updated manuscript has been improved, however, several of the fundamental comments in the previous review report are not adequately addresses such as
- The title of the paper has been modified, however it is better to replace word “scheme” into “approach”, since the first seems to be very informal.
- Still the comparative analysis in the results section should be elaborated as indicated in the previous review report, especially in the following points:
- The limited evaluation scenarios used in the results limit the validity of the provided approach, more scenarios should be provided to capture the different aspects of the problem under different circumstances.
- No complexity or run-time analysis provided, it cannot be left for future work as indicated in authors’ response since it is a part of this work.
- No scalability analysis provided, it cannot be left for future work as indicated in authors’ response since it is a part of this work.
- The provided comparative analysis should be elaborated as explained in the previous review report.
- The limitations of this work should be discussed explicitly as indicated in the previous review report.
- Still the reproducibility of both the methodology and the results in this study is in the low side as indicated in the previous review report.
- Reference [6] is a workshop paper, which did not pass through the normal peer-review process, so it is better to be replaced with a stronger peer-reviewed alternative.
Author Response
Response to Reviewer 2:
Thanks for your comments on our paper. We have revised our paper according to your comments:
Comments 1: The title of the paper has been modified, however it is better to replace word “scheme” into “approach”, since the first seems to be very informal.
Response 1: Thank you for pointing this out. We have replaced word “scheme” into “approach” in the title and paper.
Comments 2: Still the comparative analysis in the results section should be elaborated as indicated in the previous review report, especially in the following points:
1) The limited evaluation scenarios used in the results limit the validity of the provided approach, more scenarios should be provided to capture the different aspects of the problem under different circumstances.
Response 2 1): Thank you for pointing this out. At present, this research has a strong correlation with our actual work, mainly analyzing the business scenarios of our applications. In the future, we will strengthen scenario evaluation and improve the universality of the approach.
2) No complexity or run-time analysis provided, it cannot be left for future work as indicated in authors’ response since it is a part of this work.
Response 2 2): Thank you for pointing this out. We have added run-time analysis and scalability analysis in Section 4.3. Please see the blue text in Section 4.3. “Inference Time Analysis: In this study, we evaluate the inference performance of a knowledge graph-based model,……These findings confirm that the proposed method sustains real-time inference capability across a wide range of graph sizes, a key advantage for practical deployment in cybersecurity scenarios.” Please see Table 7 and 8.
3) No scalability analysis provided, it cannot be left for future work as indicated in authors’ response since it is a part of this work.
Response 2 3): Thank you for pointing this out. We have added the scalability analysis in Section 4.3. Please see the blue text in Section 4.3. “Scalability Analysis: To evaluate the scalability of the proposed text-enhanced graph attention mechanism, we extend the inference time analysis by evaluating model performance on subgraphs of varying sizes extracted from the full knowledge graph……These findings confirm that the proposed method sustains real-time inference capability across a wide range of graph sizes, a key advantage for practical deployment in cybersecurity scenarios.” Please see Table 8.
4) The provided comparative analysis should be elaborated as explained in the previous review report.
Response 2 4): Thank you for pointing this out. We have added run-time analysis and scalability analysis in Section 4.3. Please see the blue text in Section 4.3. “Inference Time Analysis: In this study, we evaluate the inference performance of a knowledge graph-based model,……These findings confirm that the proposed method sustains real-time inference capability across a wide range of graph sizes, a key advantage for practical deployment in cybersecurity scenarios.” Please see Table 7 and 8.
Comments 3: The limitations of this work should be discussed explicitly as indicated in the previous review report.
Response 3: Thank you for pointing this out. We have added explanations of the relevant limitations in our work and improved its content. Please see the response to Comments 2.
Comments 4: Still the reproducibility of both the methodology and the results in this study is in the low side as indicated in the previous review report.
Response 4: Thank you for pointing this out. We have added descriptions of data source collection, graph construction, model construction, and result analysis. However, as the graph is already embedded in commercial systems for use, its code cannot be made public. Please see the blue text in Section 3.3. “In this study, we employ BeautifulSoup to crawl and parse web pages containing CVE, CWE, and CAPEC data, as well as……3) Select the Nmap tool for data acquisition, conduct comprehensive scans of network assets, and store the resulting data in a database.”
Comments 5: Reference [6] is a workshop paper, which did not pass through the normal peer-review process, so it is better to be replaced with a stronger peer-reviewed alternative.
Response 5: Thank you for pointing this out. We have changed the Reference [6]. Please see the blue text in Reference [6]. “Ghaderyan. D.; Aybat. N. S.; Aguiar. A. P.; Pereira. F. L. A fast row-stochastic decentralized method for distributed optimization over directed graphs. IEEE Trans. Autom. Control. 2024, 69, 275–289.”
Special thanks to you for your good comments.
Round 3
Reviewer 2 Report
Comments and Suggestions for AuthorsAll comments in the previous review report have been adequately addressed in the updated manuscript and/or the authors' response. Just a minor issue I have found while reviewing the latest version of the manuscript, which is:
In Lines 536, 537: "In Proceedings of the 2024 ACM Conference on ZZZ". I cannot find ZZZ conference, I think it is a typo or a placeholder. Would you please kindly double check and update it with the correct conference information.
Author Response
Response to Reviewer 2:
Thanks for your comments on our paper. We have revised our paper according to your comments:
Comments 1: In Lines 536, 537: "In Proceedings of the 2024 ACM Conference on ZZZ". I cannot find ZZZ conference, I think it is a typo or a placeholder. Would you please kindly double check and update it with the correct conference information.
Response 1: Thank you for pointing this out. “ZZZ” should be a placeholder. In order to make the paper clearer, we have replaced it with another similar reference. Please see the Reference [25]. “Estak. M.; Heriko. M.; Druovec. T. W.; Turkanovi. M. Applying k-vertex cardinality constraints on a Neo4j graph database. Future. Gener. Comp. SY. 2021, 115, 459–474.”
Special thanks to you for your good comments.