You are currently viewing a new version of our website. To view the old version click .
Electronics
  • Article
  • Open Access

13 November 2022

Feature-Enhanced Document-Level Relation Extraction in Threat Intelligence with Knowledge Distillation

,
,
,
,
and
School of Cryptographic Engineering, PLA Information Engineering University, Zhengzhou 450001, China
*
Author to whom correspondence should be addressed.
This article belongs to the Topic Machine and Deep Learning

Abstract

Relation extraction in the threat intelligence domain plays an important role in mining the internal association between crucial threat elements and constructing a knowledge graph (KG). This study designed a novel document-level relation extraction model, FEDRE-KD, integrating additional features to take full advantage of the information in documents. The study also introduced a teacher–student model, realizing knowledge distillation, to further improve performance. Additionally, a threat intelligence ontology was constructed to standardize the entities and their relationships. To solve the problem of lack of publicly available datasets for threat intelligence, manual annotation was carried out on the documents collected from social blogs, vendor bulletins, and hacking forums. After training the model, we constructed a threat intelligence knowledge graph in Neo4j. Experimental results indicate the effectiveness of additional features and knowledge distillation. Compared to mainstream models SSAN, GAIN, and ATLOP, FEDRE-KD improved the F1score by 22.07, 20.06, and 22.38, respectively.

1. Introduction

Today, the Internet of Things (IoT) impacts almost every aspect of societal needs [1]. With the rapid development of network and information technology, new cyber threats (e.g., session hijacking, masquerade attack, and interruption) [2] are showing a gradual rising trend. Increasing complexity of attack strategy and the ever-changing attack scenarios make traditional network defense, such as firewalls, hard to resist. In 2019, more than 10,000 new types of cybercrime were committed in Russia [3]. In February 2022, Ukrainian government agencies and banking websites were targeted by large-scale distributed denial-of-service (DDoS) attacks, resulting in the offlining of at least 10 websites [4]. To achieve better command of threat situations and coordinate the response to unknown threats, security experts proposed cyber threat intelligence (CTI) for network defense. Gartner [5] first put forward that CTI is knowledge of existing or emerging threats against assets, including scenarios, mechanisms, indicators, and actionable recommendations, which can provide the subject with countermeasures.
Knowledge of threat intelligence originates from security analysis reports, blogs, social media, etc., which provides powerful data support for situational awareness and active network defense [6]. However, threat intelligence is mainly in the form of natural language, containing a large amount of unstructured data. Thus, it is difficult to visualize the internal relations of crucial elements. To help researchers understand the semantic association of elements quickly, it is necessary to design corresponding algorithms for mining entities and relations between them from large-scale threat intelligence documents to construct a knowledge graph.
Relation extraction aims to identify relations between entities from a given text [7]. As shown in Figure 1, the head entity Attacker “Mealybug” and the tail entity Trojan “Trojan.Emotet” can express the relation of “Use”. Although relation extraction in the general domain has achieved satisfactory results, the mainstream models present the following limitations in the cybersecurity domain: (1) the lack of open-source datasets about threat intelligence; (2) threat intelligence contains plenty of terms such as vulnerability number, malware name, advanced persistent threat (APT) group, etc., with serious out-of-vocabulary (OOV) problem; (3) threat intelligence documents are complex in structure. The frequency of entities in a sentence is extremely low, leading to the serious imbalance in the distribution of data labels. In addition, the current work mainly focusses on text mining at the sentence level. However, in practical scenarios, there may be multiple mentions for an entity and the relations between entities usually depend on at least two sentences for inference [8].
Figure 1. A threat intelligence text containing entities and relations.
To this end, this paper proposes a novel feature-enhanced document-level relation extraction model (FEDRE) to improve the in-domain performance of threat intelligence, which integrates new features. Then we introduce a teacher–student model to achieve knowledge distillation (FEDRE-KD). In summary, we present a practical model to convert threat intelligence documents into structured data and construct a knowledge graph. It can be further utilized in threat hunting and decision making. Our contribution can be summarized as follows:
(1) We captured part-of-speech (POS) of entity, width of mention, distance between entity pair, and type of entity as new features in document-level threat intelligence relation extraction. Pre-training model bidirectional encoder representation from transformers (BERT) was applied as encoder to alleviate the OOV problem.
(2) We introduced a teacher–student model, gathering effective information from texts by soft labels, which retains the association between classes and eliminates some invalid redundant information. We achieved knowledge distillation and further improved performance.
(3) We collected 227 threat intelligence documents and manually annotated them based on an ontology we defined. We systematically compared the performance of our model with the mainstream neural network models on the document-level relation extraction task. Experimental results demonstrate the effectiveness of our model. The extraction results were integrated to construct a threat intelligence knowledge graph, realizing the visualization of correlation of key elements.

3. Framework Architecture

This paper proposes a novel document-level relation extraction model, FEDRE, integrating global and local information. It captures part of speech of entity, width of mention, distance between entity pair, and type of entity as new features. The framework of FEDRE is shown in Figure 2.
Figure 2. The framework of the proposed model FEDRE. The input will be preprocessed to obtain tokenization and part-of-speech. FEDRE converts them into vectors as mention-level representation. To solve the problem of multiple mentions, the model adopts log-sum-exp pooling to acquire entity-level embedding, combined with type embedding and width embedding. Then FEDRE obtains local contextual embedding using the multi-head attention mechanism. Distance embedding is presented as an additional feature for entity pair. Representations for a specific entity pair are encoded with the embeddings above. Finally, the model feeds them into a classifier and infers the relations in the original input.
As shown in Figure 3, we introduce a teacher–student model, gathering effective information from texts using soft labels.
Figure 3. The teacher–student model. FEDRE was trained on the annotated data as the teacher model and generated soft labels. Then the student model was trained on both the annotated data and the predicted soft labels to eliminate invalid redundant information. The total loss was computed by combining the mean squared error loss with the adaptive focal loss.

3.1. Encode Layer

In a given a document, D = [ x t ] t = 1 l , x t represents the word at position t . We marked entity mentions by inserting a special symbol “*” at the start and the end of mentions. We used the pre-trained model BERT as our encoder, obtaining contextual embedding H :
H = B e r t ( [ x 1 , , x l ] ) = [ h 1 , , h l ]
where H R l × d 1 , and d 1 is the dimension of the hidden layer in the pre-trained model.

3.2. Representation Layer

We used NLTK, a Python library, to generate POS tags of the input sentence. Then we created a POS embedding matrix P :
P = P o s ( [ x 1 , , x l ] ) = [ p 1 , , p l ]
where P R l × d 2 , and d 2 is the dimension of the POS embedding.
For each token, we concatenate the contextual embedding and its POS embedding to generate POS-enhanced token representation:
C = [ h 1 | p 1 , , h l | p l ] = [ c 1 , , c l ]
where C R l × ( d 1 + d 2 ) , and [ | ] denotes the concat operation.
Span width was an important feature for the entity, so we trained a width embedding matrix:
W = W i d t h ( l i ) = [ w 1 , , w n 1 ]
where w i R d 3 , and d 3 is the dimension of the width embedding.
We took the embedding of “*” at the start of mention and concatenated it with width embedding to obtain width-enhanced mention embedding:
m e n t i o n m j = c m j | w m j
For an entity e i with N e i mentions { m j i } j = 1 N e i , a log-sum-exp pooling was applied to produce entity embedding:
h e i = l o g j = 1 N e i exp ( m e n t i o n m j )
Experimental results showed that entity types contained information required for relation extraction. Therefore, an entity-type embedding matrix was generated to merge type information:
T = T y p e ( t i ) = [ t 1 , , t n 3 ]
where t i R d 5 , and d 5 is the dimension of the type embedding.
Given a pre-trained multi-head attention matrix A R H D × l × l , A i j k denotes the attention score from t o k e n   j to t o k e n   k in the i t h attention head. We first took the attention from the “*” symbol as the mention-level attention, then averaged the attention over mentions of the same entity to obtain entity-level attention A i E R H D × l , denoting the attention scores from the i t h entity to all tokens. Then we located important context for the given entity pair ( e s , e o ) by attention matrix, calculating local contextual embeddings.
A ( s , o ) = A s E · A o E
q ( s , o ) = i = 1 H D A i ( s , o )
a ( s , o ) = q ( s , o ) / 1 T q ( s , o )
c ( s , o ) = H a ( s , o )
Next, representation for a specific entity pair ( e s , e o ) was encoded as:
z s ( s , o ) = tan h [ W S h e s + W C 1 c ( s , o ) + W D 1 D ( d s o ) + W T 1 T ( e s ) ]
z o ( s , o ) = tan h [ W O h e o + W C 2 c ( s , o ) + W D 2 D ( d o s ) + W T 2 T ( e o ) ]
where d s o is the distance between the first mention of entity s and entity o .
To reduce the number of parameters, we exploited the group bilinear, which effectively lowered the computational overhead. Specifically, we divided the entity representation into k equal-sized groups and fused the features to obtain representation of the given entity pair.
[ z s 1 ; ; z s k ] = z s
[ z o 1 ; ; z o k ] = z o
g ( s , o ) = ( i = 1 k z s i W r i z o i + b r )

3.3. Relation Classification

We calculated the logit of relation r of the given entity pair using a non-linear activation:
P ( r | e s , e o ) = σ ( g ( s , o ) ) = σ ( i = 1 k z s i W r i z o i + b r )
Relation extraction can be regarded as a multi-label classification task. Traditional baselines usually use standard binary cross-entropy loss to tackle this problem, which specify a global threshold as the criterion for whether a relation label exists. However, models have different confidence in threshold for different entity pairs. In addition, the distribution of entities and relations was extremely unbalanced in this task. Therefore, we adopted adaptive focal loss (AFL) as our loss function. Specifically, we set a learnable dynamic threshold combined with focal loss.
L A F L = r i P T ( 1 P ( r i ) ) γ l o g ( P ( r i ) ) + l o g ( P ( r T H ) )

3.4. Knowledge Distillation

In this module, a teacher-student model was introduced to realize knowledge distillation, so that model performance could be further improved. Specifically, we firstly obtained teacher model trained by the process mentioned above. Then we used the mean square error (MSE) loss to calculate the difference between logits generated by the student model and soft labels generated by the teacher model. Finally, we combined it with AFL in 3.3 as the overall loss function of the student model.
L K D = M S E ( T e a c h e r ,   S t u d e n t )
L R E = α 1 L A F L + α 2 L K D
where α 1 and α 2 are hyperparameters used to balance the two loss functions.

4. Experiment

4.1. Dataset

We annotated 227 threat intelligence documents manually, 151 of which were selected as the training set and the remaining 76 as the test set. The training set contained 1610 entities and 949 relations. Definitions of entities and relations between them are shown in Table 1 and Table 2.
Table 1. Distribution of entities.
Table 2. Distribution of relations.
Threat intelligence ontology was constructed, as shown in Figure 4.
Figure 4. The threat intelligence ontology. Crucial elements and relationships between them are displayed above. For instance, entity “Attacker” is connected with entity “OS” by relation “Target”.

4.2. Experiment Setup

Our model was trained on Nvidia Geforce RTX 3090 GPU based on Pytorch1.7.1. We used cased BERT-based as the pre-trained encoder for threat intelligence. We trained the model for 100 epochs with batch size 8, using the AdamW optimizer with warmup and early stop strategy (If the performance was not improved for 20 consecutive epochs, the training process would be stopped). The learning rate was set to 5 × 10−5 for BERT and 1 × 10−4 for other layers. The loss weight of the teacher model and student model was set to 1:1, i.e., α 1 = α 2 = 1 . We chose 25 as the size of the POS embedding ( p ) , type embedding ( t ) , width embedding ( w ) , and distance embedding ( d ) .
To tackle the imbalance of the dataset, we adopted random oversampling to copy minority classes before training our model. Specifically, tokens were replaced by their synonyms to create new samples.
Following prior studies, we introduced commonly used metrics in the relation extraction task to evaluate our model, i.e., precision (P), recall (R), and F1-score (F1). Additionally, we used F1 as the main evaluation metric. Furthermore, we presented time overhead as another index. Meanwhile, we calculated performance for each kind of relation- analysing model at a more granular level.
We compared our model with three excellent works, including SSAN [15], GAIN [31], and ATLOP [8]. For fair comparisons, we use cased BERT-based as the base encoder for all methods.

4.3. Result and Analysis

4.3.1. Model Comparison

Table 3 presents the relation extraction results of our model and baseline models on our dataset. First, compared to ATLOP, FEDRE improved its performance significantly by 21.01/22.61/22.38 P/R/F1 on the test set. This demonstrates the usefulness of additional features during inference. In addition, we concluded that FEDRE-KD outperformed FEDRE by 4.51 in the F1, proving that knowledge distillation can effectively be promoted. The experimental results also show that FEDRE-KD performed better than all the baseline models. The F1 of our model was 21.07 higher than that of SSAN and 20.06 higher than that of GAIN.
Table 3. Performance comparison of different models.

4.3.2. Ablation Study

We conducted ablation studies to further analyze the utility of each module in FEDRE. The results are shown in Table 4.
Table 4. Ablation study of FEDRE.
We first removed the POS embeddings and width embeddings, which are denoted as NoPOS and NoWidth, respectively. It was obvious that performance would drop if any feature of them was removed, indicating that the information for POS and width is important for relation prediction. Specifically, we found that verbs and nouns were more likely to be associated to other tokens in threat intelligence documents. Meanwhile, integrating width embedding could enrich representation at the mention level.
Then, we removed the entity-type embeddings, which is denoted as NoType. The performance dropped sharply, by 11.49. There existed different relations between different kinds of entities. For instance, “Patched” would only appear when the head entity belonged to “Vulnerability” and the tail entity belonged to “Time”. Therefore, integrating type embeddings can enrich representation at the entity level.
Finally, we removed the distance embeddings, which is denoted as NoDistance. The performance dropped by 11.47. This further demonstrates that the distance of two entities could enrich representation at the entity-pair level.

4.3.3. Fine-Grained Performance Comparison

To further observe the ability of introducing additional features and knowledge distillation to fit different types of data, Table 5 shows the fine-grained performance in detail.
Table 5. Fine-grained performance comparison.
Combined with the distribution of relations in Table 2, it can be intuitively found that introducing additional features significantly improved the classification ability of most types, such as ”Target”, “Perform”, and “Use”. Meanwhile, it is obvious that introducing knowledge distillation brought further promotion, with the maximum improvement of 21.16.

4.3.4. Choice of Sampling Technique

To alleviate the imbalance of the dataset, oversampling and undersampling were introduced. The results in Table 6 prove that the oversampling algorithm could significantly improve the performance. However, the undersampling algorithm suffered from the risk of unreasonably removing instances of loss of important information.
Table 6. Performance comparison of different sample techniques.

4.4. Threat Intelligence Knowledge Graph Construction

We inputted threat intelligence documents with annotated entities into the trained FERED-KD model. The model predicted relations selected from predefined relation sets for all the entity pairs. Then we inserted the entity-relation set into the knowledge graph using the neo4j-admin command. The results are shown in Figure 5.
Figure 5. Part of threat intelligence knowledge graph. Documents were fed into the trained FEDRE-KD model to obtain structured data and construct KG. This figure displays an instance of KG center on Dragonfly.

5. Conclusions

In this paper, we propose a novel document-level relation extraction model introducing additional features and knowledge distillation. Experimental results show that integrating features can enhance the fitting ability of most types. Meanwhile, the teacher–student model can further improve the performance. Additionally, we constructed a threat intelligence knowledge graph displaying internal association between vital elements in documents. In summary, the proposed model FEDRE-KD can provide significant support to transform network defense from passive to active. It can be utilized in tracing an attacker and making auxiliary decisions. In future work, we plan to extend our dataset to avoid imbalance. In addition, we will consider solving the overlap entity problem. Finally, we will focus on knowledge reasoning to obtain new information.

Author Contributions

Conceptualization, Y.L. (Yongfei Li); formal analysis, Y.L. (Yongfei Li); methodology, Y.L. (Yongfei Li); software, Y.L. (Yongfei Li); supervision, Y.G., C.F., Y.H., Y.L., (Yingze Liu) and Q.C.; writing—original draft, Y.L. (Yongfei Li); writing—review and editing, Y.G., C.F., Y.H., Y.L. (Yingze Liu), and Q.C.; conceptualization, Y.L. (Yongfei Li); formal analysis, Y.L. (Yongfei Li); Methodology, Y.L. (Yongfei Li); Software, Y.L. (Yongfei Li); Supervision, Y.G., C.F., Y.H., Y.L. (Yingze Liu), and Q.C.; writing—original draft, Y.L. (Yongfei Li); All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (grant numbers 61501515 and 61601515).

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Dohare, I.; Singh, K.; Ahmadian, A.; Mohan, S. Certificateless aggregated signcryption scheme for cloud-fog centric industry 4.0. IEEE Trans. Ind. Inform. 2022, 18, 6349–6357. [Google Scholar] [CrossRef]
  2. Thirumalai, C.; Mohan, S.; Srivastava, G. An efficient public key secure scheme for cloud and IoT security. Comput. Commun. 2020, 150, 634–643. [Google Scholar] [CrossRef]
  3. Simonov, N.; Klenkina, O.; Shikhanova, E. Leading Issues in Cybercrime: A Comparison of Russia and Japan. In Proceedings of the 6th International Conference on Social, Economic, and Academic Leadership (ICSEAL-6-2019), Prague, Czech, 13–14 December 2019; pp. 504–510. [Google Scholar]
  4. Maschmeyer, L.; Dunn Cavelty, M. Goodbye Cyberwar: Ukraine as Reality Check. CSS Policy Perspect. 2022, 10. [Google Scholar] [CrossRef]
  5. McMillan, R. Definition: Threat Intelligence. March. 2013. Available online: https://www.gartner.com/en/documents/2487216 (accessed on 30 September 2022).
  6. Liu, C.; Wang, J.; Chen, X. Threat intelligence ATT&CK extraction based on the attention transformer hierarchical recurrent neural network. Appl. Soft Comput. 2022, 122, 108826. [Google Scholar]
  7. Nguyen, T.H.; Grishman, R. Relation Extraction: Perspective from Convolutional Neural Networks. In Proceedings of the Proceedings of the 1st Workshop on Vector Space Modeling for Natural Language Processing, Denver, CO, USA, 5 June 2015; pp. 39–48. [Google Scholar]
  8. Zhou, W.; Huang, K.; Ma, T.; Huang, J. Document-Level Relation Extraction with Adaptive Thresholding and Localized Context Pooling. In Proceedings of the AAAI Conference on Artificial Intelligence, Virtual Event, 2–9 February 2021; pp. 14612–14620. [Google Scholar]
  9. Peng, H.; Gao, T.; Han, X.; Lin, Y.; Li, P.; Liu, Z.; Sun, M.; Zhou, J. Learning from Context or Names? An Empirical Study on Neural Relation Extraction. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Virtual Event, 16–20 November 2020; pp. 3661–3672. [Google Scholar]
  10. Soares, L.B.; Fitzgerald, N.; Ling, J.; Kwiatkowski, T. Matching the Blanks: Distributional Similarity for Relation Learning. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, 28 July–2 August 2019; pp. 2895–2905. [Google Scholar]
  11. Guo, Z.; Zhang, Y.; Lu, W. Attention Guided Graph Convolutional Networks for Relation Extraction. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, 28 July–2 August 2019; pp. 241–251. [Google Scholar]
  12. Zhang, Z.; Han, X.; Liu, Z.; Jiang, X.; Sun, M.; Liu, Q. ERNIE: Enhanced Language Representation with Informative Entities. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, 28 July–2 August 2019; pp. 1441–1451. [Google Scholar]
  13. Wang, D.; Hu, W.; Cao, E.; Sun, W. Global-to-Local Neural Networks for Document-Level Relation Extraction. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Virtual Event, 16–20 November 2020; pp. 3711–3721. [Google Scholar]
  14. Zhang, L.; Cheng, Y. A Densely Connected Criss-Cross Attention Network for Document-level Relation Extraction. arXiv 2022, arXiv:2203.13953. [Google Scholar]
  15. Xu, B.; Wang, Q.; Lyu, Y.; Zhu, Y.; Mao, Z. Entity Structure Within and Throughout: Modeling Mention Dependencies for Document-Level Relation Extraction. In Proceedings of the AAAI Conference on Artificial Intelligence, Virtual Event, 2–9 February 2021; pp. 14149–14157. [Google Scholar]
  16. Yuan, C.; Huang, H.; Feng, C.; Shi, G.; Wei, X. Document-level relation extraction with entity-selection attention. Inf. Sci. 2021, 568, 163–174. [Google Scholar] [CrossRef]
  17. Xie, Y.; Shen, J.; Li, S.; Mao, Y.; Han, J. Eider: Evidence-enhanced Document-level Relation Extraction. arXiv 2021, arXiv:2106.08657. [Google Scholar]
  18. Long, Z.; Tan, L.; Zhou, S.; He, C.; Liu, X. Collecting indicators of compromise from unstructured text of cybersecurity articles using neural-based sequence labelling. In Proceedings of the 2019 International Joint Conference on Neural Networks (IJCNN), Budapest, Hungary, 4–19 July 2019; pp. 1–8. [Google Scholar]
  19. Gasmi, H.; Laval, J.; Bouras, A. Information extraction of cybersecurity concepts: An lstm approach. Appl. Sci. 2019, 9, 3945. [Google Scholar] [CrossRef]
  20. Wang, W.; Ning, K.; Song, H.; Lu, M.; Wang, J. An Indicator of Compromise Extraction Method Based on Deep Learning. J. Comput. 2021, 44, 15. [Google Scholar]
  21. Satyapanich, T.; Ferraro, F.; Finin, T. Casie: Extracting cybersecurity event information from text. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; pp. 8749–8757. [Google Scholar]
  22. Hinton, G.; Vinyals, O.; Dean, J. Distilling the knowledge in a neural network. arXiv 2015, arXiv:1503.02531. [Google Scholar]
  23. Romero, A.; Ballas, N.; Kahou, S.E.; Chassang, A.; Gatta, C.; Bengio, Y. Fitnets: Hints for thin deep nets. arXiv 2014, arXiv:1412.6550. [Google Scholar]
  24. Zhang, Z.; Shu, X.; Yu, B.; Liu, T.; Zhao, J.; Li, Q.; Guo, L. Distilling knowledge from well-informed soft labels for neural relation extraction. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; pp. 9620–9627. [Google Scholar]
  25. Liu, Q.; Li, Y.; Duan, H.; Liu, Y.; Qin, Z. Knowledge Graph Construction Techniques. J. Comput. Res. Dev. 2016, 53, 582–600. [Google Scholar] [CrossRef]
  26. Lv, X.; Han, X.; Hou, L.; Li, J.; Liu, Z.; Zhang, W.; Zhang, Y.; Kong, H.; Wu, S. Dynamic Anticipation and Completion for Multi-Hop Reasoning over Sparse Knowledge Graph. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Virtual Event, 16–20 November 2020; pp. 5694–5703. [Google Scholar]
  27. Zhou, K.; Zhao, W.X.; Bian, S.; Zhou, Y.; Wen, J.-R.; Yu, J. Improving conversational recommender systems via knowledge graph based semantic fusion. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, San Diego, CA, USA, 6–10 July 2020; pp. 1006–1014. [Google Scholar]
  28. Gao, P.; Liu, X.; Choi, E.; Soman, B.; Mishra, C.; Farris, K.; Song, D. A System for Automated Open-Source Threat Intelligence Gathering and Management. In Proceedings of the 2021 International Conference on Management of Data, Xi’an, China, 20–25 June 2021; pp. 2716–2720. [Google Scholar]
  29. Piplai, A.; Mittal, S.; Abdelsalam, M.; Gupta, M.; Joshi, A.; Finin, T. Knowledge enrichment by fusing representations for malware threat intelligence and behavior. In Proceedings of the 2020 IEEE International Conference on Intelligence and Security Informatics (ISI), Arlington, VA, USA, 9–10 November 2020; pp. 1–6. [Google Scholar]
  30. Mittal, S.; Joshi, A.; Finin, T. Cyber-all-intel: An ai for security related threat intelligence. arXiv 2019, arXiv:1905.02895. [Google Scholar]
  31. Zeng, S.; Xu, R.; Chang, B.; Li, L. Double Graph Based Reasoning for Document-level Relation Extraction. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Virtual Event, 16–20 November 2020; pp. 1630–1640. [Google Scholar]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.