Event Detection Using a Self-Constructed Dependency and Graph Convolution Network

He, Li; Meng, Qingxin; Zhang, Qing; Duan, Jianyong; Wang, Hao

doi:10.3390/app13063919

Open AccessArticle

Event Detection Using a Self-Constructed Dependency and Graph Convolution Network

by

Li He

^1,2,

Qingxin Meng

^1,2,*,

Qing Zhang

^1,2,

Jianyong Duan

^1,2 and

Hao Wang

^1,2

¹

CNONIX National Standard Application and Promotion Laboratory, North China University of Technology, Beijing 100144, China

²

School of Information Science and Technology, North China University of Technology, Beijing 100144, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2023, 13(6), 3919; https://doi.org/10.3390/app13063919

Submission received: 16 February 2023 / Revised: 11 March 2023 / Accepted: 16 March 2023 / Published: 19 March 2023

(This article belongs to the Special Issue Natural Language Processing (NLP) and Applications)

Download

Browse Figures

Versions Notes

Abstract

:

The extant event detection models, which rely on dependency parsing, have exhibited commendable efficacy. However, for some long sentences with more words, the results of dependency parsing are more complex, because each word corresponds to a directed edge with a dependency parsing label. These edges do not all provide guidance for the event detection model, and the accuracy of dependency parsing tools decreases with the increase in sentence length, resulting in error propagation. To solve these problems, we developed an event detection model that uses a self-constructed dependency and graph convolution network. First, we statistically analyzed the ACE2005 corpus to prune the dependency parsing tree, and combined the named entity features in the sentence to generate an undirected graph. Second, we implemented an enhanced graph convolution network using the multi-head attention mechanism to understand the representation of nodes in the graph. Finally, a gating mechanism combined the semantic and structural dependency information of the sentence, enabling us to accomplish the event detection task. A series of experiments conducted on the ACE2005 corpus demonstrates that the proposed method enhances the performance of the event detection model.

Keywords:

event detection; dependency parsing; graph convolution network; attention mechanism; natural language processing

1. Introduction

Event detection is important because it helps with quickly obtaining structured event information from lengthy text [1]. Event detection involves two different subtasks: event detection and event argument extraction. The event detection task first identifies the trigger words in a sentence, and then classify them into predefined event categories. The event argument extraction task extracts the parameters of the sentence according to the event type and then performs classification.

Event detection is different from other information extraction tasks. To obtain the best results, the model must understand not only the semantics of a sentence, but also the structural information in a sentence [2]. For this, the fusion of dependent syntactic information for event detection has become one of the current mainstream methods. Dependency parsing [3] outputs a directed graph G = (N, E), where N represents a node and E represents a directed edge, corresponding to the words in the sentence and the directed edge in the graph, respectively. Figure 1 shows the dependency syntax analysis result of the sentence “During a meeting, the team owner told Jordan that his services were no longer necessary tomorrow”. The verb is selected as the core word of dependency parsing; in the above sentence, “told” is the core word, so it is directly connected to the “root” node. Dependency parsing results reflect the collocation relationship between words in sentences, and can provide key clues for event detection. Dependency information includes syntactic structure information and relation label information. The former shows the semantic relevance of words in the dependency relationship, whereas the latter describes the specific type of dependency relationship between two words, reflecting the differences in the syntactic characteristics between different words.

Graph Convolution Network (GCN) [4] is a deep-learning method used for processing and analyzing graph data, which can learn the results of dependency parsing. By employing a convolution operation on graph data, a fresh representation of nodes is produced by consolidating the features of both nodes and their neighbors. To date, researchers have shown a keen interest in GCNs, and methods grounded in GCN have become some of the leading approaches in event detection tasks.

However, Liu et al. [5] and Song et al. [6] found that using a large number of edge types derived from dependency parsing substantially increase the number of model parameters and requires many training calculations. Cui et al. [7] stated that “nsubj” (nominal subject), “dobj” (direct object) and “nmod” (noun compound modifier) constituted 32.2% of the dependency labels related to trigger words in the ACE2005 dataset (each relationship accounted for 2.5% of an average total of 40 dependencies). Although dependency parsing can provide useful structural information for event detection tasks, large amounts of redundant information are introduced and the effect of event detection is reduced.

To solve the above problems, we developed an event detection model using a self-constructed dependency and graph convolution network. First, we calculated the data statistics on the dependency parsing of the ACE2005 dataset; based on the results, we pruned the dependency parsing tree of a sentence and combined the named entity features to build an undirected graph. Second, we used the improved graph convolution network based on the multi-head attention mechanism to capture graph nodes. The representation of nodes in the graph was dynamically updated during the whole training process. The importance of dependency was distinguished through the multi-head attention mechanism. Finally, we used a gating mechanism to dynamically combine the semantic information of the sentence with the structural dependency information of the event.

The main contributions of this study are as follows:

We designed a novel graph construction method by pruning the dependency parsing tree and combining the named entity features.
We used the multi-head attention mechanism to improve the GCN model, and we dynamically fused the semantic representation of the sentence and structural dependency information through a gating mechanism.
The experiments conducted on the ACE2005 benchmark verified that our proposed method can improve the event detection effect of the model and has certain rationality.

This paper is organized as follows: First, we introduce the definition of the event detection task and some existing problems. Second, we introduce the achievements regarding event detection in recent years. Third, given the shortcomings of existing methods, we introduce the proposed event detection model. Fourth, we describe the proof of the model through comparative experiments, verify the necessity of each part of the model and show that the model achieved the best effect through ablation experiments. Last, we summarize the study and describe directions for improvements to the model in the future.

2. Related Work

Modeling event extraction into classification tasks based on machine learning is the current mainstream research direction; deep learning and neural networks have become the main means of event extraction. Fully connected neural networks and circular neural networks have been successfully applied to event extraction tasks. The attention mechanism and self-attention mechanism have also been introduced to the event extraction task because of their ability to capture links in the text context. In the research on event extraction for English text, Chen et al. [8] proposed a classic model: they used a dynamic multi-pooled convolutional neural network and the location information provided by trigger words to capture the context information of an entity. Nguyen et al. [9] abandoned the pipeline extraction method on the basis of Chen et al., introduced a joint extraction method into event extraction, and extracted the semantic features in sentences using bidirectional LSTM [10], which they fused with the sentence structure features. Liu et al. [11] and Yang et al. [12] introduced many word embedding methods, including word embedding, position embedding, type embedding and dictionary embedding and used sentence semantics. To further integrate syntactic information into the model, Sha et al. [13] introduced a syntactic dependency tree into the model, and feature aggregation was performed by syntactic dependency through RNN [14]. Finally the final extraction result was obtained by aggregating the relationship pairs between entities.

After Google proposed the Transformer architecture [15] in 2019, using the self-attention mechanism to extract context in the word vector, on the basis of Transformer, the Bert architecture was proposed to achieve unsupervised training on massive texts. Yang et al. [16] then introduced the Bert architecture into event extraction, and used it as a generation word vector for the next feature aggregation, and they proposed dataset generation technology based on Bert’s mask mechanism. Ramponi et al. [17] introduced an event extraction mechanism based on the Bert mechanism into the key event extraction for biomedicine and optimized it for relevant fields. To address the challenges posed by complex syntax, many graph-based models have been introduced to the event extraction model. Liu et al. [5] constructed isomorphic graph structures by combining the results of syntactic dependency analysis with words; they used graph attention networks for feature aggregation and also simplified and optimized the complexity of graphs. In addition to sentence-level event extraction, corresponding studies have been conducted on document-level event extraction. Zheng et al. [18] proposed an end-to-end document-level event extraction system in the field of Chinese finance. Yang et al. [19] proposed filtering out key sentences from the document level and then performing the sentence-level event extraction task, to achieve text-level event extraction. Yan et al. [20] proposed the idea of multi-hop dependency parsing, using the graph attention network (GAT) [21] to transform the dependency syntax tree into a multi-level graph structure for modeling, thus enhancing the semantic association between trigger words and related entities. Cui et al. [7] introduced dependency label information into GCN and proposed an edge-enhanced graph neural network model, which achieved better performance than the current benchmark model. However, because the focus of the model is integrating syntactic structure and typed dependency labels, it only uses a simple splicing method to fuse the semantic information and structural dependency information of sentences, without distinguishing the importance of dependency relationships. Effectively using the results of sentence dependency parsing and integrating this structured information into the model without adding redundant information still remain challenging in this context.

3. Methods

The event detection model proposed in this paper is shown in Figure 2. Initially, we acquire the contextual representation of the input sentence, and subsequently formulate a dependency graph to input into the improved GCN layer. Then, through the gating mechanism, we fuse the semantic representation and the dependency structure information, where the classifier is employed for event detection. In this section, we describe the proposed method in five parts: embedding layer, dependency graph construction method, multi-GCN, gating mechanism fusion and event detection.

3.1. Embedding Layer

Let

S = \{w_{1}, w_{2}, \dots, w_{n}\}

denote an n-word sentence. We first use Bert to obtain the context representation of the input sentence. Bert is a coding structure with self-attention mechanism, so it can fully understand sentence context, better solve the problem of polysemy in sentences (for example, “apple” can represent a kind of fruit or a company that produces consumer electronics; the model needs to understand the true meaning of words according to the sentence context), and alleviate the impact of long-distance dependence on the detection effect of the model. The two arguments that belong to the same event may have a considerable distance between their respective positions; the model should try to avoid forgetting previous information in the learning process. For the event detection task, the Bert model inserts a CLS token [22] in front of the sentence: this is a special token used to represent the entire sequence for the purpose of classification tasks. By using the vector representation of the CLS token, the model can capture the overall meaning and context of the input sequence.

X = \{X_{1}, X_{2}, \dots, X_{n}\} = B e r t (w_{1}, w_{2}, \dots, w_{n})

(1)

3.2. Dependency Graph Construction Method

We used the Stanford CoreNLP tool [3] to analyze the dependency syntax of sentences, and we statistically analyzed the processing results of all sentences in the ACE2005 corpus. According to Table 1, the probability of “det” (determiner) and “case” (case marking), which appear most frequently in all dependent syntactic label as the trigger word related label, is lower than 10%. We considered that these types of edges can be classified as redundant information. Therefore, based on the above statistics, we proposed a pruning dependency syntax tree. For each branch, if the current parsing label does not belong to the top 10 as the probability of label associated with trigger words, it is removed. The pruning result is shown in Figure 3.

Because the input of GCN is an undirected graph, the dependency relationship is built based on the following rules:

(1): Reverse edges ( $E (w_{j}, w_{i})$ ): To strengthen the flow of dependency information in the opposite direction on the edge of the graph, and better capture the relationship between entities, we add the corresponding reverse edge $E (w_{j}, w_{i})$ for each edge $E (w_{j}, w_{i})$ generated by dependency parsing.
(2): Sequence edges ( $E (w_{i}, w_{i + 1})$ ): To restore the integrity of the sentence and capture the adjacency relationship in time sequence, we add an adjacency edge $E (w_{i}, w_{i + 1})$ to each adjacency word.
(3): Entity edges ( $E (w_{e 1}, w_{e 2})$ ): To enhance the model’s understanding of the entity features in the sentence, the named entity recognition method is used to extract the entities in the sentence and connect them with each other.
(4): Self-connected edges ( $E (w_{i}, w_{i})$ ): We add self-connected edges to each character to prevent nodes from ignoring their own characteristics.

To extract the entities in the sentence, we use the sequence labeling method based on the Begin, Inside, Other (BIO) pattern; we use conditional random field (CRF) to identify the entities. In the training phase, we minimized the following loss:

L_{n e r} = - \sum_{s \in D} log P (y_{s} ∣ s)

(2)

where

y_{s}

is the golden label of s. For inference, we used the Viterbi algorithm to decode the label sequence with the maximum probability.

The final result of the self-constructed dependency graph is shown in Figure 4 (self-connected edge omitted); the nodes with the darkened background in the figure indicate the entities that we extracted.

3.3. Multi-GCN

To fully use the dependency relationship in

G C N

to explore the potential relationship between words, we built an L-layer

G C N

and defined its propagation rules as follows:

E^{l}, N^{l} = G C N (N^{l - 1}, E^{l - 1})

(3)

where

N^{l}

indicates the vector representation of the node in layer l, and

E^{l}

indicates the vector representation of the edge in layer l. At the initial stage of

l - 1 = 0

,

E^{0}

indicates the initial adjacency matrix of the graph and

N^{0}

indicates the input node vector. In the model training stage, GCN updates the vector representation of the node by aggregating the neighbor information around the current node, and the adjacency tensor is aggregated channel by channel. The calculation formula is as follows:

\begin{matrix} N^{l} = σ (N_{1}^{l}, N_{2}^{l}, \dots, N_{Q}^{l}) \\ N_{i}^{l} = E_{:, :, i}^{l - 1} N^{l - 1} W_{v}, i \in [1, Q] \end{matrix}

(4)

where

σ

is the ReLU activation function,

N_{i}^{l}

is the vector representation of node i output from layer l,

E_{:, :, i}^{l - 1}

is the i-th channel slice of adjacency tensor

E^{l - 1}

, and

W_{v}

is the learnable weight parameter.

Then, we update the edge representation through the context of the two nodes connected to the edge; the calculation formula is as follows:

E_{i, j, :}^{l} = [E_{i, j, :}^{l - 1} ∥N_{i}^{l}∥ N_{j}^{l}] W_{e}, i, j \in [1, n]

(5)

where ‖ is the splicing operation, and

W_{e}

is the learnable weight parameter.

After learning the GCN of layer L, the final node representation is marked as N’, which contains the dependency information in the sentence. However, it still contains some less important information. Therefore, in order to distinguish the importance of dependency, we use the multi-head attention with the number of heads of K, and obtain the representation M of the node by setting the weight of the dependency. The calculation formula is as follows:

M = MultiHeadAttention (N^{'}, N^{'}, N^{'})

(6)

3.4. Gating Mechanism Fusion

After obtaining the semantic representation X and dependency representation M of the sentence, we use the gating mechanism to dynamically integrate the semantic representation and dependency representation. Its formula is as follows:

\begin{matrix} G = W_{g} \times [X ∥ M] + b_{g} \\ \tilde{G} = σ (G) ⊙ G \end{matrix}

(7)

where ‖ indicates matrix splicing,

W_{g}

is the weight parameter,

b_{g}

is the offset,

σ

is the sigmoid activation function, and ⊙ represents the corresponding multiplication of matrix elements.

3.5. Event Detection

Finally, we use a fully connected layer to process the node representation

\tilde{G}

of the gated mechanism output, calculate the event type score of the words in the sentence, normalize it through the SoftMax function, and output the conditional probability distribution Y. The calculation formula is as follows:

Y = SoftMax (W_{y} \tilde{G} + b_{y})

(8)

where

W_{y}

is the weight parameter, and

b_{y}

is the offset. We take the trigger word label with the largest value in the probability distribution Y as the final event detection result.

4. Experiments

4.1. Datasets

The ACE2005 English dataset was used for the experiment, which contains 599 English documents and is annotated with 8 event types, 33 subtypes and 35 argument roles. In this study, the training, verification and test sets were divided according to the ratio 8:1:1. As ACE2005 contains six types of events, it was also divided according to the ratio of 8:1:1 in each category to ensure relative balance of the training samples. Finally, 479 training corpora, 60 verification corpora and 60 test corpora were obtained. We used the standard evaluation methods for the event detection task, that is, accuracy (P), recall (R) and F1 score as indicators.

4.2. Experiment Setting

In the experiment, the number of GCN layers L was 3, the number of multi-head attention heads K was 3, the learning rate was 1 × 10

^{- 3}

, the model training batch was 42, and the number of training rounds was 50. We set the Adam optimization model to accelerate the convergence rate; the cross entropy loss function was used to ensure that the convergence speed of the model was not affected when the learning rate drops suddenly. L2 regularization and the dropout mechanism were used to prevent the model from over-fitting.

4.3. Comparison Experiment

To verify the effect of our model, several baseline models of event detection were compared. Three event detection models based on feature, sequence and continuous GCN were selected for comparison:

(1): Feature-based model: Event detection is performed by manually designing features. CrossEntity [23] uses cross-entity information to detect events, and MaxEnt [24] uses lexical and syntactic features to detect events.
(2): Sequence-based model: Event detection is performed by processing and analyzing word sequences in sentences. DMCNN [8] builds a dynamic multi-pool CNN to learn sentence features, JRNN [9] uses bidirectional RNN to learn dependencies in sequences, and DBRNN [13] uses bidirectional RNN based on a syntax dependency tree to capture the dependencies between words.
(3): GCN-based model: A graph is built according to the dependency of sentences, syntactic information is extracted through GCN, and events are detected. JMEE [5] combines an attention mechanism with GCN to improve the performance. RGCN [25] uses an adjacency matrix and a convolution filter of a specific relationship to model data to perform tasks. MOGANED [20] updates the node representation by aggregating the attention of each layer on GCN with an attention mechanism; EE-GCN [7] uses GCN to learn semantic information and add dependency information to dynamically update the representation of nodes and edges.

Table 2 shows that our model can effectively improve the performance of event detection tasks. Of all considered models, the precision value and F1 score of our method are the highest, at 79.2% and 78.0%, respectively. We attribute the performance gain to two aspects: (1) Compared with the other dependency-parsing-based event detection models, we do not use a full range of parsing labels, which reduces not only the training parameters of the model but also the introduction of redundant information and avoids the propagated error from the dependency parsing tool. (2) We use the multi-head attention mechanism to improve the GCN model to select important dependencies; finally, semantic representation and dependencies are dynamically fused through a gating mechanism. Although the EEGCN model also integrates this information, it does not distinguish the importance of dependencies, and does not use dynamic fusion mechanism, but simply splices.

4.4. Ablation Experiments

4.4.1. Analysis of Impact of Different Graph Structures on the Model

The above-described experiments have shown that the proposed model can effectively improve event detection performance compared with existing methods. To further verify the impact of different graph structures on the model, ablation was conducted for several edge types introduced in the graph. Table 3 shows the event detection performance of using a dependency parsing tree, a pruning dependency parsing tree, entity feature edge, and combining entity feature and pruning dependency parsing tree to build a graph. Among them, the result of introducing the edge of entity features only was the worst. A possible reason is that this method helps the model recognize the entity features in the event mention but also increases the complexity of the edge in the graph, leading to the decline in experimental results. Our proposed method of dependency parsing tree pruning combined with entity feature edge construction produced the best result because it can not only reduce unnecessary parameters in the model, but also help the model to identify important entity feature information in the event detection task.

4.4.2. Analysis of Influence of GCN Layers L and Attention Heads K on Model Effect

In this study, we built an L-layer GCN model and used the multi-head attention mechanism to improve the GCN model. To achieve the best performance, we explored the impact of the numbers of GCN layers and attention heads K on the performance of the model. While other parameters remained constant in the experiment, we increased the number of hidden layers and attention heads from one to five for the experiment; the event detection results are shown in Table 4. Here, we only focus on the F1 score. From the results in the table, we can see that the efficacy of the model first increased but subsequently declined, as the numerical values of the two parameters increased. The best result was obtained when both values were three, so we finally determined that three GCN layers and attention heads should be used.

5. Conclusions and Future Work

We developed an event detection model using a self-constructed dependency and graph convolution network as a novel method to identify important words and dependencies in sentences; our model effectively increases the performance of event detection. In the event detection model, the dependency parsing tree is pruned to reduce the redundant information introduced by dependency parsing; an improved GCN model is used to learn the dependency representation of sentences, and the multi-head attention mechanism is used to extract important dependency features. Finally, a gating mechanism is used to dynamically fuse the semantic and dependency features of the context for event detection. We conducted a series of comparative experiments on a public event detection dataset, compared the performance of our method with that of several benchmark models of current event detection, and analyzed the impact of various experimental parameters on the model’s efficacy. The final experimental results fully verified the effectiveness of the proposed event detection model. In the future, we will continue to optimize the graph structure or further improve the graph convolution network layer. More effective features can be integrated to improve the model.

Author Contributions

Conceptualization, L.H., Q.Z., J.D. and H.W.; writing—original draft preparation, L.H. and Q.M.; writing—review and editing, L.H.; data curation, Q.M.; validation, Q.Z., J.D. and H.W. All authors have read and agreed to the published version of the manuscript.

Funding

This work is supported by the National Natural Science Foundation of China (61972003), the CNONIX National Standard Application and Promotion Laboratory and the RD Program of the Beijing Municipal Education Commission (KM202210009002).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

We would like to thank the anonymous reviewers for their helpful comments. We would like to thank the referees for their comments, which helped improve this paper considerably.

Conflicts of Interest

The authors declare no conflict of interest.

References

Xiang, W.; Wang, B. A Survey of Event Extraction From Text. IEEE Access 2019, 7, 173111–173137. [Google Scholar] [CrossRef]
Nguyen, T.; Grishman, R. Graph convolutional networks with argument-aware pooling for event detection. In Proceedings of the AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018; Volume 32. [Google Scholar]
Manning, C.D.; Surdeanu, M.; Bauer, J.; Finkel, J.R.; Bethard, S.; McClosky, D. The Stanford CoreNLP natural language processing toolkit. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, Baltimore, MD, USA, 22–27 June 2014; pp. 55–60. [Google Scholar]
Kipf, T.N.; Welling, M. Semi-supervised classification with graph convolutional networks. arXiv 2016, arXiv:1609.02907. [Google Scholar]
Liu, X.; Luo, Z.; Huang, H. Jointly multiple events extraction via attention-based graph information aggregation. arXiv 2018, arXiv:1809.09078. [Google Scholar]
Song, L.; Zhang, Y.; Wang, Z.; Gildea, D. N-ary relation extraction using graph state lstm. arXiv 2018, arXiv:1808.09101. [Google Scholar]
Cui, S.; Yu, B.; Liu, T.; Zhang, Z.; Wang, X.; Shi, J. Edge-enhanced graph convolution networks for event detection with syntactic relation. arXiv 2020, arXiv:2002.10757. [Google Scholar]
Chen, Y.; Xu, L.; Liu, K.; Zeng, D.; Zhao, J. Event extraction via dynamic multi-pooling convolutional neural networks. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Beijing, China, 26–31 July 2015; pp. 167–176. [Google Scholar]
Nguyen, T.H.; Cho, K.; Grishman, R. Joint event extraction via recurrent neural networks. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego, CA, USA, 12–17 June 2016; pp. 300–309. [Google Scholar]
Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
Liu, S.; Liu, K.; He, S.; Zhao, J. A probabilistic soft logic based approach to exploiting latent and global information in event classification. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, Phoenix, AZ, USA, 12–17 February 2016. [Google Scholar]
Yang, B.; Mitchell, T. Joint extraction of events and entities within a document context. arXiv 2016, arXiv:1609.03632. [Google Scholar]
Sha, L.; Qian, F.; Chang, B.; Sui, Z. Jointly extracting event triggers and arguments by dependency-bridge RNN and tensor-based argument interaction. In Proceedings of the AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018; Volume 32. [Google Scholar]
Zaremba, W.; Sutskever, I.; Vinyals, O. Recurrent neural network regularization. arXiv 2014, arXiv:1409.2329. [Google Scholar]
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017, 30, 15. [Google Scholar]
Yang, S.; Feng, D.; Qiao, L.; Kan, Z.; Li, D. Exploring pre-trained language models for event extraction and generation. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, 28 July–2 August 2019; pp. 5284–5294. [Google Scholar]
Ramponi, A.; van der Goot, R.; Lombardo, R.; Plank, B. Biomedical event extraction as sequence labeling. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Online, 19–20 November 2020; pp. 5357–5367. [Google Scholar]
Zheng, S.; Cao, W.; Xu, W.; Bian, J. Doc2EDAG: An end-to-end document-level framework for Chinese financial event extraction. arXiv 2019, arXiv:1904.07535. [Google Scholar]
Yang, H.; Chen, Y.; Liu, K.; Xiao, Y.; Zhao, J. Dcfee: A document-level chinese financial event extraction system based on automatically labeled training data. In Proceedings of the ACL 2018, System Demonstrations, Melbourne, Australia, 15–20 July 2018; pp. 50–55. [Google Scholar]
Yan, H.; Jin, X.; Meng, X.; Guo, J.; Cheng, X. Event detection with multi-order graph convolution and aggregated attention. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China, 3–7 November 2019; pp. 5766–5770. [Google Scholar]
Velickovic, P.; Cucurull, G.; Casanova, A.; Romero, A.; Lio, P.; Bengio, Y. Graph attention networks. Stat 2017, 1050, 10–48550. [Google Scholar]
Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv 2018, arXiv:1810.04805. [Google Scholar]
Hong, Y.; Zhang, J.; Ma, B.; Yao, J.; Zhou, G.; Zhu, Q. Using cross-entity inference to improve event extraction. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Portland, OR, USA, 19–24 June 2011; pp. 1127–1136. [Google Scholar]
Li, Q.; Ji, H.; Huang, L. Joint event extraction via structured prediction with global features. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Sofia, Bulgaria, 4–9 August 2013; pp. 73–82. [Google Scholar]
Schlichtkrull, M.; Kipf, T.N.; Bloem, P.; Van Den Berg, R.; Titov, I.; Welling, M. Modeling relational data with graph convolutional networks. In Proceedings of the Semantic Web: 15th International Conference, ESWC 2018, Heraklion, Crete, Greece, 3–7 June 2018; pp. 593–607. [Google Scholar]

Figure 1. Dependency parsing results of the sentence “During a meeting, Team owner told Jordan that his service were no longer necessary tomorrow”.

Figure 2. Model graph based on proposed self-constructed dependency graph and graph convolution neural network.

Figure 3. The result of pruning dependency syntax tree analysis.

Figure 4. Self-constructed dependency graph.

Table 1. Statistics of dependency parsing tags related to trigger words on ACE2005 dataset.

Dependency Parsing Labels	Percentage of Labels in the Corpus (%)	Probability of Label Associated with Trigger Words (%)	Ranking
nsubj:pass (nominal subject)	1.01	31.66	1
aux:pass (auxiliary)	1.18	27.96	2
obl (oblique nominal)	5.38	27.78	3
advcl (adverbial clause modifier)	1.34	27.76	4
det (determiner)	8.81	8.98	26
case (case marking)	11.11	7.82	31

Table 2. Exploring the effect of different compatibility functions on the overall performance of the model.

Methods	Precision (%)	Recall (%)	F1 (%)
MaxEnt	74.5	59.1	65.9
CrossEntity	68.7	68.9	68.8
DMCNN	75.6	63.6	69.1
JRNN	66.0	73.0	69.3
DBRNN	74.1	69.8	71.9
RGCN	68.4	79.3	73.4
JMEE	76.3	71.3	73.7
MOGANED	79.5	72.3	75.7
EEGCN	76.7	78.6	77.6
Ours	79.7	77.2	78.4

Table 3. Influence of different graph structures on the model.

Methods	Precision (%)	Pecall (%)	F1 (%)
-	78.3	76.8	77.5
Pruning	78.9	76.5	77.7
Entity Edges	78.1	76.2	77.1
Pruning + Entity Edges	79.7	77.2	78.4

Table 4. Analysis of influence of GCN layers L and Attention heads K on model effect.

GCN Layers	F1-Score (%)	Attention Heads	F1-Score (%)
1	75.8	1	76.1
2	77.4	2	77.6
3	78.4	3	78.4
4	77.8	4	78.0
5	76.7	5	76.9

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

He, L.; Meng, Q.; Zhang, Q.; Duan, J.; Wang, H. Event Detection Using a Self-Constructed Dependency and Graph Convolution Network. Appl. Sci. 2023, 13, 3919. https://doi.org/10.3390/app13063919

AMA Style

He L, Meng Q, Zhang Q, Duan J, Wang H. Event Detection Using a Self-Constructed Dependency and Graph Convolution Network. Applied Sciences. 2023; 13(6):3919. https://doi.org/10.3390/app13063919

Chicago/Turabian Style

He, Li, Qingxin Meng, Qing Zhang, Jianyong Duan, and Hao Wang. 2023. "Event Detection Using a Self-Constructed Dependency and Graph Convolution Network" Applied Sciences 13, no. 6: 3919. https://doi.org/10.3390/app13063919

APA Style

He, L., Meng, Q., Zhang, Q., Duan, J., & Wang, H. (2023). Event Detection Using a Self-Constructed Dependency and Graph Convolution Network. Applied Sciences, 13(6), 3919. https://doi.org/10.3390/app13063919

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Event Detection Using a Self-Constructed Dependency and Graph Convolution Network

Abstract

1. Introduction

2. Related Work

3. Methods

3.1. Embedding Layer

3.2. Dependency Graph Construction Method

3.3. Multi-GCN

3.4. Gating Mechanism Fusion

3.5. Event Detection

4. Experiments

4.1. Datasets

4.2. Experiment Setting

4.3. Comparison Experiment

4.4. Ablation Experiments

4.4.1. Analysis of Impact of Different Graph Structures on the Model

4.4.2. Analysis of Influence of GCN Layers L and Attention Heads K on Model Effect

5. Conclusions and Future Work

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI