Hierarchical Gated Recurrent Unit with Semantic Attention for Event Prediction
Abstract
:1. Introduction
- We construct a novel HS-GRU model that predicts events based on time-based event information and prior knowledge of current event link.
- We propose a semantic selective attention mechanism to combine temporal information with prior knowledge. It calculates the influence score of current events on the next event under the prior knowledge and fuses the event information by the score.
- Experimental results on Chinese News datasets demonstrate that our model is superior to several state-of-the-art methods on event prediction tasks.
2. Related Work
2.1. Traditional-Machine-Learning-Based Methods
2.2. Deep-Learning-Based Methods
3. Our Proposal
3.1. Gated Recurrent Unit
3.2. Association Link Network
3.3. Hierarchical Semantic Attention
3.3.1. Encoding Layer
3.3.2. Attention Layer
3.3.3. Prediction Layer
3.4. Learning
4. Experiments
4.1. Dataset
4.2. Model Training
4.3. Evaluation
- N-Gram is a language model which assumes that the occurrence of the Nth word is only related to the previous words. It uses probabilistic methods to reveal the statistical laws inherent in language units. Here, we used two n-gram with different smoothing techniques, Backoff N-Gram and Modified Kneser-Ney [30], to compare with our model.
- GRU [10] is a variant of RNN that has a simpler structure for the long-term dependencies of RNN labeling. It can do well with sequence problems.
- HRED [27] is a hierarchical encoder–decoder suggestion model that encodes the context information of the query layer and the session layer, respectively, and finally provides context-aware query suggestions for users. It is similar to the task of our model.
4.4. Event Selection
5. Conclusions
Author Contributions
Funding
Conflicts of Interest
References
- Geng, Y.; Su, L.; Jia, Y.; Han, C. Seismic Events Prediction Using Deep Temporal Convolution Networks. J. Electr. Comput. Eng. 2019, 2019, 7343784. [Google Scholar] [CrossRef] [Green Version]
- Soni, S.; Ramirez, S.L.; Eisenstein, J. Discriminative Modeling of Social Influence for Prediction and Explanation in Event Cascades. arXiv 2018, arXiv:1802.06138. [Google Scholar]
- Cortez, B.; Carrera, B.; Kim, Y.J.; Jung, J.Y. An architecture for emergency event prediction using LSTM recurrent neural networks. Expert Syst. Appl. 2018, 97, 315–324. [Google Scholar] [CrossRef]
- Yang, Y.; Wei, Z.; Chen, Q.; Wu, L. Using External Knowledge for Financial Event Prediction Based on Graph Neural Networks. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management, Beijing, China, 3–7 November 2019; pp. 2161–2164. [Google Scholar]
- Wang, Y.; Li, Q.; Huang, Z.; Li, J. EAN: Event Attention Network for Stock Price Trend Prediction based on Sentimental Embedding. In Proceedings of the 10th ACM Conference on Web Science, Boston, MA, USA, 30 June–3 July 2019; pp. 311–320. [Google Scholar]
- Yang, C.C.; Shi, X.; Wei, C.P. Discovering event evolution graphs from news corpora. IEEE Trans. Syst. Man Cybern. Part A Syst. Hum. 2009, 39, 850–863. [Google Scholar] [CrossRef]
- Manshadi, M.; Swanson, R.; Gordon, A.S. Learning a Probabilistic Model of Event Sequences from Internet Weblog Stories. In Proceedings of the FLAIRS Conference, Coconut Grove, FL, USA, 15–17 May 2008; pp. 159–164. [Google Scholar]
- Pichotta, K.; Mooney, R.J. Learning statistical scripts with LSTM recurrent neural networks. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, Phoenix, AZ, USA, 12–17 February 2016. [Google Scholar]
- Wang, Z.; Zhang, Y.; Chang, C.Y. Integrating order information and event relation for script event prediction. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Copenhagen, Denmark, 9–11 September 2017; pp. 57–67. [Google Scholar]
- Cho, K.; Van Merriënboer, B.; Gulcehre, C.; Bahdanau, D.; Bougares, F.; Schwenk, H.; Bengio, Y. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv 2014, arXiv:1406.1078. [Google Scholar]
- Luo, X.; Xu, Z.; Yu, J.; Chen, X. Building association link network for semantic link on web resources. IEEE Trans. Autom. Sci. Eng. 2011, 8, 482–494. [Google Scholar] [CrossRef]
- Weiler, A.; Grossniklaus, M.; Scholl, M.H. Event identification and tracking in social media streaming data. In Proceedings of the EDBT/ICDT, Athens, Greece, 28 March 2014; pp. 282–287. [Google Scholar]
- Lu, Z.; Yu, W.; Zhang, R.; Li, J.; Wei, H. Discovering event evolution chain in microblog. In Proceedings of the 2015 IEEE 17th International Conference on High Performance Computing and Communications, 2015 IEEE 7th International Symposium on Cyberspace Safety and Security, and 2015 IEEE 12th International Conference on Embedded Software and Systems, New York, NY, USA, 24–26 August 2015; pp. 635–640. [Google Scholar]
- Zhou, P.; Wu, B.; Cao, Z. EMMBTT: A Novel Event Evolution Model Based on TFxIEF and TDC in Tracking News Streams. In Proceedings of the 2017 IEEE Second International Conference on Data Science in Cyberspace (DSC), Shenzhen, China, 26–29 June 2017; pp. 102–107. [Google Scholar]
- Liu, Y.; Peng, H.; Guo, J.; He, T.; Li, X.; Song, Y.; Li, J. Event detection and evolution based on knowledge base. In Proceedings of the KBCOM 2018, Los Angeles, CA, USA, 9 February 2018; pp. 1–7. [Google Scholar]
- Jans, B.; Bethard, S.; Vulić, I.; Moens, M.F. Skip n-grams and ranking functions for predicting script events. In Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, Avignon, France, 23–27 April 2012; pp. 336–344. [Google Scholar]
- Radinsky, K.; Davidovich, S.; Markovitch, S. Learning causality for news events prediction. In Proceedings of the 21st International Conference on World Wide Web, Lyon, France, 16–20 April 2012; pp. 909–918. [Google Scholar]
- Li, Z.; Zhao, S.; Ding, X.; Liu, T. EEG: Knowledge Base for Event Evolutionary Principles and Patterns. In Proceedings of the Chinese National Conference on Social Media Processing, Beijing, China, 14–17 September 2017; pp. 40–52. [Google Scholar]
- Granroth-Wilding, M.; Clark, S. What happens next? event prediction using a compositional neural network model. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, Phoenix, AZ, USA, 12–17 February 2016. [Google Scholar]
- Hu, L.; Li, J.; Nie, L.; Li, X.L.; Shao, C. What happens next? Future subevent prediction using contextual hierarchical LSTM. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, San Francisco, CA, USA, 4–9 February 2017. [Google Scholar]
- Li, Z.; Ding, X.; Liu, T. Constructing narrative event evolutionary graph for script event prediction. arXiv 2018, arXiv:1805.05081. [Google Scholar]
- Mikolov, T.; Karafiát, M.; Burget, L.; Černockỳ, J.; Khudanpur, S. Recurrent neural network based language model. In Proceedings of the Eleventh Annual Conference of the International Speech Communication Association, Makuhari, Chiba, Japan, 26–30 September 2010. [Google Scholar]
- Bengio, Y.; Simard, P.; Frasconi, P. Learning long-term dependencies with gradient descent is difficult. IEEE Trans. Neural Netw. 1994, 5, 157–166. [Google Scholar] [CrossRef] [PubMed]
- Lin, Z.; Feng, M.; Santos, C.N.d.; Yu, M.; Xiang, B.; Zhou, B.; Bengio, Y. A structured self-attentive sentence embedding. arXiv 2017, arXiv:1703.03130. [Google Scholar]
- Scarselli, F.; Gori, M.; Tsoi, A.C.; Hagenbuchner, M.; Monfardini, G. The graph neural network model. IEEE Trans. Neural Netw. 2008, 20, 61–80. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Li, Y.; Tarlow, D.; Brockschmidt, M.; Zemel, R. Gated graph sequence neural networks. arXiv 2015, arXiv:1511.05493. [Google Scholar]
- Sordoni, A.; Bengio, Y.; Vahabi, H.; Lioma, C.; Grue Simonsen, J.; Nie, J.Y. A hierarchical recurrent encoder-decoder for generative context-aware query suggestion. In Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, Melbourne, Australia, 19–23 October 2015; pp. 553–562. [Google Scholar]
- Pascanu, R.; Gulcehre, C.; Cho, K.; Bengio, Y. How to construct deep recurrent neural networks. arXiv 2013, arXiv:1312.6026. [Google Scholar]
- Serban, I.V.; Sordoni, A.; Bengio, Y.; Courville, A.; Pineau, J. Hierarchical neural network generative models for movie dialogues. arXiv 2015, arXiv:1507.04808. [Google Scholar]
- Goodman, J.T. A bit of progress in language modeling. Comput. Speech Lang. 2001, 15, 403–434. [Google Scholar] [CrossRef] [Green Version]
- Ghosh, S.; Vinyals, O.; Strope, B.; Roy, S.; Dean, T.; Heck, L. Contextual lstm (clstm) models for large scale nlp tasks. arXiv 2016, arXiv:1602.06291. [Google Scholar]
International News | China News | |
---|---|---|
Events | 120,879 | 259,327 |
Event links | 114,785 | 237,480 |
Words | 2521 | 3538 |
Training | 91,824 | 189,984 |
Validation | 11,480 | 23,748 |
Test | 11,481 | 23,748 |
Batches | Training | |
---|---|---|
International News | 837,800 | 71 h 43 min |
China News | 894,300 | 75 h 23 min |
Error Rate | ||
---|---|---|
International News | China News | |
Backoff N-Gram [Goodman, 2001] | 96.71% | 95.94% |
Modified Kneser-Ney [Goodman, 2001] | 93.52% | 94.75% |
GRU [Cho et al. 2014] | 54.44% | 66.47% |
HRED [Sordoni et al. 2015] | 51.30% | 64.57% |
HS-GRU (ours) | 48.48% | 62.33% |
International News | China News | |||
---|---|---|---|---|
Long Event Link | Short Event Link | Long Event Link | Short Event Link | |
HRED | 45.63% | 68.32% | 59.51% | 73.23% |
HS-GRU | 42.50% | 66.41% | 57.43% | 71.42% |
3.13% | 1.91% | 2.48% | 1.8% |
International News | China News | |||||
---|---|---|---|---|---|---|
Top 1 | Top 5 | Top 10 | Top 1 | Top 5 | Top 10 | |
Random | 2.00% | 10.00% | 20.00% | 2.00% | 10.00% | 20.00% |
GRU | 56.00% | 74.36% | 78.00% | 49.27% | 69.27% | 71.64% |
HRED | 59.64% | 74.73% | 79.64% | 53.82% | 71.45% | 73.45% |
HS-GRU | 69.45% | 82.73% | 86.36% | 56.36% | 75.82% | 78.91% |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Su, Z.; Jiang, J. Hierarchical Gated Recurrent Unit with Semantic Attention for Event Prediction. Future Internet 2020, 12, 39. https://doi.org/10.3390/fi12020039
Su Z, Jiang J. Hierarchical Gated Recurrent Unit with Semantic Attention for Event Prediction. Future Internet. 2020; 12(2):39. https://doi.org/10.3390/fi12020039
Chicago/Turabian StyleSu, Zichun, and Jialin Jiang. 2020. "Hierarchical Gated Recurrent Unit with Semantic Attention for Event Prediction" Future Internet 12, no. 2: 39. https://doi.org/10.3390/fi12020039
APA StyleSu, Z., & Jiang, J. (2020). Hierarchical Gated Recurrent Unit with Semantic Attention for Event Prediction. Future Internet, 12(2), 39. https://doi.org/10.3390/fi12020039