Nested Named Entity Recognition Based on Dual Stream Feature Complementation
Abstract
:1. Introduction
- a novel dual-stream feature complementary nested named entity recognition method is proposed. Different from the existing methods, which directly integrate character embedding vectors and word embedding vectors during embedding, this paper puts the two embedding vectors into Bi-LSTM [19] to obtain text context features respectively and then uses a bidirectional feature complementation mechanism to explore the maximum information gain between the two feature vectors.
- To implement the feature complementation mechanism proposed in this paper, a low-level bidirectional complementation module and a high-level bidirectional complementation module are designed to realize the bidirectional feature information complementation mechanism from two dimensions respectively.
- Through a comparative analysis of the GENIA dataset, the conclusion obtained in this paper is that compared with other classical experiments, the experimental results in this paper are the best.
2. Related Work
2.1. Named Entities
2.2. Nested Named Entities
2.3. Discussion
3. Methodology
3.1. Overview Network Architecture
3.2. Character Embedding and Feature Extraction
3.3. Low-Level Feature Complementarity
3.4. High-Level Feature Complementarity
3.4.1. Multi-Head Self-Attention Mechanism
3.4.2. HBCL
3.5. Entity Classification
4. Experiments
4.1. DataSet and Annotation Method
4.2. Experimental Parameters and Environment
4.3. Evaluation Criterion
4.4. Comparison of Experimental Results
- Lu and Roth [46] jointly modeled and identified entity boundaries, entity types, and entity heads based on the Hypergraph method for the first time.
- Xu and Jiang et al. [47] proposed the method based on local detection, which is superior to the traditional sequence labeling method without any external resources or feature engineering;
- Sohrab and Miwa [36] listed all possible interval spans as potential entity segments, and then used a deep neural network to classify them;
- Ju and Miwa et al. [32] proposed a new neural model to identify nested entities by dynamically stacking non-nested NER layers, and the cascaded CRF layer was used to extract information encoded by internal entities in an internal to external manner to identify external entities;
- Lin and Shao et al. [48] proposed combining the hypergraph model with a neural network to identify overlapping elements by hypergraph and recognize nested entities by neural network acquiring features;
4.5. Ablation Studies
- Bi-LSTM + MHSM: the two embedded vectors use Bi-LSTM to extract context feature information, and then use a multi-head attention mechanism to obtain local feature information.
- Bi-LSTM + CCM + MHSM + HBCL: after using Bi-LSTM to obtain the text context information features, the effective information of the character-level embedded feature vector is added to the word-level embedded feature vector, and then the multi-head attention mechanism is entered, and high-level feature complementation is performed.
- Bi-LSTM + WEM + MHSM + HBCL: after obtaining two kinds of feature vectors through Bi-LSTM, the effective information of the word-level embedded feature vector is added to the character-level embedded feature vector, and then local feature information is obtained, and then high-level feature complementation is performed.
- Bi-LSTM + MHSM + HBCL: use Bi-LSTM to obtain the text context feature information, then obtain the sentence local feature information, and then directly carry out high-level feature complementation to obtain the sentence feature vector.
- Bi-LSTM + CCM + WEM + MHSM: after obtaining the two feature vectors through Bi-LSTM, the low-level feature complementation of the two dimensions is carried out, and then the multi-head attention mechanism is entered to obtain the local feature information of the sentence.
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Data Availability Statement
Conflicts of Interest
References
- He, J.; Wang, H. Chinese named entity recognition and word segmentation based on character. In Proceedings of the Sixth SIGHAN Workshop on Chinese Language Processing, Hyderabad, India, 11–12 January 2008. [Google Scholar]
- Sasano, R.; Kurohashi, S. Japanese named entity recognition using structural natural language processing. In Proceedings of the Third International Joint Conference on Natural Language Processing: Volume-II, Hyderabad, India, 7–12 January 2008. [Google Scholar]
- Xue, N.; Shen, L. Chinese Word Segmentation as LMR Tagging. In Proceedings of the Second Workshop on Chinese Language Processing, SIGHAN 2003, Sapporo, Japan, 11–12 July 2003; pp. 176–179. [Google Scholar]
- Gupta, P.; Schütze, H.; Andrassy, B. Table filling multi-task recurrent neural network for joint entity and relation extraction. In Proceedings of the COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, Osaka, Japan, 11–16 December 2016; pp. 2537–2547. [Google Scholar]
- Mintz, M.; Bills, S.; Snow, R.; Jurafsky, D. Distant supervision for relation extraction without labeled data. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, Singapore, 2–7 August 2009; pp. 1003–1011. [Google Scholar]
- Bai, T.; Ge, Y.; Guo, S.; Zhang, Z.; Gong, L. Enhanced natural language interface for web-based information retrieval. IEEE Access 2020, 9, 4233–4241. [Google Scholar] [CrossRef]
- Selya, A.; Anshutz, D.; Griese, E.; Weber, T.L.; Hsu, B.; Ward, C. Predicting unplanned medical visits among patients with diabetes: Translation from machine learning to clinical implementation. BMC Med. Inform. Decis. Mak. 2021, 21, 1–11. [Google Scholar] [CrossRef] [PubMed]
- Fei, H.; Zhang, Y.; Ren, Y.; Ji, D. Latent emotion memory for multi-label emotion classification. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; Volume 34, pp. 7692–7699. [Google Scholar]
- Wu, S.; Fei, H.; Ren, Y.; Li, B.; Li, F.; Ji, D. High-order pair-wise aspect and opinion terms extraction with edge-enhanced syntactic graph convolution. IEEE/ACM Trans. Audio Speech Lang. Process. 2021, 29, 2396–2406. [Google Scholar] [CrossRef]
- Hanifah, A.F.; Kusumaningrum, R. Non-Factoid Answer Selection in Indonesian Science Question Answering System using Long Short-Term Memory (LSTM). Procedia Comput. Sci. 2021, 179, 736–746. [Google Scholar] [CrossRef]
- Mollá, D.; Van Zaanen, M.; Cassidy, S. Named Entity Recognition in Question Answering of Speech Data. In Proceedings of the Australasian Language Technology Workshop 2007, Melbourne, Australia, 10–11 December 2007. [Google Scholar]
- Shibuya, T.; Hovy, E. Nested named entity recognition via second-best sequence learning and decoding. Trans. Assoc. Comput. Linguist. 2020, 8, 605–620. [Google Scholar] [CrossRef]
- Lafferty, J.; McCallum, A.; Pereira, F.C. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data; Morgan Kaufmann Publishers Inc.: San Francisco, CA, USA, 2001. [Google Scholar]
- Li, X.; Feng, J.; Meng, Y.; Han, Q.; Wu, F.; Li, J. A unified MRC framework for named entity recognition. arXiv 2019, arXiv:1910.11476. [Google Scholar]
- Muis, A.O.; Lu, W. Labeling gaps between words: Recognizing overlapping mentions with mention separators. arXiv 2018, arXiv:1810.09073. [Google Scholar]
- Katiyar, A.; Cardie, C. Nested named entity recognition revisited. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, New Orleans, LA, USA, 1–6 June 2018; Volume 1. [Google Scholar]
- Hammerton, J. Named entity recognition with long short-term memory. In Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003, Edmonton, AB, Canada, 31 May–1 June 2003; pp. 172–175. [Google Scholar]
- Huang, Z.; Xu, W.; Yu, K. Bidirectional LSTM-CRF models for sequence tagging. arXiv 2015, arXiv:1508.01991. [Google Scholar]
- Lample, G.; Ballesteros, M.; Subramanian, S.; Kawakami, K.; Dyer, C. Neural architectures for named entity recognition. arXiv 2016, arXiv:1603.01360. [Google Scholar]
- Chiu, J.P.C.; Nichols, E. Named entity recognition with bidirectional LSTM-CNNs. Trans. Assoc. Comput. Linguist. 2016, 4, 357–370. [Google Scholar] [CrossRef]
- Collobert, R.; Weston, J.; Bottou, L.; Karlen, M.; Kavukcuoglu, K.; Kuksa, P. Natural language processing (almost) from scratch. J. Mach. Learn. Res. 2011, 12, 2493–2537. [Google Scholar]
- Ilić, S.; Marrese-Taylor, E.; Balazs, J.A.; Matsuo, Y. Deep contextualized word representations for detecting sarcasm and irony. arXiv 2018, arXiv:1809.09795. [Google Scholar]
- Mikolov, T.; Sutskever, I.; Chen, K.; Corrado, G.S.; Dean, J. Distributed representations of words and phrases and their compositionality. Adv. Neural Inf. Process. Syst. 2013, 26–35. [Google Scholar]
- Pennington, J.; Socher, R.; Manning, C.D. Glove: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, 25–29 October 2014; pp. 1532–1543. [Google Scholar]
- Joulin, A.; Grave, E.; Bojanowski, P.; Mikolov, T. Bag of tricks for efficient text classification. arXiv 2016, arXiv:1607.01759. [Google Scholar]
- Sun, C.; Yang, Z.; Wang, L.; Zhang, Y.; Lin, H.; Wang, J. Biomedical named entity recognition using BERT in the machine reading comprehension framework. J. Biomed. Inform. 2021, 118, 103799. [Google Scholar] [CrossRef]
- Guo, S.; Yang, W.; Han, L.; Song, X.; Wang, G. A multi-layer soft lattice based model for Chinese clinical named entity recognition. BMC Med. Inform. Decis. Mak. 2022, 22, 1–12. [Google Scholar] [CrossRef]
- Li, Y.; Nair, P.; Pelrine, K.; Rabbany, R. Extracting Person Names from User Generated Text: Named-Entity Recognition for Combating Human Trafficking. In Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022, Dublin, Ireland, 22–27 May 2022; pp. 2854–2868. [Google Scholar]
- Alsaaran, N.; Alrabiah, M. Classical Arabic named entity recognition using variant deep neural network architectures and BERT. IEEE Access 2021, 9, 91537–91547. [Google Scholar] [CrossRef]
- Wang, B.; Lu, W.; Wang, Y.; Jin, H. A neural transition-based model for nested mention recognition. arXiv 2018, arXiv:1810.01808. [Google Scholar]
- Lin, H.; Lu, Y.; Han, X.; Sun, L. Sequence-to-nuggets: Nested entity mention detection via anchor-region networks. arXiv 2019, arXiv:1906.03783. [Google Scholar]
- Ju, M.; Miwa, M.; Ananiadou, S. A neural layered model for nested named entity recognition. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), New Orleans, LA, USA, 1–6 June 2018; pp. 1446–1459. [Google Scholar]
- Straková, J.; Straka, M.; Hajič, J. Neural architectures for nested NER through linearization. arXiv 2019, arXiv:1908.06926. [Google Scholar]
- Luan, Y.; Wadden, D.; He, L.; Shah, A.; Ostendorf, M.; Hajishirzi, H. A general framework for information extraction using dynamic span graphs. arXiv 2019, arXiv:1904.03296. [Google Scholar]
- Zheng, C.; Cai, Y.; Xu, J.; Leung, H.; Xu, G. A boundary-aware neural model for nested named entity recognition. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China, 3–7 November 2019. [Google Scholar]
- Sohrab, M.G.; Miwa, M. Deep exhaustive model for nested named entity recognition. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, 31 October–4 November 2018; pp. 2843–2849. [Google Scholar]
- Huang, P.; Zhao, X.; Hu, M.; Fang, Y.; Li, X.; Xiao, W. Extract-Select: A Span Selection Framework for Nested Named Entity Recognition with Generative Adversarial Training. In Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022, Dublin, Ireland, 22–27 May 2022; pp. 85–96. [Google Scholar]
- Shen, Y.; Ma, X.; Tan, Z.; Zhang, S.; Wang, W.; Lu, W. Locate and label: A two-stage identifier for nested named entity recognition. arXiv 2021, arXiv:2105.06804. [Google Scholar]
- Yuan, Z.; Tan, C.; Huang, S.; Huang, F. Fusing Heterogeneous Factors with Triaffine Mechanism for Nested Named Entity Recognition. arXiv 2021, arXiv:2110.07480. [Google Scholar]
- Wan, J.; Ru, D.; Zhang, W.; Yu, Y. Nested Named Entity Recognition with Span-level Graphs. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Dublin, Ireland, 22–27 May 2022; pp. 892–903. [Google Scholar]
- Joulin, A.; Cissé, M.; Grangier, D.; Jégou, H. Efficient softmax approximation for gpus. In Proceedings of the International Conference on Machine Learning, Sydney, Australia, 6–11 August 2017; pp. 1302–1310. [Google Scholar]
- Tolstikhin, I.O.; Houlsby, N.; Kolesnikov, A.; Beyer, L.; Zhai, X.; Unterthiner, T.; Yung, J.; Steiner, A.; Keysers, D.; Uszkoreit, J.; et al. Mlp-mixer: An all-mlp architecture for vision. Adv. Neural Inf. Process. Syst. 2021, 34, 24261–24272. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017, 30–45. [Google Scholar]
- De Boer, P.T.; Kroese, D.P.; Mannor, S.; Rubinstein, R.Y. A tutorial on the cross-entropy method. Ann. Oper. Res. 2005, 134, 19–67. [Google Scholar] [CrossRef]
- Kim, J.D.; Ohta, T.; Tateisi, Y.; Tsujii, J. GENIA corpus—A semantically annotated corpus for bio-textmining. Bioinformatics 2003, 19, i180–i182. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Lu, W.; Roth, D. Joint mention extraction and classification with mention hypergraphs. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal, 17–21 September 2015; pp. 857–867. [Google Scholar]
- Xu, M.; Jiang, H.; Watcharawittayakul, S. A local detection approach for named entity recognition and mention detection. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Vancouver, BC, Canada, 30 July–4 August 2017; pp. 1237–1247. [Google Scholar]
- Lin, J.C.W.; Shao, Y.; Fournier-Viger, P.; Hamido, F. BILU-NEMH: A BILU neural-encoded mention hypergraph for mention extraction. Inf. Sci. 2019, 496, 53–64. [Google Scholar] [CrossRef]
Terms | Abbreviations | Function |
---|---|---|
WEM | word-induced enhance module | This module obtains word-level embedded feature vectors after low-level feature complementation. |
CCM | char-steered complementary module | This module obtains character-level embedded feature vectors after low-level feature complementation |
MHSM | multi-head self-attention mechanism | It makes the model pay more attention to important features, reduces the attention to non important features, and optimizes resource allocation. |
HBCL | Height bi-directional interactive connect layer | Height bi-directional interactive connect layer Two word representation vectors complement high-level features to obtain the final sentence representation vector. |
Item | Train | Dev. | Test | Overall | Nested |
---|---|---|---|---|---|
DNA | 7650 | 1026 | 1257 | 9933 | 1774 |
RNA | 692 | 132 | 109 | 933 | 407 |
Protein | 28,728 | 2303 | 3066 | 34,097 | 1902 |
Cell line | 3027 | 325 | 438 | 3790 | 347 |
Cell type | 5832 | 551 | 604 | 6987 | 389 |
Overall | 45,929 | 4337 | 5474 | 55,740 | 4789 |
Element | First-Level Annotation | Second-Level Annotation | Third-Layer Annotation | Fourth-Layer Annotation |
---|---|---|---|---|
IL | B-protein | B-RNA | O | O |
- | I-protein | I-RNA | O | O |
2R | I-protein | I-RNA | O | O |
alpha | I-protein | I-RNA | O | O |
mRNA | O | I-RNA | O | O |
that | O | O | O | O |
is | O | O | O | O |
Parameter Type | Parameter Value |
---|---|
dropout loss rate | 0.5 |
batch size | 50 |
word-level embedded vector dimension | 200 |
character-level embedded vector dimension | 200 |
LSTM hidden layers | 200 |
LSTM layers | 1 |
number of attention mechanism heads | 8 |
learning rate | 0.0005 |
Epoch | 60 |
Object | Environment |
---|---|
system | window 10 |
GPU | NVIDIA GTX 2080Ti GPU |
hard disk | 200G |
memory | 64G |
Python version | Python 3.7 |
Pytorch | 1.3.1 |
Model | P | R | F1 |
---|---|---|---|
Lu & Roth | 72.5 | 65.2 | 68.7 |
Xu & Jiang | 71.2 | 64.3 | 67.6 |
Sohrab & Miwa | 73.3 | 68.3 | 70.7 |
Ju & Miwa | 76.1 | 66.8 | 71.7 |
Lin & Shao | 70.3 | 68.9 | 69.6 |
Ours | 68.7 | 77.2 | 72.7 |
Model | P | R | F1 |
---|---|---|---|
Bi-LSTM + MHSM | 69.1 | 68.1 | 68.6 |
Bi-LSTM + CCM + MHSM + HBCL | 67.5 | 75.9 | 71.5 |
Bi-LSTM + WEM + MHSM + HBCL | 66.0 | 77.7 | 71.4 |
Bi-LSTM + MHSM + HBCL | 66.1 | 72.6 | 69.2 |
Bi-LSTM + CCM + WEM + MHSM | 70.7 | 71.6 | 71.1 |
Ours | 68.7 | 77.2 | 72.7 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Liao, T.; Huang, R.; Zhang, S.; Duan, S.; Chen, Y.; Ma, W.; Chen, X. Nested Named Entity Recognition Based on Dual Stream Feature Complementation. Entropy 2022, 24, 1454. https://doi.org/10.3390/e24101454
Liao T, Huang R, Zhang S, Duan S, Chen Y, Ma W, Chen X. Nested Named Entity Recognition Based on Dual Stream Feature Complementation. Entropy. 2022; 24(10):1454. https://doi.org/10.3390/e24101454
Chicago/Turabian StyleLiao, Tao, Rongmei Huang, Shunxiang Zhang, Songsong Duan, Yanjie Chen, Wenxiang Ma, and Xinyuan Chen. 2022. "Nested Named Entity Recognition Based on Dual Stream Feature Complementation" Entropy 24, no. 10: 1454. https://doi.org/10.3390/e24101454
APA StyleLiao, T., Huang, R., Zhang, S., Duan, S., Chen, Y., Ma, W., & Chen, X. (2022). Nested Named Entity Recognition Based on Dual Stream Feature Complementation. Entropy, 24(10), 1454. https://doi.org/10.3390/e24101454