Next Article in Journal
Acoustic Improvements of Aircraft Headrests Based on Electrospun Mats Evaluated Through Boundary Element Method
Next Article in Special Issue
A C-BiLSTM Approach to Classify Construction Accident Reports
Previous Article in Journal
Dynamic Load Modelling within Combined Transport Trains during Transportation on a Railway Ferry
Previous Article in Special Issue
A Topical Category-Aware Neural Text Summarizer
Open AccessArticle

An ERNIE-Based Joint Model for Chinese Named Entity Recognition

by Yu Wang 1,2, Yining Sun 1,2,*, Zuchang Ma 1, Lisheng Gao 1 and Yang Xu 1
1
AnHui Province Key Laboratory of Medical Physics and Technology, Institute of Intelligent Machines, Hefei Institutes of Physical Science, Chinese Academy of Sciences, Hefei 230031, China
2
Science Island Branch of Graduate School, University of Science and Technology of China, Hefei 230026, China
*
Author to whom correspondence should be addressed.
Appl. Sci. 2020, 10(16), 5711; https://doi.org/10.3390/app10165711
Received: 29 July 2020 / Revised: 13 August 2020 / Accepted: 14 August 2020 / Published: 18 August 2020
(This article belongs to the Special Issue Machine Learning and Natural Language Processing)
Named Entity Recognition (NER) is the fundamental task for Natural Language Processing (NLP) and the initial step in building a Knowledge Graph (KG). Recently, BERT (Bidirectional Encoder Representations from Transformers), which is a pre-training model, has achieved state-of-the-art (SOTA) results in various NLP tasks, including the NER. However, Chinese NER is still a more challenging task for BERT because there are no physical separations between Chinese words, and BERT can only obtain the representations of Chinese characters. Nevertheless, the Chinese NER cannot be well handled with character-level representations, because the meaning of a Chinese word is quite different from that of the characters, which make up the word. ERNIE (Enhanced Representation through kNowledge IntEgration), which is an improved pre-training model of BERT, is more suitable for Chinese NER because it is designed to learn language representations enhanced by the knowledge masking strategy. However, the potential of ERNIE has not been fully explored. ERNIE only utilizes the token-level features and ignores the sentence-level feature when performing the NER task. In this paper, we propose the ERNIE-Joint, which is a joint model based on ERNIE. The ERNIE-Joint can utilize both the sentence-level and token-level features by joint training the NER and text classification tasks. In order to use the raw NER datasets for joint training and avoid additional annotations, we perform the text classification task according to the number of entities in the sentences. The experiments are conducted on two datasets: MSRA-NER and Weibo. These datasets contain Chinese news data and Chinese social media data, respectively. The results demonstrate that the ERNIE-Joint not only outperforms BERT and ERNIE but also achieves the SOTA results on both datasets. View Full-Text
Keywords: joint training; named entity recognition; pre-training models; ERNIE; BERT joint training; named entity recognition; pre-training models; ERNIE; BERT
Show Figures

Figure 1

MDPI and ACS Style

Wang, Y.; Sun, Y.; Ma, Z.; Gao, L.; Xu, Y. An ERNIE-Based Joint Model for Chinese Named Entity Recognition. Appl. Sci. 2020, 10, 5711.

Show more citation formats Show less citations formats
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Search more from Scilit
 
Search
Back to TopTop