Next Article in Journal
Quadratic Frequency Modulation Signals Parameter Estimation Based on Product High Order Ambiguity Function-Modified Integrated Cubic Phase Function
Previous Article in Journal
Dynamic Evolution Model of a Collaborative Innovation Network from the Resource Perspective and an Application Considering Different Government Behaviors
Article Menu

Export Article

Open AccessArticle
Information 2019, 10(4), 139; https://doi.org/10.3390/info10040139

Learning Subword Embedding to Improve Uyghur Named-Entity Recognition

1
College of Information Science and Engineering, Xinjiang University, Urumqi 830046, China
2
Multilingual Information Technology Laboratory of Xinjiang University, Urumqi 830046, China
3
Iflytek Voice and Language Joint Laboratory, Xinjiang University, Urumqi 830046, China
*
Author to whom correspondence should be addressed.
Received: 27 March 2019 / Revised: 9 April 2019 / Accepted: 11 April 2019 / Published: 15 April 2019
(This article belongs to the Section Artificial Intelligence)
  |  
PDF [875 KB, uploaded 15 April 2019]
  |  

Abstract

Uyghur is a morphologically rich and typical agglutinating language, and morphological segmentation affects the performance of Uyghur named-entity recognition (NER). Common Uyghur NER systems use the word sequence as input and rely heavily on feature engineering. However, semantic information cannot be fully learned and will easily suffer from data sparsity arising from morphological processes when only the word sequence is considered. To solve this problem, we provide a neural network architecture employing subword embedding with character embedding based on a bidirectional long short-term memory network with a conditional random field layer. Our experiments show that subword embedding can effectively enhance the performance of the Uyghur NER, and the proposed method outperforms the model-based word sequence method. View Full-Text
Keywords: subword embedding; Uyghur; named-entity recognition; morphological processing; word sequence; natural language processing; deep learning; word-based neural model subword embedding; Uyghur; named-entity recognition; morphological processing; word sequence; natural language processing; deep learning; word-based neural model
Figures

Figure 1

This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).
SciFeed

Share & Cite This Article

MDPI and ACS Style

Saimaiti, A.; Wang, L.; Yibulayin, T. Learning Subword Embedding to Improve Uyghur Named-Entity Recognition. Information 2019, 10, 139.

Show more citation formats Show less citations formats

Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Related Articles

Article Metrics

Article Access Statistics

1

Comments

[Return to top]
Information EISSN 2078-2489 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top