Next Article in Journal
sCwc/sLcc: Highly Scalable Feature Selection Algorithms
Previous Article in Journal
The Emperor of Strong AI Has No Clothes: Limits to Artificial Intelligence
Article Menu

Export Article

Open AccessArticle
Information 2017, 8(4), 157; doi:10.3390/info8040157

Bidirectional Long Short-Term Memory Network with a Conditional Random Field Layer for Uyghur Part-Of-Speech Tagging

1
School of Information Science and Engineering, Xinjiang University, Urumqi 830046, China
2
Xinjiang Laboratory of Multi-Language Information Technology, Xinjiang University, Urumqi 830046, China
*
Author to whom correspondence should be addressed.
Received: 30 October 2017 / Revised: 23 November 2017 / Accepted: 27 November 2017 / Published: 30 November 2017
(This article belongs to the Section Artificial Intelligence)
View Full-Text   |   Download PDF [603 KB, uploaded 1 December 2017]   |  

Abstract

Uyghur is an agglutinative and a morphologically rich language; natural language processing tasks in Uyghur can be a challenge. Word morphology is important in Uyghur part-of-speech (POS) tagging. However, POS tagging performance suffers from error propagation of morphological analyzers. To address this problem, we propose a few models for POS tagging: conditional random fields (CRF), long short-term memory (LSTM), bidirectional LSTM networks (BI-LSTM), LSTM networks with a CRF layer, and BI-LSTM networks with a CRF layer. These models do not depend on stemming and word disambiguation for Uyghur and combine hand-crafted features with neural network models. State-of-the-art performance on Uyghur POS tagging is achieved on test data sets using the proposed approach: 98.41% accuracy on 15 labels and 95.74% accuracy on 64 labels, which are 2.71% and 4% improvements, respectively, over the CRF model results. Using engineered features, our model achieves further improvements of 0.2% (15 labels) and 0.48% (64 labels). The results indicate that the proposed method could be an effective approach for POS tagging in other morphologically rich languages. View Full-Text
Keywords: Uyghur; part-of-speech tagging; conditional random field; long short-term memory; bidirectional long short-term memory Uyghur; part-of-speech tagging; conditional random field; long short-term memory; bidirectional long short-term memory
Figures

Figure 1

This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. (CC BY 4.0).

Scifeed alert for new publications

Never miss any articles matching your research from any publisher
  • Get alerts for new papers matching your research
  • Find out the new papers from selected authors
  • Updated daily for 49'000+ journals and 6000+ publishers
  • Define your Scifeed now

SciFeed Share & Cite This Article

MDPI and ACS Style

Maimaiti, M.; Wumaier, A.; Abiderexiti, K.; Yibulayin, T. Bidirectional Long Short-Term Memory Network with a Conditional Random Field Layer for Uyghur Part-Of-Speech Tagging. Information 2017, 8, 157.

Show more citation formats Show less citations formats

Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Related Articles

Article Metrics

Article Access Statistics

1

Comments

[Return to top]
Information EISSN 2078-2489 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top