Next Article in Journal
Genetic Algorithm-Based Optimization of Offloading and Resource Allocation in Mobile-Edge Computing
Previous Article in Journal
Requirements for Robotic Interpretation of Social Signals “in the Wild”: Insights from Diagnostic Criteria of Autism Spectrum Disorder
Previous Article in Special Issue
Towards Language Service Creation and Customization for Low-Resource Languages
Open AccessArticle

Enhancing the Performance of Telugu Named Entity Recognition Using Gazetteer Features

Department of Computer Science and Information Systems, Birla Institute of Technology and Science Pilani, Hyderabad Campus, Telangana 500078, India
*
Author to whom correspondence should be addressed.
Information 2020, 11(2), 82; https://doi.org/10.3390/info11020082
Received: 11 November 2019 / Revised: 20 January 2020 / Accepted: 20 January 2020 / Published: 2 February 2020
(This article belongs to the Special Issue Computational Linguistics for Low-Resource Languages)
Named entity recognition (NER) is a fundamental step for many natural language processing tasks and hence enhancing the performance of NER models is always appreciated. With limited resources being available, NER for South-East Asian languages like Telugu is quite a challenging problem. This paper attempts to improve the NER performance for Telugu using gazetteer-related features, which are automatically generated using Wikipedia pages. We make use of these gazetteer features along with other well-known features like contextual, word-level, and corpus features to build NER models. NER models are developed using three well-known classifiers—conditional random field (CRF), support vector machine (SVM), and margin infused relaxed algorithms (MIRA). The gazetteer features are shown to improve the performance, and theMIRA-based NER model fared better than its counterparts SVM and CRF. View Full-Text
Keywords: information extraction; named entity recognition; Telugu language; gazetteer; support vector machine; conditional random field; margin infused relaxed algorithm information extraction; named entity recognition; Telugu language; gazetteer; support vector machine; conditional random field; margin infused relaxed algorithm
Show Figures

Figure 1

MDPI and ACS Style

Gorla, S.; Neti, L.B.M.; Malapati, A. Enhancing the Performance of Telugu Named Entity Recognition Using Gazetteer Features. Information 2020, 11, 82.

Show more citation formats Show less citations formats
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Back to TopTop