Next Article in Journal / Special Issue
Detecting Emotions in English and Arabic Tweets
Previous Article in Journal
Hierarchical Clustering Approach for Selecting Representative Skylines
Previous Article in Special Issue
A Multilingual and Multidomain Study on Dialog Act Recognition Using Character-Level Tokenization
Article Menu

Export Article

Open AccessArticle
Information 2019, 10(3), 97; https://doi.org/10.3390/info10030097

Word Sense Disambiguation Studio: A Flexible System for WSD Feature Extraction

1
Institute of Information and Communication Technologies, Bulgarian Academy of Sciences, 1113 Sofia, Bulgaria
2
Laboratory of Computer Graphics and Geographical Information Systems, Technical University of Sofia, 2173 Sofia, Bulgaria
*
Author to whom correspondence should be addressed.
This paper is an extended version of our paper presented in 18th International Conference AIMSA 2018, Varna, Bulgaria, 12–14 September 2018.
Received: 27 January 2019 / Revised: 25 February 2019 / Accepted: 27 February 2019 / Published: 5 March 2019
(This article belongs to the Special Issue Artificial Intelligence—Methodology, Systems, and Applications)
Full-Text   |   PDF [412 KB, uploaded 8 March 2019]   |  

Abstract

The paper presents a flexible system for extracting features and creating training and test examples for solving the all-words sense disambiguation (WSD) task. The system allows integrating word and sense embeddings as part of an example description. The system possesses two unique features distinguishing it from all similar WSD systems—the ability to construct a special compressed representation for word embeddings and the ability to construct training and test sets of examples with different data granularity. The first feature allows generation of data sets with quite small dimensionality, which can be used for training highly accurate classifiers of different types. The second feature allows generating sets of examples that can be used for training classifiers specialized in disambiguating a concrete word, words belonging to the same part-of-speech (POS) category or all open class words. Intensive experimentation has shown that classifiers trained on examples created by the system outperform the standard baselines for measuring the behaviour of all-words WSD classifiers. View Full-Text
Keywords: word sense disambiguation; word embedding; classification; neural networks; random forest; deep forest; JRip word sense disambiguation; word embedding; classification; neural networks; random forest; deep forest; JRip
Figures

Figure 1

This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).
SciFeed

Share & Cite This Article

MDPI and ACS Style

Agre, G.; Petrov, D.; Keskinova, S. Word Sense Disambiguation Studio: A Flexible System for WSD Feature Extraction. Information 2019, 10, 97.

Show more citation formats Show less citations formats

Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Related Articles

Article Metrics

Article Access Statistics

1

Comments

[Return to top]
Information EISSN 2078-2489 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top