Next Article in Journal
Logarithmic Similarity Measure between Interval-Valued Fuzzy Sets and Its Fault Diagnosis Method
Next Article in Special Issue
Recognizing Textual Entailment: Challenges in the Portuguese Language
Previous Article in Journal
A Survey on Portuguese Lexical Knowledge Bases: Contents, Comparison and Combination
Article Menu

Export Article

Open AccessArticle
Information 2018, 9(2), 35; https://doi.org/10.3390/info9020035

Distributional and Knowledge-Based Approaches for Computing Portuguese Word Similarity

Centre for Informatics and Systems of the University of Coimbra (CISUC), Department of Informatics Engineering, University of Coimbra, 3030-290 Coimbra, Portugal
This paper is an extended version of our paper published in Progress in Artificial Intelligence, Proceedings of 18th EPIA Conference on Artificial Intelligence, Porto, Portugal, 5–8 September 2017; Volume 10423 of LNCS, Springer, pp. 828–840, entitled Unsupervised Approaches for Computing Word Similarity in Portuguese.
Received: 18 December 2017 / Revised: 23 January 2018 / Accepted: 4 February 2018 / Published: 8 February 2018
Full-Text   |   PDF [374 KB, uploaded 8 February 2018]   |  

Abstract

Identifying similar and related words is not only key in natural language understanding but also a suitable task for assessing the quality of computational resources that organise words and meanings of a language, compiled by different means. This paper, which aims to be a reference for those interested in computing word similarity in Portuguese, presents several approaches for this task and is motivated by the recent availability of state-of-the-art distributional models of Portuguese words, which add to several lexical knowledge bases (LKBs) for this language, available for a longer time. The previous resources were exploited to answer word similarity tests, which also became recently available for Portuguese. We conclude that there are several valid approaches for this task, but not one that outperforms all the others in every single test. Distributional models seem to capture relatedness better, while LKBs are better suited for computing genuine similarity, but, in general, better results are obtained when knowledge from different sources is combined. View Full-Text
Keywords: semantic similarity; word similarity; lexical knowledge bases; lexical semantics; word embeddings; distributional semantics semantic similarity; word similarity; lexical knowledge bases; lexical semantics; word embeddings; distributional semantics
Figures

Figure 1

This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. (CC BY 4.0).
SciFeed

Share & Cite This Article

MDPI and ACS Style

Gonçalo Oliveira, H. Distributional and Knowledge-Based Approaches for Computing Portuguese Word Similarity. Information 2018, 9, 35.

Show more citation formats Show less citations formats

Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Related Articles

Article Metrics

Article Access Statistics

1

Comments

[Return to top]
Information EISSN 2078-2489 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top