Next Article in Journal
MNCE: Multi-Hop Node Localization Algorithm for Wireless Sensor Network Based on Error Correction
Next Article in Special Issue
Modeling Word Learning and Processing with Recurrent Neural Networks
Previous Article in Journal
Reliable Estimation of Urban Link Travel Time Using Multi-Sensor Data Fusion
Previous Article in Special Issue
A Diverse Data Augmentation Strategy for Low-Resource Neural Machine Translation

Fully-Unsupervised Embeddings-Based Hypernym Discovery

by 1,*,† and 2,*,†
DMI, University of Cagliari, 09124 Cagliari, Italy
Computing Science Department, University of Aberdeen, Aberdeen AB24 3FX, UK
Authors to whom correspondence should be addressed.
Authors contributed equally to this work.
Information 2020, 11(5), 268;
Received: 15 April 2020 / Revised: 8 May 2020 / Accepted: 11 May 2020 / Published: 18 May 2020
(This article belongs to the Special Issue Advances in Computational Linguistics)
The hypernymy relation is the one occurring between an instance term and its general term (e.g., “lion” and “animal”, “Italy” and “country”). This paper we addresses Hypernym Discovery, the NLP task that aims at finding valid hypernyms from words in a given text, proposing HyperRank, an unsupervised approach that therefore does not require manually-labeled training sets as most approaches in the literature. The proposed algorithm exploits the cosine distance of points in the vector space of word embeddings, as already proposed by previous state of the art approaches, but the ranking is then corrected by also weighting word frequencies and the absolute level of similarity, which is expected to be similar when measuring co-hyponyms and their common hypernym. This brings us two major advantages over other approaches—(1) we correct the inadequacy of semantic similarity which is known to cause a significant performance drop and (2) we take into accounts multiple words if provided, allowing to find common hypernyms for a set of co-hyponyms—a task ignored in other systems but very useful when coupled with set expansion (that finds co-hyponyms automatically). We then evaluate HyperRank against the SemEval 2018 Hypernym Discovery task and show that, regardless of the language or domain, our algorithm significantly outperforms all the existing unsupervised algorithms and some supervised ones as well. We also evaluate the algorithm on a new dataset to measure the improvements when finding hypernyms for sets of words instead of singletons. View Full-Text
Keywords: natural language processing; natural language understanding; unsupervised learning; hypernym discovery; word embeddings; word2vec natural language processing; natural language understanding; unsupervised learning; hypernym discovery; word embeddings; word2vec
Show Figures

Figure 1

MDPI and ACS Style

Atzori, M.; Balloccu, S. Fully-Unsupervised Embeddings-Based Hypernym Discovery. Information 2020, 11, 268.

AMA Style

Atzori M, Balloccu S. Fully-Unsupervised Embeddings-Based Hypernym Discovery. Information. 2020; 11(5):268.

Chicago/Turabian Style

Atzori, Maurizio, and Simone Balloccu. 2020. "Fully-Unsupervised Embeddings-Based Hypernym Discovery" Information 11, no. 5: 268.

Find Other Styles
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

Back to TopTop