Next Article in Journal
Tree-Structured Regression Model Using a Projection Pursuit Approach
Next Article in Special Issue
An Information Diffusion Model Based on Explosion Shock Wave Theory on Online Social Networks
Previous Article in Journal
Study of Novel Punched-Bionic Impellers for High Efficiency and Homogeneity in PCM Mixing and Other Solid-Liquid Stirs
Previous Article in Special Issue
I Explain, You Collaborate, He Cheats: An Empirical Study with Social Network Analysis of Study Groups in a Computer Programming Subject
Article

FONDUE: A Framework for Node Disambiguation and Deduplication Using Network Embeddings †

AIDA, IDLab-ELIS, Ghent University, 9052 Ghent, Belgium
*
Author to whom correspondence should be addressed.
This paper is an extended version of our paper published in IEEE DSAA 2020 The 7th IEEE International Conference on Data Science and Advanced Analytics.
Academic Editors: Paola Velardi and Stefano Faralli
Appl. Sci. 2021, 11(21), 9884; https://doi.org/10.3390/app11219884
Received: 2 August 2021 / Revised: 13 October 2021 / Accepted: 18 October 2021 / Published: 22 October 2021
(This article belongs to the Special Issue Social Network Analysis)
Data often have a relational nature that is most easily expressed in a network form, with its main components consisting of nodes that represent real objects and links that signify the relations between these objects. Modeling networks is useful for many purposes, but the efficacy of downstream tasks is often hampered by data quality issues related to their construction. In many constructed networks, ambiguity may arise when a node corresponds to multiple concepts. Similarly, a single entity can be mistakenly represented by several different nodes. In this paper, we formalize both the node disambiguation (NDA) and node deduplication (NDD) tasks to resolve these data quality issues. We then introduce FONDUE, a framework for utilizing network embedding methods for data-driven disambiguation and deduplication of nodes. Given an undirected and unweighted network, FONDUE-NDA identifies nodes that appear to correspond to multiple entities for subsequent splitting and suggests how to split them (node disambiguation), whereas FONDUE-NDD identifies nodes that appear to correspond to same entity for merging (node deduplication), using only the network topology. From controlled experiments on benchmark networks, we find that FONDUE-NDA is substantially and consistently more accurate with lower computational cost in identifying ambiguous nodes, and that FONDUE-NDD is a competitive alternative for node deduplication, when compared to state-of-the-art alternatives. View Full-Text
Keywords: node disambiguation; node deduplication; node linking; entity linking; network embeddings; representation learning node disambiguation; node deduplication; node linking; entity linking; network embeddings; representation learning
Show Figures

Figure 1

MDPI and ACS Style

Mel, A.; Kang, B.; Lijffijt, J.; De Bie, T. FONDUE: A Framework for Node Disambiguation and Deduplication Using Network Embeddings. Appl. Sci. 2021, 11, 9884. https://doi.org/10.3390/app11219884

AMA Style

Mel A, Kang B, Lijffijt J, De Bie T. FONDUE: A Framework for Node Disambiguation and Deduplication Using Network Embeddings. Applied Sciences. 2021; 11(21):9884. https://doi.org/10.3390/app11219884

Chicago/Turabian Style

Mel, Ahmad, Bo Kang, Jefrey Lijffijt, and Tijl De Bie. 2021. "FONDUE: A Framework for Node Disambiguation and Deduplication Using Network Embeddings" Applied Sciences 11, no. 21: 9884. https://doi.org/10.3390/app11219884

Find Other Styles
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Back to TopTop