Next Article in Journal
Recognizing Textual Entailment: Challenges in the Portuguese Language
Previous Article in Journal
Robust Aircraft Detection with a Simple and Efficient Model
Previous Article in Special Issue
Multilingual and Multiword Phenomena in a lemon Old Occitan Medico-Botanical Lexicon
Article Menu

Export Article

Open AccessFeature PaperArticle
Information 2018, 9(4), 75; https://doi.org/10.3390/info9040075

Language-Agnostic Relation Extraction from Abstracts in Wikis

Data and Web Science Group, University of Mannheim, Mannheim 68131, Germany
*
Author to whom correspondence should be addressed.
Received: 5 February 2018 / Revised: 16 March 2018 / Accepted: 28 March 2018 / Published: 29 March 2018
(This article belongs to the Special Issue Towards the Multilingual Web of Data)
View Full-Text   |   Download PDF [1304 KB, uploaded 3 May 2018]   |  

Abstract

Large-scale knowledge graphs, such as DBpedia, Wikidata, or YAGO, can be enhanced by relation extraction from text, using the data in the knowledge graph as training data, i.e., using distant supervision. While most existing approaches use language-specific methods (usually for English), we present a language-agnostic approach that exploits background knowledge from the graph instead of language-specific techniques and builds machine learning models only from language-independent features. We demonstrate the extraction of relations from Wikipedia abstracts, using the twelve largest language editions of Wikipedia. From those, we can extract 1.6 M new relations in DBpedia at a level of precision of 95%, using a RandomForest classifier trained only on language-independent features. We furthermore investigate the similarity of models for different languages and show an exemplary geographical breakdown of the information extracted. In a second series of experiments, we show how the approach can be transferred to DBkWik, a knowledge graph extracted from thousands of Wikis. We discuss the challenges and first results of extracting relations from a larger set of Wikis, using a less formalized knowledge graph. View Full-Text
Keywords: relation extraction; knowledge graphs; Wikipedia; DBpedia; DBkWik; Wiki farms relation extraction; knowledge graphs; Wikipedia; DBpedia; DBkWik; Wiki farms
Figures

Figure 1

This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. (CC BY 4.0).
SciFeed

Share & Cite This Article

MDPI and ACS Style

Heist, N.; Hertling, S.; Paulheim, H. Language-Agnostic Relation Extraction from Abstracts in Wikis. Information 2018, 9, 75.

Show more citation formats Show less citations formats

Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Related Articles

Article Metrics

Article Access Statistics

1

Comments

[Return to top]
Information EISSN 2078-2489 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top