Submit to Information Review for Information Propose a Special Issue

Journal Browser

► Journal Browser

Towards the Multilingual Web of Data

Print Special Issue Flyer
Special Issue Editors
Special Issue Information
Keywords
Published Papers

A special issue of Information (ISSN 2078-2489). This special issue belongs to the section "Artificial Intelligence".

Deadline for manuscript submissions: closed (15 September 2018) | Viewed by 37334

Share This Special Issue

Special Issue Editors

Dr. John P. McCrae

E-Mail Website
Guest Editor

Data Science Institute/Insight Centre for Data Analytics, National University of Ireland Galway, H91 CF50 Galway, Ireland
Interests: natural language processing; semantic web

Dr. Jorge Gracia

E-Mail Website
Guest Editor

Ontology Engineering Group, Artificial Intelligence Department, Universidad Politécnica de Madrid, Madrid, Spain
Interests: semantic web; linguistic linked data; multilingualism; query Interpretation; ontology matching

Special Issue Information

Dear Colleagues,

The MDPI Information Journal invites submissions to a special issue on “Towards the Multilingual Web of Data”.

The Web of Data has increasingly become a space where concepts are described not only with logic and ontologies but also with linguistic information in the form of multilingual lexicons, terminologies, and thesauri. In particular, this has led to the creation of a growing cloud of linguistically-linked open data, which bridges the world of ontologies with dictionaries, corpora and other linguistic resources. This raises several challenges such as ontology localisation, cross-lingual question answering, cross-lingual ontology and data matching, representation of lexical information on the Web of Data, etc.

Furthermore, NLP and machine learning for linked data can benefit from exploiting multilingual language resources, such as annotated corpora, wordnets, bilingual dictionaries, etc., if they are themselves formally represented and linked by following the linked data principles. A critical mass of language resources as linked data on the Web is leading to a new generation of linked, data-aware NLP techniques and tools, which, in turn, will serve as basis for a richer, multilingual Web.

This Special Issue is concerned with groundbreaking topics at the interface of the Semantic Web, language resources and NLP, with particular emphasis on multilingual aspects.

Topics of call

Linguistic Linked Open Data
NLP on the Semantic Web
Multilinguality and semantics
Best practices for multilingual linked data
Validation, quality and legal issues for multilingual linked data
Language resources published as linked data
Ontologies, terminologies and models for multilingual linked data
Publishing language resources as linked data using language description models such as OntoLex-lemon
Cross-lingual information access, search and retrieval
NLP and machine learning approaches for the Semantic Web and Linked Data

Papers’ length has to be 9–15 pages and should be formatted according to the MDPI template. Complete instructions for authors can be found at:

https://www.mdpi.com/journal/information/instructions

Important Dates
Submission: 15 December 2017
Notification: January 2018

Dr. John P. McCrae
Dr. Jorge Gracia
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Information is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

Linked Data
Semantic Web
Linguistic Linked Data
Multilinguality
Language Resources
Lexicography
Ontology-Lexicon Interface
Natural Language Processing
Ontologies
Terminologies

Published Papers (7 papers)

Download All Papers

Editorial

Jump to: Research

2 pages, 167 KiB

Open AccessEditorial

Foreword to the Special Issue: “Towards the Multilingual Web of Data”

by John P. McCrae and Jorge Gracia

Information 2019, 10(2), 56; https://doi.org/10.3390/info10020056 - 09 Feb 2019

Cited by 1 | Viewed by 2498

Abstract

We are pleased to introduce this special issue on the topic of “Towards the Multilingual Web of Data”, which we feel is a timely and valuable topic in our increasingly multilingual and interconnected world [...] Full article

(This article belongs to the Special Issue Towards the Multilingual Web of Data)

Research

Jump to: Editorial

17 pages, 1880 KiB

Open AccessArticle

Towards the Representation of Etymological Data on the Semantic Web

by Anas Fahad Khan

Information 2018, 9(12), 304; https://doi.org/10.3390/info9120304 - 30 Nov 2018

Cited by 10 | Viewed by 4718

Abstract

In this article, we look at the potential for a wide-coverage modelling of etymological information as linked data using the Resource Data Framework (RDF) data model. We begin with a discussion of some of the most typical features of etymological data and the challenges that these might pose to an RDF-based modelling. We then propose a new vocabulary for representing etymological data, the Ontolex-lemon Etymological Extension (lemonETY), based on the ontolex-lemon model. Each of the main elements of our new model is motivated with reference to the preceding discussion. Full article

(This article belongs to the Special Issue Towards the Multilingual Web of Data)

► Show Figures

Figure 1

24 pages, 2248 KiB

Open AccessArticle

Semantic Modelling and Publishing of Traditional Data Collection Questionnaires and Answers

by Yalemisew Abgaz, Amelie Dorn, Barbara Piringer, Eveline Wandl-Vogt and Andy Way

Information 2018, 9(12), 297; https://doi.org/10.3390/info9120297 - 24 Nov 2018

Cited by 14 | Viewed by 5783

Abstract

Extensive collections of data of linguistic, historical and socio-cultural importance are stored in libraries, museums and national archives with enormous potential to support research. However, a sizable portion of the data remains underutilised because of a lack of the required knowledge to model the data semantically and convert it into a format suitable for the semantic web. Although many institutions have produced digital versions of their collection, semantic enrichment, interlinking and exploration are still missing from digitised versions. In this paper, we present a model that provides structure and semantics to a non-standard linguistic and historical data collection on the example of the Bavarian dialects in Austria at the Austrian Academy of Sciences. We followed a semantic modelling approach that utilises the knowledge of domain experts and the corresponding schema produced during the data collection process. The model is used to enrich, interlink and publish the collection semantically. The dataset includes questionnaires and answers as well as supplementary information about the circumstances of the data collection (person, location, time, etc.). The semantic uplift is demonstrated by converting a subset of the collection to a Linked Open Data (LOD) format, where domain experts evaluated the model and the resulting dataset for its support of user queries. Full article

(This article belongs to the Special Issue Towards the Multilingual Web of Data)

► Show Figures

Figure 1

16 pages, 591 KiB

Open AccessFeature PaperArticle

Annotating a Low-Resource Language with LLOD Technology: Sumerian Morphology and Syntax

by Christian Chiarcos, Ilya Khait, Émilie Pagé-Perron, Niko Schenk, Jayanth, Christian Fäth, Julius Steuer, William Mcgrath and Jinyan Wang

Information 2018, 9(11), 290; https://doi.org/10.3390/info9110290 - 19 Nov 2018

Cited by 13 | Viewed by 6155

Abstract

This paper describes work on the morphological and syntactic annotation of Sumerian cuneiform as a model for low resource languages in general. Cuneiform texts are invaluable sources for the study of history, languages, economy, and cultures of Ancient Mesopotamia and its surrounding regions. Assyriology, the discipline dedicated to their study, has vast research potential, but lacks the modern means for computational processing and analysis. Our project, Machine Translation and Automated Analysis of Cuneiform Languages, aims to fill this gap by bringing together corpus data, lexical data, linguistic annotations and object metadata. The project’s main goal is to build a pipeline for machine translation and annotation of Sumerian Ur III administrative texts. The rich and structured data is then to be made accessible in the form of (Linguistic) Linked Open Data (LLOD), which should open them to a larger research community. Our contribution is two-fold: in terms of language technology, our work represents the first attempt to develop an integrative infrastructure for the annotation of morphology and syntax on the basis of RDF technologies and LLOD resources. With respect to Assyriology, we work towards producing the first syntactically annotated corpus of Sumerian. Full article

(This article belongs to the Special Issue Towards the Multilingual Web of Data)

► Show Figures

Figure 1

30 pages, 2555 KiB

Open AccessArticle

Conversion of the English-Xhosa Dictionary for Nurses to a Linguistic Linked Data Framework

by Frances Gillis-Webber

Information 2018, 9(11), 274; https://doi.org/10.3390/info9110274 - 06 Nov 2018

Cited by 3 | Viewed by 4110

Abstract

The English-Xhosa Dictionary for Nurses (EXDN) is a bilingual, unidirectional printed dictionary in the public domain, with English and isiXhosa as the language pair. By extending the digitisation efforts of EXDN from a human-readable digital object to a machine-readable state, using Resource Description Framework (RDF) as the data model, semantically interoperable structured data can be created, thus enabling EXDN’s data to be reused, aggregated and integrated with other language resources, where it can serve as a potential aid in the development of future language resources for isiXhosa, an under-resourced language in South Africa. The methodological guidelines for the construction of a Linguistic Linked Data framework (LLDF) for a lexicographic resource, as applied to EXDN, are described, where an LLDF can be defined as a framework: (1) which describes data in RDF, (2) using a model designed for the representation of linguistic information, (3) which adheres to Linked Data principles, and (4) which supports versioning, allowing for change. The result is a bidirectional lexicographic resource, previously bounded and static, now unbounded and evolving, with the ability to extend to multilingualism. Full article

(This article belongs to the Special Issue Towards the Multilingual Web of Data)

► Show Figures

Figure 1

22 pages, 1304 KiB

Open AccessFeature PaperArticle

Language-Agnostic Relation Extraction from Abstracts in Wikis

by Nicolas Heist, Sven Hertling and Heiko Paulheim

Information 2018, 9(4), 75; https://doi.org/10.3390/info9040075 - 29 Mar 2018

Cited by 15 | Viewed by 7556

Abstract

Large-scale knowledge graphs, such as DBpedia, Wikidata, or YAGO, can be enhanced by relation extraction from text, using the data in the knowledge graph as training data, i.e., using distant supervision. While most existing approaches use language-specific methods (usually for English), we present a language-agnostic approach that exploits background knowledge from the graph instead of language-specific techniques and builds machine learning models only from language-independent features. We demonstrate the extraction of relations from Wikipedia abstracts, using the twelve largest language editions of Wikipedia. From those, we can extract 1.6 M new relations in DBpedia at a level of precision of 95%, using a RandomForest classifier trained only on language-independent features. We furthermore investigate the similarity of models for different languages and show an exemplary geographical breakdown of the information extracted. In a second series of experiments, we show how the approach can be transferred to DBkWik, a knowledge graph extracted from thousands of Wikis. We discuss the challenges and first results of extracting relations from a larger set of Wikis, using a less formalized knowledge graph. Full article

(This article belongs to the Special Issue Towards the Multilingual Web of Data)

► Show Figures

Figure 1

13 pages, 1278 KiB

Open AccessArticle

Multilingual and Multiword Phenomena in a lemon Old Occitan Medico-Botanical Lexicon

by Andrea Bellandi, Emiliano Giovannetti and Anja Weingart

Information 2018, 9(3), 52; https://doi.org/10.3390/info9030052 - 28 Feb 2018

Cited by 12 | Viewed by 4966

Abstract

This article illustrates the progresses made in representing a multilingual and multi-alphabetical Old Occitan medico-botanical lexicon in the context of the project Dictionnaire de Termes Médico-botaniques de l’Ancien Occitan (DiTMAO). The chosen lexical model of reference is lemon, which has been extended accordingly to some specific linguistic and lexical features of the lexicon. In particular, issues and solutions about the modeling of multilingual and multiword phenomena are discussed, as the way they are managed through LexO, a web editor developed in the context of the project. Full article

(This article belongs to the Special Issue Towards the Multilingual Web of Data)

► Show Figures

Figure 1

Journal Menu

Journal Browser

Towards the Multilingual Web of Data

Share This Special Issue

Special Issue Editors

Special Issue Information

Keywords

Published Papers (7 papers)

Editorial

Research

Further Information

Guidelines

MDPI Initiatives

Follow MDPI