Special Issue on Smart Data and Semantics in a Sensor World

Since its first inception in 2001, the application of the Semantic Web [...]


Introduction
Since its first inception in 2001, the application of the Semantic Web [1,2] has carried out an extensive use of ontologies [3][4][5], reasoning, and semantics in diverse fields, such as Information Integration, Software Engineering, Bioinformatics, eGovernment, eHealth, and social networks. This widespread use of ontologies has led to an incredible advance in the development of techniques to manipulate, share, reuse, and integrate information across heterogeneous data sources.
In recent years, the growth of the IoT (Internet of Things) required to face the challenges of "Big Data" [6][7][8][9][10]. The cost of sensors is decreasing, while their use is expanding. Moreover, the use of multiple personal smart devices is an emerging trend and all of them can embed sensors to monitor the surrounding environment. Therefore, the number of available sensors is exploding. On the one hand, the flows of sensor data are massive and continuous, and the data could be obtained in real time or with a delay of just a few seconds. Then, the volume of sensor data is increasing continuously every day. On the other hand, the variety of data being generated is also increasing, due to plenty of different devices and different measures to record. There are many kinds of structured and unstructured sensor data in diverse formats. Moreover, data veracity, which is the degree of accuracy or truthfulness of a data set, is an important aspect to consider. In the context of sensor data, it represents the trustworthiness of the data source and the processing of data. The need for more accurate and reliable data was always declared, but often overlooked for the sake of larger and cheaper datasets. Sensor data are uncertain and imprecise by definition; therefore, to increase their quality, allocating enough resources to clean up sensor data properly is needed.
Research areas related to the Semantic Web, like ontology matching/ontology alignment [11,12], are providing efficient methodologies based on the RDF and OWL languages to provide standard ways to convert such datasets into Linked Data sources [13,14]. Due to its interest, in several fields, such as social networks, smart cities, or context-aware mobile applications, it is very relevant to publish the data available as Linked Data. Therefore, the development of techniques to enable users to publish, visualize, and easily manipulate data is in high demand. In addition, the standardization of this linked format has completely revolutionized the way of representing and analyzing data, which requires new graph-based machine learning and data mining techniques to explore such representation.
On the other hand, sensing technologies have become an important field for computer scientists. Sensors are sparsely distributed across the globe, leading to an overwhelming amount of data about our environment. Sensors can range from stationary environmental sensors to drones or autonomous vehicles collecting data, or even to humans acting as sensors using smartphones, and can be used to detect a multitude of observations, from simple phenomena to complex events. Moreover, the Sensor Web [15][16][17] has realized the idea of a standardized, interoperable platform for everyone to easily share, find, and access sensor data.
However, the various characteristics of sensor data and their corresponding processing requirements, such as their multisource, heterogeneous, real-time, voluminous, streaming, and spatiotemporal features, has led many traditional data processing and integration approaches to show their limitations. Moreover, the lack of integration and communication between sensor networks often isolates important data streams and intensifies the existing problem of having too much data and not enough knowledge.
In this area, Semantic Web technologies [18] have provided particular means to achieve these aims. Specifically, the Semantic Sensor Web (SSW) [15] proposes that sensor data be annotated with semantic metadata, which will increase the interoperability and provide contextual information essential for situational knowledge. Social applications as well as ubiquitous and pervasive computing are examples of areas making use of semantic measured or processed data (e.g., use of semantic techniques for location-based services [19]). Semantization, context awareness, community management, and data visualization are core issues related to this area.

Topics of Interest for the Special Issue
The goal of this Special Issue is to provide a venue to show the practical progress made in the area of data management for sensors, particularly regarding the use of semantic techniques to obtain and exploit smart data from the raw sensor data. The terms "semantic web" and "sensor web" were firstly used more than 15 years ago. Despite this, they are mature topics on which the research community is still very active and with significant research challenges to address.
This Special Issue intends to provide insights on recent advances in these topics by soliciting original scientific contributions in the form of theoretical foundations, models, experimental research, and case studies for developing semantic Web-based applications. We aimed to bring together research related to several disciplines, such as Data Management, Knowledge Representation and Engineering, Web of Data, and Sensor Networks, among others. Original research contributions were invited on all aspects of the Semantic Web and Sensor Web, as well as their applications. Particularly, the following themes were within the scope of this Special Issue:

1.
Semantics and Sensor Data: real-time sensor data streams, data management for sensor data, obtention of smart data from sensors, analytics of sensor data streams, semantic modelling and annotation of sensor data, scaling semantic sensor systems, sensor data representation, acquisition, and cleaning, semantic integration of heterogeneous data sources, semantic data management technologies for sensor data, challenges with managing and integrating real-time and historical sensor data, provenance, and access control and privacy-preserving issues in semantic data and sensor data.

2.
Linked Data: Linked Data applications and case studies, visualizations, and user interfaces for ontologies, sensor data and linked data, machine learning and data mining for the Web of Data.

3.
Sensor-Based Applications: semantic modeling of data for smart cities, the mobile web, sensors and semantic streams, smart cities, urban and geospatial data, semantics and sensor data for smart cities, semantics and eGovernment, managing sensor data in transportation applications, collaborative sensing, and spatial crowdsourcing.
We encouraged theoretical, methodological, empirical, and application papers. The submitted papers should describe original work, present significant results, and provide a rigorous, principled, and repeatable evaluation. In addition, the submission of papers incorporating links to data sets and other materials used for evaluation, as well as to live demos and software source code, was appreciated.

The Papers
A total of 18 papers were submitted to the Special Issue on Smart Data and Semantics in a Sensor World. After peer review, four of them were accepted and published. Afterwards, an additional paper co-authored by the guest editors was submitted and managed independently by another editor of Applied Sciences, which after the review process was finally accepted and incorporated also to the special issue.
The paper "Using Adverse Weather Data in Social Media to Assist with City-Level Traffic Situation Awareness and Alerting", authored by Hao Lu, Yifan Zhu, Kaize Shi, Yisheng Lv, Pengfei Shi, and Zhendong Niu, presents a forecasting and alerting approach for traffic incidents in a city. It exploits temporal, spatial, and meteorological features extracted from social media to estimate the city's traffic status and suggest the appropriate warning levels. The proposal is evaluated with a traffic incidents dataset (tweets extracted from Sina Weibo, a microblogging service for Chinese-speaking people).
The paper "Ontological Representation of Smart City Data: From Devices to Cities", co-authored by Paola Espinoza-Arias, María Poveda-Villalón, Raúl García-Castro, and Óscar Corcho, presents a systematic literature review of available smart city ontologies and proposes a set of ontology design patterns. These patterns, which provide core concepts to guide the development of ontologies for smart cities, are defined by considering the ontological requirements for smart city data and the findings of the survey presented.
The paper "MEBN-RM: A Mapping between Multi-Entity Bayesian Network and Relational Model", co-authored by Cheol Young Park and Kathryn Blackmond Laskey, presents MEBN-RM, a set of mapping rules between key elements of the knowledge representation formalism Multi-Entity Bayesian Network (MEBN), which is the basis of PR-OWL (a probabilistic extension to OWL), and the Relational Model (RM). An algorithm that transforms a relational schema into a MEBN model has been released as an open software tool. The proposal is illustrated with two example use cases for situation awareness (a critical infrastructure defense system and a smart manufacturing system), where sensor data are stored in relational databases and converted to MEBN models, and the performance is evaluated in terms of the mapping speed and the accuracy of the mapping.
The paper "Smart Environmental Data Infrastructures: Bridging the Gap between Earth Sciences and Citizens", co-authored by José R. R. Viqueira, Sebastián Villarroya, David Mera, and José A. Taboada, focuses on the problem of monitoring and forecasting of environmental conditions and reviews earth science data representation and access standards and technologies. It identifies nine key challenges that need to be tackled for an effective and efficient smart geospatial searching and browsing.
Finally, the paper "Semantic Traffic Sensor Data: The TRAFAIR Experience", co-authored by Federico Desimoni, Sergio Ilarri, Laura Po, Federica Rollo, and Raquel Trillo-Lado, tackles the problem of modeling and semantic annotation of traffic data. It presents the tools and techniques used in the TRAFAIR (Understanding Traffic Flows to Improve Air Quality) project for data modeling and its semantic enrichment in the cities of Modena (Italy) and Zaragoza (Spain). An experimental evaluation shows the performance of the approach proposed to publish Linked Data.
Author Contributions: Paper writing, special issue coordination and management of submitted papers, S.I.; Paper writing and management of submitted papers, L.P.; Paper writing and management of submitted papers, R.T.-L. All authors have read and agreed to the published version of the manuscript.

Funding:
We want to thank the support of the projects that fund our work in relation to the topics of this special issue: the TRAFAIR project 2017-EU-IA-0167, co-financed by the Connecting Europe Facility of the European Union, the project TIN2016-78011-C4-3-R (AEI/FEDER, UE), and the Government of Aragon (Group Reference T64_20R, COSMOS research group).