Distributed and Parallel Architectures for Spatial Data

A special issue of ISPRS International Journal of Geo-Information (ISSN 2220-9964).

Deadline for manuscript submissions: closed (30 June 2019) | Viewed by 34044

Printed Edition Available!
A printed edition of this Special Issue is available here.

Special Issue Editors


E-Mail Website
Guest Editor
Department of Computer Science, University of Verona, 37129 Verona, Italy
Interests: spatial big data systems; spatio-temporal data analysis; spatial query processing; conceptual design of spatial databases; spatial constraints; spatial data validation
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Department of Computer Science, University of Verona, 37134 Verona, Italy
Interests: data management; spatiotemporal information systems; big data and analytics; collaborative and distributed architectures; blockchain technology
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Department of Computer Science, University of Verona, Verona, Italy
Interests: big data systems: design, analysis and evaluation of large scale data processing systems; distributed systems: analysis of the Content Delivery Networks (CDNs), with a focus on cache management policies

E-Mail Website
Guest Editor
Department of Industrial and Information Engineering and Economics, University of L'Aquila, 67100 L'Aquila, Italy
Interests: spatial databases; spatial query languages; mathematical modeling of spatial information; computational geometry; spatio-temporal reasoning; wualitative modeling of geographical information; indoor and outdoor navigation; volunteered geographic information
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

In recent years, an increasing amount of spatial data have been collected by different types of devices, such as mobile phones, sensors, satellites, space telescope, medical tools for analysis, or are generated by social networks, such as geotagged tweets. The processing of this huge amount of information, including spatial properties, which are frequently represented in heterogeneous ways, is a challenging task that has boosted research in the big data area to investigate the case and propose new solutions for dealing with its peculiarities.

Many different proposals and approaches for facing the problem have been proposed in the literature, addressing different goals and different types of users. However, most of them are obtained by customizing existing approaches, which were originally developed for the processing of big data of the alphanumeric type, without any specific support for spatial or spatio-temporal properties. Thus, the proposed solutions can exploit the parallelism provided by these kinds of systems, but without taking into account, in a proficient way, the space and time dimensions that intrinsically characterize the analyzed datasets. As described in the literature, current solutions includes: (i) the on-top approach, where an underlying system for traditional big datasets is used as a black box while spatial processing is added through the definition of user-defined functions that are specified on top of the underlying system; (ii) the from-scratch approach, where a completely new system is implemented for a specific application context; and (iii) the built-in approach, where an existing solution is extended by injecting spatial data functions into its core.

This Special Issue aims at promoting new and innovative studies, proposing new architectures or innovative evolutions of existing ones, or illustrating experiments on current technologies in order to improve the efficiency and effectiveness of distributed and cluster systems when they deal with spatio-temporal data. We invite submissions of either original technical papers or high-quality survey papers that shed new light on a particular perspective on spatial big data systems.

Assoc. Prof. Alberto Belussi
Dr Sara Migliorini
Dr Damiano Carra
Assoc. Prof. Eliseo Clementini
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. ISPRS International Journal of Geo-Information is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1700 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Big spatial (or spatio-temporal) data processing
  • Optimized MapReduce implementation of spatial analysis tools
  • Novel indexing methods for massive spatial (or spatio-temporal) data
  • Performance studies for spatial (or spatio-temporal) analytics
  • Processing of geo-crowdsourced datasets
  • Visualization of massive geo-spatial datasets
  • Smart City analytics

Published Papers (8 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

23 pages, 7573 KiB  
Article
Towards the Development of Agenda 2063 Geo-Portal to Support Sustainable Development in Africa
by Paidamwoyo Mhangara, Asanda Lamba, Willard Mapurisa and Naledzani Mudau
ISPRS Int. J. Geo-Inf. 2019, 8(9), 399; https://doi.org/10.3390/ijgi8090399 - 06 Sep 2019
Cited by 8 | Viewed by 4833
Abstract
The successful implementation of the African Union’s Agenda 2063 strategic development blueprint is critical for the attainment of economic development, social prosperity, political stability, protection, and regional integration in Africa. Agenda 2063 is a strategic and endogenous development plan that seeks to strategically [...] Read more.
The successful implementation of the African Union’s Agenda 2063 strategic development blueprint is critical for the attainment of economic development, social prosperity, political stability, protection, and regional integration in Africa. Agenda 2063 is a strategic and endogenous development plan that seeks to strategically and competitively reposition the African continent to ensure poverty eradication and equitable people-centric socio-economic and technological transformation. Its impact areas include wealth creation, shared prosperity, sustainable environment, and transformative capacities. Monitoring and evaluation systems play a critical role in collecting, recording, storing, integrating, and evaluating and tracking performance information in the implementation of longer-term strategic plans. The usage of the geographic information system (GIS) as a monitoring and evaluation tool has gained traction in the last few decades due to its ability to support the collection, integration, storage, analysis, output, and distribution of location-based data. The advent of web-based GIS provides a powerful online platform to collect, integrate, discover, use and share geospatial data, information, and services related to sustainable development. In this paper, we aim to describe the implementation, architectural structural design, and the functionality of the pilot Agenda 2063 geoportal. The live prototype internet-based geoportal is intended to facilitate data collection, management, integration, analysis, and visualization of Agenda 2063 development indicators. This geoportal is meant to support the planning, implementation, and monitoring of the Agenda 2063 goals at the continental, regional, and national levels. As our results show, we successfully demonstrated that a web-geoportal is a powerful interactive platform to upload, access, explore, visualize, analyse, and disseminate geospatial data related to the sustainable development of the African continent. Although in the pilot phase, the geoportal demonstrates the primary functionality of geoportals in terms of its capability to discover, analyse, share, and download geospatial datasets. Full article
(This article belongs to the Special Issue Distributed and Parallel Architectures for Spatial Data)
Show Figures

Figure 1

17 pages, 2993 KiB  
Article
Parallelizing Multiple Flow Accumulation Algorithm using CUDA and OpenACC
by Natalija Stojanovic and Dragan Stojanovic
ISPRS Int. J. Geo-Inf. 2019, 8(9), 386; https://doi.org/10.3390/ijgi8090386 - 03 Sep 2019
Cited by 2 | Viewed by 3539
Abstract
Watershed analysis, as a fundamental component of digital terrain analysis, is based on the Digital Elevation Model (DEM), which is a grid (raster) model of the Earth surface and topography. Watershed analysis consists of computationally and data intensive computing algorithms that need to [...] Read more.
Watershed analysis, as a fundamental component of digital terrain analysis, is based on the Digital Elevation Model (DEM), which is a grid (raster) model of the Earth surface and topography. Watershed analysis consists of computationally and data intensive computing algorithms that need to be implemented by leveraging parallel and high-performance computing methods and techniques. In this paper, the Multiple Flow Direction (MFD) algorithm for watershed analysis is implemented and evaluated on multi-core Central Processing Units (CPU) and many-core Graphics Processing Units (GPU), which provides significant improvements in performance and energy usage. The implementation is based on NVIDIA CUDA (Compute Unified Device Architecture) implementation for GPU, as well as on OpenACC (Open ACCelerators), a parallel programming model, and a standard for parallel computing. Both phases of the MFD algorithm (i) iterative DEM preprocessing and (ii) iterative MFD algorithm, are parallelized and run over multi-core CPU and GPU. The evaluation of the proposed solutions is performed with respect to the execution time, energy consumption, and programming effort for algorithm parallelization for different sizes of input data. An experimental evaluation has shown not only the advantage of using OpenACC programming over CUDA programming in implementing the watershed analysis on a GPU in terms of performance, energy consumption, and programming effort, but also significant benefits in implementing it on the multi-core CPU. Full article
(This article belongs to the Special Issue Distributed and Parallel Architectures for Spatial Data)
Show Figures

Graphical abstract

19 pages, 540 KiB  
Article
Distributed Processing of Location-Based Aggregate Queries Using MapReduce
by Yuan-Ko Huang
ISPRS Int. J. Geo-Inf. 2019, 8(9), 370; https://doi.org/10.3390/ijgi8090370 - 23 Aug 2019
Viewed by 2057
Abstract
The location-based aggregate queries, consisting of the shortest average distance query (SAvgDQ), the shortest minimal distance query (SMinDQ), the shortest maximal distance query (SMaxDQ), and the shortest sum distance query (SSumDQ) are new types [...] Read more.
The location-based aggregate queries, consisting of the shortest average distance query (SAvgDQ), the shortest minimal distance query (SMinDQ), the shortest maximal distance query (SMaxDQ), and the shortest sum distance query (SSumDQ) are new types of location-based queries. Such queries can be used to provide the user with useful object information by considering both the spatial closeness of objects to the query object and the neighboring relationship between objects. Due to a large amount of location-based aggregate queries that need to be evaluated concurrently, the centralized processing system would suffer a heavy query load, leading eventually to poor performance. As a result, in this paper, we focus on developing the distributed processing technique to answer multiple location-based aggregate queries, based on the MapReduce platform. We first design a grid structure to manage information of objects by taking into account the storage balance, and then develop a distributed processing algorithm, namely the MapReduce-based aggregate query algorithm (MRAggQ algorithm), to efficiently process the location-based aggregate queries in a distributed manner. Extensive experiments using synthetic and real datasets are conducted to demonstrate the scalability and the efficiency of the proposed processing algorithm. Full article
(This article belongs to the Special Issue Distributed and Parallel Architectures for Spatial Data)
Show Figures

Figure 1

22 pages, 452 KiB  
Article
Mobility Data Warehouses
by Alejandro Vaisman and Esteban Zimányi
ISPRS Int. J. Geo-Inf. 2019, 8(4), 170; https://doi.org/10.3390/ijgi8040170 - 02 Apr 2019
Cited by 22 | Viewed by 4592
Abstract
The interest in mobility data analysis has grown dramatically with the wide availability of devices that track the position of moving objects. Mobility analysis can be applied, for example, to analyze traffic flows. To support mobility analysis, trajectory data warehousing techniques can be [...] Read more.
The interest in mobility data analysis has grown dramatically with the wide availability of devices that track the position of moving objects. Mobility analysis can be applied, for example, to analyze traffic flows. To support mobility analysis, trajectory data warehousing techniques can be used. Trajectory data warehouses typically include, as measures, segments of trajectories, linked to spatial and non-spatial contextual dimensions. This paper goes beyond this concept, by including, as measures, the trajectories of moving objects at any point in time. In this way, online analytical processing (OLAP) queries, typically including aggregation, can be combined with moving object queries, to express queries like “List the total number of trucks running at less than 2 km from each other more than 50% of its route in the province of Antwerp” in a concise and elegant way. Existing proposals for trajectory data warehouses do not support queries like this, since they are based on either the segmentation of the trajectories, or a pre-aggregation of measures. The solution presented here is implemented using MobilityDB, a moving object database that extends the PostgresSQL database with temporal data types, allowing seamless integration with relational spatial and non-spatial data. This integration leads to the concept of mobility data warehouses. This paper discusses modeling and querying mobility data warehouses, providing a comprehensive collection of queries implemented using PostgresSQL and PostGIS as database backend, extended with the libraries provided by MobilityDB. Full article
(This article belongs to the Special Issue Distributed and Parallel Architectures for Spatial Data)
Show Figures

Figure 1

21 pages, 9062 KiB  
Article
Mr4Soil: A MapReduce-Based Framework Integrated with GIS for Soil Erosion Modelling
by Zhigang Han, Fen Qin, Caihui Cui, Yannan Liu, Lingling Wang and Pinde Fu
ISPRS Int. J. Geo-Inf. 2019, 8(3), 103; https://doi.org/10.3390/ijgi8030103 - 27 Feb 2019
Cited by 4 | Viewed by 3031
Abstract
A soil erosion model is used to evaluate the conditions of soil erosion and guide agricultural production. Recently, high spatial resolution data have been collected in new ways, such as three-dimensional laser scanning, providing the foundation for refined soil erosion modelling. However, serial [...] Read more.
A soil erosion model is used to evaluate the conditions of soil erosion and guide agricultural production. Recently, high spatial resolution data have been collected in new ways, such as three-dimensional laser scanning, providing the foundation for refined soil erosion modelling. However, serial computing cannot fully meet the computational requirements of massive data sets. Therefore, it is necessary to perform soil erosion modelling under a parallel computing framework. This paper focuses on a parallel computing framework for soil erosion modelling based on the Hadoop platform. The framework includes three layers: the methodology, algorithm, and application layers. In the methodology layer, two types of parallel strategies for data splitting are defined as row-oriented and sub-basin-oriented methods. The algorithms for six parallel calculation operators for local, focal and zonal computing tasks are designed in detail. These operators can be called to calculate the model factors and perform model calculations. We defined the key-value data structure of GeoCSV format for vector, row-based and cell-based rasters as the inputs for the algorithms. A geoprocessing toolbox is developed and integrated with the geographic information system (GIS) platform in the application layer. The performance of the framework is examined by taking the Gushanchuan basin as an example. The results show that the framework can perform calculations involving large data sets with high computational efficiency and GIS integration. This approach is easy to extend and use and provides essential support for applying high-precision data to refine soil erosion modelling. Full article
(This article belongs to the Special Issue Distributed and Parallel Architectures for Spatial Data)
Show Figures

Figure 1

18 pages, 3904 KiB  
Article
HiBuffer: Buffer Analysis of 10-Million-Scale Spatial Data in Real Time
by Mengyu Ma, Ye Wu, Wenze Luo, Luo Chen, Jun Li and Ning Jing
ISPRS Int. J. Geo-Inf. 2018, 7(12), 467; https://doi.org/10.3390/ijgi7120467 - 30 Nov 2018
Cited by 8 | Viewed by 4549
Abstract
Buffer analysis, a fundamental function in a geographic information system (GIS), identifies areas by the surrounding geographic features within a given distance. Real-time buffer analysis for large-scale spatial data remains a challenging problem since the computational scales of conventional data-oriented methods expand rapidly [...] Read more.
Buffer analysis, a fundamental function in a geographic information system (GIS), identifies areas by the surrounding geographic features within a given distance. Real-time buffer analysis for large-scale spatial data remains a challenging problem since the computational scales of conventional data-oriented methods expand rapidly with increasing data volume. In this paper, we introduce HiBuffer, a visualization-oriented model for real-time buffer analysis. An efficient buffer generation method is proposed which introduces spatial indexes and a corresponding query strategy. Buffer results are organized into a tile-pyramid structure to enable stepless zooming. Moreover, a fully optimized hybrid parallel processing architecture is proposed for the real-time buffer analysis of large-scale spatial data. Experiments using real-world datasets show that our approach can reduce computation time by up to several orders of magnitude while preserving superior visualization effects. Additional experiments were conducted to analyze the influence of spatial data density, buffer radius, and request rate on HiBuffer performance, and the results demonstrate the adaptability and stability of HiBuffer. The parallel scalability of HiBuffer was also tested, showing that HiBuffer achieves high performance of parallel acceleration. Experimental results verify that HiBuffer is capable of handling 10-million-scale data. Full article
(This article belongs to the Special Issue Distributed and Parallel Architectures for Spatial Data)
Show Figures

Figure 1

18 pages, 3294 KiB  
Article
High-Performance Geospatial Big Data Processing System Based on MapReduce
by Junghee Jo and Kang-Woo Lee
ISPRS Int. J. Geo-Inf. 2018, 7(10), 399; https://doi.org/10.3390/ijgi7100399 - 06 Oct 2018
Cited by 24 | Viewed by 4762
Abstract
With the rapid development of Internet of Things (IoT) technologies, the increasing volume and diversity of sources of geospatial big data have created challenges in storing, managing, and processing data. In addition to the general characteristics of big data, the unique properties of [...] Read more.
With the rapid development of Internet of Things (IoT) technologies, the increasing volume and diversity of sources of geospatial big data have created challenges in storing, managing, and processing data. In addition to the general characteristics of big data, the unique properties of spatial data make the handling of geospatial big data even more complicated. To facilitate users implementing geospatial big data applications in a MapReduce framework, several big data processing systems have extended the original Hadoop to support spatial properties. Most of those platforms, however, have included spatial functionalities by embedding them as a form of plug-in. Although offering a convenient way to add new features to an existing system, the plug-in has several limitations. In particular, while executing spatial and nonspatial operations by alternating between the existing system and the plug-in, additional read and write overheads have to be added to the workflow, significantly reducing performance efficiency. To address this issue, we have developed Marmot, a high-performance, geospatial big data processing system based on MapReduce. Marmot extends Hadoop at a low level to support seamless integration between spatial and nonspatial operations of a solid framework, allowing improved performance of geoprocessing workflow. This paper explains the overall architecture and data model of Marmot as well as the main algorithm for automatic construction of MapReduce jobs from a given spatial analysis task. To illustrate how Marmot transforms a sequence of operators for spatial analysis to map and reduce functions in a way to achieve better performance, this paper presents an example of spatial analysis retrieving the number of subway stations per city in Korea. This paper also experimentally demonstrates that Marmot generally outperforms SpatialHadoop, one of the top plug-in based spatial big data frameworks, particularly in dealing with complex and time-intensive queries involving spatial index. Full article
(This article belongs to the Special Issue Distributed and Parallel Architectures for Spatial Data)
Show Figures

Figure 1

15 pages, 8401 KiB  
Article
LandQv2: A MapReduce-Based System for Processing Arable Land Quality Big Data
by Xiaochuang Yao, Mohamed F. Mokbel, Sijing Ye, Guoqing Li, Louai Alarabi, Ahmed Eldawy, Zuliang Zhao, Long Zhao and Dehai Zhu
ISPRS Int. J. Geo-Inf. 2018, 7(7), 271; https://doi.org/10.3390/ijgi7070271 - 10 Jul 2018
Cited by 18 | Viewed by 5612
Abstract
Arable land quality (ALQ) data are a foundational resource for national food security. With the rapid development of spatial information technologies, the annual acquisition and update of ALQ data covering the country have become more accurate and faster. ALQ data are mainly vector-based [...] Read more.
Arable land quality (ALQ) data are a foundational resource for national food security. With the rapid development of spatial information technologies, the annual acquisition and update of ALQ data covering the country have become more accurate and faster. ALQ data are mainly vector-based spatial big data in the ESRI (Environmental Systems Research Institute) shapefile format. Although the shapefile is the most common GIS vector data format, unfortunately, the usage of ALQ data is very constrained due to its massive size and the limited capabilities of traditional applications. To tackle the above issues, this paper introduces LandQv2, which is a MapReduce-based parallel processing system for ALQ big data. The core content of LandQv2 is composed of four key technologies including data preprocessing, the distributed R-tree index, the spatial range query, and the map tile pyramid model-based visualization. According to the functions in LandQv2, firstly, ALQ big data are transformed by a MapReduce-based parallel algorithm from the ESRI Shapefile format to the GeoCSV file format in HDFS (Hadoop Distributed File System), and then, the spatial coding-based partition and R-tree index are executed for the spatial range query operation. In addition, the visualization of ALQ big data with a GIS (Geographic Information System) web API (Application Programming Interface) uses the MapReduce program to generate a single image or pyramid tiles for big data display. Finally, a set of experiments running on a live system deployed on a cluster of machines shows the efficiency and scalability of the proposed system. All of these functions supported by LandQv2 are integrated into SpatialHadoop, and it is also able to efficiently support any other distributed spatial big data systems. Full article
(This article belongs to the Special Issue Distributed and Parallel Architectures for Spatial Data)
Show Figures

Figure 1

Back to TopTop