Next Article in Journal
Monitoring and Modeling of Spatiotemporal Urban Expansion and Land-Use/Land-Cover Change Using Integrated Markov Chain Cellular Automata Model
Previous Article in Journal
Topographic Correction to Landsat Imagery through Slope Classification by Applying the SCS + C Method in Mountainous Forest Areas
Article Menu
Issue 9 (September) cover image

Export Article

Open AccessArticle
ISPRS Int. J. Geo-Inf. 2017, 6(9), 285; doi:10.3390/ijgi6090285

GeoSpark SQL: An Effective Framework Enabling Spatial Queries on Spark

1,2
,
1
,
3
and
4,5,*
1
Institute of Remote Sensing & GIS, Peking University, Beijing 100871, China
2
Beijing Advanced Innovation Center for Future Internet Technology, Beijing 100124, China
3
Faculty of Information Engineering, China University of Geosciences, Wuhan 430074, China
4
Collaborative Innovation Center of eTourism, Institute of Tourism, Beijing Union University, Beijing 100101, China
5
State Key Laboratory of Resources and Environmental Information System, Institute of Geographical Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing 100101, China
*
Author to whom correspondence should be addressed.
Received: 24 July 2017 / Revised: 1 September 2017 / Accepted: 6 September 2017 / Published: 8 September 2017
View Full-Text   |   Download PDF [2806 KB, uploaded 8 September 2017]   |  

Abstract

In the era of big data, Internet-based geospatial information services such as various LBS apps are deployed everywhere, followed by an increasing number of queries against the massive spatial data. As a result, the traditional relational spatial database (e.g., PostgreSQL with PostGIS and Oracle Spatial) cannot adapt well to the needs of large-scale spatial query processing. Spark is an emerging outstanding distributed computing framework in the Hadoop ecosystem. This paper aims to address the increasingly large-scale spatial query-processing requirement in the era of big data, and proposes an effective framework GeoSpark SQL, which enables spatial queries on Spark. On the one hand, GeoSpark SQL provides a convenient SQL interface; on the other hand, GeoSpark SQL achieves both efficient storage management and high-performance parallel computing through integrating Hive and Spark. In this study, the following key issues are discussed and addressed: (1) storage management methods under the GeoSpark SQL framework, (2) the spatial operator implementation approach in the Spark environment, and (3) spatial query optimization methods under Spark. Experimental evaluation is also performed and the results show that GeoSpark SQL is able to achieve real-time query processing. It should be noted that Spark is not a panacea. It is observed that the traditional spatial database PostGIS/PostgreSQL performs better than GeoSpark SQL in some query scenarios, especially for the spatial queries with high selectivity, such as the point query and the window query. In general, GeoSpark SQL performs better when dealing with compute-intensive spatial queries such as the kNN query and the spatial join query. View Full-Text
Keywords: big data; GeoSpark SQL; Spark; spatial query processing; spatial database big data; GeoSpark SQL; Spark; spatial query processing; spatial database
Figures

Figure 1

This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. (CC BY 4.0).

Scifeed alert for new publications

Never miss any articles matching your research from any publisher
  • Get alerts for new papers matching your research
  • Find out the new papers from selected authors
  • Updated daily for 49'000+ journals and 6000+ publishers
  • Define your Scifeed now

SciFeed Share & Cite This Article

MDPI and ACS Style

Huang, Z.; Chen, Y.; Wan, L.; Peng, X. GeoSpark SQL: An Effective Framework Enabling Spatial Queries on Spark. ISPRS Int. J. Geo-Inf. 2017, 6, 285.

Show more citation formats Show less citations formats

Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Related Articles

Article Metrics

Article Access Statistics

1

Comments

[Return to top]
ISPRS Int. J. Geo-Inf. EISSN 2220-9964 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top