Next Article in Journal
Analyzing Forces to the Financial Contribution of Local Governments to Sustainable Development
Previous Article in Journal
City-as-a-Platform: The Rise of Participatory Innovation Platforms in Finnish Cities
Open AccessArticle

A New Design of High-Performance Large-Scale GIS Computing at a Finer Spatial Granularity: A Case Study of Spatial Join with Spark for Sustainability

by 1,2, 2, 1, 1,2,* and 3,*
1
Zhejiang Provincial Key Laboratory of Geographic Information Science, Department of Earth Sciences, Zhejiang University, 148 Tianmushan Road, Hangzhou 310028, China
2
School of the Earth Sciences, Zhejiang University, 38 Zheda Road, Hangzhou 310027, China
3
Department of Geography, Kent State University, Kent, OH 44240, USA
*
Authors to whom correspondence should be addressed.
Academic Editor: Richard Henry Moore
Sustainability 2016, 8(9), 926; https://doi.org/10.3390/su8090926
Received: 20 June 2016 / Revised: 28 August 2016 / Accepted: 6 September 2016 / Published: 10 September 2016
Sustainability research faces many challenges as respective environmental, urban and regional contexts are experiencing rapid changes at an unprecedented spatial granularity level, which involves growing massive data and the need for spatial relationship detection at a faster pace. Spatial join is a fundamental method for making data more informative with respect to spatial relations. The dramatic growth of data volumes has led to increased focus on high-performance large-scale spatial join. In this paper, we present Spatial Join with Spark (SJS), a proposed high-performance algorithm, that uses a simple, but efficient, uniform spatial grid to partition datasets and joins the partitions with the built-in join transformation of Spark. SJS utilizes the distributed in-memory iterative computation of Spark, then introduces a calculation-evaluating model and in-memory spatial repartition technology, which optimize the initial partition by evaluating the calculation amount of local join algorithms without any disk access. We compare four in-memory spatial join algorithms in SJS for further performance improvement. Based on extensive experiments with real-world data, we conclude that SJS outperforms the Spark and MapReduce implementations of earlier spatial join approaches. This study demonstrates that it is promising to leverage high-performance computing for large-scale spatial join analysis. The availability of large-sized geo-referenced datasets along with the high-performance computing technology can raise great opportunities for sustainability research on whether and how these new trends in data and technology can be utilized to help detect the associated trends and patterns in the human-environment dynamics. View Full-Text
Keywords: spatial join; parallel computing; Spark; performance spatial join; parallel computing; Spark; performance
Show Figures

Figure 1

MDPI and ACS Style

Zhang, F.; Zhou, J.; Liu, R.; Du, Z.; Ye, X. A New Design of High-Performance Large-Scale GIS Computing at a Finer Spatial Granularity: A Case Study of Spatial Join with Spark for Sustainability. Sustainability 2016, 8, 926. https://doi.org/10.3390/su8090926

AMA Style

Zhang F, Zhou J, Liu R, Du Z, Ye X. A New Design of High-Performance Large-Scale GIS Computing at a Finer Spatial Granularity: A Case Study of Spatial Join with Spark for Sustainability. Sustainability. 2016; 8(9):926. https://doi.org/10.3390/su8090926

Chicago/Turabian Style

Zhang, Feng; Zhou, Jingwei; Liu, Renyi; Du, Zhenhong; Ye, Xinyue. 2016. "A New Design of High-Performance Large-Scale GIS Computing at a Finer Spatial Granularity: A Case Study of Spatial Join with Spark for Sustainability" Sustainability 8, no. 9: 926. https://doi.org/10.3390/su8090926

Find Other Styles
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Search more from Scilit
 
Search
Back to TopTop