Next Article in Journal
Using the Surface Temperature-Albedo Space to Separate Regional Soil and Vegetation Temperatures from ASTER Data
Previous Article in Journal
Spatial Analysis of Wenchuan Earthquake-Damaged Vegetation in the Mountainous Basins and Its Applications
Open AccessArticle

Active Collection of Land Cover Sample Data from Geo-Tagged Web Texts

by Dongyang Hou 1,2, Jun Chen 2,*, Hao Wu 2, Songnian Li 1,3, Fei Chen 2,4 and Weiwei Zhang 2
1
School of Environment Science and Spatial Informatics, China University of Mining and Technology, Xuzhou 221116, China
2
National Geomatics Center of China, 28 Lianhuachi West Road, Beijing 100830, China
3
Department of Civil Engineering, Ryerson University, 350 Victoria Street, Toronto, ON M5B 2K3, Canada
4
School of Geosciences and Info-Physics, Central South University, Changsha 410083, China
*
Author to whom correspondence should be addressed.
Academic Editors: Yoshio Inoue, Chandra Giri and Prasad S. Thenkabail
Remote Sens. 2015, 7(5), 5805-5827; https://doi.org/10.3390/rs70505805
Received: 5 January 2015 / Revised: 3 April 2015 / Accepted: 29 April 2015 / Published: 7 May 2015
Sample data plays an important role in land cover (LC) map validation. Traditionally, they are collected through field survey or image interpretation, either of which is costly, labor-intensive and time-consuming. In recent years, massive geo-tagged texts are emerging on the web and they contain valuable information for LC map validation. However, this kind of special textual data has seldom been analyzed and used for supporting LC map validation. This paper examines the potential of geo-tagged web texts as a new cost-free sample data source to assist LC map validation and proposes an active data collection approach. The proposed approach uses a customized deep web crawler to search for geo-tagged web texts based on land cover-related keywords and string-based rules matching. A data transformation based on buffer analysis is then performed to convert the collected web texts into LC sample data. Using three provinces and three municipalities directly under the Central Government in China as study areas, geo-tagged web texts were collected to validate artificial surface class of China’s 30-meter global land cover datasets (GlobeLand30-2010). A total of 6283 geo-tagged web texts were collected at a speed of 0.58 texts per second. The collected texts about built-up areas were transformed into sample data. User’s accuracy of 82.2% was achieved, which is close to that derived from formal expert validation. The preliminary results show that geo-tagged web texts are valuable ancillary data for LC map validation and the proposed approach can improve the efficiency of sample data collection. View Full-Text
Keywords: sample data; land cover; validation; deep web crawler; geo-tagged web sample data; land cover; validation; deep web crawler; geo-tagged web
Show Figures

Figure 1

MDPI and ACS Style

Hou, D.; Chen, J.; Wu, H.; Li, S.; Chen, F.; Zhang, W. Active Collection of Land Cover Sample Data from Geo-Tagged Web Texts. Remote Sens. 2015, 7, 5805-5827.

Show more citation formats Show less citations formats

Article Access Map by Country/Region

1
Only visits after 24 November 2015 are recorded.
Back to TopTop