Next Article in Journal
From Attitude Change to Behaviour Change: Institutional Mediators of Education for Sustainable Development Effectiveness
Next Article in Special Issue
Cyberspace Knowledge Gaps and Boundaries in Sustainability Science: Topics, Regions, Editorial Teams and Journals
Previous Article in Journal
Spatial Distribution of Migration and Economic Development: A Case Study of Sichuan Province, China
Previous Article in Special Issue
Information Extraction of High-Resolution Remotely Sensed Image Based on Multiresolution Segmentation
Article Menu

Export Article

Open AccessArticle
Sustainability 2014, 6(10), 6529-6552; doi:10.3390/su6106529

A Focused Crawler for Borderlands Situation Information with Geographical Properties of Place Names

1,2
,
2,* , 1,2
and
2
1
School of Environment Science and Spatial Informatics, China University of Mining and Technology, Xuzhou 221116, China
2
National Geomatics Center of China, 28 Lianhuachi West Road, Beijing 100830, China
*
Author to whom correspondence should be addressed.
Received: 3 June 2014 / Revised: 2 September 2014 / Accepted: 5 September 2014 / Published: 29 September 2014
(This article belongs to the Special Issue Borderland Studies and Sustainability)
View Full-Text   |   Download PDF [1817 KB, uploaded 24 February 2015]   |  

Abstract

Place name is an important ingredient of borderlands situation information and plays a significant role in collecting them from the Internet with focused crawlers. However, current focused crawlers treat place name in the same way as any other common keyword, which has no geographical properties. This may reduce the effectiveness of focused crawlers. To solve the problem, this paper firstly discusses the importance of place name in focused crawlers in terms of location and spatial relation, and, then, proposes the two-tuple-based topic representation method to express place name and common keyword, respectively. Afterwards, spatial relations between place names are introduced to calculate the relevance of given topics and webpages, which can make the calculation process more accurately. On the basis of the above, a focused crawler prototype for borderlands situation information collection is designed and implemented. The crawling speed and F-Score are adopted to evaluate its efficiency and effectiveness. Experimental results indicate that the efficiency of our proposed focused crawler is consistent with the polite access interval and it could meet the daily demand of borderlands situation information collection. Additionally, the F-Score value of our proposed focused crawler increases by around 7%, which means that our proposed focused crawler is more effective than the traditional best-first focused crawler. View Full-Text
Keywords: focused crawler; place name; web information collection; borderlands situation; relevance calculation; spatial relations focused crawler; place name; web information collection; borderlands situation; relevance calculation; spatial relations
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. (CC BY 4.0).

Scifeed alert for new publications

Never miss any articles matching your research from any publisher
  • Get alerts for new papers matching your research
  • Find out the new papers from selected authors
  • Updated daily for 49'000+ journals and 6000+ publishers
  • Define your Scifeed now

SciFeed Share & Cite This Article

MDPI and ACS Style

Hou, D.; Wu, H.; Chen, J.; Li, R. A Focused Crawler for Borderlands Situation Information with Geographical Properties of Place Names. Sustainability 2014, 6, 6529-6552.

Show more citation formats Show less citations formats

Related Articles

Article Metrics

Article Access Statistics

1

Comments

[Return to top]
Sustainability EISSN 2071-1050 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top