Next Article in Journal
Evaluating the Sustainability of a Small-Scale Low-Input Organic Vegetable Supply System in the United Kingdom
Next Article in Special Issue
Evaluation of the Effectiveness of Border Policies in Dehong Prefecture of Yunnan, China
Previous Article in Journal
Cameroon: Perspectives on Food Security and the Emerging Power Footprint
Open AccessArticle

Using Web Crawler Technology for Geo-Events Analysis: A Case Study of the Huangyan Island Incident

by Hao Hu 1, Yuejing Ge 1,* and Dongyang Hou 2
School of Geography, Beijing Normal University, 100875 Beijing, China
School of Environment Science and Spatial Informatics, China University of Mining and Technology, Xuzhou 221116, China
Author to whom correspondence should be addressed.
Sustainability 2014, 6(4), 1896-1912;
Received: 21 February 2014 / Revised: 26 March 2014 / Accepted: 26 March 2014 / Published: 9 April 2014
(This article belongs to the Special Issue Borderland Studies and Sustainability)
Social networking and network socialization provide abundant text information and social relationships into our daily lives. Making full use of these data in the big data era is of great significance for us to better understand the changing world and the information-based society. Though politics have been integrally involved in the hyperlinked world issues since the 1990s, the text analysis and data visualization of geo-events faced the bottleneck of traditional manual analysis. Though automatic assembly of different geospatial web and distributed geospatial information systems utilizing service chaining have been explored and built recently, the data mining and information collection are not comprehensive enough because of the sensibility, complexity, relativity, timeliness, and unexpected characteristics of political events. Based on the framework of Heritrix and the analysis of web-based text, word frequency, sentiment tendency, and dissemination path of the Huangyan Island incident were studied by using web crawler technology and the text analysis. The results indicate that tag cloud, frequency map, attitudes pie, individual mention ratios, and dissemination flow graph, based on the crawled information and data processing not only highlight the characteristics of geo-event itself, but also implicate many interesting phenomenon and deep-seated problems behind it, such as related topics, theme vocabularies, subject contents, hot countries, event bodies, opinion leaders, high-frequency vocabularies, information sources, semantic structure, propagation paths, distribution of different attitudes, and regional difference of net citizens’ response in the Huangyan Island incident. Furthermore, the text analysis of network information with the help of focused web crawler is able to express the time-space relationship of crawled information and the information characteristic of semantic network to the geo-events. Therefore, it is a useful tool to collect information for understanding the formation and diffusion of web-based public opinions in political events. View Full-Text
Keywords: web crawler technology; text information; sentiment analysis; Huangyan Island Incident web crawler technology; text information; sentiment analysis; Huangyan Island Incident
Show Figures

Figure 1

MDPI and ACS Style

Hu, H.; Ge, Y.; Hou, D. Using Web Crawler Technology for Geo-Events Analysis: A Case Study of the Huangyan Island Incident. Sustainability 2014, 6, 1896-1912.

Show more citation formats Show less citations formats

Article Access Map by Country/Region

Only visits after 24 November 2015 are recorded.
Back to TopTop