Next Article in Journal
Corporate Culture and Its Impact on Employees’ Attitude, Performance, Productivity, and Behavior: An Investigative Analysis from Selected Organizations of the United Arab Emirates (UAE)
Previous Article in Journal
Higher Education Response in the Time of Coronavirus: Perceptions of Teachers and Students, and Open Innovation
Previous Article in Special Issue
Online Buyers and Open Innovation: Security, Experience, and Satisfaction
Article

Comparing Methods to Collect and Geolocate Tweets in Great Britain

1
Center of Methods in Social Sciences, University of Göttingen, 37073 Göttingen, Germany
2
Department of Economics, University of Bergamo, 24127 Bergamo, Italy
*
Author to whom correspondence should be addressed.
J. Open Innov. Technol. Mark. Complex. 2021, 7(1), 44; https://doi.org/10.3390/joitmc7010044
Received: 30 November 2020 / Revised: 20 January 2021 / Accepted: 20 January 2021 / Published: 25 January 2021
(This article belongs to the Special Issue Big Data Research for Open Innovation)
In the era of Big Data, the Internet has become one of the main data sources: Data can be collected for relatively low costs and can be used for a wide range of purposes. To be able to timely support solid decisions in any field, it is essential to increase data production efficiency, data accuracy, and reliability. In this framework, our paper aims at identifying an optimized and flexible method to collect and, at the same time, geolocate social media information over a whole country. In particular, the target of this paper is to compare three alternative methods to collect data from the social media Twitter. This is achieved considering four main comparison criteria: Collection time, dataset size, pre-processing phase load, and geographic distribution. Our findings regarding Great Britain identify one of these methods as the best option, since it is able to collect both the highest number of tweets per hour and the highest percentage of unique tweets per hour. Furthermore, this method reduces the computational effort needed to pre-process the collected tweets (e.g., showing the lowest collection times and the lowest number of duplicates within the geographical areas) and enhances the territorial coverage (if compared to the population distribution). At the same time, the effort required to set up this method is feasible and less prone to the arbitrary decisions of the researcher. View Full-Text
Keywords: Twitter; geographical coverage; social media; big data; geolocation; spatial data collection Twitter; geographical coverage; social media; big data; geolocation; spatial data collection
Show Figures

Figure 1

MDPI and ACS Style

Schlosser, S.; Toninelli, D.; Cameletti, M. Comparing Methods to Collect and Geolocate Tweets in Great Britain. J. Open Innov. Technol. Mark. Complex. 2021, 7, 44. https://doi.org/10.3390/joitmc7010044

AMA Style

Schlosser S, Toninelli D, Cameletti M. Comparing Methods to Collect and Geolocate Tweets in Great Britain. Journal of Open Innovation: Technology, Market, and Complexity. 2021; 7(1):44. https://doi.org/10.3390/joitmc7010044

Chicago/Turabian Style

Schlosser, Stephan, Daniele Toninelli, and Michela Cameletti. 2021. "Comparing Methods to Collect and Geolocate Tweets in Great Britain" Journal of Open Innovation: Technology, Market, and Complexity 7, no. 1: 44. https://doi.org/10.3390/joitmc7010044

Find Other Styles
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Back to TopTop