Next Article in Journal
Implementation of FAIR Principles for Ontologies in the Disaster Domain: A Systematic Literature Review
Next Article in Special Issue
Analysis of Geotagging Behavior: Do Geotagged Users Represent the Twitter Population?
Previous Article in Journal
Detecting Urban Events by Considering Long Temporal Dependency of Sentiment Strength in Geotagged Social Media Data
Previous Article in Special Issue
Information Detection for the Process of Typhoon Events in Microblog Text: A Spatio-Temporal Perspective
Article

Evaluating the Representativeness of Socio-Demographic Variables over Time for Geo-Social Media Data

Department of Geoinformatics—Z_GIS, University of Salzburg, 5020 Salzburg, Austria
*
Author to whom correspondence should be addressed.
Academic Editors: Jean-Claude Thill and Wolfgang Kainz
ISPRS Int. J. Geo-Inf. 2021, 10(5), 323; https://doi.org/10.3390/ijgi10050323
Received: 18 April 2021 / Revised: 28 April 2021 / Accepted: 2 May 2021 / Published: 10 May 2021
(This article belongs to the Special Issue Applications and Implications in Geosocial Media Monitoring)
Geo-social media data are widely used as a data source to model populations and processes in a variety of contexts. However, if the data do not adequately represent the population they are drawn from, analysis results will be biased. Unaddressed, these biases may lead to false interpretations and conclusions. In this paper, we propose a generic methodology for investigating the representativeness of geo-social media data for population groups of similar statistical predictive power based on reference data. The groups are designed to be spatially coherent regions with similar prediction errors. Based on these units, we investigate the influence of different socio-demographic covariates on the representativeness. We perform experiments based on over 1.6 billion tweets and 90 socio-demographic covariates. We demonstrate that Twitter data representativeness varies strongly over time and space. Our results show that densely populated areas tend to be underrepresented consistently in non-spatial models. Over time, some covariates like the number of people aged 20 years exhibit highly different effects on the prediction models, whereas others are much more stable. The spatial effects can most frequently be explained using spatial error models, indicating spatially related errors that indicate the necessity of additional covariates. Finally, we provide hints for interpreting the results of our approach for researchers using the concepts presented in this paper. View Full-Text
Keywords: geo-social media; Twitter; representativeness; spatial analysis; statistical correlations; temporal snapshots geo-social media; Twitter; representativeness; spatial analysis; statistical correlations; temporal snapshots
Show Figures

Figure 1

MDPI and ACS Style

Petutschnig, A.; Resch, B.; Lang, S.; Havas, C. Evaluating the Representativeness of Socio-Demographic Variables over Time for Geo-Social Media Data. ISPRS Int. J. Geo-Inf. 2021, 10, 323. https://doi.org/10.3390/ijgi10050323

AMA Style

Petutschnig A, Resch B, Lang S, Havas C. Evaluating the Representativeness of Socio-Demographic Variables over Time for Geo-Social Media Data. ISPRS International Journal of Geo-Information. 2021; 10(5):323. https://doi.org/10.3390/ijgi10050323

Chicago/Turabian Style

Petutschnig, Andreas, Bernd Resch, Stefan Lang, and Clemens Havas. 2021. "Evaluating the Representativeness of Socio-Demographic Variables over Time for Geo-Social Media Data" ISPRS International Journal of Geo-Information 10, no. 5: 323. https://doi.org/10.3390/ijgi10050323

Find Other Styles
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Back to TopTop