Web based crowdsourcing has become an important method of environmental data processing. Two alternatives are widely used today by researchers in various fields: paid data processing mediated by for-profit businesses such as Amazon’s Mechanical Turk, and volunteer data processing conducted by amateur citizen-scientists. While the first option delivers results much faster, it is not quite clear how it compares with volunteer processing in terms of quality. This study compares volunteer and paid processing of social media data originating from climate change discussions on Twitter. The same sample of Twitter messages discussing climate change was offered for processing to the volunteer workers through the Climate Tweet project, and to the paid workers through the Amazon MTurk platform. We found that paid crowdsourcing required the employment of a high redundancy data processing design to obtain quality that was comparable with volunteered processing. Among the methods applied to improve data processing accuracy, limiting the geographical locations of the paid workers appeared the most productive. Conversely, we did not find significant geographical differences in the accuracy of data processed by volunteer workers. We suggest that the main driver of the found pattern is the differences in familiarity of the paid workers with the research topic.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited