Social media data is heavily used to analyze and evaluate situations in times of disasters, and derive decisions for action from it. In these critical situations, it is not surprising that privacy is often considered a secondary problem. In order to prevent subsequent abuse, theft or public exposure of collected datasets, however, protecting the privacy of social media users is crucial. Avoiding unnecessary data retention is an important question that is currently largely unsolved. There are a number of technical approaches available, but their deployment in disaster management is either impractical or requires special adaption, limiting its utility. In this case study, we explore the deployment of a cardinality estimation algorithm called HyperLogLog into disaster management processes. It is particularly suited for this field, because it allows to stream data in a format that cannot be used for purposes other than the originally intended. We develop and conduct a focus group discussion with teams of social media analysts. We identify challenges and opportunities of working with such a privacy-enhanced social media data format and compare the process with conventional techniques. Our findings show that, with the exception of training scenarios, deploying HyperLogLog in the data acquisition process will not distract the data analysis process. Instead, several benefits, such as improved working with huge datasets, may contribute to a more widespread use and adoption of the presented technique, which provides a basis for a better integration of privacy considerations in disaster management.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited