Geographical masks are a group of location protection methods for the dissemination and publication of confidential and sensitive information, such as health- and crime-related geo-referenced data. The use of such masks ensures that privacy is protected for the individuals involved in the datasets. Nevertheless, the protection process introduces spatial error to the masked dataset. This study quantifies the spatial error of masked datasets using two approaches. First, a perceptual survey was employed where participants ranked the similarity of a diverse sample of masked and original maps. Second, a spatial statistical analysis was performed that provided quantitative results for the same pairs of maps. Spatial statistical similarity is calculated with three divergence indices that employ different spatial clustering methods. All indices are significantly correlated with the perceptual similarity. Finally, the results of the spatial analysis are used as the explanatory variable to estimate the perceptual similarity. Three prediction models are created that indicate upper boundaries for the spatial statistical results upon which the masked data are perceived differently from the original data. The results of the study aim to help potential “maskers” to quantify and evaluate the error of confidential masked visualizations.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.