A Voronoi-Based Semantically Balanced Dummy Generation Framework for Location Privacy
Abstract
:1. Introduction
1.1. Background
1.2. Temporal Constraint Attack
1.3. Problem Statement
1.4. Major Contributions
- We introduce a new type of location privacy attack called ‘temporal constraint attack’ where an adversary can exploit the location semantics from a temporal dimension for eliminating dummies and identifying the true location. In doing so, we provide evidence on how a true location of residential semantic type can be compromised in a temporal constraint attack.
- A novel Voronoi-based semantically balanced dummy generation (VSBDG) approach is proposed to generate dummy locations that are capable of withstanding a temporal constraint attack by an adversary. In general, the VSBDG algorithm can achieve location privacy protection regardless of the semantic type of the true location; whether it is residential or non-residential. This is due to the semantically balanced nature of the location set generated by the VSBDG.
- One of the major drawbacks of existing dummy location studies is that they do not consider the spatial context of the location, which is not possible unless the technique is built upon real-world geospatial datasets. At best, the current approaches are tested on simple real-world location datasets that contain a collection of point locations. The VSBDG algorithm is built and tested on real-world geospatial datasets such as land parcels and point of interest (POI) locations. The VSBDG algorithm leverages spatial relationships and operations to identify spatially similar dummy locations for a given true location.
- The Voronoi polygons are applied to model and delineate POI influence. We establish an approach that uses a cosine similarity search for finding geographical areas within the city with similar POI influence, and perform a parcel-based similarity search to identify the residential dummy location within each similar Voronoi polygon. This allowed us to identify spatially similar residential and POI dummy locations and build semantically balanced location sets that are resistant not only to temporal constraint attacks but also to location homogeneity attacks, location distribution attacks, and map-matching attacks.
2. Related Work
3. Proposed Methodology
3.1. Relationship between Geographic Location, Address, and Land Parcel
3.2. Modeling POI Influence Using Voronoi Polygons
- m is defined as the number of Voronoi polygons used in generating a dummy location set because each Voronoi polygon contains one POI location.
- With at least two locations selected from each Voronoi polygon, there are 2m + 1 minimum number of dummy locations and one legitimate location.
3.3. Cosine Similarity between Voronoi Polygons
3.4. Parcel-based Similarity Search
4. Voronoi-Based Semantically Balanced Dummy Generation (VSBDG)
Algorithm 1: VSBDG—To identify semantically balanced dummy locations for a given residential true location |
Input: True location lt (Longitude, Latitude), Location set size k Datasets: Land Parcels P, POI-based Voronoi polygons dataset V Output: Location set D of size k 1. Determine land parcel ptrue outlining lt using spatial join between lt and P 2. Determine Voronoi polygon vtrue outlining lt using spatial join between lt and V 3. m = (k − 2)/2 // m is total number of similar Voronoi polygons to be identified 4. Vsimilar = Cosine Similarity Search(target = vtrue, candidate set = V, output length = m) // Perform cosine similarity search to identify top m Voronoi polygons from V that are similar to vtrue 5. For each Voronoi polygon vi in Vsimilar 6. Set candidate_parcels_seti = parcels within Voronoi polygon vi 7. prclSimi = Euclidean Similarity Search(target = ptrue, candidate set = candidate_parcels_seti, output length = 1) // Perform Euclidean Similarity Search to identify top 1 land parcel from candidate_parcels_seti that is - // most similar to ptrue 8. Calculate dummyresidential using parcel centroid of prclSimi 9. dummypoi = POI location associated with vi 10. Add dummyresidential, dummypoi to D 11. dummyvt = POI location associated with vtrue 12. Add dummyvt to D 13. Add lt to D 14. Return D |
5. Experimental Analysis and Results
5.1. Data Collection and Preprocessing
5.2. Electing Dummy Locations Using VSBDG
5.3. Results
6. Discussions
6.1. Evaluating VSBDG
6.2. Comparison with the Existing Dummy Approaches
7. Conclusions
- We proposed a novel VSBDG algorithm, which is conducive to generating dummies that can keep temporal constraint attacks at bay.
- The VSBDG algorithm is capable of handling both location homogeneity attacks and map-matching attacks for two reasons. 1. POI influence in a spatial area is modeled using Voronoi polygons and leverages cosine similarity search to find areas within a city that has similar POI influence. 2. The parcel-based similarity search [21] is adopted to construct dummy locations within each Voronoi polygon from parcels that are spatially similar to a legitimate location’s parcel.
- Our findings show a high average MDD of 5861.894 m and 6258.046 m for residential and POI locations, respectively, entailing that the locations are distributed further apart indicating optimized location privacy.
- The results unfold an average PDCS of 0.988 between the MDD values of residential and POI locations in location sets with sizes ranging from 4 to 22, thereby demonstrating a strong and scalable semantic balance within an output location set of the VSBDG algorithm, suggesting good location privacy protection against a temporal constraint attack.
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Sun, G.; Chang, V.; Ramachandran, M.; Sun, Z.; Li, G.; Yu, H.; Liao, D. Efficient location privacy algorithm for Internet of Things (IoT) services and applications. J. Netw. Comput. Appl. 2017, 89, 3–13. [Google Scholar] [CrossRef] [Green Version]
- Liu, B.; Zhou, W.; Zhu, T.; Gao, L.; Xiang, Y. Location privacy and its applications: A systematic study. IEEE Access 2018, 6, 17606–17624. [Google Scholar] [CrossRef]
- Jiang, H.; Li, J.; Zhao, P.; Zeng, F.; Xiao, Z.; Iyengar, A. Location Privacy-preserving Mechanisms in Location-based Services: A Comprehensive Survey. ACM Comput. Surv. 2021, 54, 1–36. [Google Scholar] [CrossRef]
- Xu, X.; Chen, H.; Xie, L. A Location Privacy Preservation Method Based on Dummy Locations in Internet of Vehicles. Appl. Sci. 2021, 11, 4594. [Google Scholar] [CrossRef]
- Schirmer, P.M.; van Eggermond, M.A.; Axhausen, K.W. The Role of Location in Residential Location Choice Models: A Review of Literature. J. Transp. Land Use 2014, 7, 3–21. Available online: http://www.jstor.org/stable/26202678 (accessed on 1 January 2023). [CrossRef] [Green Version]
- Kounadi, O.; Lampoltshammer, T.J.; Leitner, M.; Heistracher, T. Accuracy and privacy aspects in free online reverse geocoding services. Cartogr. Geogr. Inf. Sci. 2013, 40, 140–153. [Google Scholar] [CrossRef]
- Chen, S.H.; Shen, H. Semantic-Aware Dummy Selection for Location Privacy Preservation. In Proceedings of the 2016 IEEE Trustcom/BigDataSE/ISPA, Tianjin, China, 23–26 August 2016; pp. 752–759. [Google Scholar]
- Zhang, S.; Li, M.; Liang, W.; Sandor, V.K.A.; Li, X. A Survey of Dummy-Based Location Privacy Protection Techniques for Location-Based Services. Sensors 2022, 22, 6141. [Google Scholar] [CrossRef] [PubMed]
- Kido, H.; Yanagisawa, Y.; Satoh, T. Protection of Location Privacy Using Dummies for Location-Based Services. In Proceedings of the International Conference on Data Engineering Workshops, Tokyo, Japan, 3–4 April 2005. [Google Scholar] [CrossRef]
- Lu, H.; Jensen, C.S.; Yiu, M.L. PAD: Privacy-Area Aware, Dummy-Based Location Privacy in Mobile Services. In Proceedings of the Seventh ACM International Workshop on Data Engineering for Wireless and Mobile Access, Vancouver, BC, Canada, 13 June 2008. [Google Scholar]
- Niu, B.; Zhang, Z.; Li, X.; Li, H. Privacy-Area Aware Dummy Generation Algorithms for Location-Based Services. In Proceedings of the IEEE International Conference on Communications (ICC), Sydney, Australia, 10–14 June 2014; pp. 957–962. [Google Scholar] [CrossRef]
- Niu, B.; Li, Q.; Zhu, X.; Cao, G.; Li, H. Achieving k-Anonymity in Privacy-Aware Location-BASED services. In Proceedings of the IEEE INFOCOM 2014—IEEE Conference on Computer Communications, Toronto, ON, Canada, 27 April–2 May 2014; pp. 754–762. [Google Scholar] [CrossRef] [Green Version]
- Nisha, N.; Natgunanathan, I.; Xiang, Y. An enhanced location scattering based privacy protection scheme. IEEE Access 2022, 10, 21250–21263. [Google Scholar] [CrossRef]
- Zhang, Y.; Zhang, Q.; Li, Z.; Yan, Y.; Zhang, M. A k-anonymous Location Privacy Protection Method of Dummy Based on Geographical Semantics. Int. J. Netw. Secur. 2019, 21, 937–946. [Google Scholar]
- Anamala, B.M.; Subramanian, S. Dispersed dummy selection approach for location-based services to preempt user-profiling. Concurr. Comput. Pract. Exp. 2021, 33, e6361. [Google Scholar] [CrossRef]
- Shi, X.; Zhang, J.; Gong, Y. A Dummy Location Generation Algorithm Based on the Semantic Quantification of Location. In Proceedings of the IEEE International Conference Artificial Intelligence and Computer Applications (ICAICA), Dalian, China, 28–30 June 2021; pp. 172–176. [Google Scholar] [CrossRef]
- Zhang, A.; Li, X. Research on privacy protection of dummy location interference for Location-Based Service location. Int. J. Distrib. Sens. Netw. 2022, 18, 15501329221125111. [Google Scholar] [CrossRef]
- Wernke, M.; Skvortsov, P.; Dürr, F.; Rothermel, K. A classification of location privacy attacks and approaches. Pers. Ubiquit. Comput. 2014, 18, 163–175. [Google Scholar] [CrossRef]
- Alyousef, A.; Srinivasan, K.; Alrahhal, M.S.; Alshammari, M.; AL-Akhras, M. Preserving Location Privacy in the IoT against Advanced Attacks using Deep Learning. Int. J. Adv. Comput. Sci. Appl. 2022, 13, 416–427. [Google Scholar] [CrossRef]
- Jagarlapudi, H.N.S.S.; Lim, S.; Chae, J.; Choi, G.S.; Pu, C. Drone Helps Privacy: Sky Caching Assisted k-Anonymity in Spatial Querying. IEEE Syst. J. 2022, 16, 6360–6370. [Google Scholar] [CrossRef]
- Tadakaluru, A. Context Optimized and Spatial Aware Dummy Locations Generation Framework for Location Privacy. J. Geovis. Spat. Anal. 2022, 6, 27. [Google Scholar] [CrossRef]
- Parmar, D.; Rao, U.P. Dummy Generation-Based Privacy Preservation for Location-Based Services. In Proceedings of the 21st International Conference on Distributed Computing and Networking (ICDCN 2020), New York, NY, USA, 4–7 January 2020. [Google Scholar]
- Kalnis, P.; Ghinita, G.; Mouratidis, K.; Papadias, D. Preventing location-based identity inference in anonymous spatial queries. IEEE Trans. Knowl. Data Eng. 2007, 19, 1719–1733. [Google Scholar] [CrossRef] [Green Version]
- Zandbergen, P.A. A comparison of address point, parcel and street geocoding techniques. Comput. Environ. Urban Syst. 2008, 32, 214–232. [Google Scholar] [CrossRef]
- Evans, D.G.; Jones, S.M. Detecting Voronoi (area-of-influence) polygons. Math. Geol. 1987, 19, 523–537. [Google Scholar] [CrossRef]
- Van Dongen, S.; Enright, A.J. Metric distances derived from cosine similarity and Pearson and Spearman correlations. arXiv 2012, arXiv:1208.3145. [Google Scholar]
- Zhang, C.; Liang, H.; Wang, K.; Sun, J. Personalized Trip Recommendation with Poi Availability and Uncertain Traveling Time. In Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, Melbourne, Australia, 18–23 October 2014; pp. 911–920. [Google Scholar] [CrossRef] [Green Version]
- NYC OpenData. Department of Finance Digital Tax Map. 2022. Available online: https://data.cityofnewyork.us/Housing-Development/Department-of-Finance-Digital-Tax-Map/smk3-tmxj (accessed on 1 September 2022).
- NYC OpenData. Points of Interest. 2022. Available online: https://data.cityofnewyork.us/City-Government/Points-Of-Interest/rxuy-2muj (accessed on 16 November 2022).
- Esri Inc. ArcGIS Pro, version 2.8.2; Esri Inc: Redlands, CA, USA, 2021. [Google Scholar]
- GIS.NY.GOV. NYS Civil Boundaries. 2022. Available online: https://gis.ny.gov/gisdata/inventories/details.cfm?DSID=927 (accessed on 1 September 2022).
- Esri Inc. World Imagery. Available online: https://www.arcgis.com/home/item.html?id=10df2279f9684e4a9f6a7f08febac2a9 (accessed on 1 September 2022).
- RStudio Team. RStudio: Integrated Development Environment for R; RStudio: Boston, MA, USA, 2021; Available online: http://www.rstudio.com/ (accessed on 18 December 2022).
- Shokri, R.; Theodorakopoulos, G.; Le Boudec, J.Y.; Hubaux, J.P. Quantifying Location Privacy. In Proceedings of the 2011 IEEE Symposium on Security and Privacy, Oakland, CA, USA, 22–25 May 2011; pp. 247–262. [Google Scholar] [CrossRef] [Green Version]
Location Semantic Type | |||||
---|---|---|---|---|---|
Time | l1 | l2 | l3 | l4 | l5 |
2:20 a.m. | Restaurant | ResidentialTrue | Supermarket | Shopping mall | Gas station |
4:01 a.m. | Restaurant | ResidentialTrue | Supermarket | Shopping mall | Gas station |
6:10 a.m. | Restaurant | ResidentialTrue | Supermarket | Shopping mall | Gas station |
9:15 a.m. | Restaurant | ResidentialTrue | Supermarket | Shopping mall | Gas station |
11:50 a.m. | Restaurant | ResidentialTrue | Supermarket | Shopping mall | Gas station |
3:00 p.m. | Restaurant | ResidentialTrue | Supermarket | Shopping mall | Gas station |
8.00 p.m. | Restaurant | ResidentialTrue | Supermarket | Shopping mall | Gas station |
11:20 p.m. | Restaurant | ResidentialTrue | Supermarket | Shopping mall | Gas station |
Location Semantic Type | ||||||
---|---|---|---|---|---|---|
Time | l1 | l2 | l3 | l4 | l5 | l6 |
2:20 a.m. | Residential | Residential | ResidentialTrue | POI | POI | POI |
4:01 a.m. | Residential | Residential | ResidentialTrue | POI | POI | POI |
6:10 a.m. | Residential | Residential | ResidentialTrue | POI | POI | POI |
9:15 a.m. | Residential | Residential | ResidentialTrue | POI | POI | POI |
11:50 a.m. | Residential | Residential | ResidentialTrue | POI | POI | POI |
3:00 p.m. | Residential | Residential | ResidentialTrue | POI | POI | POI |
8.00 p.m. | Residential | Residential | ResidentialTrue | POI | POI | POI |
11:20p.m. | Residential | Residential | ResidentialTrue | POI | POI | POI |
Minimum Dispersion Distance—Residential (Meters) | |||
---|---|---|---|
Location Set Size (k) | True Location-1 (T-RES-1) | True Location-2 (T-RES-2) | True Location-3 (T-RES-3) |
4 | 9291.308514 | 13,685.24076 | 26,978.01745 |
6 | 9291.308514 | 10,923.37298 | 11,920.92343 |
8 | 9291.308514 | 2073.75585 | 11,834.15819 |
10 | 9291.308514 | 2073.75585 | 6944.580188 |
12 | 4076.586856 | 2073.75585 | 6944.580188 |
14 | 4076.586856 | 2073.75585 | 3461.144276 |
16 | 3194.775997 | 2073.75585 | 3461.144276 |
18 | 3101.138163 | 2073.75585 | 2069.387356 |
20 | 3101.138163 | 1618.185695 | 2069.387356 |
22 | 3101.138163 | 1618.185695 | 2069.387356 |
Minimum Dispersion Distance—POI (Meters) | |||
---|---|---|---|
Location Set Size (k) | True Location-1 (T-POI-1) | True Location-2 (T-POI-2) | True Location-3 (T-POI-3) |
4 | 10,368.38844 | 14,322.61609 | 25,800.26266 |
6 | 10,368.38844 | 8369.51998 | 12,034.79693 |
8 | 10,368.38844 | 3641.198031 | 11,580.28315 |
10 | 10,368.38844 | 3641.198031 | 7445.09195 |
12 | 3427.856197 | 3641.198031 | 7248.846177 |
14 | 3427.856197 | 3641.198031 | 3037.160013 |
16 | 3427.856197 | 3641.198031 | 3037.160013 |
18 | 2799.307792 | 3310.462408 | 2191.07635 |
20 | 2799.307792 | 3310.462408 | 2191.07635 |
22 | 2799.307792 | 3310.462408 | 2191.07635 |
Input True Location | Vectors Measured | Physical Dispersion Cosine Similarity (PDCS) |
---|---|---|
1 | T-RES-1 and T-POI-1 | 0.9966669 |
2 | T-RES-2 and T-POI-2 | 0.9641056 |
3 | T-RES-3 and T-POI-3 | 0.9993439 |
Vulnerability | VSBDG | COSA [21] | k-LPP [14] | VLBS [16] | DLSS [13] | V-Cir/V-grid [11] | DLIP [17] |
---|---|---|---|---|---|---|---|
Location homogeneity attack | ✓ | ✓ | ✓ | ✓p | X | X | X |
Map-matching attack | ✓p | ✓p | X | X | X | X | X |
Temporal constraint attack | ✓ | X | X | X | X | X | X |
Key Benefits | VSBDG | COSA [21] | k-LPP [14] | VLBS [16] | DLSS [13] | V-Cir/V-grid [11] | Random [9] |
---|---|---|---|---|---|---|---|
Physical dispersion semantic similarity for larger k values | ✓ | X | X | X | X | X | X |
Do not use location query probability | ✓ | ✓ | ✓ | ✓ | X | X | ✓ |
Use spatial context in dummy identification process | ✓ | ✓ | X | X | X | X | X |
Do not submit proxy instead of true location to LBS server | ✓ | ✓ | ✓ | ✓ | X | ✓ | ✓ |
Built on real-world geospatial dataset(s) | ✓ | ✓ | X | X | X | X | X |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Tadakaluru, A.; Qin, X. A Voronoi-Based Semantically Balanced Dummy Generation Framework for Location Privacy. Analytics 2023, 2, 246-264. https://doi.org/10.3390/analytics2010013
Tadakaluru A, Qin X. A Voronoi-Based Semantically Balanced Dummy Generation Framework for Location Privacy. Analytics. 2023; 2(1):246-264. https://doi.org/10.3390/analytics2010013
Chicago/Turabian StyleTadakaluru, Aditya, and Xiao Qin. 2023. "A Voronoi-Based Semantically Balanced Dummy Generation Framework for Location Privacy" Analytics 2, no. 1: 246-264. https://doi.org/10.3390/analytics2010013
APA StyleTadakaluru, A., & Qin, X. (2023). A Voronoi-Based Semantically Balanced Dummy Generation Framework for Location Privacy. Analytics, 2(1), 246-264. https://doi.org/10.3390/analytics2010013