A Privacy Preserving Framework for Worker’s Location in Spatial Crowdsourcing Based on Local Differential Privacy

: With the development of the mobile Internet, location-based services are playing an important role in everyday life. As a new location-based service, Spatial Crowdsourcing (SC) involves collecting and analyzing environmental, social, and other spatiotemporal information of individuals, increasing convenience for users. In SC, users (called requesters) publish tasks and other users (called workers) are required to physically travel to speciﬁed locations to perform the tasks. However, with SC services, the workers have to disclose their locations to untrusted third parties, such as the Spatial Crowdsourcing Server (SC-server), which could pose a considerable threat to the privacy of workers. In this paper, we propose a new location privacy protection framework based on local difference privacy for spatial crowdsourcing, which does not require the participation of trusted third parties by adding noises locally to workers’ locations. The noisy locations of workers are submitted to the SC-server rather than the real locations. Therefore, the protection of workers’ locations is achieved. Experiments showed that this framework not only preserves the privacy of workers in SC, but also has modest overhead performance.


Introduction
With the popularity of location-aware mobile devices, such as global position system (GPS) navigation or smart phones, Spatial Crowdsourcing (SC) [1] services have made daily life more convenient.SC is a new type of service that combines location and crowdsourcing, which is used in applications such as environmental sensing, urban planning, convenient travel and so on.In SC, users who are called requesters release their tasks to the spatial crowdsourcing server (SC-server), and other users who are called workers upload their location to the SC-server.The SC-server then assigns the tasks to the workers based on the locations of both the tasks and the workers.
To assign tasks of the requesters to the right workers, SC-server must know the locations of the workers because tasks are time-sensitive and workers should not have to travel long distances to perform and complete a task.However, locations are privacy information, which may reveal private information, such as home address or work places, and may be used to infer personal health, political view, or religious belief information by attackers.Because the SC-server may be untrusted, it may leak the locations of workers to other parties and may also use the locations for other commercial uses.Therefore, protecting the locations of workers is an urgent issue in SC.
In this paper, we propose a framework for protecting the privacy of workers' locations in spatial crowdsourcing based on local differential privacy.Workers' locations are obfuscated locally with local differential privacy and then sent to SC-server for task assignment.The SC-server can only access workers' noisy locations rather than the real ones.The proposed framework does not require the participation of trusted third parties and workers can customize their privacy requirements.
Our contributions are as follows: 1.
We propose a new framework that protects the location privacy of workers in SC.In comparison with other solutions, workers' locations are obfuscated locally with noises in the proposed framework so that our framework does not require the participation of any trusted third parties for data collection and privacy processing, and the workers can customize their privacy requirements.

2.
We design a task assignment algorithm that assign workers to tasks with obfuscated locations.

3.
We conduct experiments on real-world datasets which show that the proposed framework has privacy guarantees and modest overhead performance.
The rest of the paper is organized as follows.Section 2 presents related work.Section 3 introduces the threat model and privacy protection framework.Section 4 discusses the local differential privacy generation algorithm.Experimental set-up and results are presented in Section 5. Finally, Section 6 concludes the paper.

Related Work
Spatial crowdsourcing is becoming increasing popular, whereas its location privacy is causing concern.K-anonymity [2][3][4] has been widely used for privacy protection in location-based systems.The main idea of k-anonymity is that the location of a user is hidden among k other users [5][6][7].The other component of k-anonymity involves generating k-1 properly selected dummy points [8,9], and performing k queries to the service provider, using the real and dummy locations.However, the security of k-anonymity is poor when the attacker has background knowledge about the users.
Differential privacy [10] provides a strong privacy guarantee when attackers have background knowledge.To et al. [11] proposed a framework for protecting the privacy of worker locations in SC based on differential privacy.A trusted third party cell service provider (CSP) is needed to clean and noise the worker locations for the SC-server's queries.However, the trusted third party, such as a CSP, is not available sometimes, and users' privacy is not guaranteed even if CSP claim to keep sensitive information safe.On the other hand, the privacy requirements of some workers cannot be satisfied because the privacy budgets of all workers are uniformly distributed by CSP.
In this paper, we propose a framework based on local differential privacy [12,13] to protect workers' locations.The framework can defend against attackers with background knowledges because it achieves differentially-private protection guarantees.Furthermore, the framework does not require a trusted third party such as CSP which used in [11] and workers can customize their own privacy requirements because workers can add noises to their locations locally.

Threat Model
Figure 1 shows the basic model of the SC system.There are three participants: requesters, workers, and the SC-server.Firstly, the worker submits their location to the SC-server, which collects and updates a dataset of worker locations.Then, requesters publish their tasks on the SC-server.Next, the SC-server queries the worker locations dataset to assign tasks as they are received from requesters.Finally, workers travel to the designated locations to execute the tasks.In SC, whether a worker is willing to accept a task or not may depend on the distance between the worker and task.Therefore, the SC-server must obtain the location of workers when assigning a task.However, locations are very important private information for the workers.From the workers' locations, attackers may infer the workers' home or work places, political views, religious inclinations, etc.A worker may not be willing to send their locations to the SC-server with a guarantee of location privacy.
Intuitively, the SC-server cannot be fully trusted.An unpredictable loss may occur to workers when the SC-server leaks workers' locations due to system security vulnerabilities.Additionally, the SC-server may use these locations illegitimately because the location contains commercial value.

Privacy Protection Framework
Our goal in this paper was to protect the location privacy of workers in SC.We designed a framework (Figure 2) where workers report noisy locations to the SC-server based on local differential privacy.Different from the SC model shown in Figure 1, we added an essential step (step 0) where workers add noises to their real locations by themselves based on a local differential privacy algorithm.Therefore, the SC-server can obtain the noisy locations of workers rather than the real locations during task assignment.

Noisy Location with Local Differential Privacy
The first step in the proposed framework is to add noise to the real worker location for task assignment at the SC-server.In this section, we address the specific requirements of the SC framework based on the geo-indistinguishability method previously proposed [14].In our approach, we consider the level of privacy within a radius r where the worker enjoys l-privacy within r, where l represents the worker's level of privacy for radius and l = ϵr.The definition of local differential privacy based on location (LDPL) is as follows.In SC, whether a worker is willing to accept a task or not may depend on the distance between the worker and task.Therefore, the SC-server must obtain the location of workers when assigning a task.However, locations are very important private information for the workers.From the workers' locations, attackers may infer the workers' home or work places, political views, religious inclinations, etc.A worker may not be willing to send their locations to the SC-server with a guarantee of location privacy.
Intuitively, the SC-server cannot be fully trusted.An unpredictable loss may occur to workers when the SC-server leaks workers' locations due to system security vulnerabilities.Additionally, the SC-server may use these locations illegitimately because the location contains commercial value.

Privacy Protection Framework
Our goal in this paper was to protect the location privacy of workers in SC.We designed a framework (Figure 2) where workers report noisy locations to the SC-server based on local differential privacy.Different from the SC model shown in Figure 1, we added an essential step (step 0) where workers add noises to their real locations by themselves based on a local differential privacy algorithm.Therefore, the SC-server can obtain the noisy locations of workers rather than the real locations during task assignment.In SC, whether a worker is willing to accept a task or not may depend on the distance between the worker and task.Therefore, the SC-server must obtain the location of workers when assigning a task.However, locations are very important private information for the workers.From the workers' locations, attackers may infer the workers' home or work places, political views, religious inclinations, etc.A worker may not be willing to send their locations to the SC-server with a guarantee of location privacy.
Intuitively, the SC-server cannot be fully trusted.An unpredictable loss may occur to workers when the SC-server leaks workers' locations due to system security vulnerabilities.Additionally, the SC-server may use these locations illegitimately because the location contains commercial value.

Privacy Protection Framework
Our goal in this paper was to protect the location privacy of workers in SC.We designed a framework (Figure 2) where workers report noisy locations to the SC-server based on local differential privacy.Different from the SC model shown in Figure 1, we added an essential step (step 0) where workers add noises to their real locations by themselves based on a local differential privacy algorithm.Therefore, the SC-server can obtain the noisy locations of workers rather than the real locations during task assignment.

Noisy Location with Local Differential Privacy
The first step in the proposed framework is to add noise to the real worker location for task assignment at the SC-server.In this section, we address the specific requirements of the SC framework based on the geo-indistinguishability method previously proposed [14].In our approach, we consider the level of privacy within a radius r where the worker enjoys l-privacy within r, where l represents the worker's level of privacy for radius and l = ϵr.The definition of local differential privacy based on location (LDPL) is as follows.

Noisy Location with Local Differential Privacy
The first step in the proposed framework is to add noise to the real worker location for task assignment at the SC-server.In this section, we address the specific requirements of the SC framework based on the geo-indistinguishability method previously proposed [14].In our approach, we consider the level of privacy within a radius r where the worker enjoys l-privacy within r, where l represents the worker's level of privacy for radius and l = r.The definition of local differential privacy based on location (LDPL) is as follows.

Definition Local Differential Privacy on Location (LDPL)
A mechanism K satisfies LDPL if for all l, l , and d(l, l ) ≤ r: where d(l, l ) is the Euclidean distance between l and l , and r is the radius of zone of privacy, and denote differential privacy parameter.

Achieve Local Differential Privacy on Location
In this section, we describe a method to satisfy LDPL that can protect workers' locations privacy.First, we used the multivariate Laplacians method to add noise to workers' locations, so we can generate a noisy point.Then, we remap each point to the worker's location.

Generating Noise Point
In this paper's framework, instead of uploading the worker's real location l 0 to the SC-server, we used the noise function to generate a noisy location l to send to the SC-server.Because the location is two-dimensional (2D), we could not use the standard Laplacian method.Instead, we used the method previously described [15,16], which was obtained from the standard Laplacian by replacing |x − u| with d(x, u).The probability density function is given as: Given the parameter ∈ R + , the actual location l 0 ∈ R 2 , and any other location l ∈ R , the probability density function (pdf) of the noise mechanism is: where 2 2π is a normalization factor.To draw the point l, we switched the pdf defined in Equation (2) to a system of polar coordinates.Therefore, following the standard transformation formula, the pdf of the polar Laplacian is: where r is the distance of l from l 0 , and θ is the angle that the line ll 0 forms with respect to the horizontal axis of the Cartesian system.Basing the pdf of the polar Laplacian, we can express it as two marginals.We denote these two random variables by r (radius) and θ (angle).The two marginals are: and Hence, we generate θ as a random number in the interval [0, 2π) with uniform distribution base on the D ,Θ (θ) defined in (5).
In addition, we can obtain the cumulative distribution function (cdf) of variable r with D ,R (r) defined in Equation ( 4).The cdf is: Finally, we can generate r by C (r) defined in Equation ( 6), and where W −1 is the Lambert W function (the −1 branch).Therefore, we can generate the noisy point (r, θ) by following Algorithm 1.

Algorithm 1 Noisy Point Generate Algorithm
Input: privacy budget Output: (r, θ) Consequently, in Algorithm 1, we first generate θ as a random number in the interval [0, 2π) with uniform distribution, then we generate a random number p with uniform probability in the interval [0,1).Finally, we generate r using Equation ( 7).

Remapping Noisy Point to Worker's Location
To generate the worker's noisy location, we then had to remap the noisy point generated by Algorithm 1 to the location of the worker.We can generate the worker's noisy location by following Algorithm 2.

Design Goals and Performance Metrics
Tasks assignment with noisy data on the SC-server may reduce the effectiveness and efficiency of worker-task matching.Due to the nature of local differential privacy, it is possible that no workers may be notified of the task request.Alternatively, the real distance between the worker and task may be very far.Therefore, we considered the following performance metrics.(1) Assignment Success Rate (ASR): Due to the noisy locations of the workers, the SC-server may incorrectly assign workers to tasks that are too far, and workers will not accept it.ASR measures the ratio of tasks accepted by a worker to the total number of task requests; (2) Worker Travel Distance (WTD): The SC-server uses noisy data to assign tasks, which may lead to workers have to travel long distances to tasks.WTD measures the distance for the nearest worker to travel to the task.

Experimental Data Set
In this paper, we validate our method by using the real location Gowalla data.Gowalla is a dataset that contains the check-in history of users.For our experiments, we only used the check-in data for 6100 users in the area of San Francisco, California.We assumed that the users were the workers in the SC system, and assumed the most recent check-in points as their locations.We also modelled each check-in point as a task that was accepted by a worker.The characteristics of the dataset are shown in Table 1.

Task Assignment Algorithm
Given a task t, we needed to build a matching region (MR) for the task where workers are notified to accept the task.The algorithm must balance two conflicting requirements: determining if a region contains enough workers so that the probability of acceptance of task t is the highest, and the size of the MR must be small.To build a matching region, we first set the expected utility (EU), which represents the expected success rate of a task.We then set a maximum travel distance (MTD) for task matching, which is the maximum distance a worker must travel to perform a task.The task assignment algorithm is shown in Algorithm 3. Algorithm 3 initially selects the acceptance area of a task centered on task t and determines the utility of the task being accepted.For every additional worker, the utility of the task being accepted is recalculated.The algorithm stops when the utility exceeds the threshold EU or the radius of MR exceeds the MTD.

Evaluation Methodology
Given a task t, we first used Algorithm 1 that was proposed in Section 4.2.2 to noisy the worker's locations, and then used Algorithm 3 to perform task assignment.We compared our proposed solution with a non-private algorithm that has access to exact workers' locations so that we could evaluate the overhead of privacy.We considered the privacy budge ∈ {0.1, . . . , 0.5, . . .1}, ranging from strict to lose privacy requirements.We set the expected utility EU ∈ {0.3, 0.5, 0.7, 0.9}.We randomly generated 1000 tasks and measured the performance of ASR and WTD.

Assignment Success Rate
We first conducted experiments on the assignment success rate (ASR).Each worker decides whether or not to accept a received task request based on the distance to the task.Therefore, we denoted by acceptance rate (AR) the probability p a (0 ≤ p a ≤ 1) that a worker accepts a task to complete for which they had received a request.We assumed that all workers were identical and independent of each other in deciding to perform tasks.A task is accepted if at least one worker agrees to perform it.Therefore, the utility of MR in Algorithm 2 is: where w represents the number of workers in MR.We compared the proposed solution that was described in Section 4.2 with the non-private framework.Figure 3 compares the ASR of the non-privacy framework and our privacy framework.The ASR decreases with the privacy budget .However, tiny distinctions are evident between the ASR of non-privacy framework and our privacy framework.
Future Internet 2018, 10, x FOR PEER REVIEW 7 of 8 However, tiny distinctions are evident between the ASR of non-privacy framework and our privacy framework.

Worker Travel Distance
We then examined the worker travel distance under different privacy budgets ϵ.The metric value of WTD was determined as the distance from the task to the nearest worker that accepts the task.
We also compared the WTD between the non-privacy framework and our privacy framework.This was justified by the results shown in Figure 4.The WTD decreases with the privacy budget ϵ.We observed that privacy does not significantly increase WTD compared with the non-privacy case.Therefore, the balance between privacy budget and worker travel distance can be easily found in practical applications.

Worker Travel Distance
We then examined the worker travel distance under different privacy budgets .The metric value of WTD was determined as the distance from the task to the nearest worker that accepts the task.
We also compared the WTD between the non-privacy framework and our privacy framework.This was justified by the results shown in Figure 4.The WTD decreases with the privacy budget .We observed that privacy does not significantly increase WTD compared with the non-privacy case.Therefore, the balance between privacy budget and worker travel distance can be easily found in practical applications.

task.
We also compared the WTD between the non-privacy framework and our privacy framework.This was justified by the results shown in Figure 4.The WTD decreases with the privacy budget ϵ.We observed that privacy does not significantly increase WTD compared with the non-privacy case.Therefore, the balance between privacy budget and worker travel distance can be easily found in practical applications.

Conclusions
In this paper, we introduced a novel privacy-aware framework based on local differential privacy for spatial crowdsourcing.We added noises to a worker's location based on multivariate Laplacians.Then, we used the noisy locations to assign tasks.Our experimental results using real data showed that the proposed techniques are effective, and the cost of privacy is practical.
In the future, we will extend our framework to situations where the privacy of both workers and tasks must be protected.We will also focus on protecting the trajectory of workers in SC.

Conclusions
In this paper, we introduced a novel privacy-aware framework based on local differential privacy for spatial crowdsourcing.We added noises to a worker's location based on multivariate Laplacians.Then, we used the noisy locations to assign tasks.Our experimental results using real data showed that the proposed techniques are effective, and the cost of privacy is practical.
In the future, we will extend our framework to situations where the privacy of both workers and tasks must be protected.We will also focus on protecting the trajectory of workers in SC.

Figure 1 .
Figure 1.The model of the spatial crowdsourcing (SC) system.

Figure 1 .
Figure 1.The model of the spatial crowdsourcing (SC) system.

Future 8 Figure 1 .
Figure 1.The model of the spatial crowdsourcing (SC) system.

Figure 3 .
Figure 3. Assignment success rate (ASR) of non-privacy framework and our privacy framework.

Figure 3 .
Figure 3. Assignment success rate (ASR) of non-privacy framework and our privacy framework.

Figure 4 .
Figure 4. Worker travel distance (WTD) of non-privacy framework and our privacy framework.

Figure 4 .
Figure 4. Worker travel distance (WTD) of non-privacy framework and our privacy framework.