TripSense: A Trust-Based Vehicular Platoon Crowdsensing Scheme with Privacy Preservation in VANETs

In this paper, we propose a trust-based vehicular platoon crowdsensing scheme, named TripSense, in VANET. The proposed TripSense scheme introduces a trust-based system to evaluate vehicles’ sensing abilities and then selects the more capable vehicles in order to improve sensing results accuracy. In addition, the sensing tasks are accomplished by platoon member vehicles and preprocessed by platoon head vehicles before the data are uploaded to server. Hence, it is less time-consuming and more efficient compared with the way where the data are submitted by individual platoon member vehicles. Hence it is more suitable in ephemeral networks like VANET. Moreover, our proposed TripSense scheme integrates unlinkable pseudo-ID techniques to achieve PM vehicle identity privacy, and employs a privacy-preserving sensing vehicle selection scheme without involving the PM vehicle’s trust score to keep its location privacy. Detailed security analysis shows that our proposed TripSense scheme not only achieves desirable privacy requirements but also resists against attacks launched by adversaries. In addition, extensive simulations are conducted to show the correctness and effectiveness of our proposed scheme.


Introduction
Envisioned as one of the most promising applications to implement intelligent transportation systems (ITS), vehicular platooning [1,2] has the potential to enhance road safety, improve traffic efficiency and reduce energy consumption due to air drag reduction [3]. At the same time, with the increasing popularity of mobile devices and sensing technologies, a new sensing paradigm, mobile crowdsensing, attracts attention from both academia and industry [4]. Different from traditional sensor networks, this new sensing paradigm leverages the power of crowds for large scale sensing tasks and fuels the evolution of the Internet of Things (IoT) [5]. Many factories are built in remote areas where the sensor resources are limited, if the authority needs to inspect those factories, it can hardly collect information with the existing traditional sensor networks. However, given the fact that many highways go through the remote areas. One solution to this problem is to invite vehicles passing by those areas to take part in the crowdsensing tasks and utilize their sensed data (e.g., temperature, humidity, noise level, air pollution level, etc.).
However, due to the inherent openness of this platform, it is easy for vehicles to contribute corrupted data [6]. As a result, several research efforts have been made on ensuring the trustworthiness of the sensed data [7,8]. One possible solution is to establish the reputation system for evaluating the trustworthiness of volunteer contributions in participatory sensing applications [6]. Furthermore, this new data aggregation way may also bring in privacy concerns into the networks. For example, the sensed data could reveal the capacity of a vehicle's sensor and hence reveal the personal information of the vehicle. Another factor that has always been a concern is the location privacy, since the locations of the vehicles are closely related to the drivers of those vehicles [9]. To achieve location privacy, one approach is to use unlinkable pseudonyms that are periodically changed when broadcasting messages [10][11][12]. However, pseudonyms do not always ensure privacy, as an example shown in Figure 1, a platoon head vehicle is asking its platoon member vehicle V i for participating in the sensing task. When V i responds by sending its own reputation score ts i at Day-1 and Day-2, respectively, the platoon head vehicle can still associate V i in different days by associating its trust scores even when its pseudonym has been changed. The unchanged trust score of a vehicle reveals its location privacy and the platoon head vehicle can even derive the driving pattern of the platoon member vehicle. Therefore, it is compelling for us to build a trust system through which vehicles take its advantages without sacrificing their privacies, and a data aggregation mechanism to ensure data privacy.  Based on the observations above, we propose a trust-based vehicular platoon crowdsensing scheme, called TripSense, to improve sensing accuracy while achieving location and data privacy. This scheme is based on vehicular platooning technique to collect and aggregate data. At the same time, by establishing a trust model to measure the accuracy of a vehicle's sensed data, the service provider (SP) efficiently detects and then excludes the malicious or selfish vehicles who submit corrupted sensed data. Meanwhile, the proposed scheme is characterized by its ability to preserve the location privacy and data privacy of sensing vehicles. With the assistance of platoon head vehicles, the communication overhead and computational cost can be greatly reduced. Specifically, our work features the following: • First, we establish the trust system based on Dirichlet distribution to evaluate the sensing accuracy of all sensing vehicles in our proposed TripSense scheme. The historical sensed data will be evaluated and finally form a reputation score. Therefore, the sensing accuracy will be improved greatly when the data are always collected from those high reputation sensing vehicles. • Second, we propose the TripSense scheme by taking advantages of the unique features of vehicular platooning. In this scheme, platoon head vehicles firstly authenticate all sensing vehicles inside the platoon and then select some of them according to their trust values. Later, the sensed data from sensing vehicles will be collected and aggregated by platoon head vehicles before they are finally uploaded to the server. Compared with previous works, our proposed scheme reduces the communication overhead and hence is more suitable for the dynamic and ephemeral vehicular ad hoc network.
• Third, we design a privacy-preserving sensing vehicle selection scheme based on our trust system and a privacy-preserving data aggregation scheme based on the efficient commitment scheme in [13] such that platoon head vehicles can collect the data without leaking sensing vehicles' privacy.
The remainder of this paper is organized as follows. In Section 2, we formalize the system model, trust model and threat model considered in our work, and identify our design goals. In Section 3, we briefly recall the bilinear pairing and the Dirichlet distribution which have been applied in the trust and reputation system. In Section 4, the TripSense scheme is presented in detail, together with the rationale on how it can help the requesting vehicles choose a highly reliable relay vehicle without knowing its reputation score. Security analysis is then presented in Section 5, and the performance analysis is given in Section 6. Finally, we present the related work in Section 7 and draw conclusions in Section 8.

Problem Statement
In this section, we define the problem by formalizing the system model, security model and design goal.

System Model
In our model, the service provider (SP) wants to inspect an area of interest (AoI) located near a highway where many platoons pass by. As illustrated in Figure 2, our system model consists of three roles: the service provider (SP), a cloud server (CS), the immobile roadside units (RSUs) along the highway and mobile vehicles traveling on the highway, which are equipped with onboard units (OBUs) and powerful sensors.  Service Provider (SP): The SP is fully trusted because it is normally controlled by the authority who wants to inspect an area of interest by collecting the data of this AoI. The data collected is a vector of readings regarding, for example, air pollution level, noise level, temperature, humidity and so on.
The duty of SP is to initialize the whole system, and distribute key materials to RSUs and vehicles. It is also responsible for storing and updating trust values for all vehicles.
Roadside Units (RSUs): The RSUs are subordinated by the SP, which are connected to the CS and SP via reliable communication channels. Equipped with wireless devices, RSUs are able to exchange data with the vehicles passing-by. However, due to the high cost of RSU installment and maintenance, especially in the early stage of VANET, RSUs are sparsely deployed along the highway. The RSUs will never disclose any internal information without permission. However, we do not rule out the possibility that a portion of RSUs at the road side are compromised or the attackers even deploy bogus RSUs. Nevertheless, the SP can inspect all RSUs at high level: once the RSUs are compromised, they will be recovered or revoked soon by SP.
Cloud Server (CS): A CS collects data from RSUs, then aggregates them in a privacy-preserving way. In addition, a CS also computes the sensed data evaluations for vehicles and returns the results to SP. A CS is assumed to be honest but curious about the sensed data of vehicles, which means that it follows the proposed scheme faithfully but tends to be curious and disclose vehicles' privacy.
Vehicles: The vehicles are regarded as a group of highly mobile nodes equipped with OBUs which allow them to communicate with other vehicles or with RSUs. On the highway, vehicles follow platoon head vehicles to form a platoon. With this driving pattern, the vehicles can be further divided into two categories: • Platoon Head (PH) Vehicles P 0 = {ph 1 , ph 2 , · · · }: PH vehicles take full control of the whole platoon when driving on the highway, and they are responsible for the safety and user experience of all platoon member vehicles. Apart from that, they also claim a sensing task and submit sensed data to RSUs through V-2-I communication. PH vehicles are also honest but curious about the privacy of platoon member vehicles. In fact, PH vehicles could be malicious and provide untruthful aggregated data to server in order to subvert the system, or they may even collude with a bunch of PM vehicles with the objective to victimize other PM vehicles. However, in this work, we do not consider this issue since it is not the main focus of this work. Both PH vehicles and PM vehicles will be get paid by the SP for leading a platoon or contributing their sensed data.

Security Model
In our security model, we assume that all roles, except the SP, RSUs and malicious PM vehicles, are honest-but-curious, i.e., they will faithfully follow the protocol, but could also snoop into another role's privacy on account of some sensitive information available to them. Specifically, we first consider the privacy requirements of PM vehicles.
Privacy requirements of platoon member vehicles: The privacy requirements of a PM vehicle include its data privacy, location privacy and identity privacy. Since the sensed data are private assets of a PM vehicle, which may reflect some sensitive information like the sensor accuracy or sensing ability, the PM vehicles will not disclose them to others. The location privacy requirement indicates that a PM vehicle will not let its PH vehicle know its past driving pattern, and the identity privacy means that the PM vehicle tries to keep his real identity secret. Meanwhile, each vehicle is also privacy-curious, i.e., it tends to disclose the privacy of other vehicles from other information available to it.
We assume that there are two kinds of adversaries according to their attacking abilities, the first kind tries to impersonate another authorized PM vehicle; the second kind is able to control a small portion of vehicles. Specifically, we list potential attacks as follows: • Impersonation attack: The first kind of adversary may try to impersonate a PM vehicle to ask for a sensing task. However, this PM vehicle may not be qualified in the system. Once chosen to fulfill the task, these unqualified vehicles may submit inaccurate sensed data.

•
Malicious sensing attack: The second kind of more capable adversary is able to control a small fraction of the vehicles in the system who submit inaccurate sensed data deliberately to subvert the system. Another possible case is that the PM vehicle is selfish, so it reports arbitrary data without using sensors to save power.

•
Trust score association attack: PH vehicles are honest but curious, in this case, if the trust score of a PM is directly given to its PH vehicle, its driving pattern will be disclosed. As described in Section 1, the reason is because every time the PH vehicle can associate a PM vehicle in platoon according to the same trust score collected in different trips, even though the pseudo-id has been changed. • Data analysis attack: Due to the curious characteristics of both PH vehicles and PM vehicles, they may eavesdrop on the transmission of sensed data and try to analyze the data. On the other hand, a cloud server (CS) is also curious about the sensed data. If the data is not encrypted, these attackers can easily analyze the data in transmission.

Design Goal
Our design goal is to develop a trust-based privacy-preserving scheme to not only improve the sensing accuracy, but also preserve the privacy of sensing vehicles while resisting against the attacks launched by adversary. Specifically, the following desirable objectives need to be achieved.

•
Ensuring the sensed data reliability and accuracy. According to our adversary model, the existence of selfish and malicious vehicles who submit corrupted sensed data will make the final results inaccurate and unreliable. Hence, our proposed scheme should be able to improve the sensed data accuracy by excluding selfish and malicious vehicles' data.

•
Achieving privacy-preserving sensing vehicle selection, sensed data aggregation and evaluation. The proposed scheme should achieve privacy requirements of PM vehicles. Particularly, (i) the real identity of PM vehicles will never be disclosed; (ii) when a PM vehicle replies to the PH vehicle, the PH vehicle can never know the exact trust score of this PM vehicle; (iii) when a PH vehicle collects and aggregates the sensed data, it can never know what the data is; (iv) the CS can never know the aggregated sensed data and the evaluations on that data. • Resisting against attacks launched by adversaries. The proposed scheme should also be secure and reliable in VANET. Once an outside adversary launches some attacks, e.g., impersonation attack or data analysis attack, the proposed scheme should be able to detect them.

Bilinear Pairing
Let G and G T be two multiplicative cyclic groups of the same composite order n. Then, a bilinear pairing e : G × G → G T will satisfy the following properties: (i) Bilinear: Let g, h ∈ G and a, b ∈ Z * n , then e(g a , h b ) = e(g, h) ab ; (ii) Non-degenerated: Let g ∈ G be a generator in G, then e(g, g) = 1 G T ; and (iii) Computable: Let g, h ∈ G, then e(g, h) can be efficiently computed.

Definition 1 (Bilinear Parameter Generator). A bilinear parameter generator
Gen is a probabilistic algorithm that takes a security parameter κ as its input, and outputs a six-tuple (p, q, g, G, G T , e), where p, q are κ-bit prime numbers, n = p · q, (G, G T ) are two multiplicative groups of the same order n, g ∈ G is a generator, and e : G × G → G T is a non-degenerated and efficiently computable bilinear map.

Beta Distribution
Defined on the interval of [0,1], beta distribution is a family of continuous probability distributions indexed by two parameters α and β. A random variable X beta-distributed with parameters α and β can be denoted by: X ∼ Beta(α, β). Given that Gamma function is an extension of the factorial function where Γ(α) = ∞ 0 x α−1 e −x dx. The probability density function (PDF) f (x|α, β) can be expressed by using gamma function Γ as: The probability expectation value of the beta distribution is given by: E(x) = α α+β . Figure 3 shows the PDF of beta distribution with different parameters α and β. It expresses the uncertain probability that a process will produce positive outcomes in future. Take the example, when α = 8, β = 2, according to expectation equation, the probability expectation value of this type of beta distribution is E(x) = 0.8, which can be interpreted as the relative frequency of positive outcome that is somewhat uncertain and that the most likely value is 0.8.

Dirichlet Distribution
The Dirichlet distribution is a family of continuous multivariate probability distributions parameterized by a priori parameter vector − → α . It is the conjugate prior distribution for the parameters of the multinomial distribution. In the case of a binary state space, it is determined by the Beta distribution [14]. Generally, we can use the Dirichlet distribution to describe the probability distribution over a k-component random variable The Dirichlet distribution captures a sequence of observations of k possible outcomes, and those observations serve as the prior parameters − → α = (α 1 , α 2 , . . . , α k ) that denote the cumulative observations and initial beliefs of X. − → p is a k-dimensional random variable and − → α is a k-dimensional random observation variable. The probability density function is given by: where 0 ≤ p 1 , p 2 , · · · , p k ≤ 1; ∑ k i=1 p i = 1; α 1 , α 2 , . . . , α k > 0. The expected value of the probability that X to be x i given the observations vector − → α is given by:

Proposed TripSense Scheme
In this section, we propose our TripSense scheme, which consists of six parts: system initialization, trust-based privacy-preserving sensing vehicle selection, privacy-preserving sensed data aggregation, aggregated sensed data retrieval, privacy-preserving sensed data accuracy evaluation, and Dirichlet-based trust management.

System Initialization
We assume that a service provider (SP) will bootstrap the whole system. Specifically, given a security parameter κ, SP first generates the bilinear parameters (p, q, g, G, G T , e) by running Gen(κ) and then computes h = g q ∈ G. Next, SP chooses a secure symmetric encryption algorithm E nc(), i.e., AES, and a collision-resistant cryptographic hash function H : {0, 1} * → Z * n . In addition, SP chooses a random number s ∈ Z * n as the master key and computes P pub = g s , n = pq. Finally, SP keeps p, q secret and publishes {n, g, h, P pub , G, G T , e, H, E nc}.
RSU REGISTRATION: For each RSU, SP first generates an identity, denoted by RID, and then calculates its private key and public key as (s r ; S r ), where s r is randomly chosen in Z * n and S r = g s r . PH VEHICLE REGISTRATION: For any platoon head (PH) vehicle ph i ∈ P 0 that wants to participate in the sensing task, it has to register itself to SP and obtain a real identity RID i . Then, SP assigns the private key and public key to ph i as (s i ,S i ), where s i is randomly chosen in Z * n and S i = g s i . PM VEHICLE REGISTRATION: For each platoon member (PM) vehicle v j ∈ V 0 that wants to take part in the sensing task and contribute its sensed data, it first registers itself in the system. The following steps between SP and v j show the registration process.
• SP first chooses a random number k 0 ∈ Z * n and uses E nc() to compute pseudo-IDs P ID j = {PID j1 , PID j1 , · · · }, where each pseudo-ID PID jk ∈ P ID j is computed as PID jk = E nc k 0 (ID j ||r jk ) with a fresh random number r jk ∈ Z * n . Then, for each PID jk , SP calculates its corresponding private key by s jk = H(PID jk ) s and public key by S jk = g s jk . Finally, SP sends P ID j and the corresponding public and private keys back to v j via a secure channel. = e(s jk , g).
TRUST REGISTRATION: Each registered PM vehicle v j will be given a trust score ts j by SP before it is able to take part in the sensing task, where ts j ∈ [0, 1] with the precision of two decimal places. Initially, ts j = ts 0 . In addition, SP also defines L trust levels {TL 1 , TL 2 , · · · , TL L } for all trust scores from 0 to 1. For instance, TL 1 is with (0,0.1], TL 2 is with (0.1,0.2], · · · , TL 10 is with (0.9,1]. Later, SP selects l random elements {y 1 , y 2 , · · · , y L ∈ Z * n } as master keys, and publishes the public keys as For a registered PM vehicle v j with trust score ts j ∈ TL x , where x ∈ [1, L], SP makes signatures for each of its pseudonyms PID jk ∈ P ID j as A jk = g 1 yx +ts j +s jk +H(T j ) , where T j is the timestamp for updating the trust score of v j .
TASK REGISTRATION: Before a task is broadcasted to PH vehicles, it should be registered by SP. First, the sensed data categories need to be decided, such as air pollution level, noise level, temperature, humidity, and so on. Second, the the format could be defined as where each element denotes one category. In addition, SP also defines that each piece of data has the precision of two decimal places. In addition, the location of AoI is also included in the task. Finally, the SP will also decide a sensing vehicle trust level threshold TL TH according to the accuracy requirements of the task to make sure that the sensed data only come from those more trusted vehicles.

Trust-Based Privacy-Preserving Sensing Vehicle Selection
We assume that there are a number of m PH vehicles in the system which would like to participate in the sensing task. They form a new set P = {ph 1 , ph 2 , · · · , ph m }. For a specific PH vehicle ph i ∈ P, it needs to collect sensed data from all registered PM vehicles in its platoon and then select those which meet the trust level requirement. Therefore, a trust-based privacy-preserving sensing vehicle selection scheme has been proposed as follows: Step 1: When the platoon approaches the AoI, ph i broadcasts its sensing requests {PID i ||H(T) s i ||T} to all platoon members, where T is the current timestamp.
Step 2: In ph i 's platoon, for each registered PM vehicle v j , after receiving ph i 's requests, it first verifies whether ph i is a registered PH vehicle by checking e(H(T) s i , g) H(T)). If it holds, v j accepts the requests, otherwise, v j rejects ph i . Then v j responds to ph i with {PID jk ||H(T ) s jk ||TL x ||Π||T ||T j } by calculating as follows, where T is the current timestamp and T j is the latest timestamp for updating v j 's trust score.

•
Since the trust scores of PM vehicles are two decimal places, we need to expand them by 100 times before they can be encrypted. If v j 's trust score is with trust level TL x where x ∈ [1, L], v j encrypts the expanded trust score ts j as C = g ts j h r , with a fresh random number r ∈ Z * n .

•
Similarly, the sensed data are all collected and expanded by 100 times. For one category in − → d , v j encrypts the expanded sensed data d j as D = g d j h r , where r is a random number in Z * n .
• v j chooses a pseudonym PID jk and a random element v ∈ Z * n to calculate the following: • v j randomly chooses ts j , r , v ∈ Z * n and computes C = g ts j h r , E = B −ts j g v .
• v j calculates the proofs Π = {C, B, E, D, z 1 , z 2 , z 3 , φ} as: Step 3: After receiving the response from v j , ph j first checks whether v j is a registered PM vehicle by checking e(g, H(T ) s jk ) ? = e(S jk , H(T )). Then, ph i checks whether the timestamp T is relatively new. Next, ph i checks whether v j 's trust score ts j is with TL x by checking e(E, g) = H(C, B, E, D,Ĉ,Ê, H(T) s i ). If it holds, ph i will finally check whether TL x ≥ TL TH , and accept v j 's sensed data once its trust level satisfies the task requirement.

Privacy-Preserving Sensed Data Aggregation
For each PH vehicle ph i ∈ P, where i ∈ [1, m], we assume that a number of n i PM vehicles meet the trust level threshold requirement, and those eligible PM vehicles form a set V i = {v i1 , v i2 , · · · , v in i }. After a PH vehicle ph i receives the sensed data from its PM vehicles, it selects those eligible data and aggregates them locally before submitting to CS for global aggregation. Therefore, a privacy-preserving data aggregation scheme has been proposed.
Local Aggregation: Take one data category, d A , in − → d as an example. For simplicity, we omit the superscript and use D in i instead of D A in i . As described in Section 4.2, ph i collects encrypted sensed data from n i PM vehicles as D i1 = g d i1 · h r i1 , D i2 = g d i2 · h r i2 , · · · , D in i = g d in i · h r in i together with their corresponding encrypted trust score: C i1 = g ts i1 · h r i1 , C 2 = g ts i2 · h r i2 , · · · , C in i = g ts in i · h r in i . Then, ph i aggregates the encrypted data D ij and trust score C ij of each PM vehicle v ij ∈ V i where j ∈ [1, n i ] using the paring: e(D ij , C ij ) = e(g d ij · h r ij ) = e(g d ij · h r ij , g ts ij · h r ij ) = e(g, g) d ij ts ij · e(g, h) d ij r ij +ts ij r ij · e(h, h) Later ph i aggregates all aggregated data of all PM vehicles in V i together as: Finally, when ph i drives within the transmission range of an RSU, it submits φ i together with all pseudo-IDs in V i to CS via RSU.
Global Aggregation: Upon receiving reports from all PH vehicles in P, CS aggregates those data together as follows, and passes the final result Φ and all pseudo-IDs of PM vehicles to SP.

Aggregated Sensed Data Retrieval
Once SP receives the aggregated sensed data Φ from CS, it retrieves it using its secret key p: Similarly, we have: Since the aggregated data ∑ m i=1 ∑ n i j=1 d ij · ts ij is in a small space, we can use the method of exhaustion to retrieve them. From the pseudo-IDs in V i , i ∈ [1, m], SP is able to find their real identities and corresponding trust scores ts ij , i ∈ [1, m], j ∈ [1, n i ]. Then, SP computes the sensed data d 0 using a weighted majority method: For each category, there will be a sensed result; therefore the sensed result vector − → d 0 will be − → d 0 = (d A 0 , d B 0 , · · · , d Z 0 ). After shrinking 100 times, SP will get the final sensed result.

Privacy-Preserving Sensed Data Accuracy Evaluation
We assume that all PM vehicles in each PH vehicle ph i 's platoon contribute their sensed data in the task, where ph i ∈ P, i ∈ [1, m]. These PM vehicles form a set, denoted as V = {v 1 , v 2 , · · · , v n }, where n is the total number of these PM vehicles. After the sensed result is computed, SP would like to evaluate the sensed data accuracy in this task for each PM vehicle who contributes its data. Specifically, for v k ∈ V, from Section 4.3, we learn that CS stores v k 's pseudo-ID and encrypted sensed data for sensing category A as: The evaluation score f k ∈ [0, 1] for v k in this task can be calculated by following the steps below: Step 1: Given that there are many sensing categories in the sensed data vector, SP first defines a tolerance value for each sensing category in sensed result vector − → d . We denote these tolerance values as another vector The tolerance value can be explained in this way: if the difference between the sensed data and sensed result is larger than tolerance level, the sensed data accuracy is unacceptable and f k = 0. In addition, SP also defines the weights for different sensing categories as ω A , ω B , · · · which satisfies ω A + ω B + · · · + ω Z = 1.
Step 2: We take sensing category A as an example, and SP needs to calculate the difference between sensed result and sensed data ∆d When there are many vehicles in the VANET, the computation costs are to large for SP so it should be done by CS in a privacy-preserving way as follows. Note that, in case d A 0 is not an integer, SP rounds it off to the nearest integer.
• SP encrypts the sensed result d A 0 as: . In the case that d 0 < d k , CS also pairs the data as: Upon receiving CS's message, SP first finds the real identity of v k according to its pseudo-ID PID k , then retrieves ∆d A k = |d A 0 − d A k | using the same method of exhaustion proposed in Section 4.4.
Step 3: After calculating ∆d A k , SP uses a similar way to calculate other sensing categories as: ∆d B k , ∆d C k , · · · , ∆d Z k . Finally, the evaluation score for v k in this task is calculated by:

Dirichlet-Based Trust Management
For a specific PM vehicle v k ∈ V, the SP would like to evaluate its trustworthiness from its evaluation scores. Since the trustworthiness of v k reflects its performance in a long period, SP first collects v k 's evaluation scores in many tasks, denoted by a continuous random variable X (0 ≤ X ≤ 1). From these collected historic records, SP can estimate X's future distributions by using Dirichlet distribution. Since Dirichlet distribution is based on initial belief on an unknown event according to prior distribution, it provides a solid mathematical foundation for measuring the uncertainty of feedbacks based on historical data. Compared to Beta distribution, which is more appropriate in a binary satisfaction level [15], Dirichlet distribution is more appropriate for multi-valued satisfaction levels [16]. In our case, the evaluation trustworthiness of user vehicles are described by continuous trust scores. Therefore, SP uses Dirichlet distribution to estimate the performance of candidate vehicles in the future and then builds the trust model accordingly.
As described in Section 4.5, once v k finishes many sensing tasks, the SP is able to collect v k 's historical evaluation scores. Then, we let − → γ = {γ 1 , γ 2 , · · · , γ l } denote the vector of cumulative evaluation score and initial belief of X. With a posterior Dirichlet distribution, − → p can be modeled as: where ξ denotes the background information represented by − → γ . Let:γ 0 = ∑ l i=1 γ i . The expected value of the probability of X i ∈ (θ i−1 , θ i ] with the historical distribution of evaluation scores is given by: Consider the time factor of historical evaluation scores, and we introduce a forgetting factor η and give greater weight to more recent evaluation scores: where n is the total number of historical evaluation scores, and −→ S (0) is the initial belief vector when n = 0. Since no prior information is available, all elements of −→ S (0) have equal probability, which makes −→ S (0) = ( 1 l , 1 l , · · · , 1 l ). Parameter c 0 > 0 is a weight on the initial beliefs. In the ith sensing task of v k (i ∈ [1, n]), − → S (i) denotes the satisfaction level of its evaluation score, which contains only one element set to 1, corresponding to the selected satisfaction level, and all the other l − 1 elements set to 0. t i stands for the timestamp when the ith task took place and t is the moment of running the algorithm. The forgetting factor is η ∈ [0, 1], and a smaller η means that it is easier for the system to forget the historical records and vice versa. In order to defend against on-off attack [17], we choose an adaptive value as: η = c 1 · (1 − ts k ), where c 1 is a parameter to control the forgetting factor, and the larger value of c 1 makes the system more forgettable about the historical behaviors and vice versa. From the equation we can see that when v k has a high trust score, its forgetting factor is small, which means that those good performances will be easily forgotten. On the contrary, once v k provides low accuracy sensed data, its trust score gets lower and the forgetting factor becomes larger. This means that all of those poor sensing performances will be memorized, and it takes even longer time for v k to build up a high trust score again.
To calculate v k 's trust score when a sensing vehicle, we first assign the weight ω i to each satisfaction level θ i (i ∈ [1, l]). Let p i denote the probability that v k 's evaluation score is categorized into the satisfaction level of θ i . − → p = (p 1 , p 2 , · · · , p l )| ∑ l i=1 p i = 1. We model − → p using Equations (11)- (13). Let Y be the random variable denoting the weighted average of the probability of each evaluation score in − → p , and the trust score ts k of v k is represented as: where γ i is the accumulated evidence that v k 's evaluation score is with a satisfaction level of θ i .

Security Analysis
In this section, we discuss the security and privacy properties of the proposed TripSense scheme. In particular, following the design goals discussed early, we examine whether the proposed TripSense scheme can achieve the desirable security and privacy requirements.

The Proposed TripSense Scheme Is Privacy-Preserving for PM Vehicles
• PM vehicle's identity privacy and location privacy are preserved in the proposed TripSense scheme: In our proposed TripSense scheme, each PM vehicle v j ∈ V 0 uses pseudo-ID PID jk instead of a real identity in the network. Hence, the identity privacy can be achieved. In addition, to preserve the location privacy of the PM vehicle, v j changes its unlinkable pseudo-IDs at different trips and locations to ensure that its past and future trip and location information will not be linked by pseudo-IDs.
However, as analyzed in Section 2.2, v j still suffers from trust score link attack. Thus, in our proposed scheme, when a PH vehicle ph i checks whether its PM vehicle v j 's trust score satisfies the task requirement, it uses discrete trust levels TL x in place of accurate trust scores. In other words, v j can prove itself a highly trusted vehicle in front of ph i without revealing its exact trust score. In addition, v j 's trust score ts j is encrypted as C = g ts j h r and its trust level is in PS's signature A jk = g 1 yx +ts j +s jk +H(T j ) , which makes it impossible for the other PM vehicles to get either v j 's trust score or trust level.

•
The sensed data privacy preservation is achieved: Once the sensed data are aggregated by a PM vehicle v j , they are encrypted as: D = g d j · h r . In the whole process of local aggregation and global aggregation, those data are all aggregated without decryption until SP is reached, where SP is able to recover with its private key p. Therefore, unless the other vehicles know the private key p, the sensed data information will never be disclosed.

The Proposed TripSense Scheme Achieves Robustness Against Attacks Launched by Adversary
• Resilience to malicious sensing attack: According to our proposed scheme, the selfish or malicious vehicles that submit arbitrary sensed data will get low evaluation scores in the trust system. Those low evaluation scores will be accumulated and finally lead to low trust scores if they keep behaving in that way. When their trust scores are lower than threshold TL TH , their sensed data will be excluded from data aggregation or they will be given a low weight in data aggregation due to low trust scores. In both ways, the attacker will be mitigated in our proposed scheme.

•
Resilience to Trust Score Spoofing Attack: We assume that the majority of PM vehicles follow the scheme honestly, but we do not rule out a possibility that a small fraction of PM vehicles cheat PH vehicles by using the fake trust scores. There are two possible cases: one case is that the PM vehicle v j spoofs a higher trust score ts j with the hope to participate in a sensing task. However, in this case, when v j is using pseudo-ID PID jk , v j 's trust score ts j is signed by SP as A jk = g 1 yx +ts j +s jk +H(T j ) , where y x , s jk , H(T j ) indicate the trust level, private key and updating timestamp, respectively. Therefore, without knowing the spoofed trust score ts j 's master key y x , v j is unable to launch attack. Another case is that v j provides a fake trust score ts j after encryption as C = g ts j h r , E = B ts j g v . To deal with this type of attack, PH vehicle needs to check Resilience to Impersonation Attack: Both PH vehicle and PM vehicle could be impersonated by unqualified vehicles that want to take part in the sensing tasks. Specifically, for an impersonated PM vehicle, it may submit false data and escape punishments; for a PH vehicle, it may collect sensed data without submitting to CS and render the sensed data results incomplete. However, this attack can be thwarted by our proposed scheme. In the initialization phase in Section 4.1, both PH and PM vehicles will be given a pair of private and public keys once they are registered. In Section 4. = e(S jk , H(T )) before accepting the sensed data from v j . As a result, our proposed TripSense scheme is resistant against impersonation attacks.

Performance Evaluation
We will evaluate the performance of our proposed TripSense scheme in this section, the numerical data is generated in MATLAB. The performance metrics used in the evaluation are: (i) trust score variations for different PM vehicles in terms of the task number; (ii) detection ratio defined as the ratio of the number of detected malicious vehicles with respect to the total number of malicious vehicles with the increase of task number.

Simulation Settings
We design a simulation to evaluate our proposed TripSense scheme in which only a set of key factors are considered and specified in order to validate the PM vehicles' sensing accuracy. It is worth noting that the selected factors are related to the movement of vehicles and the packet collision problems. In this case, we simulate the proposed scheme in the environment of MATLAB where there are a total number of n registered PM vehicles. To ensure the fairness, we suggest that each PM vehicle provides m times sensing report in totally m independent tasks. The detailed simulation parameter settings is in Table 1. forgetting factor parameter 1 T 0 initial trust score 0.5

Modeling the Sensing Behaviors of PM Vehicles
Due to the lack of real data, we need to model the behaviors of not only PM vehicles who take part in the sensing tasks, especially for malicious vehicles, in order to test the performance of our proposed scheme.
Sensing accuracy level (SAL) of PM vehicles: We define a parameter as sensing accuracy level (PQL) l v ∈ [0, 1] to describe the capability of a PM vehicle to provide high accuracy sensed data. A PM vehicle with higher l v may submit more accurate sensing reports. Specifically, given a PM vehicle with l v , we use the beta distribution to describe the performance quality variable X of that PM vehicle, the probability density function of beta distribution can be expressed as: where is the probability that a PH vehicle with PQL of l v provides a service with the quality value of x ∈ [0, 1]. Higher values of l v imply that the PH vehicle provides a higher quality service. To achieve this goal, we define α and β as follows: where c 2 is the parameter to control the variance of the distribution. When c 2 is given a larger value, the performance quality values will have a larger variance and vice versa. For a PH vehicle with SAL of l v , the above model has the property of generating a service quality score that follows a beta distribution with the expectation E(X) = l v . We define that the malicious vehicles are vehicles with SAL l v ≤ 0.2.

Correctness
In this experiment, we target comparing the trust scores between malicious and honest sensing PM vehicles with different sensing accuracy levels (SALs). For a better comparison, we choose two honest PM vehicles with FAL of l v = 0.7 and l v = 0.95, respectively. In addition, other malicious PM vehicles who provide corrupt sensed data are also put into the system. After "50" number of tasks, we plot their trust scores in Figure 4.  We notice that the trust scores of all PM vehicles converge after "30" tasks. It is obvious that the honest PM vehicles with l v = 0.7 and l v = 0.95 get the highest trust scores after the experiments. On the contrary, both of the attackers get the low trust scores. We also notice that a PM vehicle with larger SAL will achieve higher trust scores, which shows the correctness of our trust model to identify PM vehicles according to their actual SALs.

Effectiveness
To demonstrate the effectiveness of our proposed scheme in detecting malicious PM vehicles, we define a proportion of ρ = 20% number of PM vehicles with the lowest l v as "malicious PM vehicles". After the m = 50 tasks, all PM vehicles will be re-ranked, so the detection ratio is defined as the ratio of "malicious PM vehicles" who remain lowest 20% in the new ranking list. Figure 5 depicts the detection ratio between our proposed trust-based sensing system with a sensing system without trust. From the figure, we can see that our proposed system's detection ratio increases quickly with the increase of task numbers, and, after around 5 tasks, it will be convergent to 92%. On the contrary, for a sensing system without a trust system, the selection of sensing PM vehicle is random and the detection ratio remains as low as 20%. Therefore, the effectiveness of our proposed scheme has been demonstrated.
For trust and reputation management, Zhang et al. have done a survey for effective trust management in VANET in [18]. Specifically, it discusses challenging issues for trust management caused by the important characteristics of VANET environments, and points out that robustness should receive particular attention. Patwardhan et al. present a distributed reputation management scheme for VANET, which enables vehicles to quickly adapt to changing local conditions and provides a bootstrapping method for establishing trust relationships [19]. However, their scheme is not quite scalable and robust. Different from the traditional entity-based trust model, Raya et al. suggest a data-oriented trust establishment framework [20]. By combining trust values of each piece of data together, their framework deals well with ephemerality and functions well in sparse areas. However, in dense urban areas, due to large amounts of data, their framework is less efficient.
There has also been extensive work on data aggregation schemes in VANET [26,27]. These works share the same assumption that vehicles or servers are trusted and the communications are secure, which, however, is not the case in real scenarios. In reality, data can be eavesdropped on and disclosed. Therefore, a lot of work has been done in privacy-preservation data aggregation [22][23][24]. Xing et al. have proposed M-PERM, a mutual privacy-preserving regression modeling approach to address the issue of keeping both participants and user data private while still utilizing them for analysis [22]. In this paper, data are aggregated at each node and each cluster, and finally at the user with maximum privacy protection. He et al. present two privacy-preserving data aggregation schemes for additive aggregation functions, which bridge the gap between collaborative data collection and data privacy [23]. Bilogrevic et al. have proposed a state-of-the-art privacy preservation framework to preserve data utility and simultaneously provide user privacy [24]. Users in this framework only contribute encrypted and aggregated models of their files to the aggregator to tackle trust and incentive challenges.
Burke is the first to introduce the concept of participatory sensing, and describes an initial architecture to enhance data credibility, quality, privacy, and 'shareability' [25]. Ganti gives an overview of crowdsensing by introducing existing mobile crowdsensing applications and explaining their unique characteristics, illustrating various research challenges, and discussing possible solutions [5].
Combining the above privacy preserving data aggregation techniques and trust models together, our proposed TripSense scheme is focused on evaluating the platoon member vehicles' sensing ability based on the accuracy of their historical sensed data. Specifically, there are several aspects which make our proposed scheme different: first, we establish a trust system as a long-term evaluating metric. Second, we make use of platoon head vehicles for authentication of local data aggregation, which greatly reduces the communication overhead between vehicles and infrastructures, hence making it very suitable for VANET. Third, our proposed scheme is privacy-preserving on platoon member vehicles' identities, locations and data.

Conclusions
In this paper, we have proposed a trust-based privacy-preserving scheme for vehicular platoon crowdsensing called TripSense. The proposed scheme mainly focuses on establishing a trust model to improve the sensed data reliability and accuracy of the whole system, while preserving the location and data privacy of sensing vehicles in the process of sensing vehicle selection, sensed data aggregation and evaluation. Detailed security analysis shows that the proposed TripSense scheme can not only achieve vehicle's identity privacy, location privacy and data privacy, but it also is resistant against adversary attacks on malicious sensing reports. Moreover, through extensive performance evaluation, we have demonstrated that our proposed scheme can achieve better sensing accuracy. In our future work, we will consider more scenarios in crowdsensing rather than data aggregation. In addition, we may also consider the collusion among PM and PH vehicles to launch attacks in order to victimize other vehicles.