Secure and Privacy-Preserving Body Sensor Data Collection and Query Scheme

With the development of body sensor networks and the pervasiveness of smart phones, different types of personal data can be collected in real time by body sensors, and the potential value of massive personal data has attracted considerable interest recently. However, the privacy issues of sensitive personal data are still challenging today. Aiming at these challenges, in this paper, we focus on the threats from telemetry interface and present a secure and privacy-preserving body sensor data collection and query scheme, named SPCQ, for outsourced computing. In the proposed SPCQ scheme, users’ personal information is collected by body sensors in different types and converted into multi-dimension data, and each dimension is converted into the form of a number and uploaded to the cloud server, which provides a secure, efficient and accurate data query service, while the privacy of sensitive personal information and users’ query data is guaranteed. Specifically, based on an improved homomorphic encryption technology over composite order group, we propose a special weighted Euclidean distance contrast algorithm (WEDC) for multi-dimension vectors over encrypted data. With the SPCQ scheme, the confidentiality of sensitive personal data, the privacy of data users’ queries and accurate query service can be achieved in the cloud server. Detailed analysis shows that SPCQ can resist various security threats from telemetry interface. In addition, we also implement SPCQ on an embedded device, smart phone and laptop with a real medical database, and extensive simulation results demonstrate that our proposed SPCQ scheme is highly efficient in terms of computation and communication costs.


Introduction
In recent years, with the popularization of wearable sensors and telemedicine, body sensor networks (BSN), which comprise multiple sensor nodes and a coordinator worn on a human body, can collect the personal information of the human body (such as heart rate, blood glucose and electrocardiogram) by sensor nodes [1][2][3][4][5][6]. The collected information first is delivered to the coordinator, then is forwarded to a remote server through a network interface for further processing [7,8]. As shown in Figure 1, vast quantities of the sensor users' personal data are collected by body sensors and recorded by a data center per second. Since large-scale aggregate analysis of personal data can yield valuable results and insights, which can address public health challenges and provide new avenues for scientific discovery [9], data center trends toward providing on-demand data query service for users. However, this requires huge storage space and enormous computing resources, which are tremendous burdens on data centers. As the survey [10] shows that roughly 55 percent of respondents plan to use cloud services for analysis queries, cloud computing is a promising way to integrate personal data resources and to provide a uniform query service to researchers [11][12][13][14][15]. Since personal data are regarded as sensitive and private assets of sensor users, how to provide accurate data query services without revealing confidential personal data has attracted considerable interest recently. To address these security and privacy issues, differential privacy [16], homomorphic encryption [17] and searchable encryption [18] are widely used. However, differential privacy cannot provide accurate query results for users; traditional homomorphic encryption schemes are time consuming and resource consuming; searchable encryption cannot provide query services over encryption data. Therefore, the above methods are not suitable for multi-dimension personal data query services.
Different from the methods discussed above, in this paper, we focus on the threats from telemetry interface [19,20] and propose a new secure and privacy-preserving body sensor data collection and query scheme, called SPCQ, for outsourced computing. In the proposed scheme, users' personal information is collected by body sensors of different types and converted into multi-dimension data, and each dimension is converted into the form of a number and uploaded to the cloud server in ciphertext. After that, the cloud server provides query services to data users by the privacy-preserving weighted Euclidean distance contrast (WEDC) algorithm for multi-dimension vectors. Meanwhile, SPCQ can protect the confidentiality of sensor users' personal data and the privacy of data users' query, with low overheads in computation and communication.
The remainder of this paper is organized as follows. In Section 2, we review the related work. In Section 3, we define the system model and security model and identify our design goal. Additionally, in Section 4, we recall the bilinear pairing of the composite order, Euclidean distance and the 2DNF cryptosystem as the preliminaries. Then, we present our SPCQ scheme in Section 5, followed by the security analysis and performance evaluation in Sections 6 and 7, respectively. Finally, we draw our conclusions in Section 8.

Related Work
In recent years, how to achieve operations over encrypted data has attracted considerable interest, and most of the proposed schemes are based on differential privacy, homomorphic encryption and searchable encryption.
The differential privacy notion was first formulated by Dwork [16], which can provide information about the database while simultaneously ensuring very high levels of privacy. Barthe et al. [21] presented CertiPriv, a machine-checked framework for reasoning about differential privacy built on top of the Coq proof assistant. The scheme provided a framework for fine-grained reasoning about an expressive class of confidentiality policies. Additionally, Tschantz et al. [22] presented the first results towards automated verification of source code for differentially-private interactive systems, which developed a formal probabilistic automaton model of differential privacy for systems by adapting prior work on differential privacy for functions. To achieve automated verification of distributed differential privacy, Eigner et al. [23] presented the framework by comprising a symbolic definition of differential privacy for distributed databases that takes into account Dolev-Yao intruders, and the scheme can overhear, intercept and synthesize the cryptographic messages exchanged on the network. However, the above differential privacy scheme cannot provide accurate query services because of added randomized noise.
Homomorphic encryption is a usual method to achieve data operations over encrypted data without decrypting it. Rivest et al. [17] first introduced homomorphism and presented four solutions to achieve homomorphic encryption. Then, Goldwasser et al. [24] proposed the first semantically-secure homomorphic encryption scheme, and many other additively homomorphic encryption schemes with proofs of semantic security [25][26][27] were presented. To achieve both additive and multiplicative homomorphisms, Gentry [28] designed a full homomorphic encryption scheme based on the mathematical object ideal lattices and uses the bootstrapping technique. It is semantically secure, and the security of the scheme is based on the split-key distinguishing problem. Then, other different full homomorphic encryption schemes were based on the elementary theory of algebraic number fields [29] and non-circuit [30]. However, most of these existing homomorphic encryption schemes have high time complexities, which is not suitable for practical use.
Keyword searchable encryption schemes usually build an encrypted searchable index, such that its content is hidden from the server unless it is given appropriate trapdoors generated via secret keys [31]. Song et al. [18] firstly studied searchable encryption in the symmetric key setting, and Boneh et al. [32] presented the first searchable encryption construction, where anyone with a public key can write to the data stored on the server, but only authorized users with a private key can search. However, public key solutions are usually very computationally expensive, and the keyword privacy could not be protected in the public key setting. To solve the multi-keyword ranked search over encrypted data problem, Cao et al. [33] proposed a basic idea of MRSE using secure inner product computation, and two improved MRSE schemes were given to achieve various stringent privacy requirements in two different threat models. However, searchable encryption can only provide keyword query rather than accurate computation query, which is not suitable for multi-dimension personal data. Different from the above works, our proposed SPCQ scheme aims at the efficiency, accurate and privacy issues, and based on an improved homomorphic encryption technology over composite order group, we develop an efficient and privacy-preserving body sensor data collection and query scheme for outsourced computing. In particular, the proposed SPCQ can be easily implemented on different terminals, and the processing of the query is just needed in the cloud server. The computational costs in both the terminal and cloud server are acceptable.

Models and Design Goals
In this section, we define the system model, security requirements and identify our design goal.

System Model
In our system model, we mainly focus on how to offer secure personal data collection and efficient query service over confidential personal data in the outsourced cloud server. Specifically, the system consists of four parts: register center (RC), sensor user (SU), data user (DU) and cloud server(CS), as shown in Figure 2. • RC is a trusted third party, which bootstraps the system initialization by generating system parameters and providing a registration system for SU, DU and CS. • SU collects real-time personal data by body sensors, and all SUs' personal data will be uploaded to CS. To guarantee the data confidentiality, SU will perform some encryption operations before uploading data to CS. • DU (e.g., the researcher), who is registered in RC, can send query request to CS for analysis of the accurate personal data items stored in it. To guarantee the privacy of DU's query information, DU will perform some encryption operations during the process of the query. Meanwhile, SUs' personal data should be kept secret from unauthorized users. • CS is composed of many data storage nodes and computing nodes, stores more than a billion encrypted personal data items from SUs and provides accurate query services to DUs over encrypted personal data. CS mainly performs two functions: authentication and computing over encrypted data. The authentication component is used to check the identity of SUs and DUs, while the computing in encryption component is used to search and compute encrypted data items with DUs' encrypted query request.

Security Requirements
The confidentiality of personal data from SUs and the privacy of DU's query information are crucial for the success of a secure and privacy-preserving body sensor data collection and query scheme. In our security model, we consider CS is honest-but-curious. Specifically, CS faithfully executes the operations to search DUs' demanded information over the encrypted personal data from SUs, but it also tries to analyze the query information and encrypted data to obtain users' sensitive information. Therefore, in this paper, we focus on the threats from telemetry interface, and the following security requirements should be satisfied in a secure and privacy-preserving body sensor data collection and query scheme. Note that, in our current model, we do not consider that any two parties collude to disclose the third party's privacy, i.e., the collusion attack on privacy is beyond the scope of this work and will be discussed in future research.

•
Confidentiality. SUs' sensitive personal data should be kept secret from CS, i.e., even if CS stores all personal data from SUs, it cannot identify any data item. In this circumstance, the confidentiality of the personal data can be guaranteed.

•
Privacy. DU's query information should be secrete from CS, i.e., even if CS obtains all DUs' queries and corresponding responses, it cannot identify DUs' query information accurately. Additionally, other users (e.g., SUs and other DUs) cannot get any information of DU. In this circumstance, the privacy-preserving requirements of DU's query information can be guaranteed. In addition, the privacy requirement also includes CS's responses, i.e., only legal DU can decrypt the corresponding response.

•
Authentication. Authenticating an encrypted query that is really sent by a legal DU and has not been altered during the transmission, i.e., if an illegal DU forges a query, this malicious operation should be detected, and only correct queries can be received by CS. The responses from CS should also be authenticated so that DUs can receive authentic and reliable query results. Moreover, the encrypted personal data from SUs can be authenticated by CS.

Design Goals
Under the aforementioned system model and security requirements, our design goal is to develop a secure and privacy-preserving body sensor data collection and query scheme for outsourced computing, which will provide secure personal data collection and storage for SUs and privacy-preserving accurate Euclidean distance query service for DUs. Specifically, the following three objects should be achieved.

•
The Security Requirements Should be Guaranteed. If the personal data collection and query scheme does not consider the security, SUs' data assets and DU's actual query information could be disclosed. Then, the data collection and query service cannot jump in popularity. Therefore, the proposed scheme should achieve the confidentiality, privacy and authentication simultaneously.

•
A Personal Data query Service with High Accuracy Should be Guaranteed. The user experience is one of the most critical aspects of data query service, and it is important that the precision of Euclidean distance query service cannot be lowered when protecting DU's privacy. Therefore, the proposed scheme should also provide highly precise and reliable query service.

•
The Effectiveness in Computation and Communication Should be Achieved for Various Terminal Devices.
The personal data may be collected by different terminal devices, such as smart phone, embedded device, etc. Although the performance of terminal devices is continuously improved today, the battery is still limited. The proposed scheme should also consider the effectiveness in terms of computation and communication to reduce the power consumption of different terminals. Moreover, data users can access the data query service by mobile terminals, in order to lower the energy cost, the efficiency of the query service is very important. Furthermore, although CS is featured with high performance in storage and computation, since thousands of DUs will query the data at the same time, the efficiencies of computation and communication are still challenging.

Preliminaries
In this section, we recall the bilinear pairing technique, Euclidean distance and 2DNF cryptosystem, which serve as the basis of our proposed SPCQ scheme.

Bilinear Pairing of Composite Order
Let G and G t be two multiplicative cyclic groups of the same composite order N = p 1 · p 2 (where p 1 and p 2 are big primes), and g is a generator of G. We suppose e : G × G → G t denotes the bilinear map (also referred to as a paring), which has the following properties.

Euclidean Distance
Euclidean distance is a common definition of distance, it corresponds to the true distance of two points in n-dimensional space. The Euclidean distance between two points P and Q is the length of the line segment connecting them in Euclidean space. In Cartesian coordinates, if P = (p 1 , p 2 , ..., p n ) and Q = (q 1 , q 2 , ..., q n ) are two different n-dimensional vectors, then the Euclidean distance from P to Q or from Q to P is given by the Pythagorean formula: To satisfy the practical circumstance and provide DUs with an accurate query service, we set different weight numbers (w 1 , w 2 , ..., w n ) for each dimension to form a weighted Euclidean distance:

2DNF Cryptosystem
The 2DNF [34] cryptosystem is a public-key system that can achieve the homomorphic properties, which resembles the Paillier [27] and Okamoto-Uchiyama [26] encryption schemes. Specially, the 2DNF cryptosystem consists of three sections: key generation, encryption and decryption. • Key generation Gen(µ). Given a security parameter µ ∈ Z + , two µ−bit prime numbers p 1 and p 2 are first chosen, and N = p 1 · p 2 is computed. Two groups of the same order N are generated, g and u are two generators of G. Then, h = u p 2 is computed as a random generator of G's subgroup with order p 1 . Finally, the public key PK = (n, G, G t , e, g, h) and private key SK = p 1 are generated. • Encryption. We assume the message space consists of integers in the set {0, 1, · · ·, T} with T < p 2 . Then, to encrypt a message m with public key PK, a random number r is selected from {0, 1, · · ·, N − 1}, and the ciphertext C = g m h r ∈ G is computed. • Decryption.
To decrypt ciphertext C with private key SK = p 1 , notice that C p 1 = (g m h r ) p 1 = (g p 1 ) m ; letĝ = g p 1 . To recover the corresponding message m, we need to compute the discrete log of C p 1 =ĝ m , whereĝ = g p 1 . Since 0 ≤ m ≤ T, it only takes expected timeÔ( √ T) using Pollard's lambda method [35] to get the message m.
Note that the decryption time in the system would be the polynomial time in the size of the message space T s . Hence, it is obvious that the cryptosystem is efficiently suitable for short messages.

Proposed SPCQ
In this section, we present a secure and privacy-preserving body sensor data collection and query scheme for outsourced computing, which mainly consists of three phases: system initialization, secure data collection and privacy-preserving query service. For an easier expression, the definition of notations to be used in the proposed SPCQ scheme are shown in Table 1. Table 1. Definition of notations in the proposed secure and privacy-preserving body sensor data collection and query (SPCQ) scheme.

Notation Definition
µ the system security parameter p 1 , p 2 two big prime numbers N = p 1 · p 2 the product of p 1 and p 2 G, G t the bilinear groups with order N e, g ,h the parameters of bilinear groups the feature parameters of a data item W = (w 1 , w 2 , ..., w n ) the weighted number of different dimensions d the weighted Euclidean search range of DU's query the encrypted search index of a data item q 1 , q 1 , ..., q n , q n DU's encrypted query parameters

System Initialization
We consider RC as the trusted third party, which bootstraps the system. In the system initialization phase, RC first selects a security parameter µ, generates system parameters (G, G t , p 1 , p 2 , e, g, h, N = p 1 · p 2 ) by executing Gen(µ) and calculates two secret bases B 1 = g p 1 and B 2 = e(g, g) p 1 . Next, RC decides a multi-dimension weight vector W = (w 1 , w 2 , ..., w n ) that each number denotes the weight value of the corresponding dimension. Then, RC picks a random number r RC ∈ Z * N as its private key SK RC and computes the corresponding public key PK RC = g SK RC . In addition, RC determines an asymmetric cryptographic algorithm E(), i.e., ECC, and a secure cryptographic hash function H(), where H : {0, 1} * → Z * N and Z * N is a nonzero group of integer modulo N. Finally, RC publishes the system parameters as N, G, G T , e, g, h, PK RC , E(), H() and keeps p 1 , SK RC secretly.
When an SU or DU registers itself to RC, it picks a random number r ∈ Z * N as the private key SK and computes and submits the corresponding public key PK = g SK to RC for the signature. Then, RC sends B 1 , B 2 , W to the registered SU and DU through a secure channel. Similarly, when CS registers itself, it generates the private and public key pair as SK CS ∈ Z * N , PK CS = g SK CS , and submits PK CS to RC for the signature. After that, RC calculates HP j = H(B 2 j 2 ), where 0 ≤ j ≤ η and η is a big integer whose length is much less than 256 bits, and structures the set of data values HPS = HP 0 , HP 1 , ..., HP η . Then, RC ranks the dataset from the smallest to the largest and sends the ordered dataset HPS to CS. It is noteworthy that B 1 , B 2 is not given to CS. After providing the registration function for SU, DU and CS, RC goes offline or suffers slowdowns against the single point of attack, since it has many secret parameters.

Secure Data Collection
SUs collect their real-time personal data through body sensors, and the data can be described by n-dimensional vectors (x i1 , x i2 , ..., x in ). Before uploading to CS, each data item in SU should be processed as follows.
, where B 1 is only known by registered SUs and DUs; this operation can resist the exhaustive attack. • SU chooses n random numbers r 1 , r 2 , ..., r n ∈ Z * N and computes the encrypted search index .., f in , f in ), which can be implicitly formed as follows.
SU makes a signature Sig = H(F i ID TS 1 ) SK using the private key SK, where TS 1 is the current timestamp to resist potential replay attack, and ID is the identify number of SU. Then, SU sends the signed data item F i ID TS 1 Sig to CS.

•
After receiving the signed data item from SU, CS first checks the timestamp TS 1 and verifies the signature Sig by computing whether e(g, Sig) = e(PK, H(F i ID TS 1 )). If it does hold, the signature is accepted, since e(g, Sig) = e(g, H(F i ID TS 1 )) SK =e(PK, H(F i ID TS 1 )). Then, CS stores the data item F i .

User Query Generation
Registered DU U j is able to send a query request to CS without revealing his or her query information by the following steps.
• U j first decides a data item with n feature parameters {y 1 , y 2 , ..., y n } that he or she is willing to query and computes y 1 = y 1 + H(B 1 ), y 2 = y 2 + H(B 1 ),..., y n = y n + H(B 1 ) to increase the sample space. • U j determines the weighted Euclidean distance search range d from the data item that he or she wants to query and computes encrypted query (q 1 , q 1 , q 2 , q 2 , ..., q n , q n ) as follows. • U j uses the public key of CS PK CS to compute Q = E PK CS (q 1 ||q 1 ||q 2 ||q 2 ||...||q n ||q n ).
• U j makes a signature Sig j = (H(Q U j TS 2 )) SK U j using his or her private key SK U j , where TS 2 is the current timestamp to resist potential replay attack. Then, U j sends the encrypted data query request Q U j TS 2 Sig j to CS.

Search and Response
After receiving encrypted data query request Q U j TS 2 Sig j from U j , CS executes the following procedures to provide personal data query service.
• CS first checks the timestamp TS 2 and verifies the signature Sig j by computing whether e(g, Sig j ) = e(PK DU j , H(Q U j TS 2 )). If it does hold, the signature is accepted, since e(g, Sig j ) = e(g, H(Q U j TS 2 )) SK U j = e(PK U j , H(Q U j TS 2 )).
• CS uses its secret key SK CS to decrypt Q and obtain q 1 , q 1 , q 2 , q 2 , ..., q n , q n . Then, CS executes the proposed WEDC algorithm as follows.
-For each data item F i stored in it, CS computes the search criteria D i as follows.
·...· B w n y 2 n 2 = e(g x i1 h r 1 , g p 1 ·2w 1 ·y 1 ) ·...· e(g x in h r n , g p 1 ·2w n ·y n ) e(g, g) p 1 2w 1 ·x i1 y 1 · ... · e(g, g) p 1 ·2w n ·x in y n B w 1 x i1 CS computes HD i = H(D i ) and searches HD i within the evaluation dataset HPS by binary search algorithm to confirm whether HD i belongs to it. If HD i belongs to HPS, it means that data item F i satisfies DU's query condition; add one to M num , where M num denotes the number of data items that meets the query condition; otherwise, data item F i does not meet DU's query condition.

-
After traversing through all data items, CS gets the number of data items that satisfy DU's query condition M num and the number of all data items N num , which can help DU to achieve the statistical query of the personal data. Then, CS encrypts N num and M num with the asymmetric encryption algorithm E() and the public key of U j PK U j and uses its private key to make a signature Sig CS = H(E PK U j (N num M num ) TS 3 ) SK CS .
-Finally, CS sends E PK U j (N num M num ) TS 3 Sig CS to U j .
Correctness of WEDC Algorithm. Here, we prove that CS can provide the correct statistical query service for DUs by executing the WEDC algorithm. Specifically, taking a look at the exponential of search criteria D i = B 2 d 2 −(w 1 (x i1 −y 1 ) 2 +...+w n (x in −y n ) 2 ) , we know B 2 is the generator of a cyclic group with order p 2 , which is selected larger than 512 bits, and w 1 (x i1 − y 1 ) 2 + ... + w n (x in − y n ) 2 is the square of the weighted Euclidean distance between DU's query data and data item F i . In addition, since search range d is usually less than 10,000, we can define η= 10, 000 and η 2 = 100, 000, 000. Therefore, if F i meets DU's query condition, i.e., 0 ≤ d 2 − (w 1 (x i1 − y 1 ) 2 + ... + w n (x in − y n ) 2 ) ≤ d 2 ≤ 100, 000, 000, the corresponding H(D i ) must be in HPS, and F i will be counted as eligible data item; otherwise, 100, 000, 000; meanwhile, F i will not be counted as an eligible data item. Through the method we have stated, CS can correctly judge whether F i satisfies DUs' query condition by utilizing WEDC algorithm.

Query Result Reading
After receiving E PK U j (N num M num ) TS 3 Sig CS from CS, U j checks TS 3 and the signature Sig CS by verifying whether e(g, Sig CS ) = e(PK CS , H(E PK U j (N num M num ) TS 3 ) SK CS ) holds. Then, U j decrypts E PK U j (N num M num ) with SK U j to obtain the query result.

Security Analysis
In this section, we analyze the security properties of the proposed SPCQ scheme. Specifically, following the security requirements discussed earlier, our analysis will focus on how the proposed SPCQ scheme can achieve personal data confidentiality, DU's query information privacy and source authentication of the personal data, query request and response.

•
The Proposed SPCQ Can Achieve Confidential Personal Data. In our proposed SPCQ, personal data are secret from CS and DUs, although CS stores all encrypted data items and receives all query requests. First, since feature parameters (x i1 , x i2 , ..., x in ) collected by body sensors usually cover a smaller scope, to avoid the exhaustive attack against (x i1 , x i2 , ..., x in ) by Pollard's lambda method, feature parameters are disturbed by calculating . In this way, the sample space is increased to more than 512 bits, which can prevent exhaustive attack efficiently. Before uploading to CS, feature parameters are encrypted to corresponding search index where r 1 is a random number to guarantee that for the same feature parameter, different SUs can obtain different search indexes. The above operations can achieve data perturbation and substitution and prevent CS from directly accessing SUs' personal data. Moreover, to avoid the guessing attacks for B 2 in the evaluation dataset HPS, the relationship between B 2 and HP j is hidden by a secure hash function H(), where HP j = H(B j 2 2 ). Therefore, from the above three aspects, CS cannot obtain the feature parameters of personal data according to uploaded data items. In addition, since DUs only can get the query statistic result from CS, SUs' personal data are secret from DUs. • DU's Query Information is Privacy-Preserving in the Proposed SPCQ. In our proposed SPCQ, similarly, DU's query condition is encrypted before being sent to CS. Specifically, DU's query information y 1 , y 2 , ..., y n is disturbed by calculating y 1 = y 1 + H(B 1 ), y 2 = y 2 + H(B 1 ), ..., y n = y n + H(B 1 ), which can resist the exhaustive attack by Pollard's lambda method. Then, the query condition is encrypted by calculating q 1 = B w 1 ·y 2 1 −d 2 2 , q 1 = B 1 2w 1 ·y 1 , q 2 = B w 2 ·y 2 2 2 , q 2 = B 1 2w 2 ·y 2 , ..., q n = B w n ·y 2 n 2 , q n = B 1 2w n ·y n , which can prevent CS from directly accessing the query data item and search range d. Since B 1 and B 2 are only known by SUs and DUs, and the collusion attack is not considered in the current security model, CS cannot obtain query information from the query request during the query process. Specifically, encrypted request and encrypted data are computed in CS to obtain the result, which will be sent back to DU, and CS also cannot obtain any useful information of DUs' queries, even in the continuous search queries environment. Meanwhile, CS still can provide accurate query service to DUs by the proposed WEDC algorithm. Concretely, CS traverses all stored data items to compute the search criteria D i and find out all data items that satisfy the query condition, then it achieves the query statistic result and sends it to DU. It is notable that the result does not have particular meaning without any other useful information of DUs' queries. Moreover, SUs are not involved in the query process, and DU's query request is encrypted by CS's public key PK CS before being sent to CS, so SUs cannot get DU's query information even if they steal the request by eavesdropping. In addition, the response is encrypted by DU's public key before being sent by CS, and thus, SUs and other registered DUs cannot decrypt the response. Therefore, from the above four aspects, DU's query information is privacy-preserving in the proposed SPCQ.

•
The Authentication of the Personal Data, Query Request and Response are Achieved in the Proposed SPCQ. In the proposed SPCQ, personal data from SUs, registered DU's query request and the response of CS are signed by the BLS [36] short signature. Since the BLS short signature is provably secure under the CDH problem in the random oracle model, the source authentication can be guaranteed. Specifically, personal data from SU is signed by computing Sig = H(F i ID TS 1 ) SK , where TS 1 is the current timestamp to resist potential replay attack and SK is SU's private key to make sure only itself can make the signature. After receiving the signed data item, CS computes whether e(g, Sig) = e(PK, H(F i ID TS 1 )) to verify the source of the signature. Similarly, the registered DU's query request and the response of CS are signed by the above operations. Moreover, since the unregistered user (such as SU and DU) does not have secret keys B 1 and B 2 , he or she cannot upload personal data item or submit valid query request to CS. Therefore, personal data and the query request from the unregistered user and the response from the mendacious CS can be detected in the proposed SPCQ.
From the above analyses, we can conclude that SPCQ is secure and privacy-preserving and can achieve our security design goal.

Performance Evaluation
In this section, we evaluate the performance of our proposed SPCQ scheme in terms of the computation complexity of SU, DU and CS. In order to measure the integrated performance of SPCQ in a real environment, we also implement SPCQ on an embedded device, smart phone and laptop with a real medical database in a wireless network, by using a custom simulator built in JAVA. Specifically, an embedded device with a 650-MHz dual-core processor, a smart phone with a 1.4-GHz quad-core processor, 2 GB RAM, Android 4.0, and a laptop with a 2.0-GHz 4-core processor, 8 GB RAM, are chosen to simulate SU, DU and CS. Based on our proposed SPCQ scheme, a personal data gathering application is installed on the embedded device to simulate SU; a personal data query application built by JAVA, named SPCQ.apk, is installed on the smart phone to simulate DU; and simulators of CS are deployed in a laptop. In order to evaluate SPCQ in a real environment, a diabetes database [37], which has 100,000 items with 55 attributes, is selected as the data source, and the corresponding storage space is 692.37 MB. Meanwhile, an evaluation dataset HPS with 10,000 preprocessed SHA-256 values is constructed, which just needs a 312.5-KB storage space. In addition, we define p 1 and p 2 as 512-bit prime numbers, and η 2 as 100,000,000.

Computation and Communication Costs
The proposed SPCQ scheme can achieve effective personal data query service for CS and DUs. Specifically, we assume the dimension of each personal data item is n, and SU needs n multiplication operations and 3n exponentiation operations for each personal data item. When a DU U j generates an encrypted query (q 1 , q 1 , q 2 , q 2 , ..., q n , q n ), it requires 2n exponentiation operations in Z p 2 . After receiving the query from U j , CS firstly computes the search criteria D i for each data item F i stored in it, which takes n * N pairing operations and 2n * N multiplication operations for checking N resource items. After receiving the response from CS, U j decrypts the query statistic result with asymmetrical encryption algorithm, which is considered negligible compared to exponentiation and pairing operations. Denote the computational costs of an exponentiation operation in Z p 2 /Z N 2 , a multiplication operation in G/G t /Z N 2 and a pairing operation by C e , C m and C p , respectively. Then, for SU, DU and CS, the computational costs are 3n * C e + n * C m , 2n * C e and nN * C p + 2nN * C m in the proposed SPCQ.
Different from other time-consuming encryption techniques, the proposed SPCQ uses improved homomorphic encryption technology over a composite order group, which can provide accurate personal data query service and largely reduce the encryption time for the smart phone. In the following, for the comparison with SPCQ, we selected a privacy-preserving range query scheme (PPRQ) [38], which can only provide a one-dimensional range query service to DUs. Let m be the bit length of the attribute values, and the computational costs of SU, DU and CS are 2n * C e + n * C m , 4m * C e + 2m * C m and 23mnN * C e + 23mnN * C m , respectively.
Due to the factor of our proposed SPCQ, we take an n-dimension query into consideration. Then, we present the computation complexity comparison of the proposed SPCQ and PPRQ in Table 2, and it is obvious that our proposed SPCQ can achieve a privacy-preserving personal data query service with low computation complexity in DU and CS.

Phase of Scheme SPCQ PPRQ
SU 3n * C e + n * C m 2n * C e + n * C m DU 2n * C e 4m * C e + 2m * C m CS nN * C p +2nN * C m 23mnN * C e +23mnN * C m For better comparison, we have implemented SPCQ and PPRQ in JAVA. In Figure 3a,b, we have plotted the computational overheads of SPCQ and PPRQ varying with different search ranges in DU and CS. From the two figures, we can see that in both DU and CS, the computational overheads of SPCQ and PPRQ vary slightly by increasing the search range, while the overheads of PPRQ are much higher than those of our proposed SPCQ scheme. It can be obviously shown that the SPCQ scheme largely reduces the computational complexity in DU and CS. In addition, we have made the comparison of communication costs between SPCQ and PPRQ, as shown in Table 3. In SPCQ, the communication length in DU is 164 * n bytes, which is much less than that of PPRQ, whose communication length in DU is 512 * n bytes; the communication length in CS is 256 bytes, while that of PPRQ in CS is 1024 * N bytes; in addition, two times of communications between DU and CS are needed in both SPCQ and PPRQ. As we mentioned above, SPCQ is more efficient than PPRQ in terms of communication costs.

Simulation and Evaluation
To have a better evaluation of our proposed SPCQ, we analyze the factors that affect the computational costs of SU, DU and CS in detail. In addition, we evaluate the integrated performance of SPCQ.

SU
In our proposed SPCQ scheme, SUs collect their real-time personal data from body sensors and upload these data to CS per certain period. Before being sent to CS, the gathered personal data should be operated to obtain ( f i1 , f i1 , f i2 , f i2 , ..., f in , f in ). Therefore, we have chosen different dimensions of personal data to illustrate the SU's computational cost on the embedded device. As shown in Figure 4a, the dimensions of collected personal data are chosen from 5 to 40, and the average computational cost increases linearly with the increase of the dimension. Meanwhile, the computation on the embedded device is less than 150 milliseconds, which is acceptable for SU.

DU
The query response time of DU (i.e., smart phone) is an important aspect for our proposed SPCQ scheme, and the computational operations in the smart phone are query generation and result reading. Since the result reading only requires DU to decrypt the query result, which is negligible, therefore we have chosen different dimensions of the query request and different search ranges to illustrate DU's computational cost. To observe the computational cost of the smart phone, the dimensions of each query are chosen from 5 to 40, and the search ranges are chosen from 100 to 800. Each condition is executed 100 times, and we have calculated the average time for different dimensions of the query request and search range. As shown in Figure 4b, the average computational cost increases linearly with the increase of the dimension, and it is nearly the same with different search ranges. The reason is that, when the smart phone generates a query request, it computes encrypted query parameters with the query condition. For the query condition with a high dimension, it takes more time to get the request, while different search ranges do not affect the computational cost.

CS
In our proposed SPCQ scheme, after receiving a query request from DU, CS will compute the search criteria D i for each data item it stored, by using bilinear pairing over the composite order group, which is the main computation overhead of CS, i.e., the efficiency of CS is impacted by the number of encrypted data resources, the dimension of each data item and the search range d. It is obvious that the computational cost in CS is increased with the number of encrypted data resources. Therefore, we have chosen different dimensions of data resources and different search ranges to illustrate the computational cost. As shown in Figure 4c, the dimensions of data resources are selected from 5 to 40, and eight search ranges of DU's requests are selected from 100 to 800. We can learn from the figure that the computational cost of CS is nearly the same with different search ranges; meanwhile, the computational cost increases linearly with the increase of the data resource's dimension.

Integrated Performance in a Real Environment
In order to evaluate the integrated performance of our proposed scheme, SPCQ is deployed in a real environment with the real medical database mentioned above. Specifically, we have chosen 1000 items from the diabetes database, and the information of resources and corresponding encryption information is stored in CS, respectively. In addition, the smart phone and CS are connected through an 802.11g WLAN, and when DUs input the query data item and search range by SPCQ.apk, the smart phone will send a query request to CS and get the response through WLAN. We have run 100 times to evaluate the performance of SPCQ with eight search ranges (from 100 to 800), as shown in Figure 4d, the runtime is about five seconds, which is acceptable in a real environment.

Conclusions
In this paper, we have proposed a secure and privacy-preserving body sensor data collection and query scheme, called SPCQ, for outsourced computing. Based on an improved homomorphic encryption technology over composite order group, the proposed SPCQ scheme can achieve the confidentiality of SUs' personal data and privacy-preserving of DU's query information. Specifically, SUs' personal data are collected in the form of multi-dimension vectors and uploaded to CS in ciphertext, and the data query request from the registered DU can be directly performed over ciphertext in CS, then the query result only can be decrypted by the registered DU. Therefore, DU can get an accurate query result without divulging his or her query information. Detailed security analysis shows its security strength and privacy-preserving ability, and extensive experiments are conducted to demonstrate its efficiency.