CSRQ: Communication-Efficient Secure Range Queries in Two-Tiered Sensor Networks

In recent years, we have seen many applications of secure query in two-tiered wireless sensor networks. Storage nodes are responsible for storing data from nearby sensor nodes and answering queries from Sink. It is critical to protect data security from a compromised storage node. In this paper, the Communication-efficient Secure Range Query (CSRQ)—a privacy and integrity preserving range query protocol—is proposed to prevent attackers from gaining information of both data collected by sensor nodes and queries issued by Sink. To preserve privacy and integrity, in addition to employing the encoding mechanisms, a novel data structure called encrypted constraint chain is proposed, which embeds the information of integrity verification. Sink can use this encrypted constraint chain to verify the query result. The performance evaluation shows that CSRQ has lower communication cost than the current range query protocols.


Introduction
Wireless sensor networks (WSNs) provide effective and convenient solutions for various applications, such as environment sensing, military target tracking, intelligent transportation system, etc. Two-tiered wireless sensor network is a kind of practical WSN, and its architecture is illustrated in Figure 1 [1,2]. The lower tier of two-tiered WSNs is composed of massively sensor nodes with limited storage and energy that are responsible for collecting data items, while the upper tier is composed of fewer resource-rich storage nodes, whose main tasks are storing data submitted by the nearby sensor nodes and answering queries issued by Sink. Compared to traditional wireless sensor networks, two-tiered WSNs have some significant advantages by interposing storage nodes as an intermediate tier. Firstly, the network topology of two-tiered WSNs is simpler. Secondly, two-tiered WSNs have a higher efficiency on query processing because Sink only communicates with storage nodes for queries.
However, it brings some security challenges to sensor networks where the storage nodes serve as an intermediate tier between the sensor nodes and Sink. Storage nodes not only receive information from the nearby sensor nodes, but also answer queries issued by Sink. Thus, the storage nodes are more attractive to attackers in two-tiered WSNs. Once it is compromised, the sensitive information stored in storage node will be obtained or guessed by the attackers, and the compromised storage node will make the query result incorrect or incomplete by maliciously inserting, deleting or tampering with
The processes of security range queries have been investigated in [14][15][16][17][18][19][20][21][22][23][24]. In [14], Sheng and Li, which is described as S&L below, employed the bucket partition idea to preserve privacy and an authentication encoding mechanism to verify the integrity of query result in two-tiered WSNs. The Range query is an important type of query in sensor networks. In this paper, we focus on the secure range query processing. The security goals of secure range query include: (1) preserving the privacy of sensory data items and the interested range issued by Sink; and (2) preserving the integrity of the query result. There are two challenges that need to be addressed during the process to achieve these goals above: one is how to process queries without knowing the exact values of data items in sensor networks, and the other is how to verify whether or not the query result is correct and complete.
Therefore, we propose a communication-efficient secure range query processing method, denoted as CSRQ, which is a novel protocol for preserving the privacy and integrity of range query. When deployed in a hostile environment, we use a new data structure named Encrypted Constraint Chain to submit sensory data to the storage nodes. It ensures that the storage nodes cannot disclose the data stored on them. In addition, the message embedded in this chain makes the juggled and/or incomplete data items in queries detectable.
The main contributions in this paper are as follows: (1) A novel encrypted constraint chain model is proposed. Data items are submitted with a complete encrypted chain, which can preserve the privacy of sensitive data in sensor networks. Furthermore, adjacent relations of factors are embedded in chains, which can be used to verify the integrity of query result; (2) Based on the encrypted constraint chain model, a new scheme named CSRQ is proposed. CSRQ can protect sensitive data in two-tiered WSNs from the compromised storage nodes and allow Sink to verify whether the query result is complete and correct; (3) We evaluate our solutions by comprehensive simulation based on real datasets, and the results show that our scheme has a better performance on communication cost.
The rest of this paper is organized as follows. Section 2 gives a brief review of related works. Section 3 describes the system model and attack model. Section 4 proposes the scheme of encrypted constraint chain. Section 5 introduces the process of our protocol in detail. Section 6 analyzes the performance of our approach. We evaluate our approach using thorough experiments in Section 7 and conclude this paper in Section 8.
The processes of security range queries have been investigated in [14][15][16][17][18][19][20][21][22][23][24]. In [14], Sheng and Li, which is described as S&L below, employed the bucket partition idea to preserve privacy and an authentication encoding mechanism to verify the integrity of query result in two-tiered WSNs. The basic idea is to divide the domain of data values into multiple buckets and distribute data items into these buckets. In each time slot, the sensor nodes encrypt data items together in each bucket and send them along with bucket ID to the nearby storage node. Then, Sink finds the minimal set of bucket IDs that contains the range in query, and sends the set as the query to storage. The storage nodes find encrypted data items in these buckets and send them to Sink. Finally, Sink decrypts the encrypted buckets. However, the communication cost and memory cost will increase exponentially with increase of the number of buckets in this scheme. In [15,16], Shi et al. proposed an optimized version of S&L's scheme to reduce the communication cost between the sensor nodes and storage nodes. The main contribution of their optimization is that a new spatiotemporal crosscheck approach is proposed to verify the integrity of query result, which reduces the communication costs.
However, there are two main drawbacks existing in the schemes of both S&L and Shi et al, which are inherited from the bucket partitioning technique. (1) The compromised storage nodes could obtain reasonable estimation of actual values of both data items and queries [17]; (2) The communication cost increases exponentially with the number of dimensions of collected data. For the sake of these problems, Chen and Liu proposed SafeQ in [17,18], which has a better safety performance in preserving the privacy and integrity of range queries. The basic idea of SafeQ is that a prefix-encoding scheme is proposed to encode both data items and queries such that it could not be estimated by compromised storage nodes, and a new data structure called neighborhood chain is proposed to generate integrity verification information, and Sink can verify the integrity of query result using this information.
Although the prefix-encoding scheme and neighborhood chains structure proposed in SafeQ can solve the problems in S&L's scheme, they will increase the memory and communication costs for both sensor nodes and storage nodes. The main reason is that each data item needs to be stored twice in the structure of neighborhood chains. Therefore, Yi et al. proposed a new link watermarking scheme named QuerySec in [19], a protocol based on two new techniques: (1) a scheme based on order preserving function for preservation of data privacy; and (2) a new link watermarking scheme for verification of query results. In [21], Nguyen et al. proposed a novel model based on a d-disjunct matrix, an order-preserving function and a permutation function to preserve the privacy of sensitive information for the range queries, while it fails to consider the verification of query result.
In [22], an efficient secure range query protocol named ESRQ is proposed to realize more efficient and correct process of range query. In [23], Dong and Zhang provided an extend version of [22], and they were the first ones to focus on collusion attacks for range queries in two-tiered WSNs. The basic idea is that different sensor nodes have different hash functions to encode data items for the protection of data privacy and the correlation among data is used for verification of result. In [24], Dong and Chen et al. proposed SecRQ, which not only protects the privacy of data, but also consider the collusion attacks and probability attacks in two-tiered WSNs. It adopts generalized inverse matrices and distance-based range query mechanism for the security of data. Besides, a mutual verification scheme is proposed to verify the integrity of query results in this paper, and it verifies the integrity of query result with lower false positive rate and lower communication cost than the schemes mentioned above.

Network Model
The adopted architecture of two-tiered WSNs is shown in Figure 1, which is similar to [17]. Two-tiered WSN is a special kind of wireless sensor network, in which the storage node M is the intermediate tier of networks, and the lower tier is composed of sensor nodes. The whole network is divided into a number of cells. Each cell consists of M and a number of sensor nodes S = {s 1 , s 2 , . . . , s n }, which can be denoted as cell = {M,{s 1 , s 2 , . . . , s n }}. Sensor nodes are inexpensive sensing devices with limited storage and energy resource and are in charge of collecting data items from a cell and submitting information of data items to nearby M. M is resourceful device with relatively high storage and energy resources and is responsible for receiving and storing information submitted by nearby

Query Model
The range query refers to accessing all the sensory data items included in a specified range, which can be denoted as a three-tuple: Q t " pψ, t, rlow, highsq where ψ denotes the set ID of queried sensor nodes, t is the queried time slot, and low and high refer to the lower and upper bounds of query range, respectively. For example, if Q t = ({s 1 , s 2 , . . . , s 8 }, t, [10,20]), it means that Sink finds all data items in [10,20] collected by sensor nodes s 1 -s 8 during a time slot t.
For the sake of brevity, we only discuss the range query that Sink queries in a cell = {M, {s 1 , s 2 , . . . , s n }} during time slot t in this paper, which can be denoted as Q t = (ψ, t, [low, high]), where ψ = {s 1 , s 2 , . . . , s n }. To get the results of queries in multiple time slots and/or multiple cells, we can simply get the final result by resolving and merging the single results.

Threat Model
By using the same threat model in [17][18][19][20][21][22], we assume that Sink and the sensor nodes are trusted in two-tiered WSNs, but M is not. In fact, the sensor nodes can also be compromised in a hostile environment. Attackers may obtain sensitive data items from compromised sensor nodes. A sensor node only contains a small fraction of data items collected by all sensor nodes, while a great deal of sensitive data items are stored in M. Therefore, attackers can steal less information from a compromised sensor node than from M. Therefore, we are mainly concerned with the scenario of the compromised M in this paper.
If M is compromised in two-tiered WSNs, we consider that attackers can attack networks in the following two ways: The attackers obtain sensitive information stored in M directly or indirectly, which violates the privacy of data. (2) The attackers forge or exclude the legitimate data items stored in compromised M, which makes the query result incorrect or incomplete.

Problems Statement
The goal of secure range query is not only to preserve the privacy of data items collected by sensor nodes and queries issued by Sink, but also to ensure that the integrity of query result can be verified by Sink. The details are as follows: The privacy issues: M could not obtain the actual values of any data items collected by sensor node and the values of lower and higher bounds of query range in Q t .
The integrity issues: If Q t = (ψ, t, [low, high]), the query result, which can be denoted as QR, should contain all data items satisfying [low, high] in ψ. All data items in QR are collected by the sensor nodes in ψ.
The keys to achieve the security goals above are as follows. First, M could decide whether a data item collected by sensory node should be included in query result by comparing it with low and high without knowing the actual values of them. Second, Sink could detect whether all data items satisfying [low, high] are included in QR and whether all data items included in QR satisfy [low, high], and that all of them are collected by the sensor nodes in ψ. Thus, we propose CSRQ, a query protocol with better performance in terms of both security and communication cost in preserving the privacy of data items and the integrity of query results.
Moreover, as an index used to evaluate the performance of security range query protocol, the communication costs include two aspects: one is the communication cost for sensor node, which plays a decisive role in the lifecycles of sensor networks; the other is the communication cost between M and Sink, which directly affects the operating costs of networks. We conduct a detailed analysis for the index of performance in Sections 6 and 7.

Encrypted Constraint Chain Model
In this section, we will introduce the encrypted constraint chain in detail, which is proposed to protect the privacy and integrity of two-tiered WSNs. Definition 1. Encrypted constraint chain: Given n numbers stored in the ascending order D = {d 1 , d 2 , . . . , d n }, where d 1 < . . . < d n , we partition these numbers into several parts with parameter τ, and encrypt every part. These encrypted parts can easily be brought together to form encrypted constraint chain C τ .
Here "'" 'denotes concatenation, and δ is the number of items in C τ . We call F i the constraint factor of C τ , and F i .ds represents the dataset in F i . C τ satisfies following conditions: (1) F i has τ sensory data items, where 1 ď i ď δ´1, while F δ has no more than τ sensory data items.
The upper and lower bounds of F i are denoted as UB(F i ) and LB(F i ), respectively. Hence, for any two adjacent constraint factors F i and F i+1 , we can determine UB( The computation formula of δ is The form of F i in C τ is as shown in Equation (3), in which k represents an encryption key, and "||" denotes the concatenation of data items.
..'F δ be an encrypted constraint chain, where δ is called the length of C τ and it is denoted as |C τ | = δ. F i satisfies that F i P C τ , and F i´1 and F i`1 are called the left and right neighbor constraint factors of F i , respectively. The head factor of C τ is denoted as head(C τ ) = F 1 and the tail factor is denoted as tail(C τ ) = F δ .

Definition 3.
Given two encrypted chains C τ and C τ '. If each factor of C τ ' is included in C τ , then C τ ' is called a sub-chain of C τ . It can be denoted as C τ ' Ď C τ . Thus, we have According to Definition 3, we can easily deduce the following property.

Property 1.
The sub-chain relationĎ is transitive, which means Definition 4. Given an encrypted constraint chain C τ , let C τ ' be a sub-chain of C τ , which means C τ ' Ď C τ . For a query range [low, high], if C τ ' simultaneously satisfies the following conditions, then C τ ' is called a Maximum Encrypted Constraint Sub-chain (MECS) of C τ .
iven a MECS of C τ that satisfies [low, high], according to Definition 4, we can know that low is between the lower and upper bounds of head constraint factor in MECS, and high is between the lower and upper bounds of tail constraint factor. Except the head and tail, the data items in each factor of MECS are between low and high, and the data items included in C τ but outside of MECS are not included in [low, high].
Now we give a further instruction of the above definitions and properties with some examples. For example, D = {3, 6, 11, 23, 38, 42}. Given a parameter τ = 3, the encrypted constraint chain is . Given a range [7,23], the MECS of C τ which satisfies [7,23] is As demonstrated by the previous definitions and properties, if F i and F i`1 are two adjacent factors in the encrypted constraint chain, the upper bound of F i equals to the lower bound of F i`1 . Thus, it provides a theoretical basis for the integrity verification of query results.

Secure Range Query Protocols
In this paper, we employ the 0-1 encoding mechanism [25] to preserve privacy. The basic idea of 0-1 encoding mechanism is to convert the verification of whether a data item is within a range to the verification of whether there are intersections between two sets. Given a number x whose binary According to the above definitions, given two numbers x and y, they can be compared with each other using their 0-1 codings. What is noteworthy is that x and y can be compared only if they are of different encoding types, which means that they have the encoding types of 0-coding and 1-coding, respectively.
In this paper, we convert each 0-1 coding to a corresponding unique number using the numerical function N (*) similarly to that used in [18] and we encode the 0-1 coding data items using the keyed-Hash Message Authentication Code (HMAC) to ensure that it is infeasible for M to steal sensitive data items. We denote HMAC function as HMAC g (*), where g is a key for HMAC which is only known to sensor node and Sink. A data processed by 0-1 encoding and HMAC is denoted as a comparator. HNE 0 (x) = HMAC g N (E 0 (x))) is a comparator of 0-coding and HNE 1 (x) = HMAC g (N (E 1 (x))) is a comparator of 1-coding. Thus, we can easily know x > y if and only if HNE 1 (x) X HNE 0 (y) " Ø.

Submission Protocol
The submission protocol concerns how a sensor node submits its data to the nearby M. First, the sensor node encrypts all the data items collected during a time slot t, then builds the corresponding encrypted constraint chain and computes the comparators of encrypted data items. After that, the sensor node sends the encrypted constraint chain to the nearby storage node.
We denote d max and d min as the lower and upper bounds of a sensory data item, respectively. Let τ be the partition parameter of encrypted constraint chain. Each sensor node s i in a network shares a secret key k i,t with Sink. The submission protocol is illustrated as the following Protocol 1.
Then s i performs the following steps.
(1) Compute the 0-1 code of each data item.
(2) Build the encrypted constraint chain C τ ,i of D i with τ and k i,t . Assuming (3) Compute the comparator of UB(F i,j ), where 1 ď j ď δ´1. It means computing HNE 0 (UB(F i,j )) and HNE 1 (UB(F i,j )). Then, add the comparator set which contains the fewest elements into Ω i . Thus we have, where min(X, Y) denotes the set containing the fewest elements.
(4) Send the following message to M, where id(s i ) denotes the ID of s i in networks.
In order to build the encrypted constraint chain easily, we denote an encrypted group as a constraint factor in this paper. Thus, the more sensory data are contained in an encrypted group, the less encrypted data and HMAC data are contributed by a sensor node. It contains at most

Query Protocol
The query protocol concerns how Sink and M process the queries correctly. To ensure that both the privacy of data and the integrity of query result can be preserved, we will give the basic idea of our query protocol. First, Sink computes the comparators of low and high in Q t = (ψ, t, [low, high]), and then replaces the low and high in Q t with their comparators respectively. After that, Sink sends the modified Q t to M. Second, after receiving a query, M decides whether the value of comparator in encrypted constraint chain contributed by sensor node is included in [low, high] using the 0-1 coding mechanism, and computes the minimal set which contains all data items satisfying the queries. Then, M sends this set to Sink. Third, after receiving this minimal set, Sink decrypts each encrypted data in dataset, and computes the QR. Finally, Sink verifies the authenticity and completeness of QR. The above steps show the basic idea of query protocol, and the detailed performance of query is as follows:

Phase 1: Sink sends query to M
Sink firstly computes {HNE 0 (low), HNE 1 (low), HNE 0 (high), HNE 1 (high)}, the comparator of low and high in Q t = (ψ, t, [low, high]), and replaces the low and high in Q t with their corresponding comparators, respectively. Then, Sink sends Q t = (ψ, t, {HNE 0 (low), HNE 1 (low), HNE 0 (high), HNE 1 (high)}) to M. Phase 2: M processes the query Upon receiving Q t from Sink, M performs the following two steps. We denote CS as the minimal encrypted dataset received by Sink, which contains the query results. The initial CS is empty, which is denoted as CS = Ø.
(1) Let C τ ,i = F i,1 'F i,2 '...'F i,δ be an encrypted constraint chain contributed by s i during a time slot, and min{HNE 0 (UB(F i,j )), HNE 1 (UB(F i,j ))} is the comparator of UB(F i,j ), which is a factor in Ω i . According to Definition 1, Here, the 0-1 encoding technology is employed to compare UB(F i,j ) and LB(F i,j ) with low and high, where F i,j is a constraint factor in C τ ,i . If one of the following three conditions can be satisfied, add F i,j into i , where i is a set of constraint factor. We set i = Ø initially.
After all constraint factors in C τ ,i are processed through the above steps, then all the constraint factor set i will be added into another set CS.
(2) After all sensor nodes are processed through step (1), M will send the following message to Sink.
M Ñ Sink : ă t, tidps i q, i | s i P ψ^ i Ď CS u ą The Protocol 2 shows that the CS received by Sink is as follows: Property 2. In CS, it includes at least one item in i contributed by s i P ψ, so we have Proof: Let C τ ,i = F i,1 'F i,2 '...'F i,δ be a MECS received by M from s i . According to Definition 1, we can easily know that any constraint factor's upper bound equals to the next factor's lower bound, which means UB(F i,j ) = LB(F i,j+1 ). [LB(F i,j ), UB(F i,j )] is a range interval composed of the lower and upper bounds of F i,j . Thus, the composition process of C τ ,i in Protocol 1 shows that the following Equation (11) is true.
Because d min and d max are the lower and upper bounds of sensory data, the query range [low, high] must be included in [d min , d max ]. According to this, we know that there is at least one constraint factor F i,j that satisfies [LB(F i,j ),UB(F i,j )] X [low, high] " Ø. There exist the following four possible cases.
Proof: Because all of i contributed by s i P ψ are MECS of C τ,i that satisfy [low, high], according to the definition of MECS, we can know that any data items in constraint factors of i are included in [low, high], and any data items out of i are not included in [low, high]. Thus, i is the minimal encrypted dataset which includes all data items satisfying [low, high] in C τ ,i . As shown in Equation (9), CS " Y s i PΨ t i u, therefore, CS is the minimal encrypted dataset which satisfies [low, high] in ψ.

The Computation of Query Result and the Algorithm of Integrity Verification
After receiving CS from M, Sink will decrypt the encrypted data in CS using the keys shared with sensor nodes and compute the query result QR, and then, it will verify the integrity of QR. Algorithm 1 shows the details of integrity verification.

Algorithm 1: The algorithm of integrity verification
Let CS " Y s i PΨ t i u be an encrypted dataset that Sink receives from M. Sink verifies the integrity of QR as follows.
Sink performs the following three steps to verify each i contributed by s i P ψ.
(1) If i = Ø, it could not satisfy Definition 2. Thus the integrity of QR is violated. Quit the algorithm.
(2) If i " Ø, Sink decrypts all factors in i using k i ,t only shared with s i , and checks whether both of following two conditions are satisfied. If so, add all the data within [low, high] into QR, and then turn to Step (3). Otherwise, the integrity of QR is violated so quit the algorithm. 1) Each factor F i ,v in i satisfies following condition: 2) F i ,k and F i ,v+1 satisfy the following formula, where F i ,v and F i ,v+1 are two adjacent factors in i .
If all factors contributed by s i P ψ are processed through Steps (1) and (2), and all of them satisfy the query, then the QR satisfies the query, thus return the QR. Otherwise, continue to process the next i contributed by unprocessed sensor node, and turn to Step (1). Algorithm 1 shows that the key to verify the integrity of the QR is to check whether all of following three conditions can be satisfied. First, each sensor node s i to be queried contributes a non-empty set i . Second, all factors in QR satisfy the query range [low, high]. Third, the upper bound of each factor equals to the lower bound of next one. If and only if all of the above three conditions are satisfied will QR satisfy the integrity of query results.

Protocol Analyses
In two-tiered WSNs, we mainly evaluate the performance of a security query protocol from following two aspects: one is security, and the other is the communication cost. In this section, we will analyze the performance of CSRQ from these two aspects.

Privacy Analysis
(1) The privacy of sensory data. The key to preserve the privacy of data items in two-tiered WSNs is to ensure that M cannot steal the actual values of encrypted data items without knowing the secret keys. If a storage node is compromised, CSRQ can effectively preserve the privacy of sensitive data. Because s i encrypts the collected data items using its private keys only shared with Sink before sending data items to M in CSRQ, it is very difficult for attackers to obtain the actual values of sensory data. Furthermore, the HMAC mechanism is employed in CSRQ to ensure that it is computationally infeasible to compute the actual values of sensory data without knowing both the Hash key and its secret key. Therefore, CSRQ has a better performance in protecting the privacy of sensory data.
(2) The privacy of query result. Similar to the privacy protection of sensory data, the key to protect privacy of query result is to ensure that M cannot get the actual values of results. In CSRQ, the sensor nodes send data items to M with the form of encrypted constraint chain and their corresponding comparators, and M compares the query range with data items without knowing the actual value of them, all data items included in CS are encrypted. Upon receiving CS, only Sink can decrypt the data items in CS and compute the query result. Therefore, without knowing the key used in the encryption, it is very difficult to steal the value of query result.
(3) The privacy of query range. In CSRQ, it does not allow attackers to obtain the actual values of query range either. Sink sends the query to M after replacing the query range with their corresponding comparators, which ensures that it is very difficult for M to leak the information of query range.
Thus, the CSRQ proposed in this paper can ensure that the privacy of sensory data, query result and query range can be protected.

Integrity Analysis
In CSRQ, we propose a novel encrypted constraint chain to ensure that the integrity of query result can be verified by Sink. The main idea is that the data items collected by all sensor nodes during a time slot t will be sent to the nearby M with the form of a complete encrypted constraint chain, which allows Sink to verify the integrity of query result by checking the relationship of adjacent factors in the chain. The integrity verification includes the following two-fold: one is verifying whether the data item satisfying the query is forged, and the other is verifying whether the data items satisfying query are deleted by attackers.
[low, high]) received by Sink from s i P Ψ during time slot t. We assume that M is compromised and it attempts to attack i . Next, we will analyze the integrity of CSRQ from the following cases.
(1) If data item in QR satisfying the query is forged by the compromised M: 1 If F i,u ' is a tampered data item which replaces the original F i,u in i , Sink will find that UB(F i,u-1 ) " LB(F i,u ') or UB(F i,u ') " LB(F i,u+1 ) after decrypting i . It could not satisfy the definition of encrypted constraint chain (Definition 1). Therefore, i can be determined as incomplete. Or Sink will detect that d α P F i,u '¨ds, where d α R [low, high], which contraries to the definition of MECS (Definition 4). Thus CR can also be determined as incomplete.
2 Similar to 1 above, if F i,u ' is inserted as a tampered data item between F i,u and F i,u+1 , Sink will detect that UB(F i,u ) " LB(F i,u ') or UB(F i,u ') " LB(F i,u+1 ) after decrypting i . It could not satisfy Definition 1 either, which means that CR is incomplete.
(2) If data item that satisfies the query range is deleted by M: 1 If all data items in i are deleted by attackers, Sink will detect that i = Ø, which is contrary to Property 2. Thus, Sink can judge that i has been attacked.
2 If the data items between F i,a and F i,b deleted by M, where j ď a < b ď v, Sink will detect that UB(F i,a ) " LB(F i,b ), which dissatisfies Definition 1. Thus, i will be determined as incomplete.
3 If the head constraint factor of i F i,j is deleted by attackers, F i,j+1 will be the new head( i ) of i . Then, Sink will detect that all data items in F i,j+1 are included in [low, high] after decrypting i , which could not satisfy the Condition (1) of Definition 4. Therefore, the incomplete i can be detected by Sink. Similarly, if the deleted constraint factor F i,j is tail ( i ), it can also be detected by Sink.
In conclusion, Sink can verify the correctness and completeness of query result effectively in CSRQ.

Communication Cost of Sensor Node
In two-tiered WSNs, the communication cost of sensor node is mainly incurred by transferring data items from the sensor nodes to M. Here, let E c be the communication cost during a data submission for each sensor node.
Let n be the size of two-tiered WSNs inquired about, and l t be the bit length of a time slot. We assume that each sensor node collects N data items during a time slot. According to the definition of encrypted constraint chain in our scheme, N data items will be divided into δ constraint factors by parameter τ, where δ can be calculated from Equation (2). Let l e , l h and l id be the average length of an encrypted constraint factor, a HMAC encoding and an ID of encrypted node, respectively, and L be the average hops from each sensor node to M. By analyzing Protocol 1, the formula to calculate the communication cost is gained as follows.

Communication Cost of Query
The query Protocol 2 shows that the query is a collaborative process between M and Sink, thus the communication costs of query should contain two aspects: one is the cost for sending the query from Sink to M, and the other is the cost for sending the message from M to Sink. We assume that the minimal encrypted dataset contributed by s i includes ρ i constraint factors, and all other parameters have the same meaning given above. Similarly, we can know the communication cost E Q for the query is as follows.
In conclusion, we can finally obtain E total , which is the total communication cost for CSRQ, simply by adding E C to E Q . Then, we have E total " E C`EQ " n ř i"1 pl id`lt`p δ i´1 q¨l h`δi¨le q¨L`4¨l h`p n`1q¨pl id`lt q`l e¨n ř Sensors 2016, 16, 259 12 of 17 In the next section, we will provide some further details about the performance analysis.

Experimental Results
For a further analysis of the performance of communication cost in CSRQ, we compare CSRQ with SafeQBloom, QuerySec, ESRQ and SecRQ by implementing these five schemes on a large real dataset from Intel Lab [26], and present the results of detailed performance evaluation obtained using MATLAB.

Contrastive Experimentfor Communication Cost
Since the sensor nodes in wireless sensor networks are mainly powered by battery, their energy is limited [27]. It is important to reduce the communication cost of sensor nodes in order to prolong the life of network. Thus, we analyze the communication costs of sensor nodes by comparing the performance of CSRQ with that of SafeQBloom, QuerySec, ESRQ and SecRQ on six aspects, including the network topology, the number of sensor nodes in a cell, the number of data items collected by a sensor node during a time slot, the length of an encoding data, an encrypted constraint factor and the number of constraint factors. We assume some default values of parameters, which are shown in Table 1 below.  In our experiment, we assume that there are 20 groups of networks with various network topologies randomly distributed in the networks. The network ID of each group is unique. Then, the communication costs of query can be determined by computing the average costs of the 20 groups of networks. The details of experimental results and analysis are as follows: (1) Impact of network ID. Figure 2 shows the communication cost of sensor nodes impacted by ID. Let other parameters be the default values. collected by a sensor node during a time slot, the length of an encoding data, an encrypted constraint factor and the number of constraint factors. We assume some default values of parameters, which are shown in Table 1 below.  In our experiment, we assume that there are 20 groups of networks with various network topologies randomly distributed in the networks. The network ID of each group is unique. Then, the communication costs of query can be determined by computing the average costs of the 20 groups of networks. The details of experimental results and analysis are as follows:

Parameter
(1) Impact of network ID. Figure 2 shows the communication cost of sensor nodes impacted by ID. Let other parameters be the default values.
According to Figure 2, there is little change caused by different network topologies in these five mechanisms, and all of their communication costs of sensor nodes fluctuate within a small scope. The average communication cost for sensor nodes of SafeQBloom is relatively high, while the costs of CSRQ and SecRQ are lower. The communication cost of CSRQ is lowest, which is 68.5% lower than that of SafeQBloom and 9.4% lower than that of SecRQ. The reasons are as follows. In CSRQ, it contains τ −1 data items in each constraint factor except the first and last ones. Therefore, fewer messages will be submitted to M than those in SafeQBloom and SecRQ. What is more, during each time slot, the sensor nodes only need to send the minimal set of each factor's upper boundary along with encrypted constraint chain to M, which can also reduce the communication cost of sensor node.  The average communication cost for sensor nodes of SafeQBloom is relatively high, while the costs of CSRQ and SecRQ are lower. The communication cost of CSRQ is lowest, which is 68.5% lower than that of SafeQBloom and 9.4% lower than that of SecRQ. The reasons are as follows. In CSRQ, it contains τ´1 data items in each constraint factor except the first and last ones. Therefore, fewer messages will be submitted to M than those in SafeQBloom and SecRQ. What is more, during each time slot, the sensor nodes only need to send the minimal set of each factor's upper boundary along with encrypted constraint chain to M, which can also reduce the communication cost of sensor node.
(2) Impact of n and N. We conducted experiment with different n and N, respectively, while other parameters are default values. Figures 3 and 4 show the communication cost of sensor nodes under the impact of n and N, respectively, where n is the number of sensor nodes in a cell, and N is the number of data items collected by a sensor node during a time slot. The communication cost of sensor nodes increases with n and N in these five schemes. In CSRQ, the 0-1 encoding scheme is used for comparison, which requires fewer messages to be transferred. It can significantly reduce the communication cost of sensor node. Thus, compared to other four schemes, CSRQ has the lowest communication cost of sensor nodes. In conclusion, the experimental data demonstrates that CSRQ can achieve security range query with lower communication cost than the existing security range query schemes. sensor nodes. In conclusion, the experimental data demonstrates that CSRQ can achieve security range query with lower communication cost than the existing security range query schemes.  (3) Impact of w and le. As w and le increase, the changes of communication costs of sensor nodes are shown in Figures 5 and 6, where w denotes the bit length of data collected by sensor node, and le denotes the average length of an encrypted constraint factor. sensor nodes. In conclusion, the experimental data demonstrates that CSRQ can achieve security range query with lower communication cost than the existing security range query schemes.  (3) Impact of w and le. As w and le increase, the changes of communication costs of sensor nodes are shown in Figures 5 and 6, where w denotes the bit length of data collected by sensor node, and le denotes the average length of an encrypted constraint factor. (3) Impact of w and l e . As w and l e increase, the changes of communication costs of sensor nodes are shown in Figures 5 and 6 where w denotes the bit length of data collected by sensor node, and l e denotes the average length of an encrypted constraint factor. (3) Impact of w and le. As w and le increase, the changes of communication costs of sensor nodes are shown in Figures 5 and 6, where w denotes the bit length of data collected by sensor node, and le denotes the average length of an encrypted constraint factor.     Figures 5 and 6 show that in these five schemes, longer lengths of sensory data and encrypted constraint factor will both cause greater communication cost of sensor nodes. CSRQ has lower communication costs than the others. The reason is similar to Figures 2 and 3.
(4) Impact of δ. With δ changed, where δ denotes the number of encrypted constraint factor, the change of communication costs of sensor nodes is shown in Figure 7. Figure 7 reveals that the sensor node's communication costs in the five schemes increase with δ, and CSRQ has a lower communication cost than the others. The reason is as follow. The larger δ means that the sensor nodes submit more messages to M. In CSRQ, each sensor node only needs to submit δ´1 minimal sets of boundary messages along with encrypted constraint chain to M. It means that fewer messages need to be transferred in CSRQ than in other schemes. Figures 5 and 6 show that in these five schemes, longer lengths of sensory data and encrypted constraint factor will both cause greater communication cost of sensor nodes. CSRQ has lower communication costs than the others. The reason is similar to Figures 2 and 3. (4) Impact of δ. With δ changed, where δ denotes the number of encrypted constraint factor, the change of communication costs of sensor nodes is shown in Figure 7.  Figure 7 reveals that the sensor node's communication costs in the five schemes increase with δ, and CSRQ has a lower communication cost than the others. The reason is as follow. The larger δ means that the sensor nodes submit more messages to M. In CSRQ, each sensor node only needs to submit δ − 1 minimal sets of boundary messages along with encrypted constraint chain to M. It means that fewer messages need to be transferred in CSRQ than in other schemes.

Contrastive Experiment for False Positive Rate
We define the false positive rate as the ratio of the number of unsatisfactory data items received by Sink to the number of data items satisfying query range. Therefore, the lower the false positive rate is, the higher the accuracy is. Figure 8 reveals the false positive rate impacted by the network size, where the network size refers to the data items collected by the sensor nodes and transmitted to M. We can see that QuerySec, ESRQ and SecRQ have no false positive, and the average false positive rates of SafeQBloom and CSRQ are 0.47% and 0.51%, respectively. In both CSRQ and SafeQBloom, the query results received by Sink may contain constraint factors in which only most parts of data items

Contrastive Experiment for False Positive Rate
We define the false positive rate as the ratio of the number of unsatisfactory data items received by Sink to the number of data items satisfying query range. Therefore, the lower the false positive rate is, the higher the accuracy is. Figure 8 reveals the false positive rate impacted by the network size, where the network size refers to the data items collected by the sensor nodes and transmitted to M. We can see that QuerySec, ESRQ and SecRQ have no false positive, and the average false positive rates of SafeQBloom and CSRQ are 0.47% and 0.51%, respectively. In both CSRQ and SafeQBloom, the query results received by Sink may contain constraint factors in which only most parts of data items satisfy the query range. What is more, in CSRQ, it contains at most 2(τ´1) unsatisfied data items in result received by Sink, and in general, τ > 2, which is a little more than that in SafeQBloom. Therefore, the false positive rate of former scheme is slightly higher than that of CSRQ. Furthermore, the false positive rate of CSRQ is very low, and it decreases with the increase of network size, and then gradually approaches 0. Thus, it has little effect on the performance of range queries. satisfy the query range. What is more, in CSRQ, it contains at most 2(τ − 1) unsatisfied data items in result received by Sink, and in general, τ > 2, which is a little more than that in SafeQBloom. Therefore, the false positive rate of former scheme is slightly higher than that of CSRQ. Furthermore, the false positive rate of CSRQ is very low, and it decreases with the increase of network size, and then gradually approaches 0. Thus, it has little effect on the performance of range queries. The experiment results show that CSRQ provides a better performance in communication cost than current query protocols, such as SafeQBloom, QuerySec, ESRQ and SecRQ, in terms of efficiency.

Conclusions
In this paper, we propose CSRQ, a novel efficient protocol for processing range queries in two-tiered WSNs, which has great performance in privacy and integrity preservation. To preserve the privacy of data items in networks, we encrypt the data items collected by the sensor nodes through the encoding mechanisms. To preserve the integrity of query range and result, we present a The experiment results show that CSRQ provides a better performance in communication cost than current query protocols, such as SafeQBloom, QuerySec, ESRQ and SecRQ, in terms of efficiency.

Conclusions
In this paper, we propose CSRQ, a novel efficient protocol for processing range queries in two-tiered WSNs, which has great performance in privacy and integrity preservation. To preserve the privacy of data items in networks, we encrypt the data items collected by the sensor nodes through the encoding mechanisms. To preserve the integrity of query range and result, we present a novel encrypted constraint chain scheme to link data items collected by a sensor node to each other, which allows Sink to verify the integrity by checking the adjacent relations embedded in the encrypted constraint chains. The results of our experiment show that CSRQ has a better performance in terms of efficiency than current query protocols.