A Key-Policy Searchable Attribute-Based Encryption Scheme for Efﬁcient Keyword Search and Fine-Grained Access Control over Encrypted Data

: Attribute based encryption is a promising technique that achieves ﬂexible and ﬁne-grained data access control over encrypted data, which is very suitable for a secure data sharing environment such as the currently popular cloud computing. However, traditional attribute based encryption fails to provide an efﬁcient keyword based search on encrypted data, which somewhat weakens the power of this encryption technique, as search is usually the most important approach to quickly obtain data of interest from large-scale dataset. To address this problem, attribute based encryption with keyword search (ABKS) is designed to achieve ﬁne-grained data access control and keyword based search, simultaneously, by an ingenious combination of attribute based encryption and searchable encryption. Recently, several ABKS schemes have been constructed in secure cloud storage system for data access control and keyword search. Nonetheless, each of these schemes has some defects such as impractical computation overhead and insufﬁcient access policy expression. To overcome these limitations, in this paper, we design a Key-Policy Searchable Attribute-based Encryption Scheme (KPSABES) based on the full-blown key-policy attribute-based encryption proposed by Vipul Goyal et al. By novel design, our scheme not only inherits all advantages of that scheme but also achieves efﬁcient and secure keyword search over encrypted data. We provide the detailed performance analyses and security proofs for our scheme. Extensive experiments demonstrated that our proposed scheme is superior in many aspects to the similar work.


Motivation
The emerging cloud computing paradigm motivates organizations and individuals to outsource their private data to the cloud server for saving local IT construction and maintenance cost.Regardless of numerous advantages of cloud computing such as multi-tenant, rapid deployment, reliability, flexibility, etc., data security and privacy is still a knotty problem to hinder the wide adoption of cloud computing, since the data are out of the control of data owners once being outsourced to the remote cloud center [1].An intuitive way to protect data is encryption [2,3].However, traditional encryption techniques block the fundamental utilizations of data such as search, calculation, and so on.Without doubt, a data user can download all ciphertexts from the cloud server and decrypt them to obtain interested data.Obviously, this straightforward method will incur impractical communication and computation overhead in the cloud computing environment, where the data are big.
We were first motivated by the full-blown key-policy attribute-based encryption scheme (KP-ABE) and its application example proposed by Vipul Goyal et al. [4].Essentially, the scheme in [4] further develops the first attribute-based encryption scheme by allowing for a wide variety of access structures, not only a fixed threshold gate.Therefore, the more flexible and fine-grained data access control over encrypted data can be achieved in a secure data sharing setting.In brief, in their construction, each data user's key is associated with a specified tree-access structure supporting AND, OR, and threshold gates and a message is encrypted under a set of attributes.A user's key can recover the message from the corresponding ciphertext if and only if the attributes in the ciphertext satisfy the access structure in the key.In addition, the authors demonstrated an important application of their KP-ABE scheme in the secure forensic analysis system, where all audit logs are encrypted by their proposed KP-ABE scheme.More specifically, in that system, a forensic analyst is issued a secret key associated with a specified access structure and the audit log records are encrypted under a set of attributes.As a result, the forensic analyst can even download all ciphertexts but can only decrypt those encrypted audit log records whose attributes satisfy the access structure in the forensic analyst's key.Thus, the access control over encrypted data can be achieved in a fine-grained manner.Nevertheless, in the practical work, the forensic analyst may only want to obtain some audit logs of interested.If the forensic analyst can first retrieve the encrypted audit logs that are within his/her access privileges and that are of interest and then decrypt them, the communication cost and decryption time could be greatly reduced.Motivated by this practical requirement, in this paper, we design a key-policy searchable attribute-based encryption scheme, named KPSABES, based on Goyal et al.'s key-policy attribute-based encryption scheme [4].According to our design, KPSABES can achieve efficient keyword-based search and fine-grained access control over encrypted data, simultaneously.Next, to further illustrate our motivation, we review the related works on the following three topics: searchable encryption, attribute-based encryption, and searchable attribute-based encryption.

Searchable Encryption
Searchable encryption (SE) allows a server to perform search over encrypted data according to a search token submitted by a data user.Song et al. proposed the first practical SE scheme in [5] with linear search complexity.Curtmola et al. [6] used inverted index structure to construct a more efficient SE construction that achieves sub-linear search complexity.Kamara et al. [7] introduced the dynamic SE technique.A dynamic SE scheme supports secure and efficient file updating.Recently, the dynamic SE schemes are further investigated in [8][9][10] to achieve better efficiency and security.To allow for the correctness and completeness verification of search results, the verifiable SE was introduced by Kurosawa and Ohtaki [11] against the unreliable server [12], which may return incorrect and incomplete search results to the data user.By using the emerging blockchain technique, decentralized SE schemes are designed in [13,14], which can ensure that the data user can always receive the correct and complete search results.Cloud computing promotes the development of data outsourcing services, and the SE schemes over encrypted outsourced cloud data are adequately researched in [15][16][17][18][19][20][21].These works mainly focus on the multi-keyword ranked search, fuzzy search, personalized search, etc.Since these above schemes are implemented in the symmetric cryptosystem, they are referred as symmetric searchable encryption (SSE).The first public-key based SE scheme was proposed by Boneh et al. [22].Later, the keyword based conjunctive search and range search are also explored in the public-key setting in [23][24][25][26].However, these traditional searchable encryption schemes do not consider search permissions of data users and cannot achieve flexible access control on encrypted data.

Attribute Based Encryption
Attribute-based encryption (ABE) is a novel cryptographic primitive that can enforce fine-grained data access control via cryptographic means.Sahai and Water designed the first attribute-based encryption, called Fuzzy Identity-Based Encryption (FIBE) [27].In FIBE, a message is encrypted with a set of attributes ω, and a private key is associated with a set of attributes ω .If and only if the two attribute sets satisfy condition |ω ∩ ω | ≥ d, the private can decrypt the message, where d is a pre-set fixed threshold.To make access control structure in ABE more expressive and flexible, ABE is further developed in [4,[28][29][30][31][32][33][34].These schemes can be generally classified into two categories: key-policy attribute-based encryption (KP-ABE) and ciphertext-policy attribute-based encryption (CP-ABE).In KP-ABE [4,[32][33][34], a message is encrypted with a set of attributes and the user's private key is associated with an access policy.If the attributes in the ciphertext satisfy the access policy in the private key, the key can recover the original message from the ciphertext.On the contrary, in CP-ABE schemes [28][29][30][31], a message is encrypted with an access policy and the user's key is associated with his/her attributes.If and only if the user's attributes satisfy the access policy in the ciphertext, the decryption can be successful.Recently, some anonymous attribute-based encryption schemes are designed in [35][36][37][38], where the attribute information is hidden for preventing from revealing privacy.Although the above schemes can provide the flexible and fine-grained access control on the encrypted data, they limit data searching.

Searchable Attribute-Based Encryption
To achieve effective keyword-based search and data access control over encrypted data, several searchable attribute-based encryption schemes are proposed [39][40][41][42][43][44][45][46][47][48].These schemes combine searchable encryption and attribute-based encryption techniques to achieve keyword-based search and data access control over encrypted data.Nonetheless, each of these schemes has some limitations.Specifically, the schemes in [39,40] use an access structure that only allows AND policies and the schemes in [41][42][43][44]48] consider AND and OR gates of attributes but do not support flexible threshold gate.The authors proposed an efficient ciphertext-policy searchable attribute-based encryption scheme in [45], which supports AND, OR, and threshold gate.However, the query keyword in the scheme is vulnerable against the chosen plaintext attack, since the search token generation algorithm is a deterministic encryption.In addition, the schemes in [46,47] use composite-order group to set up pairing environment, which incurs the impractical search complexity [43].
In this paper, we design a key-policy searchable attribute-based encryption scheme based on Goyal et al.'s scheme [4], which efficiently works in the prime order group and flexibly supports AND, OR, and threshold gates.Our proposed scheme is applicable to any real applications in the encrypted data share setting, where fine-grained data access control and efficient keyword search should be able to achieved just like the audit log application mentioned at the beginning of this section.
It needs to be emphasized that Zheng et al. [49] also constructed a similar key-policy attribute-based keyword search scheme (KP-ABKS) based on the scheme in [4].However, KP-ABKS has to introduce extra ciphertext components and time-consuming operations on the basis of the scheme [4] to achieve data searching.By skillfully design, our scheme does not need any additional ciphertext components and expensive operations to support data searching.The performance analyses and experimental results further show that our proposed scheme is superior in many aspects to theirs, including the search complexity, which is usually regarded as the core evaluation index for a search system.

Contributions
In this paper, we make three key contributions: (1) We propose KPSABES, a key-policy searchable attribute-based encryption scheme.KPSABES enables efficient keyword based search and flexible and fine-grained access control over encrypted data by leveraging the searchable encryption and the key-policy attribute-based encryption.(2) We analyzed the performance complexity for our proposed algorithms in KPSABES and the similar scheme KP-ABKS in [49].In addition, we provide the thorough security proofs for our proposed scheme.(3) We implemented our proposed KPSABES and the similar work KP-ABKS in [49], and made the thorough performance comparisons between these two schemes through extensive experiments.
The results demonstrate that KPSABES is superior in many aspects to PK-ABKS.
The rest of this paper is organized as follows.We introduce the background knowledge used to construct KPSABES in Section 2. Section 3 presents the system model and the security model for our proposed problem.We give formal algorithm definition and implementation details in Section 4. The performance analysis and security proof of our scheme are provided in Sections 5 and 6, respectively.We present our experimental evaluation of our scheme in Section 7. Section 8 concludes this paper.

Background Knowledge
In this section, we introduce the various background knowledge that is used to construct our proposed KPSABES scheme.

Bilinear Map
Let G 1 and G 2 be two multiplicative cyclic groups of prime order p, where g denotes a generator of G 1 .A bilinear map e : G 1 × G 1 → G 2 follows the properties: (1) Computability: For all u, v ∈ G 1 , there exists an efficient polynomial time algorithm to compute e(u, v) ∈ G 2 .(2) Bilinearity: For all u, v ∈ G 1 and x, y ∈ Z * p , e(u x , v y ) = e(u, v) xy holds.(3) Non-degeneracy: e(g, g) = 1, where g is a generator of G 1 .

Access Tree
The access tree is an expressive access structure, which has been widely used in attribute-based encryption.In an access tree denoted by T, each non-leaf node can represent an AND gate, an OR gate, or a threshold gate according to its threshold value.Generally, for any node x ∈ T, we use notations num x and k x to denote the number of children and threshold value of x, respectively.For any non-leaf node x, if k x = 1, x represents an OR gate; if k x = num x , it represents an AND gate; and, if 1 < k x < num x , x is a threshold gate.For any leaf node x, we define k x = 1 and num x = 0.
For ease of description to an access tree, we define the following notations.
(2) index(x): The ordering number of node x with respect to its parent node parent(x).For example, if x is the first child (from left to right) of node parent(x), then index(x) = 1, and so forth.(3) attr(x): The attribute associated with the leaf node x in an attribute-based encryption system.
To help understand the access tree and related notations, we give a concrete example, as shown in Figure 1, where the number in each node denotes the threshold value of the node.According to our definition, we have and index(g) = 3.In the access tree, nodes R and N 1 are AND gates, N 2 is an OR gate, and N 3 is a threshold gate.
Satisfying an access tree.Given a set of attributes A and an access tree T with the root node r, T x denotes the subtree of T rooted at the node x, and we define T x (A) can be recursively computed as follows.If x is a non-leaf node, evaluate T x (A) for all children x of node x.T x (A) returns 1 if and only if at least k x children return 1.If x is a leaf node, T x (A) returns 1 if and only if attr(x) ∈ A. Thus, according to the above recursive computation, if A satisfies T, then T r (A) = 1; otherwise T r (A) = 0.For instance, the attribute sets {aa, ab, ac, ae, ag} and {aa, ab, ad, a f , ag} satisfy the above access tree, but none of the attribute sets {aa, ac, ae}, {aa, ab, ae, a f }, and {aa, ab, ac, ae} satisfy it, where attr(a) = aa, attr(b) = ab, attr(c) = ac, attr(d) = ad, attr(e) = ae, attr( f ) = a f , attr(g) = ag.
An example of the access tree.

Complexity Assumption
(1) Decisional Diffie-Hellman (DDH) assumption.Let g be a generator of group G with order p; choose three elements a, b, c ∈ Z * p at random uniformly.The DDH assumption is that no probabilistic polynomial-time adversary A can distinguish the tuple (g, g a , g b , g c ) from the tuple (g, g a , g b , g ab ) with a non-negligible advantage.
(2) Decisional Bilinear Diffie-Hellman (DBDH) assumption.Let G 1 and G 2 be two multiplicative groups of order p, where g is a generator of G 1 and e denotes a bilinear map; choose four elements a, b, c, z ∈ Z * p at random uniformly.DBDH assumption is that no probabilistic polynomial-time adversary A can distinguish the tuple (g, g a , g b , g c , g z ) from the tuple (g, g a , g b , g c , e(g, g) abc ) with a non-negligible advantage.

System Model and Security Model
In this section, we present the system model and security model for our KPSABES.

System Model
The system model of KPSABES consists of three entities: the data owner, the data users, and the server (Figure 2).To ensure the confidentiality, the data owner encrypts the data files using any semantically secure symmetrical encryption scheme such as AES.Meanwhile, the data owner extracts index keywords from data files and defines a set of attributes for each index keywords according to its access permission.Then, each index keyword is encrypted under the corresponding attribute set.Finally, the encrypted data files along with the encrypted index keywords are outsourced to the server (e.g., the cloud server).When a data user wishes to join the system, the data owner first defines an access tree for the data user according to his/her system search permission.Then, the data owner uses the access tree to construct a private key for the data user.Finally, the generated private key along with the symmetrical key encrypting the data files is issued to the data user via secure communication channels.On the other hand, the data user can use the authorized private key to encrypt a query keyword of interest to generate search token, which is then submitted to the server.Upon receiving the search token, the server performs search over encrypted index keywords and returns the search results to the data user, who uses the symmetrical key to decrypt the search results locally.In the whole processes, the server knows nothing about the index keywords and the query keyword.In addition, all files and index keywords are organized as inverted index structure to achieve sub-linear search complexity.
In our system, we introduce attribute based encryption to achieve keyword authorization search in a fine-grained manner.In essence, if and only if the attribute set in an index keyword satisfies the access tree in the data user's key, the data user bears the search permission of the index keyword.For example, in Figure 2, given two index keywords w i and w j , which are encrypted by the attribute set {Computer, Professor} and {Math, Doctor} respectively, obviously, the data user has access to the index keyword w i , but not w j , where [w i ] and [w j ] denote the ciphertexts of index keyword w i and w j .

Security Model
Acknowledgedly, the security goal of the searchable encryption is to prevent the adversary from obtaining the underlying keyword information from the encrypted index keyword and the search token.Usually, the sever is considered as the "curious but honest" adversary, and both the data owner and data users are trusted [50].
We use the selective security against chosen-keyword attack game and adaptively chosen-plaintext attack game to formalize the security properties of the index keyword and the search token, respectively.We provide the formal security definitions for our scheme through the following games between a challenger C and a probabilistic polynomial-time adversary A.
Selectively Chosen-Keyword Attack Game: Setup.The challenger C generates a public parameter and a master key and sends the public parameter to the adversary A. A specifies a set of attributes A * that he wishes to be challenged upon and submits it to C.
Phase 1.The adversary A can issue private key queries many times corresponding to access trees T 1 , ..., T q .The only restriction is that none of these access trees are satisfied by the challenge attribute set A * .A can use these private keys responded by C to encrypt query keywords to generate legal search tokens.
Challenge.The adversary A outputs two keywords w 0 and w 1 and then sends them to C. C fairly chooses a bit b ∈ {0, 1} and encrypts w b with the challenge attribute set A * .The ciphertext is transferred to the adversary.
Phase 2. A continues to ask the challenger C for many private keys corresponding to access trees T q+1 , ..., T q+m .The same restriction as in Phase 1 has to be satisfied in this phase.
Guess.The adversary A outputs a guess b of b.
We define the advantage that the probabilistic polynomial-time adversary A wins the above game as |Pr Definition 1.A searchable attribute based encryption scheme is selectively secure against chosen-keyword attack if the above advantage is negligible.In other words, the index keyword encryption construction does not reveal any plaintext information of the keyword under the selectively chosen-keyword attack model.
Adaptively Chosen-Plaintext Attack Game: Setup.The challenger C generates a public parameter and a master key and sends the public parameter to the adversary A.
Phase 1.The adversary A is allowed to adaptively query for the ciphertext of any search keyword.For any queried search keyword w, B returns the ciphertext ST B ( w) to A.
Challenge.The adversary A outputs two query keywords w 0 and w 1 and then sends them to C. C fairly chooses a bit b ∈ {0, 1} and encrypts w b with the challenge attribute set A * .The ciphertext is transferred to the adversary.
Phase 2. A continues to ask the challenger C for many private keys and encrypts any query keyword w (note that w can be w 0 and w 1 ) to generate legal search token.
Guess.The adversary A outputs a guess b of b.
We define the advantage that the probabilistic polynomial-time adversary A wins the above game as |Pr Definition 2. A searchable attribute based encryption scheme is semantically secure against the adaptively chosen-plaintext attack if the above advantage is negligible.In other words, the search token does not reveal any plaintext information of the query keyword under the adaptively chosen-plaintext attack model.

Definition and Construction of KPSABES
In this section, we give the definition and describe the construction for our proposed KPSABES.

Definition
Our proposed KPSABES is composed of the following five polynomial-time algorithms.
• Setup(1 λ ) → (PK, MK).The data owner runs setup algorithm to output system public parameter PK and the master private key MK, where λ is a security parameter as the input of this algorithm.

•
Keygen(PK, MK, T u ) → K u .The data owner runs keygen algorithm to generate the private key K u for a data user u, where PK and MK are the system public parameter and the master private, respectively, and T u is an access tree which defines u's search permission.

•
Enc(PK, w, A) → I w .The data owner runs enc algorithm to encrypt an index keyword w and outputs w's ciphertext I w , where PK is the system public parameter, w is an index keyword, and A is a set of attributes which defines w's access permission.

•
Searchtoken(PK, K u , w) → ST u ( w).The data user u runs searchtoken algorithm to encrypt a query keyword w and outputs w's search token ST u ( w), where PK is the public parameter, K u is the data user u's private key, and w denotes any query keyword.

•
Search(PK, I w , T u ( w)) → 1/0.The server runs search algorithm to perform matching between the encrypted index keyword I w and the search token T u ( w) with the help of the system public parameter PK.The algorithm outputs 1 if the data user u has access to the index keyword w and w is identical to w, simultaneously; otherwise, it outputs 0.

Construction
In this subsection, we formally design our KPSABES and provide the concrete implementation of each algorithm of KPSABES.

Setup
When inputting an enough security parameter, the Setup algorithm establishes the running environment of KPSABES and generates the system public parameter and the master key.More specifically, the algorithm first generates two cycle multiplication groups G 1 and G 2 of prime order p, and creates a bilinear map e : G 1 × G 1 → G 2 and a hash function H : {0, 1} * → Z * p .Let g denote a generator of G 1 .Then, the algorithm chooses a random value α ∈ Z * p and sets g 1 = g α and further computes e(g 1 , g).Define the Lagrange coefficient where S denotes a set of elements in Z * p and i, j ∈ Z * p .In addition, let n be the maximum size of the attribute set in the system and N be the set {1, 2, ..., n + 1}.The algorithm further randomly chooses t 1 , t 2 , ..., t n+1 from group G 1 and defines a function T as: As in [4], the function T is viewed as the function g X n g h(X) for the polynomial h with degree n.Finally, the outputs of Setup algorithm are denoted as: PK = G 1 , G 2 , e, g 1 , g, e(g 1 , g), (t 1 , ..., t n+1 ), H, N, T, i,S (x)

Keygen
The Keygen algorithm outputs a private key for a data user u, who can use the private key to generate a legal search token with respect to some query keyword of interested.Before invoking Keygen algorithm, the data owner defines an access tree T u with the root node r for u according to u's system role.Next, when inputting the public parameter PK, the master key MK, and A u , the algorithm proceeds as follows.For each non-leaf node x in T u (in a top-down manner), the algorithm first chooses a polynomial q x and sets its degree d x to be d x = k x − 1, where k x is the node x's threshold value in T u .For the root node r, set q r (0) = MK = y and choose d r random values for other points of q r to completely establish it.For any other non-leaf node x, set q x (0) = q parent(x) (index(x)) and choose d x random values for other pints of q x to completely establish it.For each leaf node x in T u , choose a random value r x and compute G 1 = g r x , G 2 = g q x (0) , and T = T(attr(x)) r x .Let L be the set of leaf nodes of the access tree T u .Finally, the outputs of Keygen algorithm are denoted as: which is transmitted to the data user u via secure channels.

Enc
The Enc algorithm outputs the ciphertext of an index keyword w.Before running the algorithm, the data owner specifies a set of attributes A for the index keyword w according to w's access permission in the system.On inputting the public system PK, the index keyword w, and the corresponding attribute set A, the algorithm proceeds as follows.It first chooses a random value s ∈ Z * p and computes I = e(g 1 , g) sH(w) and I = g s .Then, for each attribute a ∈ A, the algorithm further computes I a = T(a) s .Finally, the outputs of the Enc algorithm are denoted as: The Searchtoken algorithm outputs the search token ST u ( w) about a query keyword w that the data user u is interested in.On inputting the public system PK, the data user u's key K u , and the search keyword w, the algorithm proceeds as follows.For each key component in K u , it chooses a random value λ ∈ Z * p and encrypts the search keyword w to be G H( w)λ 1 and G H( w) 2 • T H( w)λ .Finally, the outputs of the Searchtoken algorithm are denoted as:

Search
Given an encrypted index keyword I(w) and an encrypted search keyword ST u ( w) submitted by the data user u, the server invokes the Search algorithm to perform matching between I(w) and ST u ( w).The algorithm outputs 1 if and only if the following two conditions hold simultaneously: (1) the data user u has access to the index keyword w; and (2) the search keyword w is identical to w.In the whole search process, the sever knows nothing about the index keyword and the search keyword.Now, we give the concrete implementation for Search algorithm as follows.We first define a recursive algorithm SearchNode(I(w), ST u ( w), x), where x denotes a node in the access tree (in our scheme, the access tree is embedded in u's private key encrypting the search keyword w, therefore it is actually ultimately implanted in the corresponding search token).It outputs an element of G 2 or ⊥ as follows.For each leaf node x in the access tree, and letting a = attr(x), if a ∈ A, then: • T H( w)λ x , g s ) e(G H( w)λ x 1 , T(a) s ) = e(g q x (0)H( w) • T(a) r x H( w)λ x , g s ) e(g r x H( w)λ x , T(a) s ) = e(g q x (0)H( w) , g s ) • e(T(a), g) r x H( w)λ x s e(T(a), g) r x H( w)λ x s = e(g, g) sq x (0)H( w) , if a = attr(x) / ∈ A, we define SearchNode(I(w), ST u ( w), x) = ⊥.On the other hand, for each non-leaf node x, the algorithm SearchNode(I(w), ST u ( w), x) then proceeds as follows: for all nodes z that are children of x, it runs SearchNode(I(w), ST u ( w), z) and stores the outputs as F z .Let S x be an arbitrary k x -sized set of child nodes z such that F z = ⊥.If no such set exists, then the node is not satisfied and the function returns ⊥ and Search algorithm returns 0, which means the data user u has not the search permission of the index keyword w.
Otherwise, we compute: = ∏ z∈S x e(g, g) = e(g, g) sq x (0)H( w) (using polynomial interpolation) (10) Therefore, through the above recursive computation, we can observe that, if the attribute set A in I u satisfies the access tree embedded in the search token, then, for the root node r, SearchNode(I(w), ST u ( w), r) = e(g, g) sq r (0)H( w) , which means the data user u is allowed to access the index keyword w.Finally, the matching between the index keyword w and the search keyword w is performed through the following comparison: If Equation ( 10) holds, then w = w and the Search algorithm returns 1; otherwise, w = w and the Search algorithm returns 0. We can verify the matching correctness by the simple derivation as follows: SearchNode(I(w), ST u ( w), r) = e(g, g) sq r (0)H( w) = e(g, g) syH( w) = e(g y , g) sH( w) = e(g 1 , g) sH( w) = I (12)

Performance Analysis
In this section, we analyze the performance for our scheme from two aspects: storage cost and computation cost.Meanwhile, we also provide a performance comparison between the similar work KP-ABKS proposed in [49] and ours.For ease of description, we first define some notations, as shown in Table 1, where we are only interested in those relatively time-consuming operations such as pairing operation and exponentiation operation, while ignoring extremely efficient operations such as multiplication operations [51].

A
Attribute set of encrypting an index keyword.

S
Least interior node set satisfying an access structure (include the root).

U
Universe set of attributes.
The time cost that each algorithm spends in our scheme and in KP-ABKS [49] is shown in Table 2.In Table 2, we can observe that KPSABES and KP-ABKS have the same computation complexity with respect to Keygen algorithm.More importantly, our scheme outperforms KP-ABKS on Enc algorithm and Search algorithm due to the less pairing operations and exponentiation operations compared with KP-ABKS.[49].

KPSABES KP-ABKS [49]
We use the output size of the algorithm to evaluate the storage cost for KPSABES and KP-ABKS, as shown in Table 3.We can see that, compared with KP-ABKS, in additional to Keygen algorithm, other algorithms in KPSABES need fewer spaces to store their outcomes.Here, we do not consider the storage cost of search results, thus the output size of Search algorithm is ignored.
Table 3.The storage cost of the output of each algorithm in KPSABES and KP-ABKS [49].

Security Proof
In this section, we provide formal security proof for our proposed KPSABES based on the selectively chosen-keyword attack model and the adaptively chosen-plaintext attack model.Theorem 1.If DBDH is a complex problem, the index keyword encryption algorithm Enc in KPSABES is selectively secure against chosen-keyword attack.
Proof.If there is a polynomial-time adversary A that can break KPSABES with a non-negligible advantage to gain the index keyword from the corresponding ciphertext, a simulator B can be constructed to solve DBDH problem with advantage 2 .
The challenger C first runs Setup algorithm to set up the system public parameter PK and the master key MK.Then, C flips a fair binary coin ν ∈ {0, 1} and transmits a tuple t ν to the simulator B. If ν = 0, the tuple t 0 is set to be (g, A = g a , B = g b , C = g c , Z = e(g, g) abc ); on the contrary, tuple t 1 is set as (g, A = g a , B = g b , C = g c , Z = e(g, g) z ), where a, b, c, z are randomly chosen from Z * p .Next, the challenger delegates the simulator B to play the, hlSelectively Chosen-Keyword Attack Game with A as follows.
Setup.The adversary A specifies a set of attributes A * with n elements that he wishes to be challenged upon and sends it to the simulator B. B randomly chooses an n degree polynomial f (X) and constructs the other n degree polynomial u(X) as follows: for all X ∈ A * set u(X) = −X n and otherwise set u(X) = −X n , which can ensure that if and only if X ∈ A * , u(X) = −X n holds for all X.Next, B sets t i = g u(i) g f (i) for ∀i ∈ [1, n + 1].Since f (X) is randomly chosen, the value t i is also random.Finally, we have T(i) = g i n +u(i) g f (i) .
Phase 1.The adversary A issues private key queries for many times corresponding to access trees T 1 , ...T q such that none of them are satisfied by the challenge attribute set A * .For any a requested access tree T * ∈ [T 1 , T q ], since the challenge attribute set A * does not satisfy T * , i.e., T * (A * ) = 0, to generate the corresponding private key, B has to assign a polynomial Q x of degree d x for every node x in the access tree T * .To do that, we first define two procedures PolySat and PolyUnsat as follows.

•
PolySat(T * x , A * , λ x ).PolySat takes as input an access tree T * x rooted at x such that T * x (A * ) = 1, the attribute set A * , a secret value λ x ∈ Z * p , and determines the polynomials for each node of T * x .Specifically, it first generates a polynomial q x of degree d x for the root node x such that q x (0) = λ x and randomly sets d x other points to completely generate q x .Then, for each child node x of x, it calls PolySat(T * x , A * , q x (index(x ))), where q x (0) = q x (index(x ).• PolyUnsat(T * x , A * , g λ x ).PolyUnsat takes as input an access tree T * x rooted at x such that T * x (A * ) = 0, the attribute set A * , a group element g λ x ∈ G 1 , λ x ∈ Z * p , and determines the polynomials for each node of T * x .Specifically, it first generates a polynomial q x of degree d x for the root node x and makes g q x (0) = g λ x .T * x (A * ) = 0 means that no more than d x children of node x are satisfied.Let SC denote a set of child nodes of x that are satisfied by A * , we have |SC| < d x .
For each node x ∈ SC, it randomly chooses a value λ x ∈ Z * p and sets q x (index(x )) = λ x and randomly chooses d x − |SC| other points to completely define q x .Then, the procedure continues to run PolySat(T * x , A * , q x (index(x ))) if x ∈ SC, where q x (index(x )) is known to the simulator; otherwise, the PolyUnsat(T * x , A * , g q x (index(x )) ) is run, where g q x (index(x )) is known to the simulator.Now, the simulator B runs PolyUnsat(T * , A * , A), where A = g a .After the procedure call finishes, B can determine two kinds of results about the polynomial: (1) for the root node r of T * , q r (0) = a; and (2) for each leaf node x of T * , if attr(x) ∈ A * , B gets the complete polynomial q x and can further compute q x (0); if attr(x) / ∈ A * , B gets g q x (0) .Finally, B generates the private key corresponding to the access tree T * based on the polynomial of each leaf node x as follows.Let attr(x) = i.

•
If i ∈ A * , B chooses a random value r x ∈ Z * p and sets G 1 = g r x , G 2 = g q x (0) , and T = T(attr(x)) r x = T(i) r x .

•
If i / ∈ A * , let g 2 = g q x (0) , B chooses a random value r x ∈ Z * p and sets the key components corresponding to the node x as follows.
i n +u(i) , the above keys are legitimate, because Therefore, the simulator can generate the correct private key for the requested access tree T * , which is sent to the adversary A. The adversary can use the private key to encrypt any query keyword w to generate valid search token.
Challenge.The adversary A outputs two keywords w 0 and w 1 and sends them to the simulator.B randomly chooses a bit b ∈ {0, 1} and encrypts w b with A * as: where s is chosen at random from Z * p .If ν = 0, tuple t 0 : (g, A = g a , B = g b , C = g c , Z = e(g, g) abc ) is sent to B, w b 's ciphertext can be denoted as I(w b ) = (A * , I = e(g, g) abcH(w b ) , I = g s , ∀a ∈ A * : I a = T(a) s ).Since a, b, c, s are all random from the adversary A's view, we let bc = s.Thus, the w b 's ciphertext is I * (w b ) = (A * , I = e(g, g) asH(w b ) , I = g s , ∀a ∈ A * : I a = T(a) s ), which can further be denoted as I * (w b ) = (A * , I = e(g 1 , g) sH(w b ) , I = g s , ∀a ∈ A * : I a = T(a) s ).This is a valid ciphertext for w b under A * .On the other hand, if ν = 1, tuple t 1 : (g, A = g a , B = g b , C = g c , Z = e(g, g) z ) is sent to B. The ciphertext of w b can be denoted as I * (w b ) = (A * , I = e(g, g) zH(w b ) , I = g s , ∀a ∈ A * : I a = T(a) s ).Obviously, I is a random element of G 2 and contains no information about w b .
Phase 2. The adversary A continues to issue the private key queries m times corresponding to access trees T q+1 , ..., T q+m , as in Phase 1.
Guess.The adversary A outputs the guess b of b.If b = b , then B outputs ν = 0, it means that B obtains the valid DBDH tuple from C. Since A has advantage to break KPSABES and recover the index keyword information from the ciphertext, the probability that A outputs b = b is 1  2 + .If b = b , then B outputs ν = 1 and the challenger C sends the random tuple to B. The probability that A outputs b = b is 1  2 .The overall advantage that B solves the DBDH problem is as follows: Therefore, B can solve the DBDH problem with the non-negligible advantage 2 , which contradicts the recognized complexity assumption of DBDH problem.Theorem 2. If DDH is a complex problem, the search keyword encryption algorithm Searchtoken in KPSABES is semantically secure against adaptively chosen-plaintext attack.
Proof.If there is a polynomial-time adversary A that can break KPSABES with a non-negligible advantage to recover the search keyword information from the corresponding ciphertext, a simulator B can be constructed to solve DDH problem with advantage 2 .
C flips a fair binary coin ν ∈ {0, 1} and transmits a tuple t ν to the simulator B. If ν = 0, the tuple t 0 is set to be (g, A = g a , B = g b , C = g ab ); on the contrary, tuple t 1 is set as (g, A = g a , B = g b , C = g c ), where a, b, c are random elements in Z * p .Next, the challenger delegates the simulator B to play the adaptively chosen-plaintext attack game with the adversary A as follows.
Setup.The simulator B generates a public parameter PK and a master key MK.Then, B sends PK to A.
Phase 1.The adversary A adaptively queries for the ciphertext of any search keyword.Challenge.The adversary A outputs two keywords w 0 and w 1 and sends them to B. B randomly chooses a bit b ∈ {0, 1} and encrypts w b under an access tree A with leaf node set L as: which is sent to A. If tuple t 0 : (g, A = g a , B = g b , C = g ab ) is sent to B, w b 's ciphertext can be denoted as ST * B (w b ) = (ST 1 = g abH(w b ) , ST 2 = g aH(w b ) • g abH(w b ) ) (Note that since the ciphertext structures of all leaf nodes are identical, we uniformly uses the generator g of G 1 to present function value T(attr(x)) ∈ G 1 for any leaf node x ∈ L).For example, given a leaf node x ∈ L, if we let a = r x = q 0 (x), b = λ x , then this is a valid ciphertext component.On the other hand, if tuple ).In this case, ST 1 = g cH(w b ) and ST 2 = g aH(w b ) • g cH(w b ) are two random elements of G 1 from the adversary A s view and contain no information about w b .
Phase 2. The adversary A continues to ask the simulator B for the ciphertext of any search keyword, including w 0 and w 1 .
Guess.The adversary A outputs the guess b of b.If b = b , then B outputs ν = 0, it means that B the valid DDH tuple from C. Since A has advantage to recover the search keyword information from the search token, the probability that A outputs b = b is 1  2 + .If b = b , then B outputs ν = 1 and the challenger C sends the random tuple to B. The probability that A outputs b = b is 1  2 .The overall advantage that B solves the DDH problem is as follows: Therefore, B can solve the DDH problem with a non-negligible advantage 2 , which contradicts the recognized complexity assumption of DDH problem.

Experimental Evaluation
We evaluated the performances for our KPSABES and the similar work KP-ABKS proposed in [49] through extensive experiments on a real dataset, the Request For Comments Database (RFC) [52].We chose 2000 data files from RFC and extracted 320 index keywords from these data files by using Hermetic Word Frequency Counter [53] tool.All programs were run in the Java platform with the help of JPBC library [51].Specifically, the evaluations of Setup algorithm, Keygen algorithm, Enc algorithm, and Searchtoken algorithm were conducted in a client environment: a Windows 7 desktop system with 2.3-GHz Intel Core (TM) i5-6200U and 4-GB RAM.The server environment was used to evaluate the performance of Search algorithm: an Ubuntu 16.04 system with 3.60-GHz Intel Core (TM) i7-7700 CPU and 8-GB RAM.

Evaluation of Setup Algorithm
Figure 3 shows the time cost of Setup algorithm in KPSABES and KP-ABKS.Randomly generating a group element of G 1 was relatively time-consuming operation from the bilinear elliptic curve group, needing about 200 millisecond in our test environment.We can see that the time cost of both schemes increased linearly with the number of system attributes.However, our scheme had a slight advantage to set up the system running environment.An important reason was that the Setup algorithm in KP-ABKS needed more exponentiation operations compared with our KPSABES.

Evaluation of Keygen Algorithm
Figure 4 shows the time cost of Kengen algorithm in KPSABES and KP-ABKS.The most expensive operation in this algorithm was the hash operation that hashing a leaf node to an element of G 1 and needed about 106 millisecond at average.As shown in Figure 4, these two schemes had approximatively identical time cost on Keygen algorithm.Further, they were linear to the number of leaf nodes in the access tree.The experimental results are also in accordance with our algorithm complexity analysis in Section 5.

Evaluation of Enc Algorithm
Figure 5 shows the time cost of Enc algorithm in KPSABES and KP-ABKS.To thoroughly evaluate the performance of the index keyword encryption, we made three groups of different tests in our experimental dataset.Figure 5a demonstrates the time cost on one index keyword encryption when varying the number of attributes.We can see that the time cost of both schemes increased linearly with the number of attributes, which is reasonable since they are both inherited from the original KP-ABE scheme in [4].Figure 5b shows the time cost of encrypting index keywords with different size when fixing the number of attributes and data files.The experimental result shows that encrypting all 320 index keywords extracted from 2000 data files consumed about 290 s and 408 s in KPSABES and KP-ABKS, respectively, when the number of attributes was set to be 6. Figure 5c demonstrates that the number of data files had no influence on index keyword encryption in the two schemes when fixing the size of index keywords and the number of attributes.As shown in Figure 5, t our KPSABES was more efficient on index keyword encryption compared with KP-ABKS.Moreover, the greater was the number of index keywords, the more obvious was the advantage.This results is reasonable because encrypting one index keyword in KP-ABKS needed two more exponential operations than our scheme according to the algorithm complexity analysis.In addition, as shown in Figure 5b,c, the index keyword encryptions in the whole dataset were very time-consuming, but was a one-time operation.

Evaluation of Searchtoken Algorithm
Figure 6 shows the time cost of Searchtoken algorithm in KPSABES and KP-ABKS.We can see that, in KPSABES and KP-ABKS, the time spent on the search keyword encryption was closely related to the number of leaf nodes in the access tree embedded in the data user's private key.In addition, when the number of leaf nodes in the access tree was more than 3, our scheme needed more time to generate a search token.This was because KP-ABKS and KPSABES needed 2l + 3 and 3l exponentiation operations to encrypt a search keyword, respectively, where l denotes the number of leaf nodes in the access tree.However, these two schemes were extremely efficient on search keyword encryption.For example, when the number of leaf nodes in the access tree was 10, the time cost of the search keyword encryption was about 1.356 s in KPSABES and 0.968 s in KP-ABKS, which is acceptable in practical application.

Evaluation of Search Algorithm
Figure 7 shows the time cost of Search algorithm in KPSABES and KP-ABKS.Since the search performance is the key index in a search system, we made six groups of different tests with different experimental parameters to evaluate the search performance for KPSABES and KP-ABKS.Specifically, Figure 7a-c demonstrates the average time cost for different number of index keywords with fixed size of data files (n = 2000) when setting |S| = 2, |S| = 4, and |S| = 6, respectively, where S denotes the least interior node set satisfying the access tree (include the root).Figure 7d-f shows the average time cost for different size of data files with fixed number of index keywords (i = 200) when setting |S| = 2, |S| = 4, and |S| = 6, respectively.We observed that the time cost of both schemes was determined by not only the number of index keywords but also the number of least interior nodes satisfying the access tree and was insensitive to the size of data files, which confirmed our complexity analysis in Section 5.Moreover, as shown in Figure 7, our scheme outperformed KP-ABKS on search performance in an obvious advantage due to requiring fewer pairing computations.

Conclusions
In this paper, based on the original KP-ABE scheme [4], we design a key-policy searchable attribute-based encryption scheme (KPSABES) to support efficient keyword search and fine-grained access control over encrypted data.KPSABES is very suitable for the cryptography based data sharing storage system that needs the data access control and keyword based data searching.Unlike the similar KP-ABKS scheme proposed in [49], on the basis of the scheme in [4], our scheme does not require introducing any extra ciphertext components and expensive operations to support data searching.Therefore, KPSABES has some obvious advantages in terms of storage and computation cost compared with KP-ABKS.In addition, extensive experiments on a real dataset demonstrated that KPSABES is superior in many aspects to KP-ABKS, especially in the search performance.As our future work, we will consider the problem of efficient multi-keyword ranked search with fine-grained access control over encrypted data.

Figure 3 .
Figure 3.The time cost of Setup algorithm in KPSABES and KP-ABKS.

Figure 4 .
Figure 4.The time cost of Keygen algorithm in KPSABES and KP-ABKS.

KTFigure 5 .
Figure 5.Time cost of Enc algorithm in KPSABES and KP-ABKS: (a) time cost of encrypting one index keyword for different number of attributes: (b) time cost of encrypting index keywords for different number of index keywords with fixed number of attributes |A| = 6 and fixed number of data files, n = 2000; and (c) time cost of encrypting index keywords for different number of data files with fixed number of attributes |A| = 6 and fixed number of index keyword, i = 200.

Figure 6 .
Figure 6.The time cost of Searchtoken algorithm in KPSABES and KP-ABKS.

KK 6 Figure 7 .
Figure 7. Time cost of Search algorithm in KPSABES and KP-ABKS.

Table 1 .
Notations used in complexity analysis.