An Improved ID-Based Data Storage Scheme for Fog-Enabled IoT Environments

In a fog-enabled IoT environment, a fog node is regarded as the proxy between end users and cloud servers to reduce the latency of data transmission, so as to fulfill the requirement of more real-time applications. A data storage scheme utilizing fog computing architecture allows a user to share cloud data with other users via the assistance of fog nodes. In particular, a fog node obtaining a re-encryption key of the data owner is able to convert a cloud ciphertext into the one which is decryptable by another designated user. In such a scheme, a proxy should not learn any information about the plaintext during the transmission and re-encryption processes. In 2020, an ID-based data storage scheme utilizing anonymous key generation in fog computing was proposed by some researchers. Although their protocol is provably secure in a proof model of random oracles, we will point out that there are some security flaws inherited in their protocol. On the basis of their work, we further present an improved variant, which not only eliminates their security weaknesses, but also preserves the functionalities of anonymous key generation and user revocation mechanism. Additionally, under the Decisional Bilinear Diffie–Hellman (DBDH) assumption, we demonstrate that our enhanced construction is also provably secure in the security notion of IND-PrID-CPA.


Introduction
According to the concept of cloud computing addressed by computer scientist John McCarthy [1] in 1992, the computing capability of computers will someday become a kind of public utility like telephone systems. Essentially, the cloud computing is an innovative computing concept, rather than a brand-new technique. It utilizes the network to provide the service of data computing, transmitting and sharing. Moreover, it allows lots of computers to simultaneously share the same computing task for not only improving the efficiency, but also solving the plight of insufficient hardware resources in a single computer. From the perspective of end users, they only need to focus on the required resources and service types. Generally speaking, there are three types of service models described as follows: (i) Software as a Service (SaaS): Users can utilize the browser of information devices such as computers, cell phones, tablets and so on to access the resources and services of cloud providers and execute the required software and applications in highly malleable cloud infrastructures. (ii) Platform as a Service (PaaS): The cloud service provider offers the platform for application development and the supported programming language with development tools so that users can deploy or purchase the required application services by themselves. (iii) Infrastructure as a Service (IaaS): In the cloud infrastructure, cloud service providers offer all kinds of resources, including network, storage, analysis and computing, etc., so that users process tasks as if they were on the local machine without maintaining and managing the backend hardware structure.
Though the notion of cloud computing has greatly changed the traditional way of information utilization, sharing and storage, its high latency caused by the Internet transmission and the centralized processing burden of cloud systems are still major challenges in current IoT-enabled cloud applications. Owing to this reason, the model of fog computing has thus come up. It can be regarded as a technique expanded from the cloud computing. Moreover, it is more like a computing mode that is close to the end-users. Therefore, we could say that a fog is a kind of cloud approaching the ground. As compared with clouds, though fogs have a less powerful computing power, they could reduce response time, gain more energy savings and decrease the utilization of bandwidths. So far, fog computing approaches have been used in harsh operational environments such as shipping [2] and aviation [3].
In a fog-enabled IoT environment, fog nodes are core components that could be either physical devices or virtual equipment, and tightly coupled with intelligent terminals or the access network to provide computing resources. These fog nodes can forward received data to clouds and help with the downloading of user data. Like cloud databases, the fog layer has its own data storage and maintains the local database. Although the architecture of fog computing extended from that of cloud computing can increase mobility and reduce transmission latency, it is still vulnerable to many security threats summarized by Patwary et al. [4] in Figure 1. When it comes to sharing confidential data in fog computing, the proxy re-encryption (abbreviated to PRE) scheme addressed by Blaze et al. [5] is a relevant alternative, since it can maintain data confidentiality during the transformation of ciphertexts. Specifically, a data owner can first encrypt the data and then upload the ciphertext to clouds for future access. As the data are transmitted and stored in the encrypted form, anyone including the honest-but-curious cloud server is unable to decrypt it without knowing the corresponding private key. When a data owner attempts to grant another user the access right of his cloud ciphertext, he could authorize a semi-trusted fog node (viewed as the proxy) to perform the procedure of ciphertext re-encryption. In such a way, an original cloud ciphertext is converted into the one that is decryptable by the designated data user. A major advantage of the technique of PREs is that the ciphertext remains undecrypted during the conversion process. Consequently, the proxy will learn nothing about the ciphertext. Up to the present, it has been found in many of the PRE applications [5][6][7][8] such as data sharing, data outsourcing, data storage in clouds, e-mail forwarding, etc.

Related Works
Under the Decisional Bilinear Diffie-Hellman (DBDH) assumption, in 2005, Ateniese et al. [6,7] presented an improved PRE scheme following the work of Dodis and Ivan [9]. They demonstrated that PRE schemes are useful in the access control of secure file systems and could be realized efficiently in practice. In 2007, Caneti and Hohenberger [10] proposed the definition of chosen-ciphertext attacks (CCA) for PRE schemes and gave a concrete construction to satisfy the definition in the standard model. The definition that they introduced includes both game-based and simulation-based ones. The underlying security of their scheme is also the DBDH assumption.
Seeing that previous PRE schemes mainly convert the ciphertext from one public key to another, Green and Ateniese [11] introduced identity-based PRE (abbreviated to IB-PRE) schemes to deal with the problem of transforming the ciphertext from one identity to another. In addition, their schemes are non-interactive and provably secure in the random oracle models. One of their works also exhibits the property of multi-hop, i.e., a ciphertext could be converted from one identity to another more than once, which gives the practical applications more feasibility. Using standard model proofs, Chu and Tzeng [12] presented two identity-based PRE mechanisms. They showed that their first scheme has better efficiency in computational costs and the ciphertext length while the other one achieves CCA security. The two mechanisms are unidirectional and non-interactive. Nevertheless, in 2009, Shao and Cao [13] pointed out that the Chu-Tzeng scheme is not truly CCA-secure, as its re-encrypted ciphertext could be further converted into a well-formed ciphertext. Using the Decisional Diffie-Hellman (DDH) assumption and the integer factorization assumption, they also addressed a new PRE scheme that could withstand both the chosen-ciphertext attack and the collusion attack in the random oracle models.
In 2012, Fang et al. [14] presented the so-called conditional PRE with keyword search (C-PRES), which is a combination of conditional PRE (C-PRE) and public key encryption with keyword search (PEKS). They defined the security of chosen-ciphertext attacks for C-PRES and proved that their construction fulfills this definition. Wang et al. [15] also devoted their attention to the research of PRE with keyword search and hence introduced a new primitive named constrained single-hop unidirectional PRE supporting conjunctive keywords search (CPRE-CKS). Based on Park et al.'s work [16], Wang et al.'s scheme only allows the ciphertext containing the corresponding keywords to be re-encrypted by a proxy. Under the decisional q-parallel bilinear Diffie-Hellman exponent assumption, in 2013, Liang et al. [17] extended the traditional PRE into the CP-ABPRE, i.e., ciphertext policyattribute-based PRE. In such a protocol, a proxy has the ability to re-encrypt the ciphertext from one access policy to another. Their protocol can be applied to any monotonic access structure and is proved secure in the random oracle models.
Akhil et al. [18] employed the technique of PRE to enhance the security of QR codes in 2016. A QR code is a machine-readable format that could be tampered with maliciously when being transmitted. Applying the technique of PREs to QR codes makes it easy and flexible to be shared and stored among different hosts. In 2018, Zeng and Choo [19] proposed a new kind of conditional PRE (C-PRE) scheme called sender-specified PRE (SS-PRE) in which a proxy can only convert the ciphertext from a specified sender to his/her delegatee. They also demonstrated that their SS-PRE scheme outperforms the conventional C-PRE ones.
Considering the communication security between the fog and the cloud, in 2018, Vohra and Dave [20] proposed an attribute-based access control and re-encryption system composed of two phases. In the first phase, the clouds will communicate with the proxy server and transmit encrypted data, which are then decrypted by the latter according to its attribute set and access policy. In the second phase, the proxy server will broadcast re-encrypted ciphertext to all fog nodes. Only the fog node that has the correct attributes can decrypt the ciphertext.
In 2020, Lian et al. [21] introduced a PRE model along with a concrete scheme that is suitable for complex access control factor description in hybrid clouds. A hybrid cloud has not only the advantage of more powerful computing resources in public clouds, but also that of easy management in private clouds. They showed that their construction is secure under the DBDH assumption and could be reliably deployed in hybrid clouds.
In 2021, Xiong et al. [22] proposed an adaptively secure puncturable identity-based PRE scheme for securing group messages. In their work, a message server is responsible for converting the ciphertext for each user, and thus the heavy computation load could be shifted from the user to the message server. They also prove the security of their system under the DBDH assumption. However, the message server would easily become the performance bottleneck, and the centralized architecture is not suitable for distributed application environments.
Considering the data sharing in clouds, Ge et al. [23] presented a verifiable and fair attribute-based PRE scheme in which the user is able to verify the correctness of the ciphertext converted by the cloud server. Moreover, the latter is also capable of claiming its honesty when being maliciously accused by the former. They conduct experiments to demonstrate the feasibility and the efficiency of their system in the realistic environments. Nevertheless, their mechanism does not deal with the revocation issue of either attributes or the user identity.
Recently, Zhang et al. [24] applied the technique of PRE to propose an ID-based data storage (DS) system utilizing anonymous key generation for the fog computing environment. A DS scheme for fog-enabled environments is a kind of data sharing technique utilizing fog nodes as the proxy to reduce response time and communication overheads. Specifically, the fog nodes can process the data gathered by the IoT sensors and forward them to the cloud. The fog node serving as a proxy between the cloud and the endpoints is able to transform the cloud ciphertext into another one, which is decryptable by the requested user, so as to achieve the purpose of data sharing in clouds. Using anonymous key generation, a malicious private key Generation (PKG) center is unable to learn the genuine private key of users. It has been formally proved that their protocol is indistinguishable against adaptively chosen identity and chosen plaintext attacks (abbreviated to IND-PrID-CPA) in random oracle models and secure against the PKG and the collusion attacks. However, in this work, we will show that their system has several security flaws. So far, lots of PRE-related cryptographic protocols [25][26][27][28][29][30][31][32][33][34][35][36] have been proposed.

Contributions
Since a secure DS scheme in fog-enabled IoT environments is the key to ensure data confidentiality and user privacy, we devote ourselves to the enhancement of current DS schemes.
In 2020, Zhang et al. [24] proposed a novel DS scheme supporting anonymous key generation, which is unnecessary to rely on a secure channel. Unfortunately, their system is vulnerable to several attacks. Motivated by Zhang et al.'s work [24], we present an improved DS scheme in the fog-enabled IoT environments. The novelty of our work is that we view fog nodes as semi-trusted entities rather than fully trusted ones in our system. Moreover, we introduce an additional random value in generating the proxy re-encryption keys, so as to prevent unauthorized decryption. The partial private key information will not be compromised during communication, which gains more protection of confidentiality in practical environments. Our work can strengthen the application security in fog-enabled environments. In particular, telemedicine has received much attention in recent years. The data confidentiality and user privacy are the most concerned. Improving existing schemes to withstand possible attacks is of utmost importance. The major contributions of this research are itemized below: (i) We demonstrate some security vulnerabilities in relation to the illegal access privilege of Zhang et al.'s scheme [24], including the proxy attack, the outsider attack and the revoked user attack. (ii) An enhanced DS variant on the basis of Zhang et al.'s system [24] is further proposed.
In particular, we introduce an additional random value in the re-encryption key generation and modify the decryption algorithm.
(iii) We formally prove that the proposed construction is indistinguishable against adaptively chosen identity and chosen plaintext attacks (IND-PrID-CPA) in random oracle models. (iv) The superior characteristics of anonymous key generation and user revocation are preserved in the proposed improvement. (v) The confidentiality of partial private key information is well-protected during communication since it does not need to be transmitted with the communication data.
The organization of this paper is described as follows. In Section 2, the computational background and cryptographic assumptions are introduced. We review and analyze Zhang et al.'s protocol [24] in Section 3. A corresponding improvement is also presented. In Section 4, using random oracle models, we define the security notion of IND-PrID-CPA and formally prove the security of our improved variant. Finally, a conclusion is summarized in Section 5.

Preliminaries
We describe the property of bilinear pairing and a related computational assumption, which the proposed scheme is based on in this section.

Definition of Bilinear Pairing
We let both notions of G 1 and G 2 be multiplicative groups and they have the same prime order p. A symmetric bilinear pairing is defined as e: G 1 × G 1 → G 2 . Some characteristics of bilinear pairings are stated below: Letting P be an element of G 1 and x, y be arbitrary integers of Z p , the equality e(P x , P y ) = e(P, P) xy will hold.

(iii) Computability
There exists an algorithm that could efficiently compute e(P, W), where P, W ∈ G 1 2 .

Decisional Bilinear Diffie-Hellman (DBDH) Problem
Given elements (g, g f , g s , g k , e(g, g) fsk , δ), where g, g f , g s , g k ∈ G 1 4 and e(g, g) fsk , δ ∈ G 2 2 , the DBDH problem is to determine whether e(g, g) fsk equals to δ or not.

Decisional Bilinear Diffie-Hellman (DBDH) Assumption
The DBDH assumption holds provided that the advantage for arbitrary probabilistic adversary running in polynomial time, and breaking the DBDH problem is negligible.

Proposed ID-Based Data Storage Scheme
We first formalize the algorithms of ID-based data storage (abbreviated as IB-DS) schemes and then review Zhang et al.'s work [24]. Several security weaknesses of their scheme will be demonstrated, and a corresponding enhanced variant shall be introduced later.

System Architecture
The system architecture of the IB-DS scheme is illustrated in Figure 2, which could be divided into three layers, i.e., the cloud, the fog and the user layers. There is also a trusted authority called the private key generation center (PKG), which is responsible for generating the private key of all involved entities. The cloud server of the cloud layer will store encrypted data gathered from the user layer. A data requester of the user layer can request the data access of the cloud ciphertext by the assistance of the fog layer. The fog nodes comprising the fog layer are viewed as a proxy between the cloud layer and the user layer. Whenever a data owner authorizes the access privilege of a cloud ciphertext to another data user, the proxy (fog) would be granted a re-encryption key, which is able to transform the target cloud ciphertext into one decryptable by the desired data user.

Algorithms
An IB-DS scheme consists of seven algorithms including Setup, Keygen, Encrypt, Query, Permission, Re-encrypt and Decrypt. The definitions of the above algorithms are described below: -Setup(1 l ): It accepts a security value l and then generates system public parameters PP and a master secret key Msk. It takes the input of system public parameters PP, a data user identity ID u , a private key SK IDu and a data category name M cate , and then outputs a corresponding query token TK. -Permission(PP, ID u , SK IDo , TK): It takes the input of system public parameters PP, a data user identity ID u , the private key SK IDo of the data owner and a query token TK, and then outputs either an invalid symbol ⊥ or a re-encryption key RK. -Re-encrypt(PP, ID u , C, RK): It takes the input of system public parameters PP, a data user identity ID u , a ciphertext C and a re-encryption key RK, and then outputs a corresponding re-encrypted ciphertext C . -Decrypt(PP, SK ID , C or C ): It takes the input of system public parameters PP, a private key SK ID and a ciphertext C (or C ), and then outputs a decrypted message m.
We summarize the input and the output parameters of each algorithm in Table 1.

Review and Security Analysis
This subsection reviews an IB-DS scheme proposed by Zhang et al. [24] in 2020. Although their protocol is provably secure in the random oracle models, there are still some security drawbacks, which will be pointed out later. The construction of their scheme is described below: -Setup: Using a security value l, the PKG first decides two multiplicative groups G 1 and G 2 . Let p be the prime order of both groups and g a generator of G 1 . In the two groups, there is a symmetric pairing function e expressed as e: The PKG then chooses integers a, b ∈ Z p * as the Msk and computes the Mpk = (P = g a , To maintain the membership of system users, the PKG also keeps a revocation list L. The system public parameter PP is composed of is a symmetric encryption/decryption function and (h 1 , h 2 ) are two secure one-way hash functions that accept a variable-length input and generate a corresponding output in G 1 . -Keygen: A user associated with the identity ID i first chooses t i , z i ∈ Z p * to compute and transmits (ID i , T i ) to the PKG who then chooses d i ∈ Z p * to compute and delivers (SK i,1 , SK i,2 ) to ID i . In this way, ID i could further set Here, the full private key of ID i is SK i = (SK i,1 , SK i,2 ). The correctness of the private key could be verified by the following equality: -Encrypt: To encrypt the message m = (m 1 , m 2 , . . . , m n ), a data owner ID o first selects r ∈ R Z p * and a symmetric key Y ∈ G 2 to compute Then, the ciphertext C = (α, β, θ, τ) along with (ID o , M cate ), where M cate represents the category name of data, are transmitted to the nearby fog (proxy), which will keep (ID o , M cate , α, β, θ) in the local database of the fog layer and forward (ID o , M cate , τ) to the cloud server.
-Query: To request the data access of M cate , a data user ID u first chooses w ∈ R Z p * to compute W = (SK u,1 ) w , and sends (ID u , M cate , W, SK u,2 ) to the nearby proxy. Afterwards, the proxy utilizes M cate to search for matched (ID o , M cate , α, β, θ) in the local database and delivers the query token TK = (ID u , W, SK u,2 , β) to the corresponding data owner ID o .
-Permission: When receiving the query token TK = (ID u , W, SK u,2 , β), the data owner sends (ID u , SK u,2 ) to the PKG, which will inspect whether ID u is a revoked user or not according to its revocation list L and then return True/False to indicate that the membership of ID u is valid/invalid. If False, the data owner submits an invalid symbol ⊥ to the proxy. Otherwise, ID o picks a random number x ∈ Z p * to compute Then, the re-encryption key RK = (RK 1 , RK 2 , RK 3 ) is transmitted to the proxy.
-Re-encrypt: Given the re-encryption key RK = (RK 1 , RK 2 , RK 3 ), the proxy first uses the identity ID o to retrieve τ from the cloud server and then computes Finally, the resulting ciphertext C = (α , β, θ, τ, η, ρ) would be returned to the requested data user ID u .
-Decrypt: Given an original ciphertext C = (α, β, θ, τ), the data owner ID o first computes and We show that Equation (19) When given a ciphertext C = (α , β, θ, τ, η, ρ) of re-encrypted forms, ID u computes a symmetric key and then recovers the message m with Equation (20). The correctness of Equation (21) could be verified as follows. From the right-hand side of the equality, we find = Ye g ab , g r e(g rx , g)· 1 e(g ab g x , g r ) = Y Note that to revoke the membership of a user ID i , the PKG will update its revocation list L as L by adding the new entry (ID i , SK i,2 ), i.e., L = L ∪ {(ID i , SK i,2 )}. Unfortunately, the authors find out that Zhang et al.'s scheme [24] has several security weaknesses stated as follows: Weakness 1: A dishonest fog (proxy) is able to decrypt the ciphertext queried by a data user ID u without having the knowledge of corresponding private key. According to Equation (12), the private key information SK u,1 is further combined with a random integer w chosen by ID u for computing the symmetric key Y. Although the dishonest proxy knows neither the private key SK u,1 nor the secret w, it has obtained the combined value W = (SK u,1 ) w in the Query phase. Therefore, it can also successfully derive the symmetric key Y and decrypt the ciphertext queried by ID u .
Weakness 2: An adversary is able to gain access to any cloud ciphertext without having the corresponding private key. More specifically, an adversary first randomly chooses SK u,1 , w ∈ Z p * to compute W = (SK u,1 ) w with respect to any M cate he attempts to access. Since the adversary is not a revoked user in the revocation list L, he would receive a reencrypted ciphertext. Then, based on the decryption equality, i.e., Equations (20) and (21), he could employ the value W to recover the symmetric key Y and decrypt the received ciphertext, respectively.
Weakness 3: A revoked user can impersonate any legitimate user to gain access to any cloud ciphertext without having the corresponding private key. Assume that ID u is a revoked user in the system. This means that the entry (ID u , SK u,2 ) has been stored in the revocation list L of the PKG. In order to request any ciphertext in the cloud server, ID u could impersonate any non-revoked user, say ID v , to issue a query. The procedure is similar to that mentioned in weakness 2. That is, he first randomly chooses SK v,1 , w ∈ Z p * to compute W" = (SK v,1 ) w in relation to any desired M cate . As the impersonated identity ID v is still a legitimate user, the attacker would receive a corresponding re-encrypted ciphertext, which is decryptable by his forged value W".
Weakness 4: The partial information of the user's private key is compromised during communication. According to the procedures and data flows stated in the Query and the Permission phases, the partial private key SK u,2 of the data user ID u has to be transmitted via an open channel. This undoubtedly leaks the partial private key information out.

Construction of an Improved IB-DS Scheme
According to our cryptanalyses of Zhang et al.'s system [24], we find out that the private key of the requested data user is not properly hidden in the query algorithm, which makes the secret parameter able to be nullified by any malicious entity in the decryption process. Moreover, the decryption equation, i.e., Equation (21), does not integrate with the second private key of the data user, which is also a major problem that has led to previous attacks. To eliminate the security weaknesses of Zhang et al.'s scheme [24], the authors come up with an improved variant without modifying the system architecture and involved parties. In the Setup algorithm of our system, we additionally introduce a new hash function, i.e., h 3 : G 2 → G 1 . Since the processes of Setup, Keygen and Encrypt algorithms are defined the same as those of Zhang et al.'s scheme [24], we formalize them as the following Algorithms 1-3:  -Query: To request the data access of M cate , a data user ID u first chooses w ∈ R Z p * to compute W = g w , -Permission: When receiving the query token TK = (ID u , W, β), the data owner sends ID u to the PKG, which will inspect whether ID u is a revoked user or not according to its revocation list L and then return True/False to indicate that the membership of ID u is valid/invalid. If False, the data owner submits an invalid symbol ⊥ to the proxy. Otherwise, ID o picks two random numbers x, π ∈ Z p * to compute (RK 2 , RK 3 ) as Equations (14) and (15), and (RK 1 , RK 4 ) as Then, RK = (RK 1 , RK 2 , RK 3 , RK 4 ) is the generated re-encryption key, which will be transmitted to the proxy. The permission processes are presented in Algorithm 5.
Finally, the re-encrypted ciphertext C = (α , β, θ, τ, η, ρ, Φ) is returned to the data user ID u . We illustrate the flow chart of query, permission and re-encryption algorithms in Figure 3. The re-encryption processes are presented in Algorithm 6.

Algorithm 6. Re-Encrypt.
Input: PP, ID u , C, RK = (RK 1 , RK 2 , RK 3 , RK 4 ) Output: A re-encrypted ciphertext C' 1: α = α · e(RK 2 , g)  (20). Still, when given a re-encrypted ciphertext C = (α , β, θ, τ, η, ρ, Φ), the data user ID u first computes I and the symmetric key Y separately as and then recovers the message m with Equation (20). The correctness of Equation (27) could be verified as follows. From the right-hand side of the equality, we find α · e(θ, ρ) e((SK o,1 )g x , g r ) = Ye g ab , g r e(g rx , g)· e( (h 1 ( = Ye g ab , g r e(g rx , g)· 1 e(g ab g x , g r ) = Y Note that to revoke the membership of a user ID i , the PKG will update its revocation list L as L by adding the new entry ID i , i.e., L = L ∪ {ID i }. The decryption processes are presented in Algorithm 7.

Security Model and Proof
To formally prove the security of our improved IB-DS scheme, the authors first present its security model and then give a completed security proof. Since the core building block of the IB-DS scheme is actually the IB-PRE scheme, the notion of a security model for the former also comes from that for the latter. Specifically, we will prove that our improved IB-DS construction is indistinguishable against the adaptively chosen identity and chosen plaintext attacks (IND-PrID-CPA). The security model of IND-PrID-CPA for the IB-DS scheme is defined as follows.

Definition 1. (IND-PrID-CPA
). An IB-DS scheme achieves the indistinguishability against adaptively chosen identity and chosen-plaintext attacks if in the following game, there is no probabilistic adversary A who is able to defeat a challenger B with non-negligible advantage in polynomial-time: Setup: In the beginning, the challenger B performs the Setup (1 l ) algorithm to initialize the system public parameters PP and a master secret key Msk. Then, the parameters PP are sent to A. Phase 1: The adversary A will make the following queries adaptively: -Private-key Queries: In this query, the adversary A will provide an identity ID for the challenger B who then calls the Keygen (PP, Msk, ID) algorithm to get the corresponding private key SK ID and returns it. -Permission Queries: In this query, the adversary A will provide two identities (ID o , ID u ) of non-revoked users and a data category name M cate for the challenger B who first calls the Keygen (PP, Msk, ID) algorithm to gain the private keys SK IDo and SK IDu . Next, B performs the Query (PP, ID u , SK IDu , M cate ) and the Permission (PP, ID u , SK IDo , TK) algorithms to obtain the re-encryption key RK and returns it.

Challenge:
The adversary A determines a target identity ID*, a message m* = (m 1 *, m 2 *, . . . , m n *) and two symmetric keys (Y 0 , Y 1 ) of the same length. Next, the challenger B takes the input of (PP, ID*, m*, Y λ ) where λ ∈ R {0, 1} to produce a ciphertext C* = (α*, β*, θ*, τ*) as the challenge for A. Phase 2: After receiving the challenge, the adversary A can further make queries defined as those in phase 1, except for the following restrictions: -A private-key query for the target identity ID* is not allowed. -Any permission query in relation to the identities of the form (ID*, ID u ) or (ID o , ID*) is not allowed.
-The maximum number of times for the private key and the permission queries are bound by q pk and q pr .
Guess: When phase 2 terminates, the adversary A outputs a bit λ . If λ = λ, A is the winner of the game. Consequently, the advantage of A is defined as Adv(A) =|Pr[λ = λ] − 1/2|.
On the basis of a previously defined security model, we formally prove that our improved IB-DS construction is IND-PrID-CPA-secure in the proof model of random oracles below. Theorem 1. (IND-PrID-CPA). Let h i (for i = 1 and 2) be random oracles. The proposed IB-DS system is indistinguishable against adaptively chosen identity and chosen-plaintext attacks (IND-PrID-CPA) under the DBDH assumption. In particular, if a probabilistic polynomial-time adversary A making at most q pk and q pr queries breaks the IND-PrID-CPA security of our IB-DS scheme with the non-negligible advantage ε, an algorithm B solving the DBDH problem can be constructed with the non-negligible advantage ε where ε ≥ ε e( q pk + q pr + 1) .

Proof.
We depict the proof structure as Figure 4. Let (g, g f , g s , g k , e(g, g) fsk , δ) be a problem instance of DBDH for B whose purpose is to decide if e(g, g) fsk equals to δ or not by utilizing the advantage of A. In addition, the algorithm B also serves as a challenger responding to the queries that A makes in the following simulation game.

Setup:
In the beginning, B performs the Setup(1 l ) function to initialize public parameters PP = {G 1 , G 2 , e, g, p, Mpk, E(·), D(·), h 3 (·)} where Mpk = (P = g f , Q = g s ). Note that the Msk of the PKG is implicitly defined as (f, s) which B does not know. Moreover, B chooses a random integer rn ∈ Z p *. Then, the parameters PP are sent to A.

Phase 1:
The adversary A will make the following queries adaptively: hash oracle: For any h 1 (ID i || t i ) query, B uses (ID i , t i ) as the index to searches for a matched entry in the h 1 -table named HT1. Otherwise, B first chooses a bit bt 1 with Pr[bt 1 = 0] = ψ where ψ will be determined later. When bt 1 = 0, B computes HO 1 = P rn g s1 where s 1 ∈ Z p *; else, B computes HO 1 = g s1 . Then, B updates HT1 as HT1 ∪ {(ID i , t i , bt 1 , s 1 , HO 1 )} and returns the value HO 1 to A. queries on behalf of A.) When both bt 1 and bt 2 equal to 1, B aborts. In cases where both bt 1 and bt 2 equal to 0, B selects d i ∈ Z p * to compute In the remaining two cases where the values of bt 1 and bt 2 are reversed, B also chooses d i ∈ Z p * to compute As a matter of fact, in either of the above forms of private keys, SK i,1 and SK i,2 are well-formed, as shown below. To simplify the derivation, we let vx be s 1 + s 2 and rx be the value of either 2rn or rn.
Then, the computed private keys (SK i,1 , SK i,2 ) are returned to A. It is evident to observe that the returned private keys have the identical distribution as those in the real scheme. - Permission Queries: For any permission query of (ID o , ID u , M cate ) where ID o = ID u and ID u is not revoked, B first obtains their private keys SK IDo and SK IDu by the private-key queries and then finds out the corresponding information stored in HT1 and HT2. If any of ID o and ID u satisfies the condition that both of bt 1 and bt 2 equal to 1, B aborts. Otherwise, B selects w, x, π ∈ Z p * to compute W = g w , RK 1 = SK o,1 g x h 3 (e(PW π , Q)) , RK 2 = β x , RK 3 = SK o,2 , RK 4 = e(g, Q π ) where β is the partial ciphertext with respect to M cate . Here, RK = (RK 1 , RK 2 , RK 3 , RK 4 ) is the derived re-encryption key. Then, B returns RK to A.

Challenge:
The adversary A determines a target identity ID*, a message m* = (m 1 *, m 2 *, . . . , m n *) and two symmetric keys (Y 0 , Y 1 ) of the same length. Next, the challenger B takes the input of (PP, ID*, m*, Y λ ) where λ ∈ R {0, 1} to produce a ciphertext C* = (α*, β*, θ*, τ*) for A by the following steps: Step 1: Without loss of generality, we assume that the corresponding hash queries for ID* have been queried by A. If any of bt 1 * and bt 2 * equals to 0, B aborts; where (s 1 *, s 2 *) are the corresponding random values in relation to ID* and stored in HT1 and HT2, respectively, and At last, the ciphertext C* = (α*, β*, θ*, τ*) is returned to A as a challenge ciphertext.

Phase 2:
After receiving the challenge C*, the adversary A can further make queries such as those in phase 1, except for the restrictions stated in Definition 1.
Guess: When phase 2 terminates, the adversary A outputs a bit λ . If λ = λ, B outputs 1; else, B outputs 0. The former stands for that e(g, g) fsk equals to δ while the latter does not. Analysis: According to the steps in the challenge phase, if e(g, g) fsk = δ, the simulated C* would be a valid ciphertext, and hence the advantage of A to break our construction is non-negligible, i.e., Adv(A) = | Pr[λ = λ] − 1/2 |≥ ε. On the contrary, if e(g, g) fsk = δ, the adversary A has no better advantage in guessing λ , meaning that Pr[λ = λ] = 1/2. Let Pr[Perfect] be the probability of the event that the entire simulation game is perfect without accidental termination. Consequently, we could express the advantage of B to break the DBDH problem as | Pr[(g, g f , g s , g k , e(g, g) fsk ) To give a better estimation of Pr[Perfect], we first consider the probability that B does not abort in any query simulated above. For convenience, we define some probability events below: Similarly, in a permission query, B aborts when both bt 1 and bt 2 are related to ID o or ID u equal to 1. Hence, we obtain Pr[¬PrQ] ≥ (1 − (1 − ψ) 2 ) qpr . Still, in the challenge phase, B aborts if any of bt 1 * and bt 2 * corresponding to ID* equals to 0. That is, Combining all of these probability events, we have When ψ = 1 − 1 √ q pk +q pr +1 , the probability of Pr[Perfect] achieves the maximum value 1 e( √ q pk +q pr +1) where e is the base of natural logarithm. Accordingly, the advantage for B to solve the DBDH problem is calculated as ε ≥ ε e( √ q pk +q pr +1) . Theorem 2. The proposed construction is secure against the dishonest fog (proxy) that attempts to learn the plaintext from the ciphertext requested by a data user.
Proof. In the permission phase of our scheme, a dishonest proxy can obtain the reencryption key RK composed of four subkeys in which RK 1 = SK o,1 g x h 3 (e(PW π , Q)) . If the proxy tries to derive the private key SK o,1 for decrypting the original ciphertext C = (α, β, θ, τ), he has to know the two random numbers (x, π), which are chosen by the data owner. Consequently, he cannot successfully derive SK o,1 from the re-encryption key RK 1 . Furthermore, if he attempts to learn the plaintext from the re-encrypted ciphertext C = (α , β, θ, τ, η, ρ, Φ), he will face the difficulty in computing the parameter I owing to the lack of the data user's private keys.
Theorem 3. The proposed construction is secure against the malicious or compromised PKG who attempts to gain the access to any cloud ciphertext without having the corresponding private key.
Proof. In the keygen algorithm of the proposed system, we also adopt the technique of anonymous key generation to issue each user's private keys. Although the private key SK i,2 is controlled by the PKG, it cannot derive the private key SK i,1 = g ab (h 1 (ID i || t i ) · h 2 (ID i || ID PKG )) di without knowing the secret value t i chosen by the user. As for directly computing h 1 (ID i || t i ) from the received parameter T i , the PKG also does not have the correct knowledge of Z i = g zi . Without having the full private keys of any user, a malicious or compromised PKG is impossible to decrypt either an original or a re-encrypted ciphertext with Equations (19) and (26).

Theorem 4.
The proposed construction is secure against any revoked user who attempts to impersonate a legitimate user to gain access to any cloud ciphertext without having the corresponding private key.
Proof. According to the query algorithm in our scheme, a revoked user ID u impersonating a non-revoked user ID v first chooses a random number w ∈ R Z p * to compute W = g w , and sends (ID v , M cate , W) to the nearby proxy. Since the transmitted identity is ID v , it will not be rejected by the PKG in the permission phase. Finally, ID u will receive a re-encrypted ciphertext C = (α , β, θ, τ, η, ρ, Φ). However, to decrypt the ciphertext, he needs to know the correct private keys of ID v in addition to his chosen random number w. Without the former, he cannot successfully recover the original message.
We compare the functionality and security of our improved construction with some previous ones including Han [24] in Table 2. From the table, it is obvious that only the schemes of ZBW and ours support anonymous key generation and user revocation. The schemes of THJ and Mat cannot resist the collusion attack plotted by the proxy and the data user. Except for the proposed construction, all the other compared ones are subject to the malicious (compromised) PKG attack. To sum up, our improved DS variant has better functionality and security among all compared mechanisms.  The communication overheads are evaluated in terms of the length of the query token, the ciphertext and the re-encrypted ciphertext. For simplicity, the length of identity and that of the data category name are ignored in the comparison. Assume that the output length of symmetric encryption is |SE|. The detailed results are shown in Table 4. It is evident that the query token length of our scheme is shorter than that of ZBW by |G 1 |. Yet, when transmitting the re-encrypted ciphertext, our scheme has to send an extra element of G 2 . We claim that the extra element is crucial for protecting the ciphertext from unauthorized decryption. The numerical comparisons are also illustrated in Figure 6 by using the elliptic curve of embedding degree 2. Hence, the size of order p is 160 bits and that of a field element is 512 bits.

Conclusions
To enhance the security of more and more popular data applications in fog-enabled IoT environments, in this paper, we proposed an improved data storage scheme following Zhang et al.'s work [24]. We pointed out several security vulnerabilities in their scheme. Concretely speaking, their scheme fails to satisfy the basic access policy that only the user owning the correct private key can decrypt the corresponding cloud ciphertext. Hence, an adversary including the dishonest proxy, malicious PKG and revoked users can arbitrarily request a cloud ciphertext and decrypt it without having the knowledge of corresponding private key. To eliminate the above security flaws, we have modified some algorithms in our improved system. Moreover, we formally proved that our construction is IND-PrID-CPAsecure in the random oracle models. Overall, the advantages of anonymous key generation and user revocation are also preserved in the proposed variant with higher security. Our improved mechanism can provide better security protection for the applications in fogenabled IoT environments. Although the computational complexity is increased, we believe that it is a worthy trade-off to gain a higher level of security. The limitation of our mechanism is that each user has to maintain an extra secret value chosen at the keygen phase. Such a value will be utilized in the decryption process. The aim of future work should be to combine attribute-based mechanisms for supporting more fine-grained access control policies.