Certificateless Provable Group Shared Data Possession with Comprehensive Privacy Preservation for Cloud Storage

Provable Data Possession (PDP) protocol makes it possible for cloud users to check whether the cloud servers possess their original data without downloading all the data. However, most of the existing PDP schemes are based on either public key infrastructure (PKI) or identity-based cryptography, which will suffer from issues of expensive certificate management or key escrow. In this paper, we propose a new construction of certificateless provable group shared data possession (CL-PGSDP) protocol by making use of certificateless cryptography, which will eliminate the above issues. Meanwhile, by taking advantage of zero-knowledge protocol and randomization method, the proposed CL-PGSDP protocol leaks no information of the stored data and the group user’s identity to the verifiers during the verifying process, which is of the property of comprehensive privacy preservation. In addition, our protocol also supports efficient user revocation from the group. Security analysis and experimental evaluation indicate that our CL-PGSDP protocol provides strong security with desirable efficiency.


Introduction
In recent years, cloud computing [1] has received considerable attention from research communities in academia, as well as industry.As an important part of cloud computing, cloud storage has become a popular choice for people to deploy their data storage, which brings a number of benefits.Users are relieved of the burden of the management of a great deal of data.Furthermore, universal data access with independent geographical location is highly convenient.
However, a number of vulnerabilities that led to various attacks have left many potential users worried [2].Thus, many researchers focus on creating trusted cloud services that provide the necessary security guarantees.Santos et al. [3] presented a trusted cloud-computing platform (TCCP), which offers a closed box execution environment for IaaS services.TCCP guarantees confidential execution of guest virtual machines.It also enables customers to attest to the IaaS provider and to determine if the service is secure before their VMs are launched into the cloud.In 2016, Paladi et al. [4] presented a security framework for cloud infrastructure.This framework included a trusted VM launching and a domain-based stored data protection protocols.The trusted VM launching protocol is used before deploying guest VMs, and trust is established by the remotely attesting host platform configuration.Meanwhile, the domain-based storage protection protocol ensures data confidentiality in remote storage by using cryptographic methods.
Furthermore, unlike traditional storage, the cloud stored data is outside of the control of the users, which entails security risks in terms of confidentiality, integrity, and availability of data and service [5].
One of the major concerns of cloud users is the integrity of their outsourced data.Moreover, the cloud server may not be fully trusted to report incidents of data loss in order to protect its reputation [6][7][8].As a result, it is necessary for cloud users to periodically check whether their outsourced data are stored properly.Provable Data Possession (PDP) scheme is a primitive one that can be used to convince cloud users that their data are kept intact.
As is well known, it is impractical for cloud users to frequently download all the cloud data due to the expensive cost of bandwidth.Additionally, the traditional methods for data integrity checking, like hash function and Message Authorization Code (MAC), cannot be applied directly, because the cloud server may only store the hash code of the original data in order to reduce storage costs.In 2007, Ateniese et al. [9] proposed a provable data possession scheme to check the integrity of data, which employed RSA algorithm to construct homomorphic verifiable authenticators of the data blocks, meaning that the cloud servers were able to prove the data integrity with low communication overheads and computational costs.After that, many researchers proposed corresponding system models and security models based on the first PDP scheme.Later, the POR model was introduced in Juel et al. [10], which used an error-correcting code to establish a sentinel-based POR scheme.Shacham et al. [11] developed a proof of retrievability scheme based on the BLS signature [12], which not only eliminates the constraint on checking times, but also shortens the size of authenticator.In order to support the dynamic data operation on cloud data, Ateniese et al. [13] presented a scalable and efficient provable data possession scheme based on hash functions and symmetric key encryptions.The limitation is that this scheme cannot support data block insertion.Erway et al. [14] proposed a full-dynamic PDP Scheme by utilizing the authenticated flip table.Similarly, Cash et al. [15] proposed a dynamic POR scheme that relies on oblivious RAM protocols.Wang et al. [16] made further improvements to the previous dynamic PDP schemes by using Merkle hash tree (MHT).Liu et al. [17] introduced a top-down, levelled, multi-replica, MHT-based data auditing scheme for dynamic big data storage in the cloud, which supports fully dynamic data updates and authentication of block indices.
Public verifiability is one of the most important properties for PDP schemes, which means that external verifiers are able to check the integrity of cloud data.With this practical property, cloud users can delegate the rights to check data possession to a third-party auditor (TPA) to do the periodical checking job.However, data privacy may be leaked to the TPA during the checking process, which may cause financial loss for the users who have stored confidential or sensitive data on cloud servers.Recently, a number of schemes [16,[18][19][20][21][22] have been developed that allow a TPA to check the integrity of the stored data.Wang et al. [20] proposed a privacy-preserving public cloud data auditing system by combining the public key-based homomorphic authenticator with random masking.This scheme explained the definition of data privacy against the TPA.Wang et al. [21] performed further study on data privacy preservation and proposed the notion of 'zero-knowledge public auditing' to defeat off-line guessing threat.Yu et al. [22] enhanced the privacy of remote data integrity-checking protocol for secure cloud storage.The aforementioned schemes [16,[18][19][20][21][22] all work only in the public key infrastructure (PKI) [23]-based system, which may suffer from heavy public key management.Yu et al. [24] further presented an identity-based PDP scheme to eliminate heavy public key management.This scheme leaks no information of the stored data to the TPA, but cannot protect the privacy of the user's identity.If the data is shared by a group of users, such as a company, the TPA will know all the details of the group users' identities.
In order to reduce the complexity of the PDP scheme, many researchers [24][25][26][27] have focused on studying identity-based PDP schemes.By utilizing identity-based cryptography (IBC) [28], there is no need for a PKI to perform complex certificate management such as distribution, storage, revocation, and verification.Unfortunately, IBC has an inherent drawback of key escrow.Wang et al. [29] first proposed a certificateless public auditing mechanism for verifying data integrity in the cloud in order to eliminate the problem of key escrow.In this scheme, Key Generation Center will generate only the partial key, so that in any case it will not compromise the user's private key.Li et al. [30] introduced a certificateless PDP scheme for shared group data, but it lost the privacy preservation of cloud data and the user's identity to the TPA.
Motivation: In this paper, we mainly focus on preserving the privacy of group shared data against the third-party auditor (TPA) during the integrity checking process.Suppose there is a group of users from one company who share company data on a given cloud server.Each user of the group can upload and share their data on the cloud server.The manager of the group requires a TPA to periodically check the integrity of the outsourced data, but he does not allow the TPA to extract any information related to their data, not even the identity of the group users.Users of the group may leave the company, so the problem of user revocation from the group needs to be considered.Thus, we need a primitive to meet such requirements and to guarantee the integrity of the outsourced data on the cloud server.
Our contribution: The contributions of this paper are summarized as follows: • First, we propose a new PDP protocol (CL-PGSDP) for group shared data by utilizing certificateless cryptography [31], which eliminates the problems of certificate management and key escrow.

•
Second, by making use of the idea of zero-knowledge proof protocol, the equality of discrete logarithm [32][33][34], and randomization method, we construct a privacy-preserving CL-PGSDP protocol.On the one hand, our protocol leaks no information of the group shared data to the TPA.On the other hand, all the data blocks are signed by group users to get corresponding authentication tags, and the TPA cannot learn any identity information from the challenged data block during the auditing process.

•
Third, based on CDH and DL assumptions, we provide detailed security proofs of our new protocol.Additionally, our protocol supports efficient group user revocation.We perform some experiments and show the practicality of our protocol.

Organization:
The rest of the paper is organized as follows: In Section 2, we review some preliminaries used in CL-PGSDP construction.In Section 3, we formalize the system model and security model of CL-PGSDP protocol.We describe the concrete construction of the CL-PGSDP protocol in Section 4. We formally prove the correctness, soundness, and comprehensive privacy preservation of our protocol in Section 5. We report the performance and implementation results in Section 6. Section 7 concludes our paper.

Preliminaries
In this section, we review some preliminaries knowledge used in this paper, including bilinear pairing, certificateless Cryptography, zero-knowledge proof, and Complexity assumption.

Bilinear Pairing
Denote G 1 and G 2 as two multiplicative groups with the prime order q.g is a generator of group G 1 .
A function e : G 1 × G 1 → G 2 is called a bilinear pairing [12] if it has the following properties:

•
Efficient Computation: e(u, v) can be computed efficiently for all.u, v ∈ G 1 .

1.
Setup: This algorithm takes security parameter k as input and returns the system parameters params and master-key.

2.
Partial-Private-Key-Extract: This algorithm takes params, master-key, and entity's ID as inputs and returns a partial private key sk ID for the entity.Decrypt: This algorithm takes params, σ, and S ID as inputs and returns message m.
By making use of certificateless cryptography, we will construct new authenticator of data block.

Zero-Knowledge Proof
Zero-knowledge proofs [33] are defined as those proofs that convey no additional knowledge other than the correctness of the proposition in question.Here, we introduce one of the zero knowledge protocol: equality of discrete logarithm [34].Let G be a finite cyclic group with the prime order q; g 1 , g 2 are generators of G.The protocol shows that a prover (P) can prove to a verifier (V) that log without leaking the secret key to V.

3.
P computes y = ρ − vx(modq), in which x is the secret key of P, and returns y to V. 4.
V accepts the proof if and only if

Security Assumption
Discrete Logarithm (DL) problem [35]: G 1 is a multiplicative cyclic group, g is a generator of G 1 .Given (g, g a ) ∈ G 1 , compute a.

Definition 1 (DL Assumption). For any probabilistic polynomial time (PPT) algorithm A, the advantage for A
to solve the DL problem in G 1 is negligible, which can be defined as Adv

Definition 2 (CDH Assumption). For any PPT algorithm A, the advantage for A to solve the CDH problem in G 1 is negligible, which can be defined as Adv
The ε denotes a negligible value in the above definitions.

System Model and Security Model
In this section, we introduce the system model and security model of CL-PGSDP protocol.

CL-PGSDP System
The system model of our scheme is composed of three different entities: user group, cloud service provider (CSP), and the third party public auditor (TPA).Figure 1 illustrates the relationships and interactions among the three entities of the system.The user group includes numbers of users, who have large amount of data to be stored on cloud without keeping a local copy (each user is able to upload, access, and update the outsourced group shared data).We suppose one of the users is the group manager, who sets up the system and generates system parameters.The CSP has significant storage space and computation resources and provides data storage services for cloud users.The CSP could be semi-trusted and might even hide data corruption incidents to cloud users to maintain their good reputation.The TPA has expertise and capabilities to be delegated by the cloud users to check the data possession of the cloud, but the TPA is also curious in the sense that he is willing to learn some information during the data integrity checking procedure.
storage space and computation resources and provides data storage services for cloud users.The CSP could be semi-trusted and might even hide data corruption incidents to cloud users to maintain their good reputation.The TPA has expertise and capabilities to be delegated by the cloud users to check the data possession of the cloud, but the TPA is also curious in the sense that he is willing to learn some information during the data integrity checking procedure.

System Components
Eight algorithms are involved in CL-PGSDP system.
1. Setup is a probabilistic algorithm run by the group manager.It takes a security parameter λ as input and outputs the system parameters params and the master key msk .
2. Partial-Private-Key-Gen is a probabilistic algorithm run by the group manager.It takes the master key msk , a random value δ , and the identity 6. Challenge is a randomized algorithm run by the TPA.It takes the system parameters params , a unique file name, and the count c of the challenged data blocks as inputs, and outputs the challenge information chal .7. Proof-Gen is a probabilistic algorithm run by cloud server to obtain a data possession proof P of the challenged blocks.The inputs include chal , the challenged data blocks and tags of the challenged data blocks.8. Proof-Check is a deterministic algorithm run by the TPA.It inputs the proof P, the challenge information chal , and the user's public key.If P is correct, this algorithm outputs 1, otherwise it outputs 0.

System Components
Eight algorithms are involved in CL-PGSDP system.

1.
Setup is a probabilistic algorithm run by the group manager.It takes a security parameter λ as input and outputs the system parameters params and the master key msk.

2.
Partial-Private-Key-Gen is a probabilistic algorithm run by the group manager.It takes the master key msk, a random value δ, and the identity ID i ∈ {0, 1} * of the user u i as inputs, and outputs the u i 's partial private key psk ID i .

3.
Secret-Value-Gen is a probabilistic algorithm run by the group user who randomly selects y ID i as the secret value.Thus, the private key of the group user contains two parts: secret value y ID i and partial private key psk ID i .4.
Public-Key-Gen is a probabilistic algorithm performed by the group user u i to compute the public key.It inputs the u i 's secret value y ID i and outputs the u i 's public key pk ID i . 5.
Tag-Gen is a probabilistic algorithm executed by the group user u i to generate authentication tags for data blocks.It takes the u i 's partial private key psk ID i , the secret value y ID i , and the data block m j as inputs, and outputs the tag σ j of m j .

6.
Challenge is a randomized algorithm run by the TPA.It takes the system parameters params, a unique file name, and the count c of the challenged data blocks as inputs, and outputs the challenge information chal.

7.
Proof-Gen is a probabilistic algorithm run by cloud server to obtain a data possession proof P of the challenged blocks.The inputs include chal, the challenged data blocks and tags of the challenged data blocks.8.
Proof-Check is a deterministic algorithm run by the TPA.It inputs the proof P, the challenge information chal, and the user's public key.If P is correct, this algorithm outputs 1, otherwise it outputs 0.

System Security
We consider three security properties, namely, completeness, soundness, and comprehensive privacy preservation against the TPA in our CL-PGSDP protocol.

•
Completeness means the cloud server can pass the possession checking procedure as long as the cloud server properly stores the group shared data.

•
Comprehensive privacy preservation means that the TPA achieves no information on the data blocks and the user's identity during the integrity checking procedure.

•
Soundness states that whenever the cloud server convinces a TPA to accept its proof, the cloud server should actually store the challenged data blocks.According to certificateless cryptography [30,31], we consider three types of probabilistic polynomial-time (PPT) adversaries, namely, A 1 , A 2 , A 3 , and a challenger C in our security model and define the security of our protocol by three games.The details are as follows: Game 1: This game is played by challenger C and adversary A 1 who wants to substitute the user's public key with any other value, but A 1 cannot access the master key of the system.
Setup: Challenger C runs the Setup algorithm to obtain the system parameters params and the master secret key msk, and forwards params to the adversary A 1 , while keeps msk confidential.
Queries: A 1 can adaptively issue the following queries to C. C maintains the corresponding query lists, which are initially empty, and responds to the queries to A 1 as follows.
(1) Hash Query.Forge: Finally, A 1 outputs a forged tag σ for the m with the identity ID and the public key pk ID .
If the forged tag σ is valid after the above queries, then A 1 wins the game.Game 2: This game is played by challenger C and adversary A 2 who is able to get the master key but cannot substitute the group user's public key.
Setup: Challenger C runs the Setup algorithm to obtain the system parameters params and the master secret key msk, and forwards params and msk to the adversary A 2 .
Queries: A 2 can make a number of queries to C adaptively.C maintains the corresponding query lists, which are initially empty, and responds to the queries to A 2 as follows.
(1) Hash Query.A 2 makes hash function queries to C for any identity ID, and C responds the hash values to A 2 .(2) Secret Value Query.A 2 adaptively chooses different ID and summits it to C for querying the secret value of the ID.C runs the Secret-value-Gen algorithm to generate the secret value for the ID and sends it to A 2 .
(3) Public Key Query.A 2 adaptively chooses different ID and summits it to C for querying the public key of the ID.C performs the algorithm Public-key-Gen to compute the public key for the ID and sends it to A 2 .(4) Tag Query.A 2 adaptively chooses the tuple (ID, m) and submits it to C for querying the tag of the data block m generated by the ID.C runs Tag-Gen algorithm to generate the tag of data block m and sends it to A 2 .
Forge: Finally, A2 outputs a forged tag σ for the m with the identity ID.
If the forged tag σ is valid after the above queries, then A 2 wins the game.
Definition 3. A CL-PGSDP scheme is secure against adaptive impersonation and forging tag attacks if any PPT adversary A (A 1 or A 2 ) who plays the above games with the challenger C has only negligible probability ε of winning the games, that is, Pr(A win ) ≤ ε in which the probability ε is taken over all coin tosses made by A and C.

Game 3:
This game is played by challenger C and adversary A 3 who aims to forge the data integrity proof to cheat the TPA.A 3 is regarded as the untrusted CSP.From the Definition 3, we know that it is hard to forge the tag of single data block.Thus, we will focus on the issue of whether A 3 can forge the integrity proof without correct data to pass the challenge.
Setup: Challenger C runs the Setup algorithm to obtain the system parameters params, the master secret key msk, and partial private key for all users, and only forwards params to the adversary A 3 .
Tag Queries: A 3 adaptively chooses the tuple (ID, m) and sends it to C for querying the tag of data block m.C runs the algorithm Tag-Gen to generate the tag of m and returns it to A 3.
Challenge: C makes a random challenge chal to A 3 and requests A 3 to provide the corresponding data possession proof for chal.
Forge: For the challenge chal, A 3 generates a proof and sends it to C. If the proof can pass the integrity verification while A 3 does not possess the correct data, A 3 wins the game.Definition 4. A CL-PGSDP scheme is secure against adaptive impersonation and forging proof attacks if any PPT adversary A who plays Game 3 with the challenger C has only negligible probability ε of winning the games, that is, Pr(A win ) ≤ ε in which the probability ε is taken over all coin tosses made by A and C.

Our Construction
In this section, we provide a concrete construction of certificateless provable group shared data possession protocol supporting comprehensive privacy preservation for cloud storage.We suppose the number of users in the group is z, and ID i represent the unique identity of the user u i , in which 1 ≤ i ≤ z.Without losing the generality, we set u 1 as the group manager who will set up system and generate system parameters and the partial private keys for other users.The group shared data M is split into n blocks, denoted as M = m j 1 ≤ j ≤ n, m j ∈ Z * q .In the Partial-Private-Key-Gen algorithm, we employ short signature [12] and randomization method to produce the partial private key for each user of the group.In the Tag-Gen algorithm, we take the advantage of the idea of certificateless cryptography [31] to construct the tags of the data blocks using the partial private key and the secret value.In the challenge phase, the TPA randomly chooses some indexes of the data blocks and corresponding random values as a challenge to the CSP.In the proof generating phase, the CSP computes a response to the TPA.We utilize the idea of zero knowledge proof [33,34] to design the details of the interaction between the TPA and the CSP.The details of the proposed protocol are as follows:

•
Setup.This algorithm is run by u 1 .On input of security parameter λ, u 1 chooses two cyclic multiplicative groups, G 1 and G 2 , with prime order q, log 2 q ≤ λ. g is a generator of G 1 .There exists a bilinear map e : G 1 × G 1 → G 2 .u 1 selects three secure hash functions H 1 , H 2 : {0, 1} * → G 1 , H 3 : G 2 → {0, 1} l , a pseudo-random permutation (PRP) π : Z * q × {1, . . . ,n} → {1, . . . ,n} , and a pseudo-random function (PRF) φ : Z * q × Z * q → Z * q .u 1 initializes a public log file LF, which is used to record the information of the indexes of the data blocks and the information of the corresponding tag generators.u 1 randomly chooses x ∈ Z * q as master secret key msk and δ ∈ Z * q as secret value, and computes P pub = g x .u 1 keeps the master secret key msk and δ privately, and publishes the system parameters params = G 1 , G 2 , P pub , H 1 , H 2 , H 3 , e, g, q, LF, π, φ .

•
Partial-Private-Key-Gen.This algorithm is run by u 1 .When receiving the identity ID i ∈ {0, 1} * of the user u i , u 1 computes the u i 's partial private key psk ID i = H 1 (ID i + δ) x and sends psk ID i and H 1 (ID i + δ) to u i .

•
Secret-Value-Gen.This algorithm is run by group user.u i randomly selects y ID i ∈ Z * q as the secret value and keeps it privately.

•
Public-Key-Gen.This algorithm is run by group user.u i uses the secret value to compute the public key Y ID i = g y ID i .

•
Tag-Gen.Each user in group can generate tags of data blocks using partial private key and secret value.Suppose user u i (0 < i ≤ z) generates an authentication tag for data block m j (0 It takes u i 's partial private key psk ID i , the secret value s ID i , and the data block m j as inputs and outputs the tag σ j of m j .The equation for computing tag is σ j = psk ID i m j • H 2 (ω j ) y ID i , in which ω j = j f name , j is the index of data block m j , and f name denotes the unique identity of data block m j .Each time the u i generates a tag for data block m j , u i will update the information in public log file LF with the index j of m j , Y ID i , and H 1 (ID i + δ).Actually, LF is a table, and one line of it can be showed as follows:

•
The user u i uploads the data blocks and its tags to the CSP.The CSP can check the validation of each tag using the following equation: • Challenge.This algorithm is run by the TPA, who randomly picks c-element subset J of the set [1, n] by pseudo-random permutation (PRP) π; each element in J denotes the index of the challenged data block.The TPA chooses a random element υ j ∈ Z * q for each element in J by pseudo-random function (PRF) φ.Let Q be the set (j, υ j ) j∈J .To generate a challenge, the TPA will search the log file LF according the set J to get the information {(j, H 1 (ID i + δ))}.The TPA picks a random value t ∈ Z * q as secret value and computes T 1 = g t , T 2j = e(H(ID i + δ), P pub ) t .Let T 2 = T 2j , j ∈ J.The TPA sends the chal = {Q, T 1 , T 2 } to the server.
• Proof-Check.Upon receiving the proo f from the CSP, the TPA first searches the publish log file LF to get the information (j, Y ID i ) and checks the equation: in which j → i means the information of u i can be find from public log file LF by the index j of data block m j .If the equation holds, the TPA accepts the proof; otherwise, the proof is invalid.The process of Challenge, Proof-Gen, and Proof-Check are summarized as Figure 2.
Future Internet 2018, 10, x FOR PEER REVIEW 9 of 18  (1) The CSP randomly selects * q Z α ∈ , and sends it to n u .
(2) Upon receiving α , n u computes ( and sends it to the CSP. (4) When receiving σ is the tag generated by m u .The proof of the correctness of algorithm Revocation-Tag-Gen is as follows: ' ' ) • Revocation-Tag-Gen.If user u m , 2 ≤ m ≤ z leaves the group, and user u n will be the successor of u m .The following procedure will efficiently update the tags generated by u m .It needs u m , u n and the CSP online simultaneously. (1) The CSP randomly selects α ∈ Z * q , and sends it to u n .
(2) Upon receiving α, y IDn y IDm .The CSP will update the tag of the data block m i by computing the equation the tag generated by u m .The proof of the correctness of algorithm Revocation-Tag-Gen is as follows:

Security Analysis of the New Protocol
In this section, we show that our protocol is secure with the properties of completeness, soundness, and comprehensive privacy preservation.

Completeness
If the CSP properly stores data, it can always pass the verification.The completeness of the protocol can be demonstrated as follows:

Soundness
Theorem 1.In the random oracle model, if a PPT adversary A 1 wins Game 1 defined in Section 3 with non-negligible probability ε, then there is an algorithm B that can solve the CDH problem.
Proof of Theorem 1. Algorithm B is given (g, g a , g b ) ∈ G 1 ; its goal is to output g ab ∈ G 1 .Algorithm B simulates the challenger and interacts with as A 1 follows.
Setup: B produces the system parameters params, secret key δ ∈ Z * q , and sets P = g a , while a keeps unknown.
H 1 -Query: At any time, A 1 can query the random oracle H 1 .To respond to these queries, algorithm B maintains a list of tuples (ID, k 1 , K) as Tab 1 .When A 1 queries the oracle H 1 at the identity ID , Algorithm B responds as follows: (1) If ID ∈ Tab 1 , then algorithm B retrieves the tuple (ID , k 1 , K ) and responds with K to A 1 .
(2) Otherwise, B picks a random k 1 ∈ Z * q and computes K = g bk 1 .Then, it adds the tuple (ID , k 1 , K ) to Tab 1 and responds with K to A 1 .
PartialKey-Query: At any time, A 1 can query partial key for any identity ID .If ID / ∈ Tab 1 , B makes the H 1 -Query.Otherwise, B maintains a list of tuples (ID, psk) as Tab 2 .When A 1 queries the oracle PartialKey-Query at the identity, Algorithm B responds as follows: (1) If ID ∈ Tab 2 , then algorithm B retrieves the tuple (ID , psk ) and responds psk to A 1 .
(2) Otherwise, B computes psk = (K ) a .Then, it adds the tuple (ID , psk ) to Tab 2 and responds psk to A 1 .according to the verification Equation ( 1).On the other hand, B can retrieve
Finally, we can derive that which means that B can solve the CDH problem with non-negligible probability ε.However, according to CDH assumption, the advantage for B to solve the CDH problem in G 1 is negligible.Thus, A 2 cannot win Game 2. This completes the proof.
Theorem 3. If the DL assumption holds, the adversary A 3 wins Game 3 only at negligible probability.
Proof of Theorem 3. Let the challenge information be chal = {Q, T 1 , T 2 }.If A 3 outputs proo f and wins Game 3 at non-negligible probability, we can get the verification equation: in which σ is the forged tag for the forged data block m and µ is produced by A 3 .Assume the real proof is proo f and the corresponding information is (σ, µ).We also get the verification equation: Thus, we can derive from the above two verification equations that e(σ, T 1 ) e(σ , T 1 ) = µ µ Because A 3 wins the Game 3, there exists σ = σ and at least one data block m j = m j .
Suppose m j − m j = ∆m j .Then, we get µ µ = 1, which is Based on this conclusion, the DL problem can be solved as follows: Given two elements g, y ∈ G 1 in which y = g a , we will compute a ∈ Z q * .We randomly select α j , β j ∈ Z * q and let T 2j υ j = X j = g α j y β j .We can get following equation: Then, we can derive y = g . Since ∆m j = 0, β j is the random value from Z q * , so the probability of ∑ j∈I β j ∆m j = 0 is only 1  q .Therefore, we can output the right value of a with non-negligible probability 1 − 1 q .This completes the proof.

Data Privacy Preservation
Upon receiving the challenge from the TPA, the cloud server responds with the proof: , in which we hide the information of σ and µ using hash function H 3 .
Furthermore, the TPA just needs to check the following equation: , without knowing any information about data file blocks {m i } or their corresponding tags {σ i }.

User Identity Privacy Preservation
In CL-PGSDP protocol, we design a log file that is used to record the information of the index of data block and the information of its tag generator, including the hash value H 1 (ID i + δ) and the public key Y ID i of user u i .During the auditing process, the TPA gets the hash value H 1 (ID i + δ) for the challenged data block m j from the log file.Because the user identity is randomized by δ ∈ Z * q in Partial-Private-Key-Gen, it is impossible for the TPA to obtain user's real identity.Therefore, the user identity cannot be known by the TPA.

Performance and Implementation
In this section, we give the performance analysis and experimental results of our protocol.

Performance Analysis
We summarize the computational and the communicational cost of our protocol as follows.Computational cost: For simplicity, we denote by Exp G2 and Exp G2 the exponentiations in G 1 and G 2 , by Mult G1 and Mult G2 the multiplication in G 1 and G 2 , by P the pairing computation, and by H the map-to-point hash function, respectively.The original hash function, PRF and PRP operation, addition and multiplication on Z * q , and so on are omitted in our evaluation, because the computational cost of them is negligible.Suppose the data is split into n blocks.The main computation of the group manager is generating system parameters and partial private key for each group user.Thus, the main computational cost is 2Exp G1 + H.The primary computation of group users is generating tags for data blocks, which is the most expensive operation in our protocol, but fortunately part of it can be done offline.The cost of group users is (2n + 1) Exp G1 + nH.The dominated computation of the TPA is generating a challenge and checking the validity of a proof.We suppose all the group users have generated tags and the challenge involves their corresponding tags.Thus, the cost for the TPA is zP + zExp G2 + Exp G1 at most for one challenge.When checking a proof, the cost for the TPA is 2cExp G1 + cP G1 + cH + (c − 1) Exp G2 .The main computational cost for the CSP is to generate a proof Table 1, we find that there are more pairing operations (P) during the phase of Proof-Check, which is consistent with our experimental results, because one pairing operation takes more time than other cryptographic operations.However, our mechanism is of comprehensive privacy preservation, while Li's not.In the second part, we increase the number of challenged data blocks from 50 to 1000 with an increment of 50 for each test to see the time cost of Challenge, Proof-Gen, and Proof-Check steps.We suppose all the users in the group will get involved in generating the tags of the challenged data blocks.Figure 4 demonstrates the time cost of these three parts, which increase with the increase of the number of challenged data blocks, which is consistent with our previous computational analysis, because when the number of challenged data blocks rises, more random values in Q need to be produced and more need to be computed.The CSP has increasing computation on μ and σ .According to [9], if the CSP has polluted 1% of the data blocks, the TPA can achieve the probability of CSP's misbehavior detection of at least 99% while only needing to make 460 data blocks for a challenge.We can see that it costs the TPA only about 5.75 s to verify a response and the CSP costs about 1.1 s to generate a response when the number of challenged data blocks is 460.We compare the Proof-Check performance between Li's protocol and our protocol in Table 2.We find that our mechanism requires more checking time compared to Li's protocol.Based on our analysis of computation cost from Table 1, we find that there are more pairing operations (P) during the phase of Proof-Check, which is consistent with our experimental results, because one pairing operation takes more time than other cryptographic operations.However, our mechanism is of comprehensive privacy preservation, while Li's not.In the second part, we increase the number of challenged data blocks from 50 to 1000 with an increment of 50 for each test to see the time cost of Challenge, Proof-Gen, and Proof-Check steps.We suppose all the users in the group will get involved in generating the tags of the challenged data blocks.Figure 4 demonstrates the time cost of these three parts, which increase with the increase of the number of challenged data blocks, which is consistent with our previous computational analysis, because when the number of challenged data blocks rises, more random values in Q need to be produced and more 2 2 { } j T T = need to be computed.The CSP has increasing computation on μ and σ .According to [9], if the CSP has polluted 1% of the data blocks, the TPA can achieve the probability of CSP's misbehavior detection of at least 99% while only needing to make 460 data blocks for a challenge.We can see that it costs the TPA only about 5.75 s to verify a response and the CSP costs about 1.1 s to generate a response when the number of challenged data blocks is 460.We compare the Proof-Check performance between Li's protocol and our protocol in Table 2.We find that our mechanism requires more checking time compared to Li's protocol.Based on our analysis of computation cost from Table 1, we find that there are more pairing operations (P) during the phase of Proof-Check, which is consistent with our experimental results, because one pairing operation takes more time than other cryptographic operations.However, our mechanism is of comprehensive privacy preservation, while Li's not.

Conclusions
In this paper, we propose a new PDP protocol for group shared data at untrusted cloud storage, which aims to solve the problems of privacy preservation, including data privacy and the group user identity privacy.By utilizing certificateless cryptography, we eliminate the issues of expensive certificate management and key escrow.We prove that our protocol is secure, and further illustrate its efficiency through practical experiments.The results show that the proposed protocol is efficient and practical.

Figure 1 .
Figure 1.System model of our protocol.

3 . 4 . 5 .
Secret-Value-Gen is a probabilistic algorithm run by the group user who randomly selects i ID y as the secret value.Thus, the private key of the group user contains two parts: secret value Public-Key-Gen is a probabilistic algorithm performed by the group user i u to compute the public key.It inputs the i u 's secret value Tag-Gen is a probabilistic algorithm executed by the group user i u to generate authentication tags for data blocks.It takes the i u 's partial private key i ID psk , the secret value i ID y , and the data block j m as inputs, and outputs the tag j σ of j m .

Figure 1 .
Figure 1.System model of our protocol.
A 1 makes hash function queries to C for any identity ID, and C responds to the hash values to A 1 .(2) Partial Private Key Query.A 1 adaptively chooses different ID and summits it to C for querying the partial private key of the ID.C executes the Partial-Private-Key-Gen algorithm to obtain the partial private key for the ID and sends it to A 1 .(3) Secret Value Query.A 1 adaptively chooses different ID and summits it to C for querying the secret value of the ID.C runs the Secret-value-Gen algorithm to generate the secret value for the ID and sends it to A 1 .(4) Public Key Query.A 1 adaptively chooses different ID and summits it to C for querying the public key of the ID.C performs the algorithm Public-key-Gen to compute the public key for the ID and sends it to A 1 .(5) Public Key Replacement.A 1 can repeatedly select a value to replace the public key of any ID.(6) Tag Query.A 1 adaptively chooses the tuple (ID, m) and submits it to C for querying the tag of the data block m.C runs Tag-Gen algorithm to generate the tag of data block m and sends it to A 1 .
j i → means the information of i u can be find from public log file LF by the index j of data block j m .If the equation holds, the TPA accepts the proof; otherwise, the proof is invalid.The process of Challenge, Proof-Gen, and Proof-Check are summarized as Figure 2.

Figure 2 .
Figure 2. CL-PGSDP protocol.• Revocation-Tag-Gen.If user m u , 2 m z ≤ ≤ leaves the group, and user n u will be the successor of m u .The following procedure will efficiently update the tags generated by m u .It needs m u , n u and the CSP online simultaneously.

Future 18 Figure 3 .
Figure 3. Tag generation time for increased number of data blocks.

Figure 3 .
Figure 3. Tag generation time for increased number of data blocks.

Figure 3 .
Figure 3. Tag generation time for increased number of data blocks.

Figure 4 .
Figure 4. Increasing number of challenges for fixed size of data.Figure 4. Increasing number of challenges for fixed size of data.

Figure 4 .
Figure 4. Increasing number of challenges for fixed size of data.Figure 4. Increasing number of challenges for fixed size of data.
-Query: At any time, A 1 can query secret value for any identity ID , if ID / ∈ Tab 1 or ID / ∈ Tab 2 .B firstly makes the H 1 -Query or PartialKey-Query for the identity ID .Then, B randomly chooses a value x ∈ Z * q as response to A 1 .Y = g ay = g al 2 , in which y is the secret value from SecretValue-Query, and updates Tab 2 .B responds with Y to A 2 .H 2 -Query: At any time, A 2 can query the random oracle H 2 for ω .Algorithm B also maintains a list of tuples (ω, W) as Tab 3 .If ω ∈ Tab 3 , then algorithm B retrieves the tuple (ω , W ) and responds W to A 2 .Otherwise, B randomly selects l 3 ∈ Z * q and computes W = g bl 3 .Then, it adds the tuple (ω , W ) to Tab 3 and responds with W to A 2 .Tag-Query: At any time, A 2 can query tag with (m , ID , ω ).B first checks whether ID ∈ Tab 1 , ID ∈ Tab 2 , and ω ∈ Tab 3 .If not, B will compute corresponding tuple and updates Tab 1 , Tab 2, and Tab 3 .After that, B can get corresponding information from Tab 1 , Tab 2, and Tab 3, and computes the tag T for (ω , m , ID ) using the algorithm Tag-Gen and returns it to A 2 .Forge: Eventually, A 2 outputs (T , ω , m , ID ).T is the forged tag of the data block m on the identity ID .Analysis: If A 2 wins Game 2, on the one hand, B can get e(T , g) = e(H 1 (ID + δ) m , P) • e(H 2 (ω ), Y )