Privacy-Preserving Multi-Receiver Certificateless Broadcast Encryption Scheme with De-Duplication

Nowadays, the widely deployed and high performance Internet of Things (IoT) facilitates the communication between its terminal nodes. To enhance data sharing among terminal devices and ensure the recipients’ privacy protection, a few anonymous multi-recipient broadcast encryption (AMBE) proposals are recently given. Nevertheless, the majority of these AMBE proposals are only proven be securely against adaptively chosen plain-text attack (CPA) or selectively chosen ciphertext attack (CCA). Furthermore, all AMBE proposals are subjected to key escrow issue due to inherent characteristics of the ID-based public cryptography (ID-PKC), and cannot furnish secure de-duplication detection. However, for cloud storage, it is very important for expurgating duplicate copies of the identical message since de-duplication can save the bandwidth of network and storage space. To address the above problems, in the work, we present a privacy-preserving multi-receiver certificateless broadcast encryption scheme with de-duplication (PMCBED) in the cloud-computing setting based on certificateless cryptography and anonymous broadcast encryption. In comparison with the prior AMBE proposals, our scheme has the following three characteristics. First, it can fulfill semantic security notions of data-confidentiality and receiver identity anonymity, whereas the existing proposals only accomplish them by formalizing the weaker security models. Second, it achieves duplication detection of the ciphertext for the identical message encrypted with our broadcast encryption. Finally, it also avoids the key escrow problem of the AMBE schemes.


Introduction
With development of various Internet of Things (IoT) applications, the communication amongst smart IoT devices has become more and more frequent and convenient. As an important one-to-many communication model, broadcast encryption (BE, for short), which was first formally proposed by Amos Fiat and Moni Naor [1], allows for the broadcaster to deliver the encrypted data to the authorized subset S of the receivers that are monitoring the broadcast channel. In addition, only the receivers that belong to the subset S can recover the message by their private key, while the other receivers outside of S can obtain no information about the delivered data. In general, broadcast encryption is capable of saving more computational complexity and communication overhead than traditional encryption in the peer-to-peer model. Therefore, it has very important applications in communications field [2,3] and IoT [4], etc.
However, IoT devices have some non-negligible vulnerabilities during data sharing and anonymity protection [5][6][7]. At the same time, anonymity is also an important security property in the first anonymous multi-receiver identity-based broadcast encryption (AMIBE) scheme [12] was introduced. Nevertheless, their scheme was shown to be insecure by Wang et al. [34] and Chien [35] since it can not achieve anonymity protection of the receiver's identity, whereafter, Wang et al. also presented a modified proposal to fulfill the anonymity of the receiver's identity in [34]. Very regretfully, Wang et al.'s modified proposal was pointed to be insecure by Zhang et al. in [36]. In 2018, Tseng et al. presented an improved vision of Fan et al.'s AMIBE by revising receiver anonymity's security definition in [37] and their scheme was shown to be secure in the random oracle model. In Asia-CCS16, based on the multilinear map, Xu et al. gave an AMIBE scheme which is against anonymity attacks and chosen-plaintext attacks in the standard model [38,39]. However, all multilinear map candidates are broken [40]; thus, their proposal is infeasible in reality. Recently, He et al. proposed an ID-based anonymous BE scheme that can concurrently achieve data indistinguishability and anonymity of the receiver identities under the adaptively chosen ciphertext attacks [41].
ID-based cryptographic protocols cut out complex maintenance of certificates; however, an inherent problem called "key escrow" exists. This problem can make the PKG be able to execute any cryptographic operation in the name of users since it knows all users' private keys. Thus, the problem might result in potential security threats for the ID-based crypto-system. To avoid the key escrow problem, Al-Riyami and Paterson gave a variant of ID-based PKC: certificateless cryptography in [42]. Not only do the advantages of ID-based cryptography remain, but they also prevent the key escrow problem of ID-based PKC. In 2004, Yum et al. presented a general construction construction of certificateless encryption (CLE) [43]. Unfortunately, Yum et al.'s scheme was shown to be insecure by Libert et al. in [44] since it does not satisfy CCA security of CLE. In addition, therewith, Libert et al. put forward a novel construction of CLE achieving CCA security.
Recently, lIslam et al. put forward a pairing-free anonymous multi-receiver certificateless encryption scheme (AMCLE, for short) by combining AMIBE with CLE in [45]. Their scheme can achieve receivers' anonymity and the ciphertext length is linear with the number of the authorized receivers. When more than one person sends the same data, it will bring a heavy burden to the receivers for data storage. Thus, de-duplication is a wise choice to address the growing demand for storage.
To reconcile de-duplication, Douceur et al. presented a method convergent encryption (CE) [46], which is a deterministic symmetric encryption with secret key H(m). If two users Alice and Bob encrypt the same plaintext m, they can obtain the same ciphertext C = E H(m) (m). Its attractive merit makes it be applied in some commercial system. However, it lacks the detailed security analysis and it is not explicit what its basic security goal precisely is. To solve de-duplication of the identical message which is encrypted under the different secret keys, Bellare et al. put forth a novel notion Message-Locked Encryption (MLE) [47]. However, MLE is only capable of providing security of unpredictable data. Recently, Bellare et al. proposed an Interactive message-locked encryption and secure de-duplication [48] which can solve the correlated message's security problem. Until now, numerous secure de-duplication schemes have been presented for settling data de-duplication in cloud [49][50][51].

Bilinear Groups
Throughout the paper, we only consider a Type 2 pairing since our scheme is based on such construction. In the following, we review some concepts of such bilinear group pair.
1. G 1 and G 2 denote two additional groups of the same prime p; G T denotes a multiplicative group. In addition, it is deemed to be hard for solving the discrete logarithm problem in group G i , i ∈ {1, 2, T}. 2. P i denotes the generator of group G i ,for i ∈ {1, 2}. 3. Let ϕ : G 2 → G 1 be a computable isomorphism map which satisfies ϕ(P 2 ) = P 1 ; and 4. Letê : G 1 × G 2 → G T denote a computable bilinear map, which meets the following criteria: • Bilinearity: For arbitrary a, b ∈ Z p and all Q ∈ G 1 , F ∈ G 2 , we haveê(aQ, bF) =ê(Q, F) ab ; • Non-degeneracy:ê(P 1 , P 2 ) = 1.

Security Assumptions
In this subsection, we give several security assumptions [33,52] which are the security foundation to construct the proposed scheme. ε-BDH-2 problem [33] in (G 1 , G 2 ) . Given group elements a 1 P 2 , b 1 P 2 ∈ G 2 and c 1 P 1 ∈ G 1 , where P 2 ∈ G 2 , P 1 ∈ G 1 , and a 1 , b 1 , c 1 ∈ Z * p ; if there exists a PPT-algorithm A which takes (P 1 , P 2 , a 1 P 2 , b 1 P 2 , c 1 P 1 ) as inputs and outputs, the Type 2 pairing X = e(P 1 , We think that ε-bilinear Diffie-Hellman problem in G 2 and G 1 holds against A if the algorithm A is not capable of obtainingê(P 1 , P 2 ) a 1 b 1 c 1 with a non-negligible probability greater than ε. ε-BDDH-2 problem in (G 1 , G 2 ) [33]. It is hard to distinguish the distributions D 1 = (P 1 , P 2 , a 1 P 1 , b 1 P 1 , c 1 P 2 , e(P 1 , P 2 ) a 1 b 1 c 1 ) and D 2 = (P 1 , P 2 , a 1 P 1 , b 1 P 1 , c 1 P 2 , Z), where Z ∈ G T and a 1 , b 1 , c 1 ∈ R Z p . In general, D 1 is denoted as the BDDH tuple, and D 2 is called "random tuple". For a PPT algorithm B, B's advantage of breaking the BDDH-2 problem in (G 2 , G 1 ) is defined as We think that ε-decisional Bilinear Diffie-Hellman problem in (G 2 , G 1 ) holds against B if the algorithm B is capable of distinguishing the difference of the above two distributions in a non-negligible probability ε > 1/2.
The Computational Diffie-Hellman problem (CDH) in G 1 . Let (P 1 , a 1 P 1 , b 1 P 1 ) ∈ G 3 1 be a random 3-tuple where a 1 , b 1 ∈ Z p ; there does not exist an efficient algorithm A that can calculate abP 1 . A's advantage of breaking the Computational Diffie-Hellman problem in G 1 is defined as We think that the CDH problem holds against A if the algorithm A is capable of outputting a 1 b 1 P 1 in a non-negligible probability ε.
The Decisional Diffie-Hellman problem (DDH) in G 1 . Given a 4-tuple (P 1 , a 1 P 1 , b 1 P 1 , W) ∈ G 1 where a 1 , b 1 ∈ Z p and W ∈ G 1 , there does not exist an efficient algorithm A that determines a 1 b 1 P 1 = W. A's advantage of breaking the Decisional Diffie-Hellman problem in G 1 is defined as We think that the DDH problem holds against A if the algorithm A is capable of distinguishing the difference of a 1 b 1 P 1 and W in a non-negligible probability ε > 1/2.

System Model
According to the definitions of certificateless encryption and broadcast encryption, we give the basic system model of privacy-preserving multireceiver certificateless broadcast encryption with de-duplication (PMCBED) schemes. The PMCBED scheme mainly borrows the idea in [12,37,38] to achieve privacy protection of receiver identities in the certificateless broadcast encryption scheme and offer the ciphertext de-duplication function. Its framework is showed in Figure 1. It includes four entities: key generation center (KGC), the receivers, the broadcaster and the de-duplicator. Their detailed roles are shown as follows: 1. KGC: it is a trustworthy entity that is responsible for producing a partial private key of the receiver. 2. the Broadcaster: It is a sender of the message. It first selects a subset of the receivers and calculates the ciphertext of the transmitted message. Afterwards, it sends these ciphertexts to the de-duplicator. 3. The de-duplicator: It is an honest-but-curious entity. It can be acted on by the cloud server. Its goal is to check whether the received ciphertext has its replica existing in the cloud. 4. The Receiver: It is the receiver of the ciphertext, its goal is to decrypt the ciphertext. If and only if it is an authenticated receiver, then it can decrypt the ciphertext. For a PMCBED scheme, it has eight algorithms: System-setup, Extract partial-private key, Set secret-value, Set-public-key, Set-private-key, Encryption, Decryption and Equality-test. For each algorithm, its detailed definition is given as follows: • System-setup (1 λ ). λ is a security parameter, and this algorithm is run by a Key Generation Center (KGC) which takes as input λ, return the public parameters PP and the master secret key msk of KGC. The public parameters PP should be published publicly.

•
Extract partial-private key (msk, ID). In general, this algorithm is run by KGC. It takes as inputs public parameters PP, master key msk and a receiver's identity ID, and outputs the partial-private key d ID of the receiver. • Set secret-value (ID). The algorithm is run by the receiver. It takes as inputs public parameters PP and the identity ID of the receiver, and returns x ID as the receiver's secret value. • Set private-key (x ID , d ID ): This algorithm is run by the receiver, it takes as inputs the partial-private key d ID and secret-value x ID of the receiver, and outputs private key SK ID = (d ID , x ID ) of the receiver. • Set public-key (ID): The algorithm is used to produce the public key of the receiver. It takes as inputs secret value x ID of the receiver and public parameters PP, and outputs the corresponding public key Y ID . • Encrypt (m, (ID 1 , Y ID1 ), · · · , (ID t , Y IDt )). The broadcaster runs this algorithm by inputting a plaintext m, public parameters PP, a set S = (ID 1 , Y ID1 ), · · · , (ID t , Y IDt ) of receivers' identities/public keys, and outputs a ciphertext C = Encrypt(m, params, S).
• Decrypt (C): The algorithm is run by the receiver. It takes as inputs a ciphertext C, public parameters PP and the private key SK ID of the receiver, returns a recovered message m or a symbol ⊥ that indicates decryption error. • Equality-test (sk TTP , CT, CT ): It is a deterministic algorithm, run by a de-duplicator which is an honest-but-curious entity, it takes public parameter PP, the de-duplicator's secret key sk TTP and two ciphertexts CT and CT as inputs, and returns 1 if CT and CT are from the identical plaintext, otherwise, returns 0.

Security Models
For a secure public key encryption scheme, it should ensure the confidentiality of the encrypted message, this property is referred to as ciphertext-indistinguishability which can be defined in two security models of chosen-plaintext-attack (CPA) and chosen-ciphertext-attack (CCA) [53]. However, for IND-CPA and IND-CCA, indistinguishability does not hold in a secure de-duplication public key encryption in that it is easily breached by an IND-CPA adversary or an IND-CCA adversary in the game [53]. In the Challenge phase of the IND-CPA/CCA security game, the adversary is allowed to select two plaintexts mt 0 and mt 1 , and then a challenge C * for a plaintext mt b with b ∈ {0, 1} is returned. By invoking the Equality-test algorithm, the adversary is able to output the corresponding b by computing a ciphertextĈ for plaintext mt b and checking whetherĈ matches the challenge ciphertext C * . The reason to produce such problem is that, given two ciphertexts, any one can run an Equality-test algorithm to check their matching-ability.
To provide IND-CCA security in the public key encryption with de-duplication, a trusted-third party (TTP) is introduced to execute an Equality-test algorithm by inputting its private key. Meanwhile, the adversary is not allowed to have access to TTP in the security game. Thus, the Equality-test query is not involved in the following security games. In the context of the rest of this paper, we let the de-duplicator act as the TTP.
Inspired by security models of certificateless encryption (CLE) and anonymous BE, the security model of our PMCBED schemes defines two security notations "confidentiality" and " anonymity of the receivers' identities". For confidentiality, it indicates that an adversary is not capable of obtaining any information of the encrypted message from ciphertext. For anonymity of the receivers' identities, it indicates that an adversary is not capable of obtaining any identity information of the other receivers from ciphertext.
In the following, we first define the IND-CCA security game for PMCBED . Let Adv I , Adv I I be Type I and Type II probabilistic polynomial time (PPT) adversaries, respectively. In the following, Adv I /Adv I I will make an interactive game with the challenger C.

Definition 1.
A PMCBED scheme is defined to be secure against adaptive-chosen-ciphertext attack ("IND-CCA security") if there does not exist a Type I/II of adversaries having a non-ignorable superiority in the following game: • Setup: Let λ be a security parameter, C be a Challenger. C invokes a Setup (1 λ ) algorithm to return public parameters PP and master secret key msk; afterwards, C transmits PP to Adv. If Adv is the Type II adversary Adv I I , then msk is also sent to Adv. Otherwise, msk is secretly kept by the Challenger and then sends system public parameters PP to adversary Adv who also receives the master secret key msk if it is of Type II. Otherwise, the master secret key msk is kept secret. • Phase 1: In this phase, Adv can adaptively make a series of queries: -Public key query oracle: Upon receiving public key query of the receiver ID, if it is the first query of the receiver, then C invokes Set public-key algorithm to produce public key PK ID and return PK ID to Adv. Otherwise, it returns the matching public key in the list.

-
Extract partial-private key oracle: On receiving a partial private key query of the receiver ID, C inputs msk to invoke the Extract partial-private key algorithm and return d ID if Adv is the Type I Adv I ; otherwise, the oracle is not required if A is of Type II.
-Extract secret-key oracle: Upon receiving the secret key query of the receiver ID from the adversary Adv, C invokes the Set secret-value algorithm to produce secret value x ID and return it to Adv. -Decrypt oracle: On receiving the decrypting query of (CT, ID) from Adv, C invokes the Set secret-value algorithm and Extract partial-private key algorithm to obtain private key SK ID of the receiver ID; then, it runs a Decryption(CT, SK ID ) algorithm to recover the corresponding plaintext.
Note that when Adv is the Type I Adv I , it also needs to query Public-key-replace oracle in which the receiver's public key Y ID is replaced with a new public key Y ID when inputting a receiver's identity ID and its corresponding public key Y ID .
• Challenge: The adversary Adv submits two distinct equivalent-length messages m 0 and m 1 as well as a set of the receivers' identities/public-keys S * = (ID 1 /Y 1 , · · · , ID k /Y k ). It is required that Adv cannot query Extract partial-private-key oracle with the identity ID i ∈ S * . The challenger C randomly samples a bit b ∈ {0, 1} to compute the challenge ciphertext C * = Encrypt(m b , PP, S * ) and returns it to adversary A. • Phase 2: Adversary Adv can continue to adaptively issue a new sequence of queries as in Phase 1.
Meanwhile, in a Type I attack, Adv is not allowed to issue Extract partial-private-key query and Public-key-replace query on identity ID * , where ID * ∈ S * .
• Guess: At last, a guess bit b ∈ {0, 1} is returned by the adversary Adv. Adv wins this game if b = b.

Definition 2.
A PMCBED scheme is defined as ANO-CCA security if there does not exist a Type I or II of adversary Adv which has a non-ignorable superiority in the following games: • Setup and Phase 1: In the two phases, they are the same as those in the above IND-CCA Game. • Challenge: In this phase, Adv produces two challenge setsŜ 0 andŜ 1 , where |Ŝ 1 | = |Ŝ 0 |. In addition, it then submits a message m * and (Ŝ 0 ,Ŝ 1 ) to C. In addition, the constraint conditions are as follows: (1) Adv is not permitted to issue Extract partial-private-key queries on ID * when Adv is the Type I adversary Adv I ,(2) a Adv is not permitted to issue Extract secret-key queries on ID * when Adv is the Type II adversary Adv I I , where ID * ∈Ŝ 1 ⊕Ŝ 0 andŜ 1 ⊕Ŝ 0 =Ŝ 1 ∪Ŝ 0 −Ŝ 0 ∩Ŝ 1 . Then, C uniformly samples a bit α ∈ {0, 1} to calculate the ciphertext C * = Encrypt(PP,Ŝ α , m * ) and returns it to Adv. • Phase 2. In this phase, Adv adaptively issues a new series of queries as in Phase 1 with the following constraint conditions :(1) Adv is not permitted to issue Extract partial-private-key queries on ID * , (2) Public-key-replace queries on ID * are not allowed when Adv is the Type I adversary Adv I , (3) Extract secret-key queries on ID * are not allowed when Adv is the Type II adversary Adv I I , and (4) Adv is not allowed to issue Decryption Query on (ID * /Y * ; C * ), where ID * ∈Ŝ 1 ⊕Ŝ 0 . • Guess: At last, a guess bit α ∈ {0, 1} is outputted by Adv. Adv wins this game if α = α .

Our Scheme
Setup: Let λ be a security parameter, Setup (λ) algorithm takes as input λ, and outputs a bilinear map e : G 1 × G 2 → G T , where G 1 and G 2 are two groups satisfying G 1 = < P 1 > and G 2 = < P 2 >. In addition, they has the same order p. Note that P 1 = ϕ(P 2 ) and ϕ : G 2 → G 1 is an isomorphism. Let H : {0, 1} * → G 1 ,H 1 : G T ← G 1 , H 3 : G T ← Z p be three cryptographical hash function. H 2 () and f () are two one-way functions. H 0 is a random generator of group G 2 . For the KGC, it picks a number s ∈ Z p at random to calculate its public key PK pub = sP 2 . Let TPK = x T P 1 denote the public key of de-duplicator, x T ∈ Z p be its private key. (E(·), D(·)) denotes the encryption/decryption algorithm of AES. Finally, the public parameters are Param = (P 1 , P 2 , G 1 , G 2 , G T , p, ϕ, TPK, PK pub , e, H(), H 1 , H 2 , H 3 , f , H 0 , (E, D)). msk = s acts as a master secret key and is kept secretly.
Extract partial-private key: First, in all, a receiver submits its identity ID to the KGC; then, the KGC utilizes its master secret key msk to produce partial-private key d ID of the receiver, where d ID = sH(ID).

Set secret value:
For a receiver with identifier ID i , it uniformly samples a number xk i ∈ Z p and returns xk i to act as its secret value.
Set private-key: For a receiver with identifier ID i , let d ID i be its partial-private key, and xk i be its secret value. Its private key SK ID i is set to be SK ID i = (xk i , d ID i ).
Set public-key: In this algorithm, a receiver with identifier ID i takes an input secret value xk i , and outputs its public key Y i = xk i P 1 .

Discussion
For the above construction, we can know that, if the receiver's identity ID is involved in the set of the designated receivers, then this receiver can decrypt the corresponding ciphertext CT since, when this receiver's identifier satisfies ID i ∈ S, where S = {ID 1 /PK 1 , · · · , ID n /PK n }, let x i = H(ID t i ), we have C j (x i ) = 0 for j = i and Thus, the receiver with identifier ID i is capable of obtaining r 1 P 1 by utilizing its partial-private key d ID i , namely, It means that the receiver with identifier ID i is able to decrypt the message by the key K = H 1 (C 2 /e(r 1 P 1 , C 1 )).

Security Analysis
In the following theorems, we will show that our aforementioned construction can achieve two security properties: anonymity of the receiver's identity and confidentiality. Theorem 1. Let H, H 1 and H 2 denote random oracles. If the BDH-2 problem and the DDH problem in (G 1 , G 2 ) are hard, then our proposed construction can be proven to be secure against the IND-PMCBED-CCA attack of the Type I adversary.
Proof. Suppose there exists a Type I of adversary A I in an IND-PMCBED-CCA game. If it can break our construction in a non-negligible probability , then we are capable of building an algorithm B which solves the BDH-2 problem and the DDH problem in (G 1 , G 2 ). Let (P 2 , aP 2 , bP 2 , cP 1 ) be a random instance of the BDH-2 problem, where a, b and c are unknown random numbers from Z p ; the target is to compute e(P 1 , P 2 ) abc . In addition, let (P 1 , β 1 P 1 , β 2 P 1 , V) be a random instance of the DDH problem, its target is to determine V ? = β 1 β 2 P 1 . Therefore, B simulates the following security game with the adversary A I . Setup. Let PP = {P 1 , P 2 , G 1 , G 2 , e, p, H, H 1 , H 2 , H 3 , (E, D)} be system parameters; they are built by B. In addition, B sets PK = aP 1 = ϕ(aP 2 ) and TPK = β 1 P 1 . Then, B sends public parameters PP to the adversary A I . In the following proof,H 2 acts as a one-way function. H, H 1 and H 3 are random oracles. Phase 1. In this phase, A I is capable of adaptively issuing a series of queries.
• H-Hash Query: When receiving the H-hash query on ID i from A I , B answers as below. If a record ID i have appeared in a tuple (ID i , Q i , η i , q i ) in the H-list which is originally empty, it sends back Q i ; otherwise, it generates η i ∈ {0, 1}, and randomly chooses q i ∈ Z p . If η i = 0, it sets Q i = q i P 1 , else it sets Q i = q i bP 1 = q i · ϕ(bP 2 ) and adds (ID i , Q i , η i , q i ) in the H-list. It returns Q i to A I Public-key query: When A I makes a public key query with ID i , if the 3-tuple (ID i , Y i , xk i ) appears in the PK-list which is initially empty. Y i is returned to A I ; otherwise, B picks xk i ∈ Z p to set Y i = xk i P 1 , and adds (ID i , Y i , xk i ) in the PK-list. Finally, it returns Y i to A I .

•
Extract partial-private key Query: Upon receiving a Partial-private key query of the identity ID i , if the record (ID i , Q i , η i , q i ) had appeared in the H-list and η i = 0, then B computes d ID i = q i · aP 1 = q i · ϕ(aP 2 ). Otherwise, abort it and output ⊥. • Extract secret-value Query: When A i issues a query on an identity ID i , if 3-tuple (ID i , Y i , xk i ) exists on the PK-list, then xk i is returned to A I , otherwise, B randomly selects xk i ∈ Z p to compute Y i = xk i P 1 and adds (ID i , Y i , xk i ) in the PK-list.

•
Public-key-replace Query: When A I makes a public key replace query with Decryption Queries: On input, a ciphertext CT and an identity ID i , where CT = (C 1 , C 2 , C 3 , Q 1 , · · · , Q n ), B first issues a H-query with ID i to obtain the tuple (ID i , Q i , η i , q i ), if η i = 0, it sets d ID i = q i · P 1 and make a Extract-secret-value query with ID i , if xk i = ⊥ is returned, B can make use of (d ID i , xk i ) to decrypt CT and respond the Decryption Query. Otherwise, B does the following steps: it retrieves k i from H 3 -list and decrypts CT to recover M||τ = D(k i , CT) with k i to parse it into M and τ which can recover τTPK. (Note that we assume that the H 3 -query had been made before the adversary issues the decryption-query with CT).
if C 0 = f (M) · TPK + f (τ)TPK break; } 2. If j ≤ q H 3 , B sends back M to A I . Otherwise, it aborts it.
Challenge. In this phase [13], A I submits two equivalent-size plaintext M 0 and M 1 , as well as a challenge set of identities/public-keys S * = (ID 1 /Y 1 , ID 2 /Y 2 , · · · , ID l / PK l ) with the restriction conditions which A I have not issued partial-private-key Oracle with ID i ∈ S * in phase 1 and each η i = 1 in the tuple (ID i , Q i , η i , q i ) of H 1 -list, where Y i is a public key which corresponds to the identity ID i .

Phase 2.
A I can adaptively make a new series of queries as in Phase 1 with the constraints: 1. CT * can not be made into Decryption queries. 2. All ID i ∈ S * is not allowed to issue Extract partial-private-key queries.
Guess. Eventually, A I outputs its guess β ∈ {0, 1}. When V = β 1 β 2 P 1 , the challenge ciphertext CT * is a valid one. For the perspective of A I , the challenger's simulation is indistinguishable from the real game. When V is a random element of G 1 , the challenge ciphertext has the same distribution as the real ciphertext. Furthermore, we assume that A I must have previously issued H 1 query with X i = e(H(ID i ), PK) c Because C * 1 = cP 1 ,H(ID i ) = q i bP 1 and PK = aP 1 , it means that B can compute e(P 1 , P 2 ) abc = (X i ) q −1 i . Therefore, it is impossible to have an IND-PMCBED-CCA adversary A I which breaks our PMCBED scheme.

Theorem 2.
Under the DDH problem in G 1 , our proposed PMCBED scheme is provably secure against the IND-PMCBED-CCA attack of Type II adversary A I I .
Proof. Assume that there is a Type II of adversary A I I in the IND-PMCBED-CCA game. If it breaks our construction, then we are capable of constructing an algorithm B to solve the DDH problem. Let (P 1 , aP 1 , bP 1 , Z) be an instance of DDH problem in group G 1 , where a, b ∈ Z p are unknown, its goal is to determine Z = abP 1 .
Setup. Algorithm B randomly chooses α ∈ Z p to compute PK = αP 1 and let TPK = aP 1 . Let PP be public parameters, where PP = (P 1 , PK, TPK, e, G 1 , G 2 , P 2 , H, H 1 , H 2 , H 3 , E, D, f ). Then, it delivers PP and α to the adversary A I I . H, H 1 , H 3 are three random oracles which are controlled by B. Phase 1. A I I can adaptively issue a series of queries. H-Hash Queries. Upon receiving a receiver's identifier ID j , B first checks that (ID j , Q j ) has appeared in the H-list which is initially empty; if it is, then Q j is returned. Otherwise, B picks q j ∈ Z p at random to calculate Q j = q j P 1 and adds (ID j , Q j , q j ) in the H 1 -list. Finally, Q j is returned. H 1 -Hash Queries. It is the same as that of Theorem 1. H 3 -Hash Queries. It is the same as that of Theorem 1. Public-Key Queries. Upon receiving an identity ID i , if the 3-tuple (ID i , Y i , xk i ) has existed in the PK-list that was originally empty, then Y i is returned. Otherwise, it produces η i ∈ {0, 1} and randomly chooses a i ∈ Z p . If η i = 0, it sets Y i = a i P 1 , else it sets Q i = a i bP 1 and adds (ID i , Y i , η i , a i ) in the PK-list. It returns Y i to A I I . Decryption Query. Upon receiving (CT, ID i ), if ID i had existed in the PK-list and the corresponding η i = 0 holds, then B decrypts the ciphertext CT by (α · H(ID i ), a i ) and returns the decrypted message M to the adversary A I I . Otherwise, B does the following steps: Challenge Phase. Let S * = (ID 1 /Y 1 , ID 2 /Y 2 , · · · , ID l / Y l ). In this phase, the adversary A I I outputs two equivalent length messages M 0 and M 1 , and a set of identites/public-keys S * with the restriction conditions with each η i in the tuple (ID i , Y i , η i , a i ), where ID i ∈ S * satisfies η i = 1.
Then, B is computed as below: 1. It uniformly samples k ∈ Z p to compute C * 1 = kP 2 and C * 0 = f (M β )TPK + Z as well as C * −1 = e(bP 1 , P 2 ). Note that we have the relation (D 0 = aP 1 , ) which is the instance of the CDH problem if Z = abP 1 . 2. For j = 1 to l, it calculates x * i = H 2 (ID i ); 3. Then, for j = 1 to l, it builds the polynomial 4. For j = 1 to l, B computes Note that r 1 in the original encryption is set as r 1 = a but is unknown. 5. For i ∈ {1, 2 · · · , l}, it calculates

Phase 2.
A I I may issue a new series of queries which is the same as what it did in Phase 1 with the restriction that CT * is not made in the Decryption query. Guess. Finally, A I I gives its guess β . If β = β , A I I wins this game with non-ignorable advantage ε. When Z = abP 1 , the ciphertext CT * = (C * 0 , C * 1 , C * 2 , C * 3 , Q 1 , · · · , Q l ) is a valid one since This means that r 1 = a and τ = b in the encryption. Thus, if A I I breaks our scheme , then B is able to solve the DDH problem. Challenge. After terminating Phase 1, A I submits a challenge message M and two disparate sets of identities/public-keys S * 0 = (ID * 0 /Y * 0 , ID 2 /Y 2 , · · · , ID l / Y l ) and S * 1 = (ID * 1 /Y * 1 , ID 2 / Y 2 , · · · , ID l / Y l ) with the constraint in which A I can not issue Extract Partial-private-key queries with ID i for ID i ∈ {S * 0 , S * 1 }. B randomly selects β ∈ {0, 1} to compute as follows: , and for j = 2 to l, it computes x * i = H 2 (ID i ). 4. Next, for j ∈ {2, 3, · · · , l}, it constructs the polynomial a ji x i .

5.
B randomly chooses r 1 ∈ Z p and for j ∈ {2, 3, · · · , l}, it randomly chooses T i ∈ G 1 to compute R j = T j + r 1 Y j ; and then it computes R β = T β + r 1 Y * β . 6. For j ∈ {β, 2, 3, · · · , l}, B computes Q j = ∑ l i = 0 a i,j−1 R i . 7. B randomly chooses Q ∈ G T and τ ∈ Z p to compute C * 2 = e(P 1 , C * 1 ) r 1 and C * Finally, A I outputs its guess β . B outputs 1 when β = β , it means that Z = e(P 1 , P 2 ) abc ; if β = β , outputs 0, it means Z = e(P 1 , P 2 ) abc . Analysis: In the above game, the simulation is indistinguishable from the scheme. If Z = e(P 1 , P 2 ) abc , then we let k * = c. All this time, CT * has the same distribution as the ciphertext in the real game; If Z is a random element in G T , then the ciphtertext has the uniform distribution in the ciphertext space since C * 3 = E(K, x τ ||M β ), where K = H 3 (Q) is a random element. Thus, in the adversary A I 's view, M β is independent, and it cannot provide any information to A I . Theorem 4. Let hash functions H, H 1 and H 3 be a random oracle. If the DDH assumption in groups (G 1 , G 2 ) is difficult, then our construction is proven to be secure against the Type II of adversary A I I in the ANON-ID-CCA attack game.
Proof. Let A I I be an adversary. If it breaks our construction, then we are capable of constructing a novel algorithm B which solves the DDH problem. Let (P 1 , aP 1 , bP 1 , Z) be a random instance of DDH problem in groups (G 1 , G 2 ), where a, b ∈ Z p are unknown, its goal is to determine Z = abP 1 .
Setup. Algorithm B randomly chooses α ∈ Z p to set PK = αP 1 . Let PP = (P 1 , P 2 , e, p, PK, f , G 1 , G 2 , H, H 1 , H 2 , H 3 , (E, D)) denote public parameters that are built by B. Then, it delivers PP and α to the adversary A I I . Here H, H 1 , H 3 are three random oracles that are controlled by B. Phase 1. A I I is capable of issuing a series of the same queries as those of Theorem 2.

Phase 2.
A I I can still adaptively issue the queries with the following constraints.

1.
A I I is not capable of issuing Public-key Query with ID, where ID ∈ {ID * 0 , ID * 1 }. 2. A I I is not capable of issuing Decryption Query with (CT * , ID), where ID ∈ {ID * 0 , ID * 1 }.
Guess. Finally, A I I returns its guess bit β . B outputs 1 if β = β ; it means that Z = abP 1 ; otherwise, outputs 0 meaning Z = abP 1 . Analysis: In the above game, the simulation is indistinguishable from the scheme. When Z = abP 1 , assume r 1 = a. The challenge ciphertext has the same distribution as that in the real game, in addition to when Z is a random element of G 1 , C * 2 and C * 3 in the ciphtertext has the form C * 2 = e(aP 1 , P 2 ) k · Q and C * 3 = E(K, x τ ||M β ), where K = H 3 (Q) and Q are uniform and random. Thus, from the adversary A I I 's view, M β is independent; it provides no information to A I I .

Performance Analysis
To evaluate the efficiency of the proposed scheme, we give the corresponding computational cost of the main algorithm by comparing with the Hung et al. scheme [37] and Islam et al. scheme [45]. For convenience, we define the following notations. Let T p , T m , T e and T h denote the time of executing a pairing operation, a scalar multiplication operation and an exponentiation operation as well as a map-to-point hash function, respectively. The computation cost of the main algorithms for the three schemes are shown in Table 1. Computational cost of encryption for n receivers (2n + 1)Tp + (n 2 + n)Tm nTp + nTe + (n + 1)Tm + nT h (n + 1)Tp + (n + 2)Tm + 2Te + nT h  Table 1, we find that our proposed scheme has more computational costs than the other two schemes. However, our proposed scheme has better security and functionality.

Conclusions
The users are increasingly concerned about anonymity. To protect the identity anonymity of the receiver, we construct a privacy-preserving Multi-receiver Certificateless Broadcast Encryption Scheme with De-duplication scheme in this work. It can not only simultaneously achieve confidentiality and the receiver's identity anonymity, but also achieve duplicate detection to determine whether two different ciphertexts are from the identical message. Thus, our proposal can efficiently reduce the cloud server's storage burden. It is very significant for cloud storage. Nevertheless, the ciphertext size is linear to the number of the receivers. A very important challenge will be how to construct a PMCBED scheme with constant-size ciphertext.
Author Contributions: All the authors contributed to the research and wrote the article. J.Z. proposed the idea, designed, security analysis and performed the evaluation. P.O. suggested directions for the detailed designs and evaluation, as well as coordinating the research.