A Blockchain-Empowered Arbitrable Multimedia Data Auditing Scheme in IoT Cloud Computing

: As increasing clients tend to outsource massive multimedia data generated by Internet of Things (IoT) devices to the cloud, data auditing is becoming crucial, as it enables clients to verify the integrity of their outsourcing data. However, most existing data auditing schemes cannot guarantee 100% data integrity and cannot meet the security requirement of practical multimedia services. Moreover, the lack of fair arbitration leads to clients not receiving compensation in a timely manner when the outsourced data is corrupted by the cloud service provider (CSP). In this work, we propose an arbitrable data auditing scheme based on the blockchain. In our scheme, clients usually only need to conduct private audits, and public auditing by a smart contract is triggered only when veriﬁcation fails in private auditing. This hybrid auditing design enables clients to save audit fees and receive compensation automatically and in a timely manner when the outsourced data are corrupted by the CSP. In addition, by applying the deterministic checking technique based on a bilinear map accumulator, our scheme can guarantee 100% data integrity. Furthermore, our scheme can prevent fraudulent claims when clients apply for compensation from the CSP. We analyze the security strengths and complete the prototype’s implementation. The experimental results show that our blockchain-based data auditing scheme is secure, efﬁcient, and practical.


Introduction
The rapid development of the Internet of Things (IoT) and intelligent multimedia has led to the explosive growth of massive amounts of data, which has put tremendous pressure on the entire Internet. To cope with this challenge, storing IoT and intelligent multimedia data with a cloud service provider (CSP) is a common solution [1][2][3][4]. Many schemes have been implemented and proved to ensure the security of outsourced data during transmission [5]. However, such a wide attack surface and many recent data breaches have raised concerns about data integrity and availability [6][7][8][9][10]. When sensitive IoT and intelligent multimedia data are outsourced to a CSP, the clients lose control of the data, and the data may be changed or deleted without their permission. To solve this problem, clients need to regularly check the integrity of the outsourced data, and remote data integrity checking is becoming an important issue in cloud computing.
Most existing data integrity checking techniques are probabilistic [11][12][13][14][15][16][17][18][19][20][21][22][23][24][25][26][27]. In this approach, the verifier randomly selects partial data blocks and then performs integrity verification on those chosen data blocks instead of checking the whole dataset, and hence a 100% guarantee for the integrity of the data cannot be provided. However, for massive IoT and intelligent multimedia data, especially sensitive data related to finance, energy, transportation, etc., a probabilistic approach is not enough, since they have strict requirements for data integrity and correctness. Another type of data integrity checking technique is deterministic. In this approach, the verifier examines all data blocks instead of only checking chosen partial data blocks, thus providing 100% assurance of data integrity [28]. However, the deterministic method means higher verification and computational overhead, and hence efficiency is a challenge and must be considered in this approach.
In order to check remote outsourced data integrity, numerous data auditing schemes have been proposed. According to the different roles of verifiers, the existing auditing schemes can be divided into private auditing and public auditing [24]. In private auditing schemes, the role of the verifier is assumed by the client himself, and some important key information used in verification is usually stored by the client instead of the CSP. Therefore, there will be disputes when the verification fails, because if the key information used in the verification is broken by a malicious client, the response of the CSP cannot pass verification. In other words, we cannot determine whether the CSP has damaged the data when verification fails. At this point, fair arbitration is required because the CSP needs to compensate the client for data corruption if verification is not passed. In public auditing schemes, the client usually resorts to a third-party auditor (TPA) to check the integrity of outsourced data. Thus, the audit results completely depend on the TPA. However, this is unrealistic since a fully credible TPA may not always exist. In addition, it should be noted that in these two existing types of auditing schemes, if the auditing results show that the outsourced data are corrupted by the CSP, it is usually difficult for clients to obtain the compensation from the CSP in a timely manner.
In this work, we propose an efficient blockchain-based hybrid auditing scheme with fair arbitration. In our scheme, we use bilinear map accumulators to realize deterministic checking, in which the verifier can check all data blocks, and at the same time, the computational overhead is acceptable. Specifically, for outsourced data B = {b 1 , · · · , b n }, the basic idea of an audit is that the verifier uses the random index j to challenge the CSP. Upon receiving the challenge, the CSP needs to compute the corresponding witness wit b j for the target data block b j , and all data blocks except b j are used in the calculation of the witness. The CSP returns both the target data block b j and the calculated witness as a response, namely (wit b j , b j ). Thus, even a small change in the outsourced data can cause the generated response to change, and so the generated response cannot pass verification. In other words, a valid response cannot be generated by the CSP if the data are not actually saved or the data are corrupted with the CSP. Therefore, the verifier examines all data blocks instead of only checking partial data blocks, thus providing 100% assurance of data integrity.
Moreover, the client not only holds the data file digest acc B , which is the key information used in verification by him or herself, but also saves a copy of the digest acc B to the blockchain simultaneously with the data uploading phase. During the audit, the role of the verifier can be assumed by a client or blockchain smart contract, which means that clients usually only need to conduct private audits, and public auditing by a smart contract is triggered only when verification fails in private auditing because if the data file digest is broken by the client, the CSP's response cannot pass verification even if the data are not corrupted by the CSP, and disputes may arise in this point. In this case, it will trigger the blockchain smart contract to conduct the public auditing for fair judgement, since the data file digest acc B saved in the blockchain will not be broken by anyone due to the non-tamperability property of the blockchain. Therefore, this blockchain-based hybrid auditing scheme with fair arbitration can solve the problem of distrust between the CSP and the client.
It is worth noting that we replace the TPA with a blockchain smart contract in our public auditing phase, and by using this technology, designing a smart contract with fair arbitration can ensure that the client will be compensated automatically and in a timely manner when the outsourced data are corrupted by the CSP due to the smart contract, which is an automatically executed code running on the blockchain. Furthermore, a dishonest client may falsely claim for compensation from the CSP, and we resort to digital Mathematics 2022, 10, 1005 3 of 17 signature technology to prevent dishonest behavior by the client. The contributions of our work are summarized as follows:

•
We present an efficient hybrid data auditing scheme for the IoT and intelligent multimedia by using the blockchain. By applying deterministic cryptographic techniques and the blockchain, our proposed design can fairly solve the problem of distrust between the CSP and the client. It also makes the auditing scheme more reliable, because the deterministic methods provide 100% data possession and integrity guarantees.

•
We enforce a healthy ecosystem to punish dishonest CSPs automatically and provide timely compensation to the client for data corruption by the CSP in the proposed scheme.

•
Our scheme can protect honest CSPs and prevent fraudulent claims by dishonest clients at the same time.

•
We use the hybrid auditing design in the proposed scheme. It can also save audit fees and communication costs for the client, because the public auditing phase is triggered only when verification fails in the private auditing phase. • We not only theoretically prove the correctness and soundness of our scheme but also experimentally verify the feasibility and efficiency of the scheme by the prototype's implementation.
The rest of the paper is structured as follows. Section 2 overviews some related works, and Section 3 introduces the preliminaries used in our scheme. After that, we give the system model of our proposed design, including the architecture overview, threat model, and security goals, in Section 4. Section 5 presents a detailed description of our proposed scheme. In Section 6, we present security analysis and some characteristics of our proposed scheme. We show the performance evaluation in Section 7 and conclude the paper in Section 8.

Data Auditing Schemes
With the increase in demand for outsourced data integrity checking, many data auditing schemes have been proposed [29,30]. Ateniese et al. [11] introduced the notion of provable data possession (PDP), which was the first public audit scheme to verify the authenticity of data, in 2007. However, data privacy protection and the full data dynamic operation cannot be supported in this scheme [12]. Then, Erway et al. [13] proposed a PDP scheme supporting full dynamic data updating. Since then, to achieve more functions and improve the efficiency of data auditing for remote data, a lot of research has been conducted in this area. Wang et al. [14] proposed a public data auditing scheme supporting data dynamic operations. In [15], the follow-up work supports privacy-preserving multiple-task auditing. Yuan et al. [16] proposed a public audit scheme for dynamic data sharing with the help of doubly linked information tables. In [17], the authors used the data structure of a Merkle hash tree to devise a public auditing scheme in which the communication overhead and verification efficiency are greatly taken into account. In addition to public auditing, private auditing is necessary in some cases [21][22][23][24]. Furthermore, various PDP models have been proposed [25][26][27]. The PDP method above allows a verifier to verify the remote data integrity without retrieving or downloading all of the data, only randomly selecting a few data blocks and then performing integrity verification on those chosen data blocks instead of checking the whole dataset. Thus, this is a probabilistic method and cannot provide a 100% guarantee for the data's integrity.
Due to the limitations of hardware, few deterministic auditing schemes have been proposed. In [31], the authors proposed the first deterministic public auditing mechanism, but it did not support dynamic data operation. In [32], Deswarte et al. proposed an auditing scheme based on the Diffie-Hellman cryptographic protocol. However, it incurred a high computational overhead, because the CSP must compute the power of the entire file for each auditing verification. Filho et al. [33] proposed a simple deterministic data integrity checking protocol based on a homomorphic RSA-based hash function, but the computation cost remained high and without data dynamic support. In [34], Sebé et al. which can reduce the computational overhead but without public auditing or data dynamic support. Barsoum [35] proposed a multi-copy provable data possession scheme supporting the public verifiability of multiple replicas of the data. In [36], Hao et al. proposed a privacy-preserving data integrity auditing scheme that supports public auditing and data dynamics.
With the development of the IoT and intelligent multimedia, several data auditing schemes for the IoT and intelligent multimedia services have been proposed. In [28], the authors devised a data audit mechanism by using a bilinear mapping accumulator for sensor data. The proposed design can check all data blocks, thereby eliminating the possibility of any server-side operation. However, existing data auditing solutions cannot solve the trust issue between the data owner and the CSP (or TPA). For private auditing schemes, since the key information used in verification is stored only on the client's side locally, this cannot solve the problem of client fraud. For public auditing schemes, the data owner usually resorts to a TPA to check the outsourced data integrity, but this is unrealistic as a fully trusted TPA may not always exist. As mentioned in [18], the involvement of a TPA may lead to data loss or abuse of authority.

Blockchain and Smart Contracts
As the core technology of the emerging cryptocurrencies, the blockchain is essentially a distributed database where the transactions are batched into an ordered growing list of blocks which are linked using cryptography. As is well known, the blockchain has the characteristics of decentralization, immutability, and distributed storage. Smart contracts [37] are executable, pre-agreed programs running automatically on the blockchain. Based on the properties and functions of the blockchain, many typical applications such as decentralized storage [38][39][40][41], crowdsourcing systems [42][43][44][45], medical data management [46,47], and distributed ledger technologies [48,49] have been built.
Recently, Wang et al. [19] leveraged smart contracts to design a blockchain-based fair payment scheme to replace TPAs for public cloud auditing. Yuan et al. [20] proposed a blockchain-based public auditing and secure deduplication scheme which supports automatic compensation of users for data corruption and automatic punishment of malicious CSPs by using a smart contract, but users need to pay an audit fee to the miners of the blockchain in each verification. Wang et al. [24] proposed a blockchain-based private provable data possession scheme which not only saves storage space but also greatly improves efficiency. However, it has no mechanism for automatic punishment and compensation when the outsourced data are corrupted by a CSP. Moreover, theses blockchain-based auditing schemes are all for the probabilistic approach. In light of the previous work, our work used a different auditing design which combines private auditing and public auditing for a deterministic approach. We aim to ensure reliability for data integrity verification and financial fairness in the data auditing scheme so that both the clients and CSPs are incentivized to conduct trustworthy behavior while saving on auditing fees for the clients. Table 1 shows the comparison between our proposed design and some related existing auditing schemes.

Bilinear Mapping
Let G 1 , G 2 , and G T be three cyclic groups of the prime order p. We use g 1 and g 2 to denote the generator of G 1 and G 2 , respectively. Bilinear mapping (pairing) is a mapping e : G 1 × G 2 → G T with the following properties: Computability: For all x, y ∈ Z p , there exists an effectively computable algorithm to compute e(g 1 x , g 2 y ).

q-SDH Assumption
Here we assume that G 1 = G 2 = G. Therefore, let G be a finite cyclic group of the order p, where p is a prime number whose length is κ bits. Thus, for a randomly chosen element α ∈ Z * p , a random generator g of G, and PPT algorithm A, the following holds: The bilinear map accumulator to be used in our data integrity auditing scheme is based on the properties of bilinear mapping and the q-SDH assumption described above. See [28] for details.

Smart Contract
The concept of a smart contract was first proposed by Nick [37] in 1995. It is an executable pre-agreed program running automatically on the blockchain according to its content. Developers can build distributed applications such as voting, financial transactions, and signing agreements based on smart contracts for Ethereum. When deploying the smart contract, it is necessary to preset the trigger condition and the corresponding response rule. After the smart contract is deployed, once an event triggers the terms of the contract, the code will be executed automatically without central authorization. The relevant details can be found in [37].

System Model
We present the system model, threat model, and security goals of our proposed hybrid auditing scheme with fair arbitration for data integrity verification in this section.

Architecture Overview
The traditional cloud data integrity auditing scheme consists of three roles-the client, verifier, and CSP-as shown in Figure 1. The client is the data owner who wants to outsource their personal data to a CSP, the CSP provides outsourced data storage and management services for the client, and the verifier is in charge of auditing the outsourced data's integrity. The role of the verifier can be performed by a client or a third-party auditor (TPA), which correspond to private auditing and public auditing, respectively. Our proposed blockchain-based hybrid auditing scheme extends the work performed in [20,28]. The integrity verification scheme in [28] meets both private and public auditing, but both of these types of audits have some defects: (1) In private auditing, when verification fails, we cannot determine whether the CSP has damaged the data because the file digest acc B used in verification is stored on the client's side locally. If the file digest is broken by the client, the CSP's response cannot pass verification. At this point, fair arbitration is required, because the CSP needs to compensate the client for data corruption in this case. (2) In public auditing, the client resorts to a TPA to audit the data's integrity. However, it is unrealistic that the correctness of the auditing results depends entirely on the TPA. verification fails, we cannot determine whether the CSP has damaged the data because the file digest used in verification is stored on the client's side locally. If the file digest is broken by the client, the CSP's response cannot pass verification. At this point, fair arbitration is required, because the CSP needs to compensate the client for data corruption in this case. (2) In public auditing, the client resorts to a TPA to audit the data's integrity. However, it is unrealistic that the correctness of the auditing results depends entirely on the TPA. Based on the defects in above two types of auditing in [28], we propose an arbitrable hybrid auditing scheme based on the blockchain. In this scheme, the client first conducts a private audit, and in case verification fails, it will trigger the blockchain to perform a public audit for fair judgement. It is worth noting that the TPA is replaced by the blockchain in public audits in our scheme, and we designed a smart contract with fair arbitration for the client. Using this smart contract, the client can be compensated automatically when data are broken by the CSP. Our scheme includes three different roles: the client, CSP, and blockchain. As shown in Figure 2, the interactions among them are described as follows: Based on the defects in above two types of auditing in [28], we propose an arbitrable hybrid auditing scheme based on the blockchain. In this scheme, the client first conducts a private audit, and in case verification fails, it will trigger the blockchain to perform a public audit for fair judgement. It is worth noting that the TPA is replaced by the blockchain in public audits in our scheme, and we designed a smart contract with fair arbitration for the client. Using this smart contract, the client can be compensated automatically when data are broken by the CSP. Our scheme includes three different roles: the client, CSP, and blockchain. As shown in Figure 2, the interactions among them are described as follows: 1. The client has a large amount of data and needs to outsource this data to the CSP for maintenance and computation but does not save the copy locally. It also stores a file digest copy on the blockchain. Then, the client challenges the CSP and verifies the response coming from the CSP. If the response from the CSP passes verification, the outsourced data are considered complete; otherwise, they are considered incomplete. This case will trigger the blockchain to perform public auditing for fair judgement.
2. The CSP has huge storage space and computation resources to provide outsourced data storage and management services for the client. Upon receiving the client's (or blockchain's) challenge, the CSP sends the generated response to the client (or blockchain). 3. The blockchain stores copies of the file digests for the client. When the blockchain is triggered to perform public auditing for fair judgement, the blockchain will challenge the CSP and then verify the response coming from the CSP. If the response can pass verification, the remote data are complete; otherwise, they are determined to be incomplete, since the data file digest saved in the blockchain will not be broken by anyone due to the non-tamperability property of the blockchain. Therefore, the

1.
The client has a large amount of data and needs to outsource this data to the CSP for maintenance and computation but does not save the copy locally. It also stores a file digest copy on the blockchain. Then, the client challenges the CSP and verifies the response coming from the CSP. If the response from the CSP passes verification, the outsourced data are considered complete; otherwise, they are considered incomplete. This case will trigger the blockchain to perform public auditing for fair judgement.

2.
The CSP has huge storage space and computation resources to provide outsourced data storage and management services for the client. Upon receiving the client's (or blockchain's) challenge, the CSP sends the generated response to the client (or blockchain).

3.
The blockchain stores copies of the file digests for the client. When the blockchain is triggered to perform public auditing for fair judgement, the blockchain will challenge the CSP and then verify the response coming from the CSP. If the response can pass verification, the remote data are complete; otherwise, they are determined to be incomplete, since the data file digest acc B saved in the blockchain will not be broken by anyone due to the non-tamperability property of the blockchain. Therefore, the failure of verification must be due to damage to the outsourced data by the CSP, and the client obtains compensation from the CSP automatically through a smart contract.

Threat Model and Design Goals
Both the CSP and the clients can be dishonest in our scheme. It is assumed that the CSP has no motivation to disclose managed data to others and also has no motivation to drop the managed data. However, the data stored on the CSP may be damaged due to software, hardware bugs, or hacker attacks. The CSP may conceal data corruption to avoid compensation. For the clients, they may modify the file digests used in private auditing, which leads to a failed verification result for obtaining compensation from the CSP. Moreover, one may pass him or herself off as a real client (i.e., real data owner) to obtain compensation from the CSP.
In this scheme, we will achieve the following security goals: • Correctness: If the outsourced data have not been broken by the CSP, and if the client and CSP execute the proposed scheme honestly, then the response from the CSP can pass verification; • Soundness: Only when the data are complete can they pass verification; • Privacy preserving: The entire auditing process will not disclose any data privacy in-formation; • Dynamic operations support: This supports that the client can insert, delete, and update the data outsourced to the CSP, and after dynamic operations, the auditing scheme remains applicable; • Timely compensation: The client can obtain compensation from the CSP in a timely manner when the outsourced data are damaged by the CSP.
In order to better understand our audit scheme's construction, the major notations and their meanings in this paper are listed in Table 2.

Notation
Meaning e A bilinear pairing G, G 1 , G 2 , G T Cyclic groups with order p g Generator of group G F The original data file to be divided into n segments f 1 , · · · , f n c i The ciphertext of segment f i τ i The tag of segment c i , i.e., The processed data file The accumulated value of data file B by the bilinear pair accumulator wit b j The witness calculated by the CSP σ The signature of file B ssk, spk The private key ssk and public key spk for a digital signature algorithm sk acc , pk acc The sec ret key sk acc and public key pk acc for a bilinear pair accumulator

Scheme Construction
We first give the main idea of our design and then show the auditing scheme in detail.

Main Idea
Our proposed blockchain-based hybrid auditing scheme extends the work in [20] and [28], but the difference is that the role of the verifier can be the client or blockchain smart contract, which enables our scheme to audit with fair arbitration and timely compensation, saving audit fees and communication costs for the client.
In our proposed blockchain-based hybrid auditing scheme, to save audit fees and communication costs for the client, the audit phase is executed by the client first (i.e., the client assumes the role of the verifier for the audit). If verification is passed, then the outsourced data are complete; otherwise, the outsourced data are possibly incomplete. At this point, the blockchain smart contract is triggered for fair arbitration; that is, the blockchain smart contract assumes the role of the verifier for auditing again. Meanwhile, both the client and CSP send deposits to the smart contract, and then client signs a smart contract with the CSP. First, the smart contract submits the deposit of the client to the miners as an audit fee. Secondly, if verification is passed, the CSP's deposit is returned; otherwise, the smart contract sends the CSP's deposit to the client as compensation. Note that if verification is passed, then the outsourced data are complete, and thus the CSP's deposit is returned. If the auditing verification is not correct, then the outsourced data are incomplete, and the verification result must be incurred by the CSP since the file digest acc B used for verification is stored on the blockchain, and the blockchain has the property of non-tamperability.

A Concrete Scheme
The outsourced data are assumed to be static in our scheme. Our hybrid auditing scheme consists of three phases-the set-up phase, the data upload phase, and the audit phase-which extends Ren et al.'s construction [28] as follows: • Set-up phase: The algorithm of the bilinear pairing instance generator is used to generate cyclic groups G 1 , G 2 , and G T with prime order p, a bilinear map e : G 1 × G 2 → G T , and s R ← Z * p . For simplicity, we assume G 1 = G 2 = G, but this is not essential. The generator of G is denoted by g. Let sk acc = s be the secret key and pk acc = g, g s , · · · , g s n be the public key, where n is the upper bound on the number of elements to accumulate. Each client generates the private key ssk and public key spk for a digital signature algorithm. Then sk acc and ssk are the secret parameters, and the public parameters of our scheme are pk acc , spk, e, and g. Let H be a cryptographic hash function.
• Data upload phase: The client divides the file F into n segments with l 1 bits (i.e., F = ( f 1 , · · · , f n )) and then performs the following procedure: (1) Each segment is encrypted separately using asymmetric encryption techniques such that c i = E( f i ) for i = 1, · · · , n.
An l 2 -bit tag τ i is generated for each segment c i such that τ i = H(c i ) for i = 1, · · · , n. The tags are saved in the tag index table (TIT).
Each tag is put at the end of its corresponding segment and generates a data block b i = c i τ i such that B = {b 1 , · · · , b n }. (4) The accumulated value of the processed file set B is calculated with the bilinear pair accumulator (i.e., acc B = g ∏ n i = 1 (b i +s) ), and the signature σ = Sig ssk (name) is computed, where the name ∈ Z p is the identifier of file B, which is uniformly and randomly chosen by the client. (5) The client stores the copies of the TIT, the signature σ, and the auxiliary value aux = (acc B , e, g, pk acc ) on the blockchain. (6) The processed data file B, the signature σ, and pk acc = g, g s , · · · , g s n are uploaded to the CSP.
• First-phase audit: The client interacts with the CSP as follows: (1) The client uses the random index j to challenge the CSP.
(2) Upon receiving the challenge, the CSP needs to calculate the witness wit b j = acc (b j +s) −1 B of element b j , but the witness cannot be calculated directly since s is unknown. However, the CSP can express the witness as wit b j = ∏ n−1 i = 0 (g s i ) a i using pk acc = g, g s , · · · , g s n , where {a 0 , · · · , a n−1 } is the coefficient of s in polynomial f(s) = ∏ b∈B\{b j } (b + s). Note that the CSP uses all elements in B except b j to compute wit b j . (3) The CSP returns (wit b j , b j ) as a response to the client.
(4) After receiving (wit b j , b j ), the client checks whether e(acc B , g) = e wit b j , g b j g s holds. Meanwhile, the client extracts the corresponding segment c j * and its tag τ j * from the block b j returned by the CSP and compares whether the extracted tag τ j * and the original τ j stored in the TIT are equal. If they are, then τ j = H c j * is calculated using the extracted data segment c j * ,and it is determined whether the calculated tag τ j and the original tag τ j are equal.
If verification is passed, output "1" is determined, meaning the outsourced data are complete; otherwise, output "0" is assigned, which triggers the blockchain's smart contract to perform a second-phase audit for fair arbitration.

•
Second-phase audit: The blockchain smart contract interacts with the CSP as follows: (1) The blockchain smart contract uses the random index j to challenge the CSP.
(2) Upon receiving the challenge, the CSP calculates the witness of the element b j by pk acc = g, g s , · · · , g s n . (3) The CSP sends (wit b j , b j ) as a response to the blockchain smart contract. (4) After receiving (wit b j , b j ), the blockchain smart contract checks whether e(acc B , g) = e wit b j , g b j g s holds. Meanwhile, it extracts the corresponding segment c j * and its tag τ j * from the block b j returned by the CSP and compares whether the extracted tag τ j * and the original τ j stored in the TIT are equal. If they are, then it calculates τ j = H c j * using the extracted data segment c j * , and determines whether the calculated tag τ j and the original tag τ j are equal.
If verification is passed, output "1" is reached, meaning the outsourced data are complete, and the smart contract automatically returns the CSP's deposit and submits the deposit of the client to the miner as an audit fee; otherwise, output "0" is reached, meaning the outsourced data are not complete. In this case, the CSP verifies the signature σ by public parameter spk, the smart contract submits the CSP's deposit to the client as compensation only if the outsourced data belong to the client, and then the smart contract submits the deposit of the client to the miners as an audit fee.
In order to better understand the caculations of digest acc B and the witness in the above audit scheme construction, an illustrative example is given below in Figure 3. If verification is passed, output "1" is reached, meaning the outsourced data are complete, and the smart contract automatically returns the CSP's deposit and submits the deposit of the client to the miner as an audit fee; otherwise, output "0" is reached, meaning the outsourced data are not complete. In this case, the CSP verifies the signature σ by public parameter spk, the smart contract submits the CSP's deposit to the client as compensation only if the outsourced data belong to the client, and then the smart contract submits the deposit of the client to the miners as an audit fee.
In order to better understand the caculations of digest and the witness in the above audit scheme construction, an illustrative example is given below in Figure 3. Suppose the data processed by the client are B = {b 1 , b 2 , b 3 }. The client first calculates the digest acc B = g ∏ 3 i = 1 (b i +s) . In the audit phase, upon receiving the challenge index j (suppose j = 2) from the verifier (which can be the client or blockchain smart contract), the CSP needs to calculate the witness wit b 2 = g ∏ 3 i = 1,i =2 (b i +s) , but the witness cannot be calculated directly since s is unknown. Let f(s) The CSP can express the witness as wit b 2 = ∏ 2 i = 0 (g s i ) a i = (g) b 1 b 3 ·(g s ) b 1 +b 3 · g s 2 1 using pk acc = g, g s , · · · , g s n , where is the coefficient of s in polynomial f(s). The CSP sends (wit b 2 , b 2 ) as a response to the verifier. We can see that the CSP needs to use all elements in B except b j to compute the witness wit b j .

Analysis of Our Design
In this section, we analyze the security and characteristics of our scheme. It is assumed that the underlying cryptographic tools such as the bilinear pairing instance generator algorithm, bilinear pair accumulator, one-way hash function, asymmetric encryption algorithm, and digital signature algorithm are secure.

Security Analysis
We should ensure the correctness and soundness requirements in our scheme. Correctness means that the response provided by the CSP can pass verification if the outsourced data on the CSP are not corrupted. Soundness means that verification can be passed only if the outsourced data on the CSP is not broken. We give the following theorems to prove these requirements can be satisfied in our proposed scheme: Theorem 1. For correctness, suppose that both the CSP and client execute the proposed scheme honestly. If the outsourced data on the CSP are not broken, then the CSP's response can pass verification.
Proof. Suppose that both the CSP and client execute the proposed scheme honestly. As described in our scheme, the CSP's response (wit b j , b j ) can pass verification only when the following two conditions are met: (1) e(acc B , g) = e wit b j , g b j g s .
(2) By extracting the data segment c j * and its corresponding tag τ j * from the target block b j returned by the CSP, the extracted tag τ j * and the original τ j stored in the TIT are equal. By calculating τ j = H c j * using the extracted data segment c j * , the calculated tag τ j and the original tag τ j are equal.
If the outsourced data on the CSP are not broken, then the first condition is met since the following equations hold due to the properties of bilinear mapping: , g (b j +s) ) = e wit b j , g b j g s where for the data file digest acc B = g ∏ n i = 1 (b i +s) = g ∏ b i ∈B (b i +s) , the witness response by the CSP is wit b j = g Meanwhile, the client extracts the corresponding segment c j * and its tag τ j * from the target block b j returned by the CSP. If the outsourced data are complete, then the extracted tag τ j * and the original τ j stored in the TIT must be equal, and the calculated τ j = H c j * using the extracted data segment c j * must be equal to the original tag τ j stored in the TIT, so the second condition is met. Therefore, if the outsourced data on the CSP are not broken, then the CSP's response can pass verification.

Theorem 2.
Regarding soundness, verification is only possible if the outsourced data on the CSP are not corrupted. In other words, if the outsourced data are not complete, then verification cannot be passed.
Proof. In our proposed scheme, verification can be passed in either the first-phase auditing or the second-phase auditing. In both cases, after receiving (wit b j , b j ) from the CSP, verification can be passed only when the following two conditions are met: (1) e(acc B , g) = e wit b j , g b j g s .
(2) When extracting the data segment c j * and its corresponding tag τ j * from the target block b j returned by the CSP, the extracted tag τ j * and the original τ j stored in the TIT are equal, and by calculating τ j = H c j * using the extracted data segment c j * , the calculated tag τ j and the original tag τ j are equal.
Upon receiving the challenge, the CSP needs to compute the corresponding witness wit b j for the target data block b j , and all data blocks except b j must be used in the calculation of the witness. Thus, even a small change in the outsourced data can cause the generated witness to change. Moreover, the CSP returns both the target data block and the calculated witness as a response. If the target data block b j is corrupted, then either the extracted tag τ j * is not equal to the original τ j stored in the TIT or the calculated τ j = H c j * using the extracted data segment c j * is not equal to the original tag τ j stored in the TIT. In other words, due to the security of the hash algorithm, it is almost impossible to generate the same tags using other data blocks. Thus, a valid response cannot be generated by the CSP if the data are not actually saved or the data are corrupted on the CSP. Therefore, if the outsourced data are not complete, then verification cannot be passed.

Other Characteristics
Our scheme has the following properties:

•
Privacy preservation: Before uploading the data, the client divides the data file F into n segments with l 1 bits; that is, F = ( f 1 , · · · , f n ), and then each segment is encrypted separately using asymmetric encryption techniques such that c i = E( f i ) for i = 1, · · · , n. Note that the client does not disclose the key for the encrypted data to others in the whole auditing process so no one can access the outsourced data except the client him or herself; • Dynamic operations support: Our proposed scheme also supports the dynamic operations of data such as inserting, deleting, and updating by using the tag index table (TIT) similar to the method in [28], ensuring that after dynamic operations, the auditing scheme remains applicable; • Timely compensation: As described in our scheme, the blockchain smart contract must be triggered for fair arbitration if the outsourced data are corrupted by the CSP. The client signs a smart contract with the CSP, and both the client and CSP send deposits to the smart contract. First, the smart contract submits the deposit of the client to the miners as an audit fee. Secondly, if verification is passed, it returns the CSP's deposit; otherwise, the deposit of the CSP is sent to the client as compensation via the smart contract. Thus, if the outsourced data are corrupted by the CSP, then verification cannot be passed, so the smart contract submits the deposit to the client as compensation automatically after the CSP verifies the signature σ; that is, the client can obtain compensation from the CSP in a timely manner.

Performance Evaluation
We implemented our system prototype of our proposed auditing scheme model via Python code. In this scheme, we used Solidity 0.8.11 to build an Ethereum smart contract and used Go Ethereum (Geth) 1.10.16 as the Ethereum client. The smart contract was deployed to the Ethereum test network.
The overhead of the smart contract comes from the posting parameter and on-chain verification. In this scheme, there is just one parameter named acc B which is used for on-chain verification. Figure 4 measures the gas costs of different sizes of files that were used in our auditing scheme. It is shown that the cost of our contract implemented on Ethereum was a constant value. The gas cost was fixed at 4.2216 × 10 4 . The total cost of ether could be calculated by the Ethereum gas rule: gasCost × gasPrice. The average gas price was about 45 Gwei, and 1 Gwei is 10 −9 ether. The current exchange rate is 1 ether = USD 2500. As shown in Figure 4, our cost for deploying the auditing scheme model was about USD 4.7493. The results confirm that this was not a huge cost for the client. In order to execute the polynomial operations in a bilinear map accumulator, we introduced the PBC library in the implementation. The PBC library is an open-source C library built on the GMP library that performs the mathematical operations underlying pairing-based cryptosystems. In the data upload phase, the client encrypts the data blocks which need to be outsourced by RSA, a kind of asymmetric encryption algorithm. Then, the client generates conflict-free 20-bit tags using a hash algorithm. To evaluate the performance of our model, we used the following set-up. The server proxy of the client was collocated with the CSP server on 8 cores of a machine with 2.60 GHz Intel i7-6700 HQ processors and 12 GB of RAM running Ubuntu Linux.
The upload time of the client, the witness computation time of the CSP, and the verification time cost of the verifier should be considered. First, we fixed the size of the data file to 2 MB. The data file was generated randomly from 0-9, a-z, and A-Z in our testing. Both the base size of the data blocks and the size of each increment were designed to be 256 bytes. When the block size was increased to 2 KB, the experiment would be stopped. For each round of experiments, we ran 100 tests with the same data and took the average as the result to reduce the error effect of a single experiment. Note that the larger the data block size, the fewer segments were divided, and the fewer labels were generated since the size of the data file was fixed. The final experimental results are shown in Figure 5a,b. It is not hard to observe that the upload time and the generation time of the witness were inversely proportional to the block size, and the time required for the verifier to conduct verification was fixed at 0.002 s, which conformed to our expectations. We can see that as the block size increased, the time required to calculate and wit decreased correspondingly, as shown in Figure 5c. Additionally, the order of magnitude of these parameters was about 0.01, well below the upload time. For the client, most of the time the cost was spent on encrypting the data file. Moreover, the calculation overhead of the parameters b (i.e., data segment) and τ (i.e., tag) was independent of the block size. In order to execute the polynomial operations in a bilinear map accumulator, we introduced the PBC library in the implementation. The PBC library is an open-source C library built on the GMP library that performs the mathematical operations underlying pairing-based cryptosystems. In the data upload phase, the client encrypts the data blocks which need to be outsourced by RSA, a kind of asymmetric encryption algorithm. Then, the client generates conflict-free 20-bit tags using a hash algorithm. To evaluate the performance of our model, we used the following set-up. The server proxy of the client was collocated with the CSP server on 8 cores of a machine with 2.60 GHz Intel i7-6700 HQ processors and 12 GB of RAM running Ubuntu Linux.
The upload time of the client, the witness computation time of the CSP, and the verification time cost of the verifier should be considered. First, we fixed the size of the data file to 2 MB. The data file was generated randomly from 0-9, a-z, and A-Z in our testing. Both the base size of the data blocks and the size of each increment were designed to be 256 bytes. When the block size was increased to 2 KB, the experiment would be stopped. For each round of experiments, we ran 100 tests with the same data and took the average as the result to reduce the error effect of a single experiment. Note that the larger the data block size, the fewer segments were divided, and the fewer labels were generated since the size of the data file was fixed. The final experimental results are shown in Figure 5a,b. It is not hard to observe that the upload time and the generation time of the witness were inversely proportional to the block size, and the time required for the verifier to conduct verification was fixed at 0.002 s, which conformed to our expectations. We can see that as the block size increased, the time required to calculate acc B and wit decreased correspondingly, as shown in Figure 5c. Additionally, the order of magnitude of these parameters was about 0.01, well below the upload time. For the client, most of the time the cost was spent on encrypting the data file. Moreover, the calculation overhead of the parameters b (i.e., data segment) and τ (i.e., tag) was independent of the block size. In addition, if the block size was fixed at 256 bytes, we increased the data file size from 256 KB to 2 MB in increments of 256 KB. Figure 6a,b shows that most of the computing overhead was spent on the data upload phase, and the vast majority of this time was used to encrypt the data as mentioned above. As shown in Figure 6c, both the time required for generating acc B by the client and the computational overhead for calculating the witness by the CSP were proportional to the size of the outsourced data file. The results show that the computational overhead of the verifier was a constant value regardless of the data file size and the data block size. Therefore, the experimental results show that our proposed auditing scheme was very effective.
To measure the distribution trend of the computing overhead with different parameters, we fixed the size of the data file at 2 MB and the block size at 256 bytes. Then, we repeated the experiment 100 times and recorded the computing overhead of the parameters for each experiment. The distribution trend of the computing overhead is shown in Figure 7. The computing overhead of all parameters was according to Gaussian distribution. quired for generating by the client and the computational overhead for calculati the witness by the CSP were proportional to the size of the outsourced data file. The sults show that the computational overhead of the verifier was a constant value regardle of the data file size and the data block size. Therefore, the experimental results show th our proposed auditing scheme was very effective. To measure the distribution trend of the computing overhead with different para eters, we fixed the size of the data file at 2 MB and the block size at 256 bytes. Then, w repeated the experiment 100 times and recorded the computing overhead of the param ters for each experiment. The distribution trend of the computing overhead is shown Figure 7. The computing overhead of all parameters was according to Gaussian distrib tion.

Conclusions and Future Work
In this paper, we designed a novel and efficient two-phase arbitrable hybrid au scheme based on the blockchain. By using a bilinear map accumulator and bloc smart contract, our scheme not only realizes deterministic checking, which provide data possession and integrity guarantees, but also enables a healthy ecosystem be the client and the CSP. That aside, when the outsourced data are lost or corrupted CSP, our scheme can compensate the client in a timely manner and punish the dis CSP automatically with a smart contract. Meanwhile, our scheme also protects the

Conclusions and Future Work
In this paper, we designed a novel and efficient two-phase arbitrable hybrid auditing scheme based on the blockchain. By using a bilinear map accumulator and blockchain smart contract, our scheme not only realizes deterministic checking, which provides 100% data possession and integrity guarantees, but also enables a healthy ecosystem between the client and the CSP. That aside, when the outsourced data are lost or corrupted by the CSP, our scheme can compensate the client in a timely manner and punish the dishonest CSP automatically with a smart contract. Meanwhile, our scheme also protects the honest CSP and prevents dishonest behavior from the client. Furthermore, we designed hybrid auditing in our scheme instead of conducting public auditing through the blockchain. The hybrid auditing design not only provides fair judgment but also saves audit fees for the client, because the public auditing phase by the blockchain is triggered only when verification fails in the private auditing phase. Through theoretical and experimental analysis, it was verified that our design was feasible and efficient, and it achieved the desired security goals. Of course, our scheme still has some limitations to be improved upon. For example, it can only check whether the outsourced data are corrupted and cannot determine which data blocks are corrupted or how to repair the corrupted data blocks. In future works, we will enhance more functions of our auditing scheme, such as the location and repair of corrupted data blocks.