A Blockchain-Based Privacy-Preserving and Fair Data Transaction Model in IoT

: The rapid development of the Internet of Things (IoT) has resulted in vast amounts of widely distributed data. Sharing these data can spur innovative advancements and enhance service quality. However, conventional data-sharing methods often involve third-party intermediaries, posing risks of single-point failures and privacy leaks. Moreover, these traditional sharing methods lack a secure transaction model to compensate for data sharing, which makes ensuring fair payment between data consumers and providers challenging. Blockchain, as a decentralized, secure, and trustworthy distributed ledger, offers a novel solution for data sharing. Nevertheless, since all nodes on the blockchain can access on-chain data, data privacy is inadequately protected, and traditional privacy-preserving methods like anonymization and generalization are ineffective against attackers with background knowledge. To address these issues, this paper proposes a decentralized, privacy-preserving, and fair data transaction model based on blockchain technology. We designed an adaptive local differential privacy algorithm, MDLDP, to protect the privacy of transaction data. Concurrently, veriﬁable encrypted signatures are employed to address the issue of fair payment during the data transaction process. This model proposes a committee structure to replace the individual arbitrator commonly seen in traditional veriﬁable encrypted signatures, thereby reducing potential collusion between dishonest traders and the arbitrator. The arbitration committee leverages threshold signature techniques to manage arbitration private keys. A full arbitration private key can only be collaboratively constructed by any arbitrary t members, ensuring the key’s security. Theoretical analyses and experimental results reveal that, in comparison to existing approaches, our model delivers enhanced transactional security. Moreover, while guaranteeing data availability, MDLDP affords elevated privacy protection.


Introduction
With the widespread development and application of Internet of Things (IoT) technology, vast amounts of data from IoT sensors are collected, stored, and utilized [1].Data from a single sensor often fails to meet the requirements of users.The true value of IoT lies in the sharing and comprehensive use of diverse sensing data.Data sharing promotes the distribution of data resources, elevating work efficiency and quality, while spurring innovative applications.For instance, in healthcare, data sharing can provide valuable records of patient treatments and physical examinations, assisting medical professionals in providing more targeted treatment plans [2].In data markets, novel data transaction models have emerged that allow data owners to sell their information to consumers.Big data has evolved into a valuable asset [3].
However, the current IoT system architecture is based on client-server communication.IoT devices are connected to a central cloud server which is used to ensure the communication between devices and handle and store data.This centralized architecture may create a single point of failure.This may increase security and privacy risks.Therefore, it is necessary to adopt a new solution based on decentralized architecture [4].Additionally, during data transactions, due to the inherent mistrust between data providers and consumers, the latter might delay or refuse payments after obtaining the data.Similarly, providers could withhold datasets after receiving compensation, making fair payments between parties hard to ensure.Given the aforementioned challenges in security, privacy preservation, and fair payment, many data owners are reluctant to provide their data to third-party trading platforms.The big data industry grapples with the challenge of "data islands".Hence, there is an imperative need to adopt a decentralized data-sharing framework and establish a secure, efficient data transaction model that ensures fair payment and safeguards data providers' privacy during transactions.
Blockchain, as a distributed ledger, embodies characteristics of decentralization, immutability, and auditability and is frequently employed to address security issues tied to traditional IoT systems [5].Compared to conventional centralized IoT systems, a decentralized IoT system based on blockchain has several advantages: firstly, it achieves end-to-end communication without involving centralized servers, reducing single-point failure risks and bolstering fault tolerance; secondly, nodes on the blockchain can verify the integrity and identity information of data uploaded by other nodes, which can prevent malicious data tampering and ensure the security and consistency of the blockchain network; finally, blockchains store data and event logs in an immutable manner, giving blockchain-based IoT systems traceability and accountability.
Nevertheless, each node in the blockchain maintains a local backup of the entire blockchain to uphold the network's integrity.Given that all nodes have access to blockchain data, this backup mechanism has raised growing concerns about privacy.There's a potential risk of sensitive information being exposed.Traditional privacy-preserving techniques, such as anonymization and generalization, have proven inadequate when faced with attackers possessing background knowledge [6].This inadequacy is evident from significant privacy leaks in datasets like AOL [7] and Netflix [8], leading to questions about the effectiveness of these methods in protecting user privacy.Differential privacy, which is a notable method to counter attackers with background knowledge, operates by adding random "noise" to datasets to ensure data privacy [9,10].However, while differential privacy introduces "noise" to maintain privacy, it compromises data availability.Determining the right amount of "noise" to strike a balance between dataset availability and privacy remains a research challenge.
In the context of ensuring fair payment, verifiable encrypted signatures are commonly used to guarantee transaction fairness online.The concept of verifiable encrypted signatures was first introduced by ASOKAN [11] and involves three parties: the signer, the verifier, and a trusted third party (i.e., an arbitrator).The fundamental principle behind verifiable encrypted signatures is that the signer encrypts a conventional digital signature using the arbitrator's public key, thereby confirming that the ciphertext genuinely contains a standard signature.Any verifier can use the arbitrator's public key to verify its validity.Nevertheless, without the help of the signer or the arbitrator, it is impossible to extract a valid signature.The ordinary signature can only be recovered by the adjudicator from this encrypted signature.However, in such a situation, the impartiality and security of a single arbitrator cannot always be guaranteed.Dishonest traders might collude with the arbitrator to the detriment of the other party.Furthermore, if the arbitration node experiences a single-point failure, the potential loss of the arbitration private key could render the arbitration process unfeasible.
To address these challenges, this paper proposes a decentralized, privacy-preserving, and fair IoT data transaction model based on blockchain.We have developed an adaptive localized differential privacy algorithm, termed MDLDP (Multiple Disturbance of Local Differential Privacy), which perturbs data prior to its integration into the blockchain, ensuring the protection of local data privacy.We also leverage verifiable encrypted signatures to ensure fairness during transactions.This model replaces the traditional single arbitrator, commonly found in conventional verifiable encrypted signatures, with an arbitration committee.The arbitration committee employs threshold signature techniques to manage arbitration private keys.Only by collaborating among any arbitrary set of t members can the reconstruction of the complete arbitration private key be achieved.Assuming that any t members act securely and honestly, the impartiality and security of the arbitration committee can be guaranteed.This setup precludes both the loss of arbitration private keys and the potential collusion between dishonest traders and arbitrators.Theoretical analysis and experimental results indicate that our model offers heightened transaction security compared to existing approaches.Additionally, the MDLDP algorithm not only ensures data availability but also provides augmented privacy protection.

Blockchain and Differential Privacy
With the widespread adoption of blockchain technology, a secure and distributed ecosystem has been established for the Internet of Things (IoT).This technology is increasingly being utilized across diverse IoT sectors to build robust data-sharing solutions [12].In the domain of Industrial IoT, the complexity of manufacturing processes increases due to the varied nature of the industries involved.As the final products frequently originate from multiple departments that span various sectors, concerns about privacy and security arise during cross-domain interactions.Singh et al. [13] presented a centralized cloud cross-domain data-sharing platform employing multiple security gateways.These gateways employ blockchain technology to upload information to a centralized cloud.When applications report malicious activities, the centralized cloud employs blockchain verification to validate the reports and subsequently impose penalties on the responsible parties.However, this approach cannot guarantee the impartiality and security of the centralized cloud.Lu et al. [14] designed a blockchain-empowered secure data-sharing architecture for distributed multiple parties.They formulate the data-sharing problem into a machine learning problem by incorporating privacy-preserved federated learning.The privacy of data is well maintained by sharing the data model instead of revealing the actual data.In the field of Medical IoT, Sabu et al. [15] proposed a model that combines blockchain with the Interplanetary File System (IPFS) to address privacy and security concerns associated with data sharing.This model provides restrictions and protective measures for users' personal data.The data in the IPFS is distributed among nodes, and using IPFS to store health records has the feature of tamper resistance.Nevertheless, this mechanism lacks effective protection for the raw data.Thantharate et al. [16] presented ZeroTrustBlock-a comprehensive blockchain-based framework for secure and private health data management that addresses limitations in mainstream health IT systems.The proposed architecture provides a decentralized medical record repository using a permissioned blockchain.Smart contracts enact fine-grained access policies tailored to patient consent.A hybrid on-chain and off-chain storage model balances transparency with confidentiality.Integration gateways enable interoperability with existing systems like EHRs and insurance platforms.In the field of smart transportation, Cui et al. [17] propose a secure and efficient data-sharing scheme among vehicles without the assistance of an RSU in IoV.They exploit consortium blockchain technology to achieve traceability and immutability of data-sharing records.The scheme prevents unauthorized data sharing and improves the security and privacy of the data-sharing process.To meet the requirements of vehicle speed, latency, and communication overhead in the actual environment of vehicle networking, Du et al. [18] modified the original PBFT consensus structure and designed an extensible double-layer PBFT consensus algorithm.In addition, a multi-weight subjective logic model CRMWSL for calculating reputation values was proposed to achieve accurate calculation of RSU node reputation values.Meanwhile, suitable nodes are elected into the committee to participate in consensus according to their reputation values, which further reduces communication overhead and improves blockchain scalability.Miao et al. [2] advocated for data sharing through model sharing and introduced a secure mechanism called BP2P-FL, which utilizes peer-to-peer federated learning.By introducing blockchain into data sharing and recording every training process, data providers are able to provide high-quality data.To protect privacy, BP2P-FL uses differential privacy techniques to disturb the global datasharing model, but this mechanism cannot guarantee privacy safety during the federated learning process.Fotiou et al. [10] proposed a data transaction model employing Local Differential Privacy (LDP) to safeguard data provider privacy and devised a blockchainbased solution to ensure fair exchange and immutable data logs.However, traditional LDP mechanisms cannot fit well with blockchain since the requirements of a fixed input range, large data volume, and using the same privacy budget, which are practically difficult in a decentralized environment.To address this, Zhang et al. [19] presented a novel local differential privacy mechanism to partition and perturb data, which does not mandate vast data volumes or fixed input ranges.By using an iteration approach to adaptively allocate the privacy budget for different perturbation procedures that minimize the total deviation of perturbed data and increase the data availability.

Fair Payment
In recent years, the big data transaction market has garnered significant attention from researchers.In data transactions, fair payment refers to the timely receipt of the agreed-upon dataset or compensation when both parties of the transaction comply with the transaction agreement in good faith [20].Zhou et al. [21] proposed a distributed data vending framework based on blockchain by combining data embedding and similarity learning.They obtained the trade-off between data retrieval and leakage risk by indexing the data.Niu et al. [22] proposed TPDM, which integrates trust and privacy preserving in data markets by using homomorphic encryption and identity-based signatures.However, these mechanisms do not adequately address the fair payment issue between trading parties.Djuric et al. [23] propose the Fair Exchange Internet Payment Protocol (FEIPS) for the payment of physical goods.Although FEIPS has a strong emphasis on fair exchange, it still guarantees strong security properties, including confidentiality, data integrity, authentication, and non-repudiation.Goldfeder et al. [24] contemplated the fair payment problem when purchasing physical goods with cryptocurrencies and proposed a series of protocols.These protocols offer security and privacy and are compatible with blockchain-based cryptocurrencies like Bitcoin.Chen et al. [25] propose a fair exchange protocol for autonomous data sharing and describe a concrete implementation framework based on BTC.The concrete framework is designed based on BVM smart contract scripts.Nevertheless, these protocols all rely on trusted third parties.Kurtulmus et al. [26] established a protocol using blockchain technology, wherein participants do not need mutual trust.Users can employ their datasets to train machine learning models and obtain rewards.However, this protocol lacks effective protection for local data.Wang et al. [27] propose an auditable fair payment and physical asset delivery protocol based on smart contracts.In view of the phenomenon of goods being switched, the way of "pre-verification" is added.In addition, this plan designs a complete return process for the first time, providing a better service experience and higher efficiency for consumers.Zhao et al. [28] propose a new blockchain-based fair data trading protocol in the big data market, to enhance the privacy, availability, and fairness of data trading.The advantage of blockchain infrastructures is removing the single-point failure of the big data market.They enhance the anonymity of data providers and extend DAPS to data trading for fairness.At the same time, they use similarity learning to enhance the availability of trading data.

Blockchain
Blockchain is a peer-to-peer network comprised of multiple participating nodes.It can be regarded as a distributed ledger characterized by decentralization, tamper-resistance, non-forgery, and traceability.Transaction information is recorded in block structures that include timestamps, and each block contains a pointer to its predecessor.The blockchain is maintained collaboratively by all participating nodes, with its consistency ensured via consensus algorithms.Depending on the access rules, blockchains can be categorized into public blockchains and consortium blockchains.In public blockchains, the number of participating nodes is not fixed, and they have the freedom to join or exit at will.For consortium blockchains, only users who have undergone identity verification and received authorization are permitted to join.

Local Differential Privacy
Traditional data privacy protection techniques, such as k-anonymity [29] and generalization [30], lack universal applicability due to their absence of a strong mathematical foundation to define data privacy and data loss.Moreover, they are ineffective against attackers armed with background knowledge.The emergence of differential privacy [31][32][33] has effectively addressed this issue.This model is a robust privacy protection technique based on mathematical theory.Differential privacy is unconcerned with an attacker's background knowledge, even if the attacker possesses information on all records except one, that single record's privacy remains uncompromised.Local differential privacy is a distributed variant of differential privacy that allows each user to locally perturb the raw data to protect privacy before uploading it.It is defined as follows [34]:

Definition 1 (ε-Local Differential Privacy):
If there exists a randomized algorithm M, for any two distinct tuples v i and v i in dataset D and any potential output y ∈ Y (Y being the output domain of M), that satisfies then the randomized algorithm M is said to satisfy ε-local differential privacy.Here, Pr[•] indicates the probability of the output result.ε is referred to as the privacy budget, which represents the level of data privacy protection.The smaller its value, the closer the probabilities over adjacent datasets, and the higher the level of data privacy protection.
A commonly employed technique to realize local differential privacy is the randomized response mechanism [35].The principle behind this mechanism is that when users respond to sensitive Boolean questions, they answer truthfully with probability P and oppositely with a probability of 1-P.Local differential privacy is built on a rigorous mathematical foundation that ensures data privacy protection even when the attacker has maximum background knowledge.

Verifiable Encrypted Signature
The basic principle of verifiable encrypted signatures is as follows [36]: Data consumers encrypt the arbitrator's digital signature using their public key and verify if the signature truly exists in the ciphertext.Anyone can validate the validity of the signature via the arbitrator's provided public key.In the case of disputes, arbitrators can use their private keys to decrypt encrypted signatures and prevent traders from maliciously withholding digital signatures.Verifiable encrypted signatures effectively ensure fairness in online transactions and protect both parties from potential losses.The verifiable encrypted signature protocol comprises the following eight algorithms: (1) Setup(1 λ ) → pp : Generates the open parameter pp by giving the parameter λ.
(2) AdjKeyGen(pp) → (APK, ASK): By disclosing the parameters pp, generate the ar- biter's key pair (APK, ASK).( 3) KeyGen(pp, APK) → (pk i , sk i ): Generating signer i by disclosing the parameters pp and the arbiter's public key APK of the key pair (pk i , sk i ).(4) Sign(pp, sk i , m) → σ i : By disclosing the parameters pp, private key sk i and message m, generating signer i of the digital signature σ i .( 5) Verify(pp, pk i , m) → 0/1 : Verify the validity of the digital signature σ i by disclosing the parameters pp, public key pk i and message m.If the algorithm outputs 1 it means that the signature is valid and outputs 0 it means that the signature is invalid.( 6) VESSign(sk i , m, APK) → σ VES i : Generate a verifiable encrypted signature σ VES i by utilizing the private key sk i , message m, and the arbiter's public key APK.(7) VESVerify pk i , m, APK, σ VES i → 0/1 : By public key pk i , message m, arbiter public key APK as well as the verifiable encrypted signature σ VES i , verifying the validity of the verifiable encrypted signature σ VES i .If the algorithm outputs 1, the cryptographic signature can be verified as valid, while an output of 0 indicates that it is invalid.(8) Adj pk i , APK, ASK, m, σ VES i σ i : By public key pk i , APK, the private key of the arbiter ASK, message m and a verifiable encrypted signature σ VES i to obtain the digital signature σ i .

Threshold Signature
In 1987, Yvo Desmedt first introduced the concept of threshold signatures [37].The threshold signature mechanism allows any t signatories out of r to sign a message.However, if the number of signatories is less than t, a valid signature cannot be generated.The scheme is described as follows [38]: Let Z be a finite field, and q be a large prime number in that field.c i (where i = 1, 2, . .., r) represent the r participants.A t − 1 degree polynomial is randomly selected as: where a i ∈ Z(q) (for i = 1, 2, . .., t − 1).Compute s i = f (x i ) for i = 1, 2, . . . r, and send s i as a secret share to participant c i .Using any t sub-keys, the secret can be reconstructed such that:

System Model
This section introduces our proposed blockchain-based security, privacy-preserving, and fair data transaction model.

System Overview
In the model, data is perturbed using local differential privacy before the transaction to protect data privacy, and verifiable encrypted signatures are used to ensure the fairness of the transaction.An arbitration committee is established to replace the traditional single arbitrator, with the intention of preventing potential collusion between dishonest parties and the arbitrator during the transaction.The system assumes that IoT devices are programmable and can implement local differential privacy.The model views IoT devices as nodes in the blockchain, which can effectively prevent single-point failures.It also securely records information about data disturbances and transactions in an immutable manner to ensure the traceability of transactions.
The system overview is illustrated in Figure 1.As depicted in Figure 1, the system primarily consists of four components: data consumers, data providers, the arbitration committee, and the blockchain.Specifically, data consumers issue data transaction requests.Data nodes that meet the requirements and are interested can apply to become data providers for a given transaction.After the application is approved, the data provider will consider their privacy and reward needs to determine their privacy budget and perturb their local data preparation for transactions.The transaction employs verifiable encrypted signatures to sign the transaction agreement, with the arbitration public key synthesized by a random subset of t members from the arbitration committee.Ultimately, the blockchain records transactional information, including data perturbations, to facilitate traceability.Further details on each component of the model will be explained below.
Appl.Sci.2023, 13, x FOR PEER REVIEW 7 of committee, and the blockchain.Specifically, data consumers issue data transaction r quests.Data nodes that meet the requirements and are interested can apply to becom data providers for a given transaction.After the application is approved, the data provid will consider their privacy and reward needs to determine their privacy budget and pe turb their local data preparation for transactions.The transaction employs verifiable e crypted signatures to sign the transaction agreement, with the arbitration public key sy thesized by a random subset of t members from the arbitration committee.Ultimately, th blockchain records transactional information, including data perturbations, to facilita traceability.Further details on each component of the model will be explained below.(1) Data Consumers: Based on their specific needs, they issue a data trading reques Data consumers may need to ensure the size of transaction data due to statistical an other business needs.Meanwhile, the smaller the privacy budget of differential pr vacy, the lower the availability of transaction data.Therefore, the minimum data vo ume and minimum privacy budget are specified in the data transaction request.Su sequently, they make a payment that serves as a deposit to prevent any maliciou behavior on their part.Data nodes that meet the criteria of the consumer's reque and are interested in it can apply to participate in this data transaction.Ultimately, transaction agreement is signed using a verifiable encrypted signature, after whic they obtain the dataset, now distorted with noise, and paid compensation.(2) Data Providers: Refers to nodes that meet the demands of the data consumers an successfully partake in the data transaction.These providers aspire to earn compe sation from the current data transaction.However, they might be unable to meet th data volume requirements of the data consumer individually due to limitations their storage or limited data resources.When data providers apply to join a transa tion, they must declare their own data size.If the cumulative data size of all the pr viders does not meet the consumer's needs, the transaction is then canceled.Sim larly, to prevent malicious behavior, data providers must also pay a deposit whe applying to participate in the transaction.(3) Arbitration Committee: This committee is selected through mutual consultation b tween both trading parties and is exclusively valid for the current transaction.Com mittee members are also required to pay a deposit to prevent malicious behavio (1) Data Consumers: Based on their specific needs, they issue a data trading request.Data consumers may need to ensure the size of transaction data due to statistical and other business needs.Meanwhile, the smaller the privacy budget of differential privacy, the lower the availability of transaction data.Therefore, the minimum data volume and minimum privacy budget are specified in the data transaction request.Subsequently, they make a payment that serves as a deposit to prevent any malicious behavior on their part.Data nodes that meet the criteria of the consumer's request and are interested in it can apply to participate in this data transaction.Ultimately, a transaction agreement is signed using a verifiable encrypted signature, after which they obtain the dataset, now distorted with noise, and paid compensation.(2) Data Providers: Refers to nodes that meet the demands of the data consumers and successfully partake in the data transaction.These providers aspire to earn compensation from the current data transaction.However, they might be unable to meet the data volume requirements of the data consumer individually due to limitations in their storage or limited data resources.When data providers apply to join a transaction, they must declare their own data size.If the cumulative data size of all the providers does not meet the consumer's needs, the transaction is then canceled.Similarly, to prevent malicious behavior, data providers must also pay a deposit when applying to participate in the transaction.(3) Arbitration Committee: This committee is selected through mutual consultation between both trading parties and is exclusively valid for the current transaction.
Committee members are also required to pay a deposit to prevent malicious behavior, such as delaying or refusing arbitration.At the same time, in order to motivate committee members, they will receive rewards after completing the arbitration on time, which come from both parties involved in the transaction that require arbitration.The number of members, r, in the committee adheres to the rule t < r ≤ 2t − 1.After the committee's formation, each member receives a secret share s i of the arbitration private key, distributed by the key management entity.If any abnormality occurs during the transaction process, such as one party exhibiting dishonest behavior and refusing to use its private key to decrypt a verifiable encrypted signature, any member of the committee can jointly rebuild the arbitration private key, decrypt it, and extract the digital signature.As long as any t members within the committee are honest and secure, it ensures the fairness of the arbitration committee and the security of the private key.(4) Blockchain: Consortium blockchains, as a specific type of blockchain, possess access controls, comply with the requirement of reviewing user settings, and maintain a relatively stable set of participating nodes.The system employs the consortium blockchain as a distributed environment required for data trading between data consumers and providers.Throughout the data trading process, all participating entities collaboratively maintain the blockchain ledger.Failures or departures of individual nodes will not disrupt the entire data transaction process, thereby enhancing the robustness of the data trading market.Furthermore, due to the blockchain's inherent attributes of transparency, immutability, and traceability, transaction details and the reputation evaluations of participants are logged onto the blockchain.Nodes with malicious behaviors are penalized, enhancing the model's security and fairness.

Threats and Security Goals
The system assumes the data provided by the data providers is genuine.These nodes have undergone rigorous review before joining and aim to achieve a successful transaction with a high-quality noisy dataset.However, they could potentially face challenges related to privacy, security, or fair reward.The following discusses the threats faced by the system and the security goals achieved in responding to these threats.
Threat 1: Privacy leakage of data providers.The information offered by data providers could pose severe adverse consequences, such as the exposure of sensitive details (names, addresses, phone numbers, social accounts, etc.).Malicious entities may use this information to pose threats to users' personal safety and assets.In addition to personal information, other data provided can also be used for data analysis to reveal preferences, interests, behaviors, etc.This may have adverse effects, such as dynamic pricing based on big data.
Security Goal 1: Privacy protection of local data.Throughout the data transaction process, without voluntarily disclosing their local data, data providers will not compromise the privacy of sensitive data.No data provider can directly or indirectly access the real data information of other data providers during the cooperative transaction process.Data providers use local differential privacy to perturb sensitive data before uploading it, rather than sending plaintext, which ensures that no one can infer the provider's local data information from the perturbed dataset.
Threat 2: The erroneous behavior of participants.Mainly including three types of incorrect behavior: (1) data providers deceitfully report a larger data size to garner more compensation while providing a smaller actual size; (2) after receiving payment, the data provider intentionally fails to send the perturbed dataset, which will result in direct losses for the data consumer; (3) after obtaining the perturbed dataset, data consumers intentionally withhold payments, which will result in direct losses for the data provider.
Security Goal 2: Fair protection of reward for data providers.The system weighs a data provider's reward based on three key aspects: (1) size of the local data size, which remains unchanged after perturbation.(2) Privacy budget: a smaller privacy budget means higher privacy protection and lower data availability, which means less reward, and vice versa.(3) Credibility value: only by diligently participating in data perturbation and trading more data size can providers elevate their credibility score, with those erring seeing a reduction.High credibility ensures more rewards.Additionally, the system also implements punitive measures.Providers who misbehave will be fined and honest providers will be compensated.
Security Goal 3: Secure transactions.The transaction process utilizes the verifiable encrypted signature technology.Both parties first agree on the transaction terms.Following this, the data consumer signs the agreement using a verifiable encrypted signature, which anyone can verify for validity.Once the signature's validity is confirmed, the data provider furnishes the perturbed dataset.After receiving this dataset, the data consumer uses their private key to extract their signature, completing the transaction.If the data consumer cheats and refuses to use their key, the data provider can appeal to the arbitration committee for decryption and penalize the deceitful party.

Workflow
Table 1 enumerates and explains the relevant symbols: Based on processes 1 to 7 in Figure 1, the workflow of the system mainly includes four stages: initialization, local data perturbation, data transaction, and arbitration.
Initialization stage: Data consumers issue a data transaction request T (N, ε) to the blockchain and deposit a fee as collateral (as in Process 1 ).Data nodes can apply based on their computational capacity and data resource conditions.Nodes approved for the task become data providers for this transaction and also contribute a fee as a deposit.During the application process, data node i states its local data size d i .If the final total data size ∑ n i=1 d i is less than N, the task is aborted.The specific approval method can be through offline negotiations, which is beyond the paper's primary focus.Nodes not involved in the transaction can propose to form an arbitration committee with data consumers and providers.Members of the arbitration committee must gain mutual consent from both transaction parties.
Local data perturbation stage: Firstly, the data provider P i clarifies that the dataset D i they are about to use for transactions and the privacy dataset L i they want to focus on protecting belong to D i .Then, based on their own situation, they balance the privacy protection of data and the transaction rewards they want to obtain to determine their respective privacy budget ε i .The privacy budget ε i cannot be less than the minimum privacy budget ε required by data consumers P c .Due to the characteristics of differential privacy technology, the larger the privacy budget, the higher the data availability, and the greater the rewards obtained, but the weaker the privacy protection.Conversely, the smaller the privacy budget, the better the privacy protection, but the lower the data availability, and the fewer transaction rewards obtained.Next, the data provider P i utilizes the algorithm MDLDP to locally perturb dataset D i to obtain perturbed dataset G i for transaction (as shown in process 2 ).In order to achieve better privacy protection for L i , i.e., a greater degree of privacy protection α i , the algorithm MDLDP requires multiple random perturbations to select one of the best results.Due to the unchanged privacy budget ε i , the overall availability of dataset G i has not decreased.Finally, the data provider P i uploads the perturbed dataset G i and privacy budget ε i and other information to the blockchain for transaction (as in process 3 ).The MDLDP algorithm is as follows (Algorithm 1): degree of privacy protection α ij to evaluate the effectiveness of this perturbation 13: end 14: Take α ij the result of the largest perturbation as G i and α i ; 15: return G i , α i .Data transaction stage: Initially, the data provider P s aggregates all data from providers (as in Process 4 ) and negotiates with data consumer P c regarding fees and data sharing, primarily encompassing reward details.Following this, P c and P s generate their respective key pairs (pk c , sk c ) and (pk s , sk s ) using the public key APK.P c generates his true signature σ i based on Sign(pp, sk c , m).P c then produces a verifiable encrypted signature σ VES c using VESSign(sk c , m, APK) to sign the transaction agreement.Anyone can validate the validity of the signature via the arbitrator's provided public key APK.But without the help of the P c or the arbitration committee, it is impossible to extract the true signature σ i .Once the agreement is signed and validated, P s delivers the perturbed data set G = ∑ n i=1 G i to P c .After confirming the G dataset, P s decrypts the encrypted signature σ VES c using its private key sk c and extracts the true signature σ i to validate the agreement and complete the transaction (as in Process 5 ).
Arbitration stage: If P c acts deceitfully during the transaction, refusing to decrypt the encrypted signature σ VES c to extract the true signature σ i to complete the agreement, i.e., paying compensation.Data providers P s can seek assistance from arbitration committees.any t members within the arbitration committee can utilize Equation (3) to reconstruct the arbitration private key ASK (as in Process 6 ).Based on the characteristics of verifiable encrypted signature technology, the private key ASK can directly extract the true signature σ i of data consumer P c from the encrypted signature σ VES c (as in Process 7 ).Forcing data consumer P c to execute agreements to pay compensation.Dishonest trading behavior will result in the confiscation of margin as compensation for honest traders and arbitrators.

Verifiable Encrypted Signature
To facilitate fair transactions and safeguard both transaction parties' legitimate rights, this paper introduces a verifiable encrypted signature protocol.Traditional verifiable encrypted signature protocols assume a single and neutral arbitrator [39], but such an arbitrator may experience a single point of failure or cheating behavior.To address this issue, we merge the verifiable encrypted signature with the threshold signature, replacing the traditional arbitrator with a committee.Every committee member holds a secret share s i , and any t members can reconstruct the arbitration private key.The verifiable encrypted signature protocol is detailed below: Setup: enter the security parameter λ, choose two large primes of length λ/2, and compute the product of these two large primes N. Choose a random element in the group Z * Φ(N) , and compute the element s such that se = 1 mod Φ(N).The system public key is (N, e).The corresponding private key is (N, s), according to Equation (2), s is divided into r parts, held by r members of the arbitration committee.When arbitration is needed, any t members can use Equation (3) to reconstruct s and decrypt the corresponding verifiable encrypted signature.
KeyGen: The data provider P s randomly selects g 1 Z * Φ(N) , and selects x s {0, 1} λ , calcu- lates g = g 1 2 mod N, y s = g x s mod N. The data consumer c selects the element x c ∈ {0, 1} λ and computes y c = g x c mod N. The public key for P s is y s , with a private key sk s = x s .The public key for P c is y c , with a private key sk c = x c .
Sign: outputs the transaction protocol Tx, P c computes a signature Verify: taking as input the transaction agreement Tx, signature σ c and the system public key (N, e).Verify whether the σ c e = H(Tx) s mod N.
If the equation holds, output 1, indicating that the signature σ c is valid, otherwise output 0, indicating that the signature is invalid.
PreSign: P c select the element r ∈ {0, 1} λ , and calculate y r = g r mod N and also calculate the secret factor u = y x c s mod N and m = u e H(Tx)mod N. P s Calculate the secret factor u = y x s v mod N, and verify whether m = u e H(Tx)mod N. VESSign: P c selects an element t from the set t ∈ {0, 1} λ , and calculates σ = m 2s y r c mod N, y e = y e s mod N, y er = y r s mod N, y t = g t mod N, y et = y t e mod N, c = H(m , y er , y r , y e , g, y et , y t ).P c then computes z = t − rc, and outputs the verifiable encrypted signature σ VES c = (σ , c, z).VESVer: Given the system's public key (N, e), P c 's public key pk c , P s 's public key pk s , and the parameters g, m , y r , as well as the verifiable encrypted signature σ VES c , computes w er = σ e m −2 mod N, y e = y e s mod N,w er = y z e w c er mod N, w t = g z y c r and c = H m , w er , y r , y e , g, w et , w t If c = c , then output 1, indicating that the verifiable encrypted signature σ VES c valid.Otherwise, output 0, indicating that the signature is invalid.

Privacy Protection
We use a lemma to prove that the model can protect the privacy of data providers P i .
Lemma 1: If the data provider P i does not leak data locally, the proposed model ensures the privacy and security of the data provider.
Proof.The data provider P i determines the dataset D i used for the transaction before the transaction and determines the privacy budget ε i based on privacy protection and compensation needs.Then, the algorithm MDLDP is used to perturb dataset D i to obtain G i .Finally, upload the privacy budget ε i and perturbed dataset G i to the blockchain for transaction.The entire process is conducted locally, so this model can ensure the privacy and security of the data provider P i .

Fairness Assurance
The system's transaction mechanism and financial incentive structure guarantee fairness for data providers.This mechanism will prevent malicious behavior by data providers and encourage them to actively provide perturbed datasets.Data providers must pay a fee as a deposit when registering for a transaction.The size of the provided data and the privacy budget serve as bases for reward, both correlating positively with the reward.In addition to transaction rewards, data providers can earn bonuses tied to their reputation score.Only by actively and honestly providing data can they enhance their credibility.High-credibility data providers will receive more rewards.The system enforces penalties: dishonest data providers will be fined, with deductions made from their initial deposit, compensating honest data providers in the process.

Transactional Security
Lemma 2: If any t members of the arbitration committee are honest, then the verifiable encrypted signature protocol proposed in this paper achieves fair payment.
Proof.There are only four situations in the transaction process: (1) both parties are honest; (2) The data provider P s is honest, while the data consumer P c is dishonest; (3) Data consumers P c are honest, while data providers P s are dishonest; (4) Both parties in the transaction are dishonest.
Situation (1).The data provider P s will provide the data G = n ∑ i=1 G i honestly, and the data consumer P c will also pay the compensation honestly, which is fair to both parties.
Situation (2).The data provider P s and the data consumer P c have negotiated an agreement regarding payment of compensation.The data provider P s provides the data G after verifying the verifiable encrypted signature σ VES c of the data consumer P c , and the data consumer P c refuses to decrypt the encrypted signature σ VES c after obtaining the dataset G. Any t members of the arbitration committee will collaborate to generate the arbitration private key ASK, which can directly decrypt the encrypted signature σ VES c to make the payment agreement effective.The implementation of the agreement will be resolved by law.Ensuring fairness.
Situation (3).The data provider P s and data consumer P c negotiate a payment agreement, and the data consumer P c performs a verifiable encrypted signature σ VES c on the payment agreement, but the data provider P s refuses to provide the dataset G.The probability of a data provider P s successfully cracking an encrypted signature without a private key to defraud a reward is negligible.Meanwhile, due to the number of committee members t < r ≤ 2t − 1, and t members being honest, it is impossible for t dishonest members to collude with data providers P s , and dishonest data providers P s cannot receive any rewards.Ensuring fairness.
Situation (4).The data provider P s will not provide the data honestly, and the data consumer P c will not make the payment honestly.Both parties have no losses, but both parties will be deducted the deposit due to malicious behavior, which is fair to both parties.

Experiments
This section describes the experimental environment for the algorithm MDLDP and evaluates it from the perspectives of privacy protection, data availability, and time cost.

Experimental Setup
To assess the MDLDP algorithm, experiments were conducted on the real-world dataset "Air Quality" [40].The dataset consists of 9358 instances recorded by five metal oxide chemical sensor arrays embedded in air quality chemical multisensory devices.These devices are located on the ground in a heavily polluted area of Italy, with a recorded duration from March 2004 to February 2005.These data, provided by certified reference analyzers from the same region, represent hourly average concentrations of ground-level pollutants such as carbon monoxide, non-methane hydrocarbons, benzene, total nitrogen oxides, and nitrogen dioxide.For comparison, this article randomly selected 500 data from non-methane hydrocarbons, except for missing values.This article uses the MAT-LAB platform for simulation experiments.All experimental programs are written in the MATLAB R2018b platform using the MATLAB language, and the experimental hardware environment is Intel (R) Core (TM)i3-7100U, equipped with a CPU @ 2.40 GHz and 8 GB RAM, running on the Windows 10 operating system.The equipment comes from Qingdao, China, and the manufacturer is Lenovo Company.
This article uses privacy protection degree α to evaluate the privacy protection effect of algorithm MDLDP.G j represents the perturbed dataset obtained by the data provider after the j-th perturbed of the original dataset D. L represents the privacy dataset of the data provider, which is a part of the original dataset D. ∆ L, G j represents the number of perturbed data in the privacy dataset L after the j-th perturbed, l represents the amount of data in dataset L.
∆(L,G j ) l represents the proportion of perturbed data in the privacy dataset L, which is the degree of protection for the privacy dataset.According to the MDLDP algorithm, the data provider needs to perform h random perturbations on the dataset and select the result with the maximum degree of protection.In addition, due to the randomness of differential privacy, w privacy protection degrees will be generated in the experiment, and their average value will be taken as the final result.
In this paper, we use the commonly used method of mean square error (MSE) to evaluate the data availability of the algorithm, such as [41].Among them, (a 1 , a 2 , • • • , a n ) denotes the real data in the Original dataset D, a 1 , a 2 , • • • , a n represents the data in the perturbed dataset G.

Experimental Analysis
In the work of using local differential privacy technology to protect user privacy, most of them use other implementation mechanisms such as Laplace, while the use of random perturbation mechanisms is relatively rare.Therefore, this article chose the traditional LDP algorithm [42] and DDLDP [43] as a comparison, both of which use random perturbation mechanisms.To ensure a fair comparison, five data groups identical in size to [43] were chosen, with sizes of 100, 200, 300, 400, and 500, respectively.Three different privacy budgets were selected: ε = 0.5, ε = 1.0, and ε = 2.0. Figure 2 shows the privacy protection degree under different data sizes for the three algorithms.To prevent randomly inconsistent results, we chose w = 10 and calculated the average of 10 instances.Moreover, the experiment randomly selected 10% of the data size from the original set D as private data L.
According to Equations ( 4) and ( 5), we obtained the privacy protection degrees of three algorithms under different privacy budgets and data sizes.As shown in Figure 2, under three different privacy budgets, the privacy protection of the MDLDP method is generally higher than that of the other two comparison methods.This indicates that the MDLDP method can better and more accurately protect the privacy dataset  .As the amount of data increases, the degree of privacy protection  slightly decreases and eventually stabilizes.According to Equation ( 6), we obtained the mean square error (MSE) of three algorithms under different privacy budgets and data sizes.From Figure 3, it can be observed that under varying privacy budgets and data sizes, the traditional LDP method exhibits a slightly larger mean squared error, indicating its relatively lower data availability.The mean squared error of the MDLDP method is close to that of the DDLDP method, suggesting that both methods offer better data availability.In conjunction with Figure 2, it can be concluded that the MDLDP method not only offers superior privacy protection but also ensures great data availability.This is because under the same privacy budget, the MDLDP method tends to protect private datasets and relatively reduces the protection of non-private data.According to Equations ( 4) and ( 5), we obtained the privacy protection degrees of three algorithms under different privacy budgets and data sizes.As shown in Figure 2, under three different privacy budgets, the privacy protection of the MDLDP method is generally higher than that of the other two comparison methods.This indicates that the MDLDP method can better and more accurately protect the privacy dataset L. As the amount of data increases, the degree of privacy protection α slightly decreases and eventually stabilizes.
According to Equation ( 6), we obtained the mean square error (MSE) of three algorithms under different privacy budgets and data sizes.From Figure 3, it can be observed that under varying privacy budgets and data sizes, the traditional LDP method exhibits a slightly larger mean squared error, indicating its relatively lower data availability.The mean squared error of the MDLDP method is close to that of the DDLDP method, suggesting that both methods offer better data availability.In conjunction with Figure 2, it can be concluded that the MDLDP method not only offers superior privacy protection but also ensures great data availability.This is because under the same privacy budget, the MDLDP method tends to protect private datasets and relatively reduces the protection of non-private data.
Figure 4 depicts the perturbation time of the three algorithms at different data sizes.As the data size increases, the fluctuation in perturbation time for all three algorithms remains within a certain range, with these fluctuations being related to the randomness of differential privacy.When the data size is constant, the perturbation time for each algorithm under different privacy budgets is roughly equivalent, which aligns with the characteristics of differential privacy.As can be observed from Figure 4, the time consumed by the MDLDP algorithm is approximately on par with the DDLDP algorithm, both of which are slightly longer than that of the LDP algorithm.But compared to better privacy protection, these limited excess time consumption are acceptable.Due to the fact that time consumption does not significantly change with changes in privacy budget or data volume, the MDLDP method has good practicality.As the data size increases, the fluctuation in perturbation time for all three algorithms remains within a certain range, with these fluctuations being related to the randomness of differential privacy.When the data size is constant, the perturbation time for each algorithm under different privacy budgets is roughly equivalent, which aligns with the characteristics of differential privacy.As can be observed from Figure 4, the time consumed by the MDLDP algorithm is approximately on par with the DDLDP algorithm, both of which are slightly longer than that of the LDP algorithm.But compared to better privacy protection, these limited excess time consumption are acceptable.Due to the fact that time consumption does not significantly change with changes in privacy budget or data volume, the MDLDP method has good practicality.

Conclusions and Future Works
This paper introduces a decentralized, privacy-preserving, and fair data transaction model based on blockchain technology.The model safeguards the privacy of local data by designing an adaptive local differential privacy algorithm, MDLDP, and ensures fair data transactions using verifiable encrypted signatures.Instead of the traditional single arbitrator found in conventional verifiable encrypted signatures, the proposed model introduces a committee.Through threshold signature technology, the arbitration private key is divided and managed by the committee members.In the event of transaction disputes, any t members of the committee can collaborate to reconstruct the arbitration private key for arbitration.Theoretical analysis shows that our method can effectively protect the privacy of data providers, ensure fair transaction markets, and safeguard the security of arbitration private keys, preventing the theft or loss of arbitration private keys due to singlepoint failures.In addition, this method also prevents collusion between dishonest traders and arbitrators and achieves fair payment in transactions.We evaluated the MDLDP algorithm on a real-world dataset.Compared with existing methods, although the MDLDP algorithm has slightly more time consumption than the method with the lowest time consumption, it has better privacy protection and can more accurately protect users' privacy datasets.Meanwhile, this method will not reduce data availability.Due to the fact that time consumption does not significantly change with changes in privacy budget or data volume, the limited excess time consumption of the MDLDP method is acceptable and has good practicality.
However, although we conducted simulation experiments using MATLAB, there is a lack of specific implementation details on real blockchain platforms, such as node registration and approval, consensus algorithms, transactions, and data storage.We plan to further validate the model on specific blockchain platforms such as Hyperledger in the future.The model proposed in this article assumes that honest data providers provide effective and high-quality data.How to verify data quality and ensure data validity is a problem that needs to be solved in the future.In addition, it is also a valuable study that high availability and low latency access may be required for IoT systems.

Figure 2 .
Figure 2. Privacy-preserving degrees of private dataset with different data sizes.

Figure 2 .
Figure 2. Privacy-preserving degrees of private dataset with different data sizes.

Figure 3 . 20 Figure 4
Figure 3. Mean Square Error with different data sizes.Figure 3. Mean Square Error with different data sizes.

Figure 4 .
Figure 4. Time consumption for perturbing different data sizes.

Table 1 .
Description of symbols.

Algorithm 1 .
The MDLDP Algorithm Input: D i , ε i , L i Output: G i α i 1: Create array indice [L i ] for storing the index of dataset L i ; 2: Create array indice [G ij ] storing the index of the perturbed data in dataset G ij ; 3: p i = exp{ε i }/(exp{ε i } + d i − 1); 4: for j 1 to h by 1 // independent perturbations of the original dataset D i h times 5: for z 1 to d i by 1 // Randomly perturb each element in dataset D i with probability 1 − p i 6: if rand() < p i 7: g ijz = value ijz ; // value ijz refers to the true value in D i , g ijz is an element in G ij 8: else 9: g ijz = rand(D i )/value ijz ; 10: end 11: Obtaining perturbed dataset G ij ;