Decentralized Policy-Hidden Fine-Grained Redaction in Blockchain-Based IoT Systems

Currently, decentralized redactable blockchains have been widely applied in IoT systems for secure and controllable data management. Unfortunately, existing works ignore policy privacy (i.e., the content of users’ redaction policies), causing severe privacy leakage threats to users since users’ policies usually contain large amounts of private information (e.g., health conditions and geographical locations) and limiting the applications in IoT systems. To bridge this research gap, we propose PFRB, a policy-hidden fine-grained redactable blockchain in decentralized blockchain-based IoT systems. PFRB follows the decentralized settings and fine-grained chameleon hash-based redaction in existing redactable blockchains. In addition, PFRB hides users’ policies during policy matching such that apart from successful policy matching, users’ policy contents cannot be inferred and valid redactions cannot be executed. Some main technical challenges include determining how to hide policy contents and support policy matching. Inspired by Newton’s interpolation formula-based secret sharing, PFRB converts policy contents into polynomial parameters and utilizes multi-authority attribute-based encryption to further hide these parameters. Theoretical analysis proves the correctness and security against the chosen-plaintext attack. Extensive experiments on the FISCO blockchain platform and IoT devices show that PFRB achieves competitive efficiency over current redactable blockchains.


Introduction
The Internet of Things (IoT) is defined as connecting all objects through informationsensing devices such as radio frequency identification to the Internet, enabling intelligent recognition and management.The concept of IoT entails integrating sensors into various objects, including power grids and railways, enabling data collection and communication for seamless connectivity and interaction with the physical world [1][2][3].However, due to the inherent decentralization of IoT device deployment [4,5], achieving secure data management poses challenges.
To address these challenges, blockchain has been widely applied in IoT systems as a decentralized data management platform, providing traceability and integrity to secure IoT systems [6][7][8].According to a report by Morder Intelligence (https://www.mordorintelligence.com/industry-reports/blockchain-iot(accessed on 31 July 2023)), the market size of blockchain-based IoT systems is expected to grow from USD 568.51 million in 2023 to USD 3436.54 million by 2028.Despite these benefits, immutable blockchains present two limitations in current blockchain-based IoT systems.Firstly, the immutability of blockchain violates the right to be forgotten in the General Data Protection Regulation (GDPR) (https://gdpr-info.eu/art-17-gdpr/(accessed on 31 July 2023)), severely limiting the development and implementation of blockchain-based IoT systems.GDPR mandates that users should be able to erase their personal data (e.g., health and transportation data) from data management systems.However, with immutable blockchains, once data are appended to the blockchain, no modifications can be made [9].This conflict between immutability and GDPR could result in high fines for blockchain-based IoT systems.Secondly, IoT devices are vulnerable to network attacks and can be used to spread improper information (e.g., indecent surveillance videos) in blockchain-based IoT systems.If this improper information cannot be redacted, it may negatively impact the blockchain-based IoT application ecology [10][11][12].
To address these limitations, some redactable blockchains have been proposed, adopting chameleon hashes [13][14][15][16][17][18].Specifically, in 2017, Ateniese et al. [13] introduced the first redactable blockchain by using the chameleon hash technique.Their work extended researchers' insight into immutable blockchains and made blockchains compliant with data regulation (e.g., GDPR).Unfortunately, their work only supports all-or-nothing redaction privileges, i.e., they either cannot redact any data or can redact all data.Clearly, it is infeasible in practice since real-world blockchain applications contain a large number of devices that own different attributes.To achieve fine-grained redaction, Derler et al. [13], Ma et al. [18], and Xu et al. [19] utilized attribute-based fine-grained access control and chameleon hash to introduce the policy-based chameleon hash technique (PCH), subsequently using PCH to achieve fine-grained redactable blockchains.Following PCH, Mao et al. [18] further introduced a decentralized policy-based chameleon hash technique (DPCH) and used it to design a decentralized fine-grained redactable blockchain.However, in existing fine-grained redactable blockchains, users' policies are public to all users, making them unsuitable for some policy-sensitive IoT systems, such as IoT-based smart healthcare and smart transportation [20].In these IoT systems, users' policies contain sensitive private information, e.g., users' health conditions and geographical locations [21].For instance, in an IoT-based smart healthcare application, users use their IoT devices (e.g., smartwatches) to record their health data and utilize blockchain-based IoT systems to manage the data.In the application, users' policies usually contain information about users' health conditions, such as sensitive health details [22].Obviously, if this private information is leaked, users may face discrimination and even potential attacks [23].Therefore, there is a great need to design decentralized policy-hidden redactable blockchains in blockchain-based IoT systems.We make a comparison between our proposed solution and existing works, as shown in Table 1.We address the issue of policy disclosure in a decentralized, fine-grained setting.

•
We propose a policy-hidden fine-grained redactable blockchain scheme (named PFRB) for blockchain-based IoT systems.With decentralized settings, PFRB enables users to achieve fine-grained data redaction without compromising policy privacy.• PFRB leverages multi-authorized attribute-based encryption and Newton's interpolation formula-based secret sharing to construct a decentralized secret sharing for policy hiding.Then, based on the constructed secret sharing, PFRB further enriches chameleon hashes to achieve decentralized policy-hidden fine-grained redactable blockchains.• Security analysis proves the security of PFRB under the chosen-plaintext attack in the random oracle model.Experimental results show that PFRB has competitive efficiency over recent fine-grained redactable blockchain schemes.
Organization.The remainder of this paper is structured as follows.In the next section, we review existing works on redactable blockchains to introduce the research gap.In Section 3, we provide an introduction to the preliminary information, followed by an overview of the PFRB system in Section 4. Next, in Section 5, we provide insight into the detailed construction of PFRB.Section 6 presents a formal correctness analysis and security analysis.Performance evaluation of PFRB is shown in Section 7. Finally, concluding remarks are given in Section 8.

Related Work
This section provides a systematic review of the current literature on redactable blockchain techniques and introduces the research gap.
In 2017, Ateniese et al. [13] introduced the first redactable blockchain by using the chameleon hash technique.Their work extended researchers' insight into immutable blockchains and made blockchains compliant with the data regulation (e.g., GDPR and CCPA).Unfortunately, their work only supports all-or-nothing redaction privileges, i.e., they either cannot redact any data or can redact all data.Clearly, it is infeasible in practice since real-world blockchain applications contain a large number of devices that own different attributes.
To address the above limitation, Derler et al. [16] proposed a policy-based chameleon hash (PCH) and used PCH to implement fine-grained redactable blockchains.PCH combines chameleon hashes, ephemeral trapdoors, and linear secret sharing matrix-based attribute-based encryption to associate transactions with access policies, enabling rewriting only when editors' attributes satisfy the policy.Followed by Derler et al.'s work, Tian et al. [24] and Xu et al. [19] extent redactable blockchains with accountability by introducing digital signatures, respectively.Due to the utilization of centralized attribute-based encryption schemes, the above schemes can only be applied in consortium blockchains, and cannot support generally decentralized settings.However, in practical blockchain applications, especially distributed IoT systems, decentralized systems are more general.To achieve decentralized redactable blockchains, Ma et al. [18] introduced the first decentralized policy-based chameleon hash (DPCH) by linear secret sharing matrix-based multi-authority attribute-based encryption and used DPCH to achieve decentralized finegrained blockchain redaction.
However, redaction policies in the above redactable blockchains are open to all participants, which is infeasible in practical blockchain-based IoT systems since policies in IoT systems usually contain users' private information.For instance, in IoT-based smart healthcare applications, patients use their smart swatches to collect their health conditions (such as heart rate and personal temperature) and specify policies to allow their private doctors to edit.Evidently, these policies usually contain sensitive information regarding patients' health conditions and geographical locations.Once this private information is leaked, adversaries can launch attacks on these patients such as robbery and discrimination.
Thus, the protection of policy privacy in decentralized redactable blockchains represents an urgent matter.

Preliminary
In this section, we introduce the building blocks of PFRB, i.e., multi-authority attributebased encryption, Newton's interpolation formula-based secret sharing, and chameleon hash.

Multi-Authority Attribute-Based Encryption
A multi-authority attribute-based encryption (MA-ABE) [28] system consists of arbitrary numbers of attribute authorities and users.A set of global public parameters is defined in the system.Users can select an attribute authority and obtain their corresponding decryption key.The authorization authority performs the appropriate attribute key generation algorithm and returns the result to the user.The encryption process uses the global public parameters and a set of attributes to generate the ciphertext.The decryption process uses the decryption key for the attribute set to perform decryption.
Definition 1 (MA-ABE): A multi-authority attribute-based encryption ABE MC involves three types of entities: authorities, data owners, and data users.It includes five algorithms:

•
Global Setup (λ) → (GP): This algorithm accepts a secure parameter λ as input and produces a public global parameter GP as output.

•
Authority Setup (GP) → (PK, SK): In this step, the algorithm takes the public global parameter GP as input and generates a public key PK and a secret key SK as output.
It is crucial to keep the secret key SK confidential, while the public key PK is intended for publication.• Encryption (M, (A, ρ), GP, {PK}) → (CT): The algorithm accepts several inputs, including a message M, an n × access matrix A with ρ mapping its rows to attributes, the global parameter GP, and the public keys of the relevant authorities PK.It then produces a ciphertext CT as output.• KeyGen (ID, i, SK, GP) → (K i,ID ): The algorithm generates a key K i,ID for attribute i associated with an authority using the inputs: a global identifier ID, the attribute i, the secret key SK, and the public global parameter GP. • Decryption (CT, {K i,ID }, GP) → (M): The algorithm decrypts the ciphertext CT using the input parameters: the key K i,ID for ID and attribute i, as well as the global parameter GP.The result of the decryption process is the message M.

Newton's Interpolation Formula-Based Secret Sharing
In this paper, Newton's interpolation formula is primarily used for key recovery.Also, due to the introduction of polynomial-oriented secret sharing, PFRB achieves higher efficiency than traditional linear secret sharing matrix-based works.The transaction issuer hides the key within the zeroth term of a polynomial, ensuring that only users who meet the policy requirements can reconstruct the polynomial and access the hidden key.

•
Secret Generation: Assume that there are (n + 1) points represented as (x 0 , y 0 ), (x 1 , y 1 ), . . ., (x n , y n ).Here, x i is called the interpolation point, and y i is called the interpolation value.Given an interpolation polynomial f (x), for each i = 0, 1, 2, . . ., n, y i is represented as y i = f (x i ).The Newton's basis n i (x) is defined as follows: Based on n i (x), the Newton's interpolation polynomial Q n (x) can be defined as follows: Specifically, based on x 0 , Q n (x) can be estimated as follows: where f [x 0 ] represents the zeroth order divided difference.Similarly, based on x 1 , the Newton's interpolation polynomial Q n (x) is estimated as follows: Thus, K 1 can be estimated as follows: where f [x 0 , x 1 ] represents the first order divided difference.Without loss of generality, K i can be defined as follows: where f [x 1 , x 2 , . . ., x i ] denotes the i-th order divided difference, respectively.For more details, the reader can refer to previous literature.

•
Secret Construction: We can reconstruct the secret with Newton's parameters as follows:

Chameleon Hash
A chameleon hash CH typically encompasses the following five algorithms: • Setup(1 λ ) → pp: The probabilistic setup algorithm takes a security parameter λ as input and generates a public parameter pp as output.The public parameter pp is used in subsequent algorithms and protocols to ensure the security and functionality of the system.• KeyGen(pp) → (pk, sk): The probabilistic key generation algorithm takes the pub- lic parameter pp as input and generates a public-secret key pair (pk, sk) as output.
The public key pk is used for encryption or other public operations, while the secret key sk is kept confidential and used for decryption or other sensitive operations.• Hash(pk, m) → (h, r): The probabilistic hash algorithm takes the public key pk and a message m ∈ M as input.It then produces an output of 1 if the tuple (h, r) is considered valid according to the algorithm's criteria.If the tuple is not valid, the output will be 0. • Verify(pk, m, h, r) → {0, 1}: The deterministic verification algorithm takes the public key pk, message m, hash value h, and randomness value r as input.It then determines whether the tuple (h, r) is valid according to the defined criteria.If the tuple is valid, the algorithm outputs 1.Otherwise, if the tuple is not valid, the output will be 0. • Adapt(sk, m, m , h, r) → r : The deterministic adaptation algorithm takes the secret key sk, message m ∈ M, hash value h, and randomness value r as input.It then generates an adapted randomness value r as output.

System Overview
In this section, we present the system model, brief definition, and security model of PFRB.

System Model
As shown in Figure 1, the PFRB (Privacy-Preserving Redactable Blockchain) system involves four types of entities: • Authorities: The authorities are all trusted.One of them initializes the system, and they can all generate attribute-value pairs.• Transaction Owner: The transaction owner is also trusted and wants to place a deal or some data on the blockchain.They hash the data and attempt to add the transaction to the blockchain.

Definition of PFRB
We next provide a brief definition of PFRB and summarize the notations in Table 2.
• Setup (1 λ )→(pp s , pk s , sk s ): Given a security parameter, the Setup algorithm outputs a public parameter pp s , public key pk s , and secret key sk s .Then, the authority publishes pp s and pk s to all users.• RKGen (msk, ρ)→(dk ρ , {∆ i,R j }): The RKGen algorithm takes msk and ρ as input, where ρ is the attribute set of the modifier.The algorithm outputs the decryption key dk ρ and the Lagrange coefficients of ρ, {∆ i,R j }. • ModSetup (sk s , id)→(sk id , σ id ): The ModSetup algorithm takes the secret key sk s and the global identifier id as input.It outputs the modifier's secret key sk id and the modifier's signature σ id .• AuthSetup (θ)→(pk θ , sk θ ): The AuthSetup algorithm takes the authority θ as input and outputs the authority's public key pk θ and secret key sk θ .• ModKeyGen (pk s , id, σ id , sk θ , A)→sk id,A / ⊥: The ModKeyGen algorithm takes the public key pk s , the modifier's global identifier id, the modifier's signature σ id , the authority's secret key sk θ , and an attribute A as input.It generates the secret key sk id,A for the modifier's attribute A if the request is legal; otherwise, it outputs nothing.• Hash (pk s , {pk θ }, m, R)→(pk etd , h, r, c): The Hash algorithm is designed to take the following inputs: the public key pk s , a group of authorities' public keys pk θ , the message m to be encrypted, and the policy R of the target receiver.It generates four outputs: a public key pk etd (a public component of the ephemeral trapdoor), a hash value h, a randomness value r, and a ciphertext c.The ciphertext c plays the crucial role of securely sealing the secret component sk etd , guaranteeing its confidentiality.• Verify (pk s , pk etd , m, h, r)→{0,1}: The Verify algorithm can be executed by any entity within the system.It accepts the following inputs: the public key pk s , the public component pk etd of the ephemeral trapdoor, the message m ∈ M, the hash value h, and the randomness value r.The algorithm then determines whether the tuple (h, r) is valid according to its defined criteria.If the tuple is deemed valid, the algorithm outputs 1.However, if the tuple is found to be invalid, the output will be 0.

Security Model
In our scheme, we assume that all authorities and the data owner are trusted entities, and communications between them are secure.The transaction owner generates mutable transactions honestly, and the authorities preserve the secret key honestly.However, other entities, such as chain participants, can act as adversaries and collaborate to launch the chosen-plaintext attack.The security of PFRB is defined as the indistinguishability and the collision resistance under the chosen-plaintext attack in the random oracle model as follows.

•
Setup: The challenger runs the Setup algorithm and shares the public parameters PK with the adversary.• Phase 1: The challenger allows the adversary to request private keys from the encryption oracle O E by their attributes S 1 , . . ., S q1 .• Challenge: The adversary selects and uploads two messages, M 0 and M 1 , of equal length.The adversary also presents a challenge access structure, denoted as A, which none of the previously generated attribute sets can satisfy.The challenger randomly chooses a coin flip outcome, encrypts either M 0 or M 1 under the challenge access structure A, and provides the resulting ciphertext CT * to the adversary.• Phase 2: Phase 1 is repeated, but with the additional constraint that none of the sets of attributes S q1+1 , S q1+2 , . . ., S q satisfy the access structure associated with the given challenge.This restriction ensures that the adversary cannot find any new sets of attributes that fulfill the challenge access structure.• Guess: Based on the above experiment, the adversary outputs a guess, b 0 , of b.
We say that the adversary A wins the above game if the guess b equals b.Specifically, PFRB is secure against the chosen-plaintext attack (CPA) if any probabilistic polynomialtime adversary A only has a negligible advantage to win the game as follows.

Proposed Scheme
In this section, we present detailed construction of PFRB.The workflow of PFRB is shown in Figure 2.There are eight algorithms: Setup, RkGen, ModSetup, AuthSetup, ModKeyGen, Hash, Verify and Adapt.
Choose n, d and 2n + 2 random values t 1 , . . ., t n+1 , d 1 , . . . ,d n+1 ∈ Z p and set T i = g t i , D i = g d i for each i from 1 to n + 1.
-Return the public key pk etd , random value r, hash value h, and ciphertext c as follows.

Theoretical Analysis
In this section, we theoretically analyze the correctness and security of PFRB.Then, we discuss some promising applications of PFRB.

Correctness Analysis
In this section, we will provide detailed proof of the correctness of the proposed scheme in this paper.The scheme presented in this paper builds upon Ma et al's scheme (i.e., RBDS22) [18] while incorporating additional improvements.It is worth noting that if RBDS22 is correct and the calculations of K 1,r and K 2,r in this scheme are correct, then the proposed scheme in this paper is also correct.We will now present the specific proof of the correctness of K 1,r and K 2,r as follows: In the Hash part, we have β .K 1,r can be transformed as follows: Similarly, K 2,r can be transformed as follows: In the Adapt part, we have: g T,1 = K 1,r /∏ i,R j e(P i,r , dk i ) ∆i,R j (0) , g T,2 = K 2,r / e ∏ i,s E i,r , W r .

Application of PFRB in Smart Healthcare
Currently, with policy privacy, PFRB can be widely applied in smart healthcare applications to collect and manage medical data.For instance, PFRB ensures that nobody can infer users' private information from users' redaction policies.This benefit can impel users' enthusiasm for using portable devices (e.g., mobile phones and smart swatches) to collect data associated with their health condition, such as heart rate and temperature [29].Clearly, these data hold high value for medical data analytic applications.

Application of PFRB in Smart Industry
In the industry environment, due to the harsh environment and heavy data collection tasks, IoT devices have been deployed to replace humans' work and securely transmit their data through blockchains [30].In this case, policies from IoT devices usually contain large amounts of commercially sensitive data, such as data types and factory addresses [31].Protecting this information is crucial to increase the wide implementation of blockchainbased IoT systems in the smart industry.Matching these practical requirements, PFRB is a promising solution to blockchain-based IoT systems in the smart industry.

Application of PFRB in AI
Recently, some cases have been proposed to prove the practical feasibility of combining the advantages of blockchains, AI, and IoT systems, such as swarm learning and machine/deep-learning-based IoT systems [32].Specifically, IoT systems are responsible for collecting data for AI models, and blockchains are responsible for managing data and training AI models.The trained models can then provide rich services for IoT devices, such as fault detection and inference [33].Similarly, in these promising applications, redaction policies from IoT devices contain much private information of IoT owners, such as policies for personal temperatures in medical analytics containing users' health conditions.PRFB addresses the policy leakage problem and can further impel the development of combining blockchains, IoT systems, and AI in real-world applications.

Performance Evaluation
In this section, we conduct experiments on the FISCO blockchain platform to evaluate the practical efficiency of PFRB.

Experimental Settings
Configuration: The experiment was conducted on a personal computer running Windows 10 (x64) with an Intel i7 8550U processor clocked at 1.80GHz and 8GB of memory.The implementation was done using the JPBC library in Java 8, and the MNT224 curve, known for its type-III properties and offering a 96-bit security level, was selected for pairing operations.Additionally, a 2048-bit RSA group was used for the Chameleon hash, providing a security level of 112 bits.Our scheme was deployed on the FISCO BCOS platform, which is an open-source platform customized for the financial industry.It is built upon the BCOS platform and incorporates module upgrades and functionality customization.The "Arbitration Chain" in FISCO leverages blockchain decentralization, tamper resistance, and trustworthiness.Real-time preserved data are securely stored on the blockchain using distributed data storage and encryption algorithms, ensuring the authenticity, legality, and relevance of the evidence.
Parameter design: The data owner randomly selects a news headline and uploads it to the service node as a transaction.In our experiment, we assume that each user can choose up to 100 attributes, such as gender, age, region, education level, occupation, etc., for access permissions.The access control policies for transactions are determined based on information such as the time and location of the news, and up to 100 policies can be set.It is important to note that the traditional scheme lacks policy protection and is vulnerable to security issues such as privacy leaks.This paper presents the experimental perfor-mance of key generation, hashing, and adaptation and chooses recent related works [16,18] as comparisons.
Dataset: The implementation of our proposed solution utilizes the MNH9 dataset, which is derived from a real-world open dataset provided by the Australian Broadcasting Corporation.The MNH9 dataset consists of millions of news headlines.

Experimental Results
In this paper, we present a superior solution that excels in key generation, hashing, verification, and adaptation compared to traditional methods.
Figure 3 illustrates a significant improvement in the key generation aspect.For all three approaches, the required time increases linearly with the number of attributes, and both our proposed scheme and RBDS22 outperform the traditional approach by requiring considerably less time.This noteworthy enhancement is primarily attributed to the higher efficiency of our scheme, which leverages traditional linear secret sharing matrices at the underlying level, resulting in a more stable and efficient process.Moving on to Figure 4, the most noticeable efficiency improvement of the proposed solution is in the hashing part when compared to the traditional scheme.The traditional approach's required time increases exponentially with the number of policies, while the proposed solution and RBDS22 demonstrate linear growth.This advantage can be attributed to the use of polynomial functions generated from point values, which have proven to be significantly more efficient than using traditional linear secret sharing matrices schemes.
Figure 5 reveals that the adaptation algorithm part, which employs Newton's interpolation to reconstruct polynomials, presents challenges due to the enumeration of attribute-value pairs owned by the user.While the required time for both the traditional scheme and the proposed solution increases exponentially with the number of policies, the proposed scheme still outperforms the traditional approach significantly.Moreover, RBDS22 shows linear growth in the required time, and the proposed solution performs better than RBDS22, especially when the number of policies is low.Figure 6 demonstrates that in the Verify algorithm part, the response time of all three schemes increases linearly with the number of requests, with no significant difference in the time used.This similarity is mainly due to the similarity in the calculation operations performed in the Verify algorithm part of all three schemes.Therefore, the time used by all three schemes is similar in this regard.Figures 7 and 8 display that when the number of policies is fixed, the response time of all three schemes in the hashing and adaptation algorithm parts increases linearly with the number of requests.However, the proposed scheme requires significantly less time than the traditional scheme and slightly less time than RBDS22 when the number of policies is relatively small.This advantage can be attributed to the proposed scheme's utilization of MA-ABE and Newton's interpolation, which enables it to achieve higher efficiency compared to traditional ABE and linear secret sharing matrices.

Conclusions
In this paper, we propose a policy-hidden fine-grained redactable blockchain (named PFRB) in decentralized blockchain-based IoT systems.Considering existing redactable blockchains, PFRB supports decentralized settings and fine-grained chameleon hash-based redaction.In addition, PFRB ensures that apart from successful policy matching, anyone cannot infer users' policy contents and execute any valid redaction.PFRB draws inspiration from Newton's interpolation formula-based secret sharing to convert policy contents into polynomial parameters.PFRB then utilizes multi-authority attribute-based encryption to hide these parameters further.Theoretical analysis proves that PRFB is secure against the chosen-plaintext attack.Extensive experiments on the FISCO blockchain platform and IoT devices show that PFRB achieves competitive efficiency over current redactable blockchains.For future work, we will focus on providing more comprehensive privacy protection (e.g., data privacy) and richer functionalities (such as accountability and revocation) in decentralized IoT systems.In addition, we will also focus on achieving richer data analytic mechanisms by combining various machine learning schemes.

Figure 4 .
Figure 4. Hash performance with different policy sizes.

Figure 5 .
Figure 5. Adaption performance with different policy sizes.

Figure 6 .
Figure 6.Verify performance with different request numbers.

Figure 7 .
Figure 7. Hash performance with different user numbers.

Figure 8 .
Figure 8. Adaption performance with different user numbers.

Table 2 .
• Adapt (sk id , {sk id,A }, c, m, m , h, r)→r': The Adapt algorithm is executed by the transaction modifier.It takes inputs such as the secret component sk etd , a set of secret keys sk id,A , the ciphertext c, messages m and m , the hash value h, and the randomness value r.The output of the algorithm is a new randomness value r .Notations used in PFRB.