1. Introduction
Cloud computing has revolutionized the way data are managed and stored, offering a cost-effective solution that is flexible and scalable. However, as the volume of data being stored and exchanged on the cloud continues to grow exponentially, several security and privacy issues must be addressed to ensure the safety of user data. One of the most significant challenges facing cloud computing is securely exchanging data. The following paragraph provides an overview of the existing landscape in cloud computing, highlighting its revolutionary impact on data management and storage, as well as the challenges it faces in terms of security and privacy. It also mentions traditional encryption methods and their limitations in addressing these challenges.
Traditional encryption methods are no longer suitable for cloud data exchange security as they produce multiple encrypted versions of the same data using different keys. Attribute-based encryption has been proposed as a possible solution, but it still relies on trusted third parties to protect users’ privacy [
1]. Another issue with cloud computing is the need for data files to be encrypted before they are stored on a cloud server. Most cloud servers are not entirely trustworthy and reliable, and data files may need to be downloaded locally before they can be decrypted, wasting network bandwidth and time. Searchable encryption (SE) has been put forth to address these issues. SE allows data to be encrypted while enabling users to search for and retrieve specific pieces of data without downloading and decrypting the entire file. This approach is more efficient and can save time and resources. Despite these advances, cloud computing still has security and privacy concerns, particularly around authentication processes, identity authentication, and storage security. Users must be vigilant about securing their data, and cloud service providers must implement robust security measures to protect user data from unauthorized access or attack. In conclusion, cloud computing has revolutionized managing and storing data, providing a cost-effective and flexible solution. However, security and privacy concerns must be addressed to ensure the safety of user data, and new approaches, such as searchable encryption, must be adapted to meet the evolving needs of cloud computing. This paper introduces attribute-based encryption as a potential solution and touches on the need for encrypted data files and the concept of searchable encryption to enhance security and efficiency.
The Internet of Things (IoT) has revolutionized our daily lives with the massive amount of data it generates. However, storing and managing these data is a significant challenge due to the limited resources of IoT devices. Cloud storage is an effective solution but raises security and privacy concerns, such as unauthorized data access and manipulation. Symmetric encryption is a standard technique that provides data confidentiality. However, it does not enable accurate data sharing or searching. Keyword-based searchable attribute-based encryption (KSABE) is a more comprehensive solution that offers data protection and granular access control. The ability to conduct keyword-based searches is especially critical for data users. However, the decryption process is computationally intensive, and managing large user keys is challenging when applying attribute-based encryption techniques to IoT. Blockchain-aided searchable attribute-based encryption (BC-SABE) is a promising solution that addresses these challenges effectively. This system employs a decentralized blockchain system to manage threshold parameter construction, key management, and user revocation. It enables efficient revocation and decryption without updating keys or re-encrypting ciphertext. This is because the blockchain holds all revocation procedures. Furthermore, the coalition blockchain allows users to create partial tokens, enhancing their privacy. BC-SABE is an innovative approach to IoT data management that addresses security and privacy concerns while enabling accurate data sharing and searching. It holds great potential for future IoT applications, especially when data protection, access control, and efficient revocation are critical.
Bitcoin is a type of digital currency and a pioneer of blockchain technology. It works as a distributed, open database that stores transactions in blocks. Each block is connected to the previous block using a hash function and a peer-to-peer network. The database uses the Merkle tree structure to store specific transactions in the block. If a rogue user tries to modify a transaction, the block’s hash will change, affecting the Merkle tree’s root hash. However, the network’s consensus mechanism makes it difficult for a rogue user to carry out such an attack. They would need to have 51% of the network’s computational power to obstruct a transaction. Even if they had this computational power, they would not be able to obtain the recognition of other nodes in the network or gain access to the blockchain. This is because the consensus mechanism ensures that all nodes in the network agree on the validity of a transaction. If a rogue user tries to alter a transaction, other nodes in the network will reject it, and the blockchain will not acknowledge it. The consensus mechanism also prevents double-spending, where a user spends the same bitcoin twice.
Blockchain technology is an open, decentralized, and independent system that does not require third-party management of transactions. Transactions are automatically distributed across the entire network and secured through workload-proof strategies and encryption technology, making it more secure than conventional data storage methods. The proposed blockchain-based distributed cloud storage infrastructure, such as the Block-Secure technology, uses a digital signature mechanism (DSA) to encrypt and sign user’s files’ blocks. To reduce the load on the peer-to-peer (P2P) network and quickly deliver users’ files from the cloud, a random file replication placement strategy is employed. In addition, the Merkle hash tree is used as a validation tool to provide file integrity verification. The Block-Secure system uses a genetic algorithm to address the file block replica placement problem across multiple users and different data centers in a distributed cloud storage environment. Integrating blockchain technology in the cloud storage infrastructure provides a more secure and decentralized system that ensures data privacy and integrity. Using genetic algorithms and validation tools such as the Merkle hash tree enhances the system’s efficiency and reliability in managing distributed data storage.
SE [
2] enables users to search encrypted data files using keywords, significantly reducing user communication and computation. The majority of existing SEs are based on essential public SEs. The authors of [
3] proposed a general basic SE application scheme in the mobile medical system and applied significant public SE to medical information management. The authors of [
4] suggested a comparatively secure public key SE scheme using the random oracle model to address the security of offline keyword guessing attacks. But most SEs using public keys only support one-to-one encryption and decryption. The data owner cannot implement the outsourced encrypted data—effective access control, which lacks convenience and practicability in the actual application environment—because each encryption requires the recipient’s identity information to be known, and the search authority of the search user is not considered. Data users can perform unlimited searches using any keyword to request encrypted content from the server that contains their desired keywords using any of the SE schemes mentioned above. Since data owners cannot impose adequate access controls to outsourced data information, researchers need attribute-based encryption technology to develop a SE scheme with keyword search authorization. The idea of attribute-based encryption was first put forth in [
5], and it implements fine-grained access control of data through fuzzy identification. This particular cryptographic primitive is brand new. The authors of [
6] then published an attribute-based encryption system that incorporates attributes into keys to accomplish fine-grained data access control. To balance user experience and security in data outsourcing, the authors of [
7] then proposed a fine-grained data SE method. They mentioned the potential applications for it in a safe mobile cloud environment. In the attribute-based encryption approach, the key to ensuring good communication is embedded with attributes [
8].
In contrast to other traditional encryption techniques, attribute-based encryption provides a collusion-resistance property. As a result, an adversary needs the user key to access encrypted files. Multiple collusion attacks can be stopped by using attribute-based encryption for login security. Therefore, the authorized user with the same qualities may access or decrypt the encrypted files. Distributed attribute-based encryption, multi-authority attribute-based encryption, attribute-based broadcast encryption, and ciphertext policy attribute-based encryption were just a few of the hybrids made simpler by the ABE technique. Data analysts classify various encryption techniques based on how crucial they are for data security. Attribute-based encryption finds and encrypts the user key using attribute sets. As a result, the client is permitted to distribute, manage, and maintain the PHR using identity sets. ABE, however, forbids the use of user revocation characteristics. Public-key encryption features cryptography, public keys, and private keys. The user can then decode communications using the associated private or public key. The protected file can be accessed by two users thanks to public-key encryption.
Consequently, the user must have both public and private keys. The public key can encrypt private data on a cloud server, while the private key is used to decrypt encrypted communication. However, the message is sent to the output server via the ciphertext. Hence, the public-key technique’s key encryption device is its most crucial component. The user may then use the private key to protect and authenticate the integrity of the data. The public-key technique, however, imposes restrictions on the encryption procedure. Therefore, the user must run various algorithms to relay and receive encrypted messages. AES, or Advanced Encryption Standard, is a symmetric encryption technique that encrypts data blocks of 128 bits at a time. The keys used to encrypt these data blocks have lengths of 128, 192, and 256 bits. A 256-bit key requires 14 rounds of data encryption, a 192-bit key requires 12 rounds, and a 128-bit key requires 10 rounds. Each cycle has several steps for operations, such as plaintext mixing, substitution, and transposition. File encryption, secure sockets layer/transport layer security, mobile app encryption, and Wi-Fi security may all be accomplished with AES.
Data sharing and storage in cloud storage platforms are the focus of cloud data security (CSP). The DO is allowed to upload encrypted data. Others must obtain their own decryption key with the DO’s permission to decrypt the data. In other words, clients who meet the requirements can decode the data using their private keys after they have only been encrypted once. The uploaded data are entirely within the DO’s control and the DO is accountable for its posted data. Users’ actions will all be preserved and unaffected in their entirety. Users cannot retract their behavior. This supports fine-grained access control, in which the DO encrypts the data. After meeting the attribute requirements, other users can collaborate with the DO to develop a unique decryption key. The administration of keys and data storage are independent of outside parties. A third party cannot affect the DO’s key generation and data encryption. The integrity of the key, ciphertext, and plaintext is required. The user cannot obtain the correct decrypted data if one integrity has been lost. The DOs have control over the data they post. Since the cloud platform is public, anybody may view the uploaded data that have been encrypted. Users must, however, negotiate the key with the DO if they wish to be able to decrypt the data. It is important to note that ABE techniques have their own strengths and limitations, and the choice of which scheme to use depends on the application’s specific requirements. For example, KP-ABE is more suitable for scenarios with a large number of users and a limited number of resources. In contrast, CP-ABE is more ideal for scenarios with complex access policies and more dynamic attribute updates. In addition, recent research has focused on enhancing ABE schemes to address various security and efficiency challenges. For instance, researchers have proposed hybrid ABE schemes that combine the advantages of different ABE schemes to achieve better performance and security. Other approaches include using multi-authority ABE to address scalability issues, and incorporating techniques such as proxy re-encryption and homomorphic encryption to improve the efficiency of ABE-based systems. Overall, ABE techniques have proven to be effective in ensuring fine-grained access control in cloud-based storage systems. As cloud computing and storage continue to evolve, ABE will likely continue to play a crucial role in ensuring the security and privacy of sensitive data in the cloud.
Hashes in the blockchain-based ciphertext-policy attribute-based encryption method (BCAS) ensure data integrity by preventing unauthorized access to the ciphertext, key, and starting data. To ensure that the data are valid and have not been tampered with, the DO publishes the ciphertext’s hashes and starting data to the blockchain simultaneously with the encrypted data. The system verifies the data validation and uploads the data only if the validation is successful. The DO and DR must provide the key and submit its hashes to the blockchain for verification. Once the key is generated, the DR decrypts the data and provides their hash for comparison with the original data’s hash to ensure their integrity. By incorporating data hashes into the blockchain, the BCAS method provides a secure and efficient way to manage data integrity in cloud-based storage systems.
It is important to note that while blockchain technology can address specific data security and privacy concerns, it is not a silver bullet solution. Blockchain-based systems can have their security vulnerabilities and must be carefully designed and implemented. Additionally, there may be trade-offs between security and efficiency, as blockchain-based systems can be computationally intensive and require significant resources. Nonetheless, using blockchain technology in conjunction with other cryptographic techniques, such as attribute-based encryption, can offer enhanced data security and privacy protection in specific applications. The attribute-based encryption method proposed by the authors of [
9] is designed for semi-honest cloud storage environments and provides a more versatile and comprehensive access control technique. The process is based on attribute-based encryption (ABE), allowing for fine-grained access control using user attributes instead of fixed identities. This approach enables data owners to specify access policies based on the attributes of the users, such as their job titles, age, or location.
Another paper [
10] suggests an attribute-based SE system where the cloud server handles sophisticated computing tasks to lessen the user’s computational load and increase flexibility when the access policy is altered. Since most attribute-based SE schemes employ cloud storage, data security and privacy protection issues are becoming increasingly prevalent. Users may access convenient and ample data storage services from the cloud server. However, the complexity of its security situation significantly undermines customers’ confidence in it, as unauthenticated individuals can access cloud servers at will, and data protection cannot be guaranteed. Due to blockchain technology’s [
11] ability to enable the access and sharing of data in a free and safe manner has opened up new avenues to address these issues. For the first time, the authors of [
12] highlighted the importance of storing data in the public chain and proposed a new data deletion scheme based on blockchain technology. Regardless of how poorly the cloud server behaves, the data owner can still verify the deletion result, increasing the transparency of the deletion operation. The authors of [
13] then suggested a blockchain-based SE scheme combining SE and blockchain technology to guarantee fairness and minimize computation for users to search encrypted data files illegally. The authors of [
14] presented a trusted SE strategy for criminal users and cloud service providers based on cloud storage. Data sharing relies heavily on attribute-based encryption, particularly encryption incorporating attributes into ciphertext. Blockchain technology can ensure and access the integrity and immutability of policy-related information. However, access control systems in dispersed networks typically leak sensitive data information. The authors of [
15] suggested a traceable, efficient, and privacy-preserving attribute-based searchable encryption technique [
16] in the blockchain to address the effectiveness of attribute encryption, privacy leakage, and critical abuse.
The system uses blockchain technology to guarantee the immutability and integrity of data. Searchable symmetric encryption (SSE) is also utilized, which enables selective querying of encrypted data without the risk of data leakage. To ensure the security of records on the blockchain, each network participant has a private key for signing transactions with a digital signature. If a document is altered, the signature becomes invalid, and the network is immediately alerted. To determine if a file is encrypted, the file’s entropy can be calculated using tools like bin-walk, and if there is a steady increase in entropy, the file is likely encrypted.
In the system, the hospital server stores the ciphertext of a patient’s electronic medical record, while the alliance chain and medical cloud server record the keyword ciphertext and the patient’s pseudo-identity as the specific index. The alliance chain contains the term ciphertext and receives the trapdoor when it is generated. The nodes on the alliance chain search for keywords when a patient needs a database for the incentive system. The node on the alliance chain retrieves the security index and matches the patient’s pseudo-random identification when searching for the ciphertext of the related patient. The node on the alliance chain then locates the doctor’s identification on the medical cloud server to determine the hash value of the keyword ciphertext, and the patient can decrypt the electronic medical record to reveal its plaintext. Traditional data encryption methods protect data privacy but have limitations regarding easy sharing. Ciphertext processing methods are applied to enable third-party users to perform mathematical calculations on encrypted data while protecting user privacy. The ciphertext processing methods also allow statistical or machine learning tools for privacy-preserving data analysis.
Cloud computing has significantly transformed data management and storage, providing a cost-effective, flexible, and scalable solution. However, the rapid growth of data stored and exchanged in the cloud presents security and privacy challenges. Secure data exchange is a pressing concern, with traditional encryption methods becoming inadequate. Attribute-based encryption has been proposed but still relies on trusted third parties. Additionally, data must often be locally downloaded before decryption on untrustworthy cloud servers, leading to inefficiencies. Searchable encryption (SE) offers a solution by enabling data encryption while allowing specific data retrieval without full file decryption. Nonetheless, security concerns persist, particularly regarding authentication, identity verification, and storage security.
The existing landscape highlights several gaps in addressing cloud computing’s security and privacy challenges. These include the need for improved data exchange security, efficient and trustworthy cloud storage, and enhanced authentication processes. Existing solutions, such as attribute-based encryption, require further development to minimize reliance on third parties. Limitations include the reliance on trusted third parties in attribute-based encryption, the need for local data downloads before decryption, and security concerns around authentication and identity verification.
The proposed approach, which combines blockchain technology with searchable attribute-based encryption (BC-SABE), is significant for addressing these limitations. BC-SABE offers an innovative solution to securely manage IoT data by employing a decentralized blockchain system. It streamlines parameter construction, key management, and user revocation while maintaining data privacy. This approach enhances data protection, access control, and efficient revocation, making it suitable for future IoT applications.
2. Materials and Methods
This section mainly introduces the scheme’s system, formal definition, and security model.
2.1. System Model
The system implements fine-grained access control for encrypted data, enabling different data users to access data according to their authorized attributes. The cloud server stores data files and encrypted keywords, and the blockchain stores the encrypted keywords’ storage addresses on the cloud server. The system model shown in
Figure 1 consists of several components: data owners, various data users, a cloud server, a trusted attribute authorization center, and the blockchain. The system uses cloud-based blockchain technology to ensure the immutability and integrity of data.
Attribute Authorization Center: It is completely trustworthy to the data owners and users interacting with it in the system, and responsible for setting system parameters and registering users. The key and corresponding parameters are generated by the attribute authorization center and returned to the user.
Data owner: According to the established guidelines, the data owner extracts the keyword set from the data file, encrypts the keywords using his access policy, and then uploads the data file’s ciphertext and the keyword ciphertext to the cloud server. After receiving it, the cloud server stores the ciphertext and gives the data owner the storage address. The data owner then creates a reverse index relationship between both the ciphertext of the data file and the ciphertext of the keyword in the storage address of the cloud server. To complete and publish the new block, the data owner uploads a built transaction to the blockchain, the keyword ciphertext, and its storage address. The blockchain’s other data consumers are in charge of the new collaboration.
Cloud server: Cloud servers provide data storage services. The storage address is returned to the data owner after the cloud server stores the data file ciphertext and keyword ciphertext that the data owner provided. When the keyword search is successful, the data owner uses the address supplied by the blockchain to locally check the index connection between the data file’s ciphertext and the keyword’s ciphertext. After receiving a request, the cloud server will search using the ciphertext of the user’s data file and respond with the results.
Blockchain: Blockchain nodes offer data search functions. The data owner creates a transaction and uploads it to the blockchain, the keyword ciphertext, and its address. When more blockchain data users receive the broadcasted block, the block is considered verified. The search algorithm is executed by a blockchain node that wishes to receive the reward when a user uploads the trapdoor as a transaction. The blockchain node gives the storage address of the keyword ciphertext to the data owner if the search is successful; otherwise, return failure is returned.
Data users: Users create search trapdoors using their private keys and desired keywords, upload the trapdoors to the blockchain as transactions, and the blockchain’s nodes carry out searches using the transactions. The blockchain node provides the data owner with the keyword ciphertext storage address if the search is successful. The data owner then informs the cloud server of the ciphertext address for the data file using the index relationship. The cloud server finds the encrypted data file next, after which it offers the user access to the ciphertext of the data file.
2.2. Security Model
Keyword ciphertext indistinguishability security and trapdoor indistinguishability security of the scheme under chosen-plaintext attack are defined via the probabilistic polynomial time game between attacker and challenger .
In the initial phase, runs the system to establish the algorithm output public parameters; defines a challenge access tree .
Stage 1: At this stage, adaptively performs the following query of polynomial bounded degree.
Key extraction challenge: adaptively asks for the private key corresponding to the attribute sets.
Trapdoor inquiry: The keyword ciphertext query adaptively asks for the ciphertext corresponding to . During this process, none of the private keys that are asked for satisfies the access tree .
Challenge: submits two challenge keywords, and , to .
B randomly selects encrypts to obtain the keyword ciphertext , and returns it to .
Stage 2: continues to initiate a series of queries corresponding to the attribute sets as in stage 1, and requires that none of the private keys obtained by the question satisfies the access tree . Finally, A outputs ; if , then wins game 1.
’s advantage in successfully winning this game is defined in Equation (1):
If is negligible for attacker in probabilistic polynomial time, the scheme is said to satisfy the indistinguishability of the key-ciphertext security.
- ii.
Game 2: Trapdoor indistinguishability.
Let’s assume that is a polynomial-time attacker attempting to circumvent the indistinguishable trapdoor protection. Challenger then develops a technique to overcome the DDH problem, allowing to gain the instance. .
The initial phase: runs the system to establish the algorithm to output the public parameters.
Stage 1: At this stage, adaptively performs the following query of polynomial bounded degree.
Key extraction challenge: runs the key generation algorithm to calculate and returns the essential to .
Trapdoor inquiry: Given a keyword the corresponding trapdoor is computed and returned to .
Challenge: submits two challenge keywords, and , to . randomly selects and uses to get the trapdoor and returns it to .
Stage 2: continues to initiate a series of queries as in stage 1, but cannot ask for information about the challenge keyword. Finally, outputs ; if , then wins game 2.
’s advantage in successfully winning this game is defined using Equation (2):
If (λ) is negligible for attacker in probabilistic polynomial time, the scheme is said to satisfy trapdoor indistinguishability security.
3. Proposed Work
The blockchain’s cloud-assisted attribute-based searchable encryption scheme is divided into three stages: system establishment, data encryption, and data search.
3.1. System Establishment
This stage is divided into two steps: system initialization and key generation.
- i.
System initialization (
).
In this process, the attribute authorization center executes the algorithm to initialize the system. Input the security parameter , output the system’s public parameter and the data owner’s key .
Generate a bilinear map, e. , where and is cyclic multiplicative
Hash functions .
Define the Lagrange coefficient using Equation (3)
where represents a set,
Randomly select , and calculate .
Return
- ii.
Key generation ().
Use parentheses to avoid ambiguities in denominators. Punctuate equations when they are part of a sentence.
During this procedure, the attribute authorization center runs the algorithm to produce the user’s private key for its attribute set .
Randomly select
, and calculate
using Equation (4):
For
, randomly select
and calculate
using Equation (5):
Finally, the user’s key is obtained, and is returned to the user.
3.2. Encryption
At this stage, the data owner invokes this algorithm to encrypt all keywords, each corresponding to the access tree defining the keyword search authority.
Randomly select
as the secret value, and calculate using Equation (6):
First, execute the secret sharing algorithm for each node in the access tree (including the leaf node ) from the root node . To start, choose a polynomial . The specific steps are:
For each node in , set the degree of the polynomial as the node’s threshold value , that is, .
Starting from the root node , define , and then randomly select points of the polynomial to complete the definition of . For other nodes , define , and randomly select points to complete the definition of .
Let be the set of leaf nodes in . For the node in the set , calculate
Finally, the encrypted keyword is The data owner delivers the encrypted data file and the encrypted keyword to the , who then returns the storage address. will be the data owner, and the address will be the storage address. Through marketing embeds the transaction signs it, and broadcasts it to the blockchain system, and miners record the confirmed transaction on Blockchain.
The structure of a blockchain consists of two main components: a block header and a trade. The block header has the following information: block identifier, block size, hash, and the date of the preceding block. Transactions include the following information: block producer identity , block producer’s signature and , and address. The transaction comprises .
3.3. Data Search
This stage includes trapdoor generation (Trapdoor) and keyword search (Search).
- i.
Trapdoor generation
In this process, the user uses his essential SKU and the keyword ω to be queried to generate the trapdoor Uω.
Randomly select
and calculate
using Equation (7):
For , calculate . Therefore, the trapdoor generated by the keyword to be queried is . Embed the trapdoor into the transaction, , sign it, and broadcast it to the entire blockchain system in the form of the transaction . The miners record the verified transaction on the blockchain.
- ii.
Keyword search
In the keyword search stage, according to the trapdoor information, submitted by the user, the node on the blockchain (also called the searcher P) executes the algorithm to search for the keyword ciphertext. During the whole search process, helpful information about data files and keywords to be searched will not be leaked to the blockchain and cloud servers. The user constructs a transaction that contains his trapdoor information. The nodes on the blockchain calculate the central part of the transaction according to the transaction , embed the searched into the transaction g, and sign it to the whole blockchain network. Then, they broadcast the transaction and get the reward in trade simultaneously . When the transaction does not appear on the blockchain, the user can choose to construct a new transaction to recover the reward in the previous transaction, .
The nodes on the blockchain verify whether the equation holds, where . If the equation is established, the search is successful, indicating that the user’s attribute set satisfies the access tree embedded in and w and ω are consistent; at this time, the blockchain will store the address of . This is returned to the data owner. If the equation does not hold, the search fails. There are two situations in which the search fails: the user’s attribute set does not satisfy the access tree embedded in , and the algorithm terminates; that is, the user does not have the search authority for the keyword , or the user has the search authority for the keyword , but the search found that and are not the same. means to visit the node in the tree ; the algorithm runs:
If node is a leaf node, let , that is, represents the attribute associated with the leaf node .
If node is a non-leaf node, for all child nodes of node , the result after executing the algorithm is denoted as , and all values of are reserved in the set .
If , it means that the attribute set of the child node of node does not meet the threshold value of this node; then terminate and output .
If , it means that the attribute set of the child node of node satisfies the threshold value of this node; then randomly select values of from the set , and calculate the value in combination with the Lagrange coefficient according to Equation (9):
where
,
represent the Lagrange coefficient.
- 2.
If the user’s attribute set satisfies the access tree , the execution result of the algorithm is expressed as =.
- iii.
Proof of Corrections
Calculate
using Equation (10) and
using Equation (11).
When the data owner obtains the storage address of A, after , the data owner is based on address. and address. The index relationship of finds , and . returns to the cloud server. The cloud server according to . finds the corresponding encrypted data file and replaces the encrypted data file with the user.