You are currently viewing a new version of our website. To view the old version click .
Symmetry
  • Technical Note
  • Open Access

19 March 2015

Study on User Authority Management for Safe Data Protection in Cloud Computing Environments

and
Department of Computer Software Engineering, Soonchunhyang University, 646, Eupnae-ri, Sinchang-myeon, Asan-si, Chungcheongnam-do 336-745, Korea
*
Author to whom correspondence should be addressed.
This article belongs to the Special Issue Applied Cryptography and Security Concerns based on Symmetry for the Future Cyber World

Abstract

In cloud computing environments, user data are encrypted using numerous distributed servers before storing such data. Global Internet service companies, such as Google and Yahoo, recognized the importance of Internet service platforms and conducted self-research and development to create and utilize large cluster-based cloud computing platform technology based on low-priced commercial nodes. As diverse data services become possible in distributed computing environments, high-capacity distributed management is emerging as a major issue. Meanwhile, because of the diverse forms of using high-capacity data, security vulnerability and privacy invasion by malicious attackers or internal users can occur. As such, when various sensitive data are stored in cloud servers and used from there, the problem of data spill might occur because of external attackers or the poor management of internal users. Data can be managed through encryption to prevent such problems. However, existing simple encryption methods involve problems associated with the management of access to data stored in cloud environments. Therefore, in the present paper, a technique for data access management by user authority, based on Attribute-Based Encryption (ABE) and secret distribution techniques, is proposed.

1. Introduction

Recently, as interest in data has increased at home and abroad, many related studies have been conducted. Based on the growth of IT technologies, many firms are interested in data that can expand to diverse areas and allow efficient use of computing power. Recently, numerous Internet service companies have recognized the importance of Internet service platforms, and have conducted in-house research and development to create and utilize large cluster-based cloud computing technologies based on low-priced commercial nodes [1]. In such cloud computing environments, user data are stored and maintained using numerous distributed servers. In such distributed computing environments, distributed management that can manage user high-capacity data is emerging as a major issue. However, because of the diverse forms of using high-capacity data, such as storage and saving, there are no appropriate countermeasures against security invasion and data loss by malicious attackers or internal users. Among users or firms, systems to maintain important data in external places that cannot be controlled by such users or firms are spreading because of issues, such as management costs. In this case, the diverse insecurity factors of cloud services, such as information exposure, information manipulation, and information loss, occur because of cloud service providers.

As the most basic method of solving such problems, stored data can be managed through encryption. However, existing simple encryption methods involve problems associated with the management of access to data stored in cloud environments. That is, users want multiple user access to data stored in cloud servers, or require diverse functions such as access control by user class. However, existing public-key cryptography or symmetric-key algorithm techniques cannot solve problems in key management or satisfy requirements such as access control.

To solve problems in existing encryption techniques, encryption methods suitable for distributed storage servers have recently been studied actively. Characteristically, Sahai et al. proposed the concept of Attribute-Based Encryption (ABE) as an expanded form of the concept of ID-based encryption (IBE) [2]. The IBE method that has become the basis of ABE is a method for solving certificate problems in public-key cryptography, which was first proposed by Shamir in 1984 [3]. ABE makes public-keys that use user attributes instead of their IDs. The user attributes can be composed of multiple attributes. For instance, data for a particular user can be decrypted only when the attributes of such user are the department of computers and professor as the user’s department and title.

In this paper, the existing ABE technique is used together with the secret distribution technique to propose a new data access management technique. Although the existing ABE technique provides a method for multiple users to access data using diverse attributes, the ABE technique alone cannot control data access in diverse cloud environments. The property-based technique alone cannot control the detailed data access management in the diverse cloud computing environments that require the classification of the users who have the same properties. Therefore, in this paper, a technique is proposed to allow only those users with authorization not lower than a given threshold among all users that satisfy attributes, to finally access data through the secret distribution technique.

This paper is composed as follows. In Section 2, existing studies are introduced to help understand the technique proposed in this paper; in Section 3, basic security requirements with which cloud computing environments should be equipped are examined; and in Section 4, the proposed method is explained. In Section 5, the safety of the proposed method is analyzed, and finally, in Section 6, this paper is finished with conclusions and future study directions.

3. Security Requirements

One of core mechanisms of data is the efficient management of high-capacity data. Given that security vulnerability and privacy invasion by malicious attackers or internal users can occur because of the diverse forms of using high-capacity data, the following security items become necessary [12].

  • Confidentiality: data communicated between data storage servers and client terminals should be readable only by legitimate entities. That is, unauthorized users should not be capable of obtaining such data.

  • Authentication: data storage servers should be capable of verifying whether users are legitimate entities, and allow only legitimate users to access data.

  • Availability: to guarantee that, when high-capacity data are transmitted, authentication and confidentiality occur.

  • Calculation efficiency: in order to reduce the overhead of client terminals and cloud servers during frequent data transmissions, only minimum calculations should be ensured.

  • Collusion resistance: even if many users collude, they should be able to obtain only the data permitted to each individual user.

  • Forward secrecy: users for whom certain attributes have been withdrawn should not be capable of obtaining the data required by the relevant attributes.

  • Backward secrecy: users that have obtained new attributes should not be capable of obtaining, with the current authority, data that could be obtained previously.

4. Proposed Method

4.1. System Model and Assumption

In general, cloud-computing environments were designed based on the HDFS of Apache. The basic concept of our proposed method is CP-ABE. A user that satisfies certain attributes in a group obtains the authority to obtain the decryption key. In this case, the user obtains such authority to decrypt data based on his/her authority class under the agreement of the manager group. Therefore, the proposed method can perfectly guarantee resistance to collusion, which is a problem of the existing ABE, in addition to data confidentiality through class management by subdivided authority (Figure 2).

Figure 2. Attribute-Based Encryption (ABE) and secret sharing.
n number of participants P set of participants P i ( 1 i n ) in a secret distribution q fraction k secret information Z q K set of secret information k s P i pieces of secret Z q S P i set of secret pieces s P i possessed by individual participants P i i attribute value L attribute set a i , a i ^ , a i * Z P * correspond toattribute i W = [ W 1 , W 2 , .... , W n ] access structure M plain text r Z p * random value

4.2. User Authority Management Scheme

4.2.1. Setup

Enter security parameter k to output the public key PK and master key MK that correspond to the value of the parameter.

  • G=[p, G, GT, gG, e].

  • Create random values w Z P *.

  • Select random values a i , a i ^ , a i * Z P * that correspond to attribute i , ( 1 i n ).

  • Calculate Y=e(g,g)w and A i = g a i , A i ^ = g a i ^ , A i * = g a i *.

  • PK is < Y , p , G , G T , g , r , ( A i , A i ^ , A i * ) 1 i n > and MK is < w , ( a i , a i ^ , a i ^ ) 1 i n >.

4.2.2. KeyGen

This is an algorithm to enter master key MK and attribute set L in order to output secret key SKL that corresponds to the access structure.

  • Enter attribute set L = L 1 , L 2 , , L n to create a secret key.

  • Select s i Z P * randomly and calculate s = i = 1 n s i , D 0 = g w s.

  • If Li=1, calculate [ D i , D i * ] = [ g s i a i ^ , g s i a i * ].

  • The secret key is S K L = < D 0 , ( D i , D i * ) 1 i n >.

4.2.3. Secret Sharing

The created secret key SKL is distributed to individual members of the manager group using the secret distribution technique. Shamir et al.’s secret distribution method was applied to the proposed method [13]. In this scenario, the number of members of the manager group is assumed to be three.

q ( q n + 1 ) chosen a 1 , a 2 , , a t 1 ( a i Z q , 1 i t 1 ) Secret Key k = a 0 , ( t 1 ) - d egree polynomial f ( x ) = a 0 + a 1 x + a 2 x 2 + + a t 1 x t 1 ( mod q ) s P j = f ( x j ) ( 1 j n ) K M G 1 ( x 1 , s P 1 ) K M G 2 ( x 2 , s P 2 ) K M G 3 ( x 3 , s P 3 )

4.2.4. Encrypt

This is an algorithm to enter public key PK, access structure W, and plain text M in order to output ciphertext CT that corresponds to the plain text.

  • Encrypt access structure W=[W1,W2,…,Wn] and plain text M.

  • Calculate random values r Z p * and C ~ = M Y r , C 0 = g r.

  • Calculate C i : W i = 1 , C i = A i r , W i = 0 , C i = A ^ i r , W i = * , C i = A i * rthat satisfies the following.

  • The ciphertext is C T = < C ~ , C 0 , ( C i ) 1 i n >.

4.2.5. (k,n) Threshold

This is the stage in which a user with access authority requests the manager group for the secret key. If the requester has a legitimate access class to the data, he/she can receive n secret pieces from the manager group to restore the secret key.

f ( 0 ) = i = 1 t s P A i j = 1 , j i t x A i ( x A j x A i ) ( mod q ) S K L = < D 0 , ( D i , D i * ) 1 i n >

4.2.6. Decrypt

This is an algorithm to enter the restored secret key SKL and ciphertext CT in order to output the plain text that corresponds to the ciphertext (Figure 3).

Figure 3. Data decryption.

Decrypt ciphertext C T = < C ~ , C 0 , ( C i ) 1 i n > using secret key S K L = < D 0 , ( D i , D i * ) 1 i n >.

F o r 1 i n D i = { D i i f W i * D i * i f W i = * } C ~ e ( C 0 , D 0 ) i = 1 n e ( C i , D i ) = M ( e ( g , g ) w ) r e ( g r , g w s ) i = 1 n e ( ( g a i ) r , g s i a i ) = M ( e ( g , g ) w r ) e ( g r , g w s ) e ( g , g ) s r = M ( e ( g , g ) w r ) e ( g , g ) w r = M

4.3. Data Block Management Scheme

4.3.1. Distribution Process of the Block Access Token

When clients ask NameNode for data storage, NameNode secures DataNode for data storage, and then a block access token is created for user authentication. The block access token is divided into three pieces based on the shamir (2, 3) secret sharing to be stored by the clients, NameNode, and DataNode (Figure 4).

Figure 4. Secret distribution of block access token.

4.3.2. User Authentication Process

When data are recovered, clients receive information necessary for the block access token from NameNode (Figure 5). The details are as follows:

Figure 5. User authentication using block access token.
  • Step 1. When clients ask NameNode for data, NameNode sends the DataNode address where the data are stored, the data block ID, and its own single share to the clients.

  • Step 2. Upon receiving NameNode’s single share, the clients create a block access token using their own single share.

  • Step 3. The clients send the created access token and NameNode’s single share to the DataNode address received from NameNode.

  • Step 4. Using its own single share and NameNode’s single share received from the clients, DataNode creates a block access token and compares it with the block access token received from the clients to verify if the clients are normal users.

  • Step 5. In the created block access token structure, Datablock Checksum is added when TokenAuthenticator is created, and accordingly, it is used for verifying the integrity of the stored data block (Figure 6).

    Figure 6. Improved structure of block access token.

5. Analysis of Proposed Method

5.1. User Authority Management Scheme

5.1.1. Resistance to Collusion

The existing ABE has a limitation in that two or more users can collude using their attributes to arbitrarily access data for which they have not been authorized. For instance, users A and B can use user A attributes (A,B) and user B attributes (B,C) to access the ciphertext that consists of access structure (A,C). However, the proposed method can fundamentally block illegal data access through collusion because the manager group directly manages user classes with secret distribution. That is, the user classes can be adjusted to (1,3) threshold, (2,3) threshold, or other, according to system policy by distributing the secret key to secret pieces, such as K M G 1 ( x 1 , s P 1 ) , K M G 2 ( x 2 , s P 2 ) , K M G 3 ( x 3 , s P 3 ).

5.1.2. Data Availability

The availability of data stored in distributed servers is provided by all methods. HDFS reproduces and stores at least three of each data block to respond to single point errors first. In addition, this system maximally increases data availability by frequently inspecting the errors or loss of data blocks stored in data nodes.

5.1.3. Forward Secrecy

Forward secrecy is a feature that ensures that when certain user attributes have been withdrawn, the relevant users cannot obtain the data required by the relevant attributes. The proposed method provides forward secrecy because individual pieces x in the expression f ( 0 ) = i = 1 t s P A i j = 1 , j i t x A i ( x A j x A i ) ( mod q ) calculated with secret pieces gathered based on secret distribution are created considering the access structures of attributes. That is, when the attributes of encrypted data change, illegal data access can be prevented, unless user attributes also change.

5.1.4. Reliability by Secret Distribution Technique

The method proposed in this study was based on the conventional CP-ABE and used secret distribution technique. By using the secret distribution technique, it was possible to overcome the shortcoming that, when the size of the attribute set L = L 1 , L 2 , , L n increases, the computational burden increases. However, in the process of encrypting data, each time, the secret key is segmented and shared. The distribution process is based on a polynomial such as f ( x ) = a 0 + a 1 x + a 2 x 2 + + a t 1 x t 1 ( mod q )and it is much more efficient than the computational burden generated by increase of attributes.

5.2. Data Block Management Scheme

5.2.1. Exposure of the Block Access Token Information

In the typical Hadoop structure, the secret key is frequently renewed between the NameNode and DataNode, and upon request from the users, the block access token is encrypted and transferred using an appropriate secret key. The technique using a hash chain is the concept of adding a disposable token to the block access token. Data are exposed without the separate encryption process, so blockID, userID, and keyID may be used by attackers. In other typical methods, the typical Hadoop environment is still used, and accordingly, the problems of Hadoop remain.

In this proposal, the block access tokens created during the communication process of single shares are transferred using the hash function. Only the rightful users or DataNode can create a block access token to take the hash function, and verify themselves as rightful users.

5.2.2. Man-in-the-Middle (MITM) Attack

In the typical methods, when a secret key between NameNode and DataNode is renewed, attackers can intercept the key to temporarily create a block access token. The hash chain technique and this proposal are suggested to prevent these Man-in-the-Middle (MITM) threats. As other typical methods use the typical Hadoop environment, the Hadoop problems remain.

5.2.3. Operation Amount

In the typical Hadoop system, the secret key is frequently renewed, and when a block access token is created, an operation based on a symmetric-key algorithm is conducted. Various symmetric-key algorithms, such as Advanced Encrypted Standard (AES) and Data Encryption Standard (DES), can be used. In the hash chain technique, each data block uses random numbers for n times of hash operation. In this proposal, however, the block access token is transferred together with the single share only once after taking the hash function, and after the initial single share is created and distributed. Therefore, this method is efficient.

5.2.4. Prevention of False Attacks

In the typical methods [10], user access control is conducted using Access Control List (ACL), but the Access Control Module is not enough for preventing false user data access. In this proposal, the block access token needed for data access is created only by the rightful constructors, and as such, false attacks can be prevented.

5.2.5. Data Integrity Test

In the typical Hadoop system and the method in [14], the data block integrity and the data storage server’s errors are frequently tested through the Heartbeat message. The Heartbeat message is frequently transferred by DataNode to NameNode. In the method of [10], the data block integrity is also frequently confirmed through checksum, but the operation amount increases accordingly. In the method in [11], the effectiveness is confirmed using the index information created in the object metatable. In this proposal, verification can be conducted through single share, which is necessary for the creation of the block access token, and no frequent confirmation through the Heartbeat message is necessary. Therefore, this method is efficient.

5.2.6. Data Availability

All the methods have the availability of the data stored in the distributed server. In the Hadoop distributed storage system, three or more data blocks are copied and stored to cope with the single-site errors. In addition, error and loss tests on the data block stored in DataNode are frequently conducted to enhance data availability.

6. Conclusions

In this paper, to solve the problem where various sensitive data stored in cloud servers could be leaked because of external attackers or internal user poor management, a technique was proposed that is related to data access management by user authority based on ABE and the secret distribution technique. To solve the problem where user data access authorities cannot be managed using existing simple encryption methods, the ABE method that encrypts data based on diverse user attributes was applied, and to control access by user class among users with legitimate attributes, access control by class through secret distribution was proposed. The problem of managing the secret key shared by DataNode and NameNode, and of NameNode management and MITM risks, were solved. This method is more efficient than the typical methods, which can reduce the operation overhead concentrated to NameNode.

Sensitive information stored in cloud servers can be managed safely by allowing minor user access control in cloud environments. The proposed protocol is of a structure that can safely and efficiently control access to diverse high-capacity data, including personal user information of highly confidential data, and it is expected to be used efficiently in cloud computing environments. However, our proposed method has a limitation where more calculations in polynomial expressions are added compared with the existing ABE method because it provides more functions. In the future, more efficient and safer methods should be studied based on the proposed method.

Acknowledgments

This research was supported by Ministry of Science, ICT & Future Planning (MSIP), Korea, under the Information Technology Research Center (ITRC) support program (NIPA-2014-H0301-14-1010) supervised by National IT Industry Promotion Agency (NIPA).

This work was supported by the Soonchunhyang University Research Fund.

Author Contributions

Su-Hyun Kim and Im-Yeong Lee both designed research and wrote the paper. Both authors have read and approved the final manuscript.

Conflicts of Interest

The authors declare that there is no conflict of interests regarding the publication of this paper.

References

  1. O’Malley, O.; Zhang, K.; Radia, S.; Marti, R.; Harrell, C. Hadoop Security Design; Yahoo Inc.: Santa Clara, CA, USA, 2009. [Google Scholar]
  2. Sahai, A.; Waters, B. Fuzzy Identity-based Encryption. In Advances in Cryptology—EUROCRYPT 2005; Springer Berlin Heidelberg: Heidelberg, Germany, 2005; pp. 457–473. [Google Scholar]
  3. Shamir, A. Identity-based Cryptosystems and Signature Schemes. In Advances in Cryptology; Springer Berlin Heidelberg: Heidelberg, Germany, 1985; pp. 47–53. [Google Scholar]
  4. Ghemawat, S.; Gobioff, H.; Leung, S.-T. The Google File System. Proceedings of the Nineteenth ACM symposium on Operating Systems Principles, Bolton Landing, NY, USA, 19–22 October 2003; pp. 29–43.
  5. Hadoop Wiki. Available online: http://wiki.apache.org/hadoop/PoweredBy accessed on 10 March 2015.
  6. Apache Hadoop. Available online: http://hadoop.apache.org/ accessed on 10 March 2015.
  7. Kar, J.; Majhi, B. A Novel Deniable Authentication Protocol based on Diffie-Hellman Algorithm Using Pairing Technique. Proceedings of the 2011 International Conference on Communication, Odisha, India, 12–14 February 2011; pp. 493–498.
  8. Yu, S.; Wang, C.; Ren, K.; Lou, W. Achieving Secure, Scalable, and Fine-Grained Data Access Control in Cloud Computing. Proceedings of 2010 IEEE INFOCOM, San Diego, CA, USA, 14–19 March 2010; pp. 1–9.
  9. Zhu, S.; Yang, X.; Wu, X. Secure Cloud File System with Attribute Based Encryption. Proceedings of 2013 5th International Conference on Intelligent Networking and Collaborative Systems (INCoS), Xi’an, China, 9–11 September 2013; pp. 99–102.
  10. Jin, S.; Yang, S.; Zhu, X.; Yin, H. Design of a Trusted File System Based on Hadoop. In Trustworthy Computing and Services; Springer Berlin Heidelberg: Heidelberg, Germany, 2013; pp. 673–680. [Google Scholar]
  11. Zhang, D.-W.; Sun, F.-Q.; Cheng, X.; Liu, C. Research on Hadoop-based Enterprise File Cloud Storage System. Proceedings of 2011 3rd International Conference on Awareness Science and Technology (iCAST), Dalian, China, 27–30 September 2011; pp. 434–437.
  12. Hubbard, D.; Sutton, M. Top Threats to Cloud Computing V1.0. Available online: https://cloudsecurityalliance.org/topthreats/csathreats.v1.0.pdf accessed on 19 March 2015.
  13. Shamir, A. How to Share a Secret. Commun. ACM 1979, 22, 612–613. [Google Scholar]
  14. Park, S.-J.; Kim, H. Improving Hadoop Security Through Hash-chain. J. Korean Inst. Inf. Technol. 2012, 6, 56–73. [Google Scholar]
  15. Cheung, L.; Newport, C. Provably Secure Ciphertext Policy ABE. Proceedings of 14th ACM Conference on Computer and Communications Security, Alexandria, VA, USA, 29 October–2 November 2007; pp. 456–465.
  16. Goyal, V.; Pandey, O.; Sahai, A.; Waters, B. Attribute-based Encryption for Fine-Grained Access Control of Encrypted Data. Proceedings of the 13th ACM Conference on Computer and Communications Security, Alexandria, VA, USA, 30 October–3 November 2006; pp. 89–98.
  17. Bethencourt, J.; Sahai, A.; Waters, B. Ciphertext-Policy Attribute-based Encryption. Proceedings of IEEE Symposium on Security and Privacy, Berkeley, CA, USA, 20–23 May 2007; pp. 321–334.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.