You are currently viewing a new version of our website. To view the old version click .
Electronics
  • Article
  • Open Access

3 January 2022

Big Data Handling Approach for Unauthorized Cloud Computing Access

,
,
,
,
and
1
Department of Cybersecurity, International Information Technology University, Almaty 050000, Kazakhstan
2
Sensor Networks and Cellular Systems Research Center, University of Tabuk, Tabuk 71491, Saudi Arabia
3
Department of Information Technology, University of Tabuk, Tabuk 71491, Saudi Arabia
4
Department of Computer Science, Shaqra University, Shaqra 11961, Saudi Arabia
This article belongs to the Special Issue Big Data Privacy-Preservation

Abstract

Nowadays, cloud computing is one of the important and rapidly growing services; its capabilities and applications have been extended to various areas of life. Cloud computing systems face many security issues, such as scalability, integrity, confidentiality, unauthorized access, etc. An illegitimate intruder may gain access to a sensitive cloud computing system and use the data for inappropriate purposes, which may lead to losses in business or system damage. This paper proposes a hybrid unauthorized data handling (HUDH) scheme for big data in cloud computing. The HUDH scheme aims to restrict illegitimate users from accessing the cloud and to provide data security provisions. The proposed HUDH consists of three steps: data encryption, data access, and intrusion detection. The HUDH scheme involves three algorithms: advanced encryption standards (AES) for encryption, attribute-based access control (ABAC) for data access control, and hybrid intrusion detection (HID) for unauthorized access detection. The proposed scheme is implemented using the Python and Java languages. The testing results demonstrated that the HUDH scheme can delegate computation overhead to powerful cloud servers. User confidentiality, access privilege, and user secret key accountability can be attained with more than 97% accuracy.

1. Introduction

Cloud computing is a service that is currently popular [1]. Cloud computing companies provide almost all possible methods to process data: storing, changing, sharing with others, and eventually deleting [2,3]. However, the most attractive feature that distinguishes cloud computing from other traditional storage systems is its convenience; data can be obtained anywhere and anytime, instantly, with a connection to the internet [4]. People are looking for convenient, fast, and inexpensive systems to facilitate their tasks, which has become a factor in creation of the cloud systems, which respond to these needs but also face serious problems related to the security of data provided by users [5].
Depending on the services needed by organizations and individuals, cloud computing can be characterized by three existing models: software as a service (SaaS), platform as a service (PaaS), and infrastructure as a service (IaaS) [6,7,8]. SaaS includes software applications that are provided to customers for use in the cloud without any installation on their desktops, laptops, and so on. Today, various SaaS platforms around the world can be used [9]. The well-known examples of Saas platforms are Amazon’s EC2 [10], Amazon’s S3 [11], IBM’s Blue Cloud [12], Google App Engine [13], Yahoo Pig, Google Apps [14], Dropbox [15], and Salesforce’s Customer Relation Management (CRM) system [16]. PaaS provides a platform for developing software applications provided by cloud computing, e.g., Amazon Web Services [10] and Window Azure [17]. IaaS comprises infrastructure, such as servers, operating systems, networks, and so on, that is provided to users through virtualization. Virtualization is the principal enabling core of cloud computing; it uses software to split one computer device into multiple independent computing devices, where each can be used to perform computing tasks. This helps to efficiently allocate and use the usually idle computing resources, reduces cost, and reliably increases infrastructure use. Among IaaS systems are DigitalOcean, Linode, Rackspace, Microsoft Azure, GCE, and so forth [18,19,20]. These systems significantly increase work efficiency in organizations at a relatively low price.
These organizations provide services in a pay-as-you-use manner at a relatively low price. For companies, it is easier to use cloud systems, because users can save the investments that would have been used for building their infrastructure [21].
In addition to these benefits, cloud computing has other advantages, such as increased efficiency, portability, scalability, and flexibility [22].
Cloud computing provides computing infrastructure resources over a network for organizations to use [23]. Despite their many benefits, cloud computing faces several challenges related to the security of access control and privilege management [24]. Potential threats can be addressed using machine learning algorithms [25].
Scalability, integrity, and data access are examples of the security and privacy challenges that face every cloud computing user. Data access is a significant aspect of the service of the cloud, without which the platform would not be able to operate or be so popular among users. Therefore, unauthorized access in a cloud system is one of the important problems that must be solved or prevented so that every user is able to trust the provider with their sensitive data [26].
Thus, the security and privacy of cloud data are important, as people and organizations are concerned that their data might fall into the hands of third parties who can use them for their own purposes [27]. For example, cloud data need to be secured within a trusted domain, e.g., data owners, industry and large organization data, and federal agents. Additionally, cloud data need to be saved in a fully trusted cloud database. Unauthorized data access is one of the existing security issues experienced by cloud computing systems [28] that must be solved and continuously reviewed.
Several mechanisms have been proposed to secure data access in cloud computing. These works have considered various problems in cloud computing and tried to achieve fine-grained, scalable, and secure data access control. For example, Yu et al. [18] combined three cryptographic techniques: KP-ABE, PRE, and lazy re-encryption. Each of them solves specific issues, creating an entire fine data access system. However, the system cannot properly handle multiple levels of attribute authorities. To solve this problem, Wan et al. [19] introduced a new approach that extended Yu et al.’s [18] solution. To achieve flexible and scalable data access control in cloud computing, Yu et al. [18] implemented a hierarchical attribute-set-based encryption (HASBE) algorithm. Wang et al. [20] combined two existing algorithms, hierarchical identity-based encryption (HIBE) and cipher-text policy attribute-based encryption (CP-ABE). After this, Wang et al. [20] investigated performance trade-offs and then used proxy and lazy re-encryption algorithms on the given output. Other researchers [29,30,31,32,33] proposed other solutions for secure access control, which are described in detail in Section 3. However, the proposed methods have their own disadvantages and shortcomings. For example, these methods are susceptible to several types of cyber threats. Some of them use public key encryption, which is slow compared to symmetric encryption and has many potential certification issues since it depends on a third party. Different from existing methods, we constructed a method called HUDH. The HUDH scheme provides the integration of three state-of-the-art algorithms for data encryption, data access, and intrusion detection. These algorithms successfully restrict unauthorized access in cloud computing systems. The proposed HUDH scheme is practically applicable because the algorithms in the approach are already standardized, being used separately in business and market-oriented applications.

2. Problem Statement

Security is an essential issue for any computing environment [34], as the data, hardware, and software should be protected from unauthorized access. A cloud service provider in cloud computing offers computational services and virtualization over the Internet. It provides critical services such as restricting unauthorized access, maintaining data integrity, and ensuring data availability. To ensure cloud security, security challenges must be addressed to take full advantage of this computing paradigm [35]. To deeply analyze the origins of unauthorized access in cloud computing systems regarding causation, we followed a proper research methodology. The existing schemes were studied in accordance with research strategies to address the problem of unauthorized access in cloud computing. Thus, we found that an efficient model of access control including encryption, data access, and intrusion detection algorithms is required to secure data in the cloud, as well as to implement measures against intrusion [36].
Cloud computing involves an enormous number of devices, applications, and parties that make designing a secure data sharing framework a difficult task to accomplish. Moreover, data on the cloud are susceptible to various threats such as losses, accidental alteration by the cloud provider, and attacks [37]. Thus, developing a complete security method for cloud storage is necessary.
To achieve our aim, we reviewed the existing methods used in cloud computing systems and investigated the use of advanced security mechanisms such as advanced data encryption, secure data access, and accurate intrusion detection mechanism to build a secure cloud computing model. We then used a popular dataset to test and analyze the ability of the proposed model to resist many types of attacks.

2.1. Motivation

Building security mechanisms for cloud storage is an essential task. Legitimate participants who want to share their data on the cloud want secure control and access mechanisms, as well as fast and safe sharing of data on demand. The currently available methods have several shortcomings. Thus, we need a robust cloud computing system that can provide advanced data encryption, secure data access, and accurate intrusion detection mechanism.

2.2. Paper Contributions and Organization

In this paper, a mechanism for handling unauthorized data access in cloud computing is proposed consisting of several steps: data encryption, data access, and intrusion detection [38,39,40]. To implement these mechanisms, we used some existing efficient algorithms: advanced encryption standards (AES) for encryption, the attribute-based access control (ABAC) algorithm for data access control, and the hybrid intrusion detection (HID) algorithm for unauthorized access detection.
  • The HUDU scheme integrates three state-of-the-art algorithms (ABAC, AES, and HIDS) to ensure data security, user authentication, and prevention of potential threats in cloud computing.
  • The HIDS algorithm combines the features of two known algorithms (random forest and neural network) for important feature selection and training the data. A higher accuracy was achieved with the integration of these two algorithms.
  • The proposed HUDU scheme was tested on the known UNSW-NB15 dataset using Class 4 and Class 6 to confirm its accuracy. The use of different classes provides a new direction for ensuring security, as higher accuracy was achieved when employing a higher class.
This paper is organized as follows: Section 3 reviews the existing schemes used to solve the unauthorized data access problem in cloud computing environments. Section 4 presents a detailed description of the proposed scheme (HUDH) involving three algorithms: AES for encryption, ABAC for data access control, and HID for unauthorized access detection. Section 5 shows the implementation of the method and presents the results in detail. Section 6 discusses the results. Section 7 concludes the paper.

4. Proposed Hybrid Unauthorized Data Handling

In this study, we designed a mechanism that considers data access control from a different point of view. One algorithm protects the data if the intruder gains access by encrypting the data. The second algorithm handles the access issue and safely delegates rights to a user. Th final algorithm provides fast detection of intrusions to the system, which enables quick reactions.
When the data are loaded to the cloud, the system automatically applies the AES algorithm to encrypt and store the data in the database. Each data owner has a secret key and the consumer of that data has the public key. Therefore, if the data are in the possession of the third party, it is not possible to use the data, because there is no key to decipher them. For each user, data and actions have their own attributes, which will be used by the ABAC algorithm to find relationships among them and provide access if the policy allows it. The intrusion detection system works when the session of the user starts and continues monitoring until the session ends. If the system identifies anomalous behaviors of the user, it instantly sends a notification message to the administrator.
In Algorithm 1, Step 1 shows the initialization process of the used variables. The input and output processes are shown at the beginning of the algorithm. Steps 2–7 apply the AES cryptographic algorithm to obtain encrypted data by substituting bytes, shifting rows, and mixing columns. The time complexity of the encryption algorithm is O ( m + n ) , where n is the text length and m is the pattern length.
Algorithm 1: Encryption algorithm.
Input:  D , I k in
Output:  D e out
 1: Initialization:  { D : data, D b : database, D e : encrypted data, A e : AES algorithm, N: number of iterations, I k : initial key, R k : round key, B r : byte rows, B c : byte columns, B: bytes}
 2: Do
 3: Add  R k
 4: Apply  A e on D by shifting B
 5: Shift  B r
 6: Mix  B c
 7: While  N 9
 8: Get  D e
 9: Store  D e D b
Step 8 shows the created encrypted data. Step 9 is the process of storing the encrypted data in the database.
In Algorithm 2, Step 1 shows the initialization process of the used variables. The input and output processes are shown at the beginning of the algorithm. Step 2 shows the process of defining policies in a JSON file. Step 3 makes a request to the JSON file to obtain access to data. Steps 4–10 show if a user has the right to the data, then access is granted, and data is obtained from the database and deciphered; otherwise, an error message is returned to the request, as defined in (1):
a ( x ) = 03 x 3 + 01 x 2 + 01 x + 02
Algorithm 2: Data access algorithm.
Input:  R d in
Output:  D d , M e out
  1: Initialization:  { D b : database, D e : encrypted data, D d : deciphered data, M e : message, P r : policy, J: JSON file, R d : request to data access, K: key for deciphering, A e : AES algorithm}
  2: Set  P r J
  3: R d J
  4: if  R d A into J then
  5:     Get  D e from D b
  6:     Use K and A e on D e
  7:     return  D d
  8: else
  9:     return  M e
  10: end if
The encryption step involves the AES algorithm to obtain ciphertext. The most important part of this algorithm is the key expansion. The encryption consists of multiple rounds depending on the size of the key, and each round has a new key. The routine creates 4 x ( N r + 1 ) words, where N r is the number of rounds. We can use a particular equation to calculate and find keys in each round easily, as defined in (2):
K [ n ] : s [ i ] = k [ n 1 ] : s [ i ] k [ n ] : s [ i ]
where k is the size of the key that consists of 16 bytes and s represent every four bytes of that key.
For s 0 , we have to use a particular equation that is different from the above equation as follows. These equations are used to find a key for each round rather than s 0 , the key from our initial key, where R c o n [ i ] is the round constant for round i of the key expansion:
K [ n ] = s 0 = k [ n 1 ] : s 0 S u b B y t e ( k [ n 1 ] : s 3 8 R c o n [ i ] )
where n is a constant that is equal to 0 × 63 , A r r is the initial bytes, and x r is the number of required rounds to find the inverse. I is determined by the following nonlinear equation:
I = A r r + x r + n
To calculate the number of rounds n r , we use the following mathematical equation, as defined in (5):
n r = S k S b + 6
where S k is the key size and S b is the block size.
To implement byte substitution, Equation (6) is used:
b i = b i b ( i + 4 ) m o d 8 b ( i + 5 ) m o d 8 b ( i + 6 ) m o d 8 b ( i + 7 ) m o d 8 C i
where b i is the ith bit of the given byte and c i is a bit of the given byte with a specific value. The time complexity of the data access algorithm in the best case is O ( l o g n ) ; in the worst case, it is O ( n ) .
In Algorithm 3, Step 1 involves the initialization process of the used variables. The input and output processes are shown at the beginning of the algorithm. Step 2 involves the process of training our models using a dataset and previous log files.
Algorithm 3: Intrusion detection algorithm.
Input:  D s , L o in
Output:  M e out
  1: Initialization:  { D s : dataset, D: data, A: access, M e : message, L o : data in the log, U: user, P a : patterns, S: active status, M s d : supervised model, M u d : unsupervised model, N l : new log}
  2: Train  M s d and M u d using D s and L o
  3: Do
  4: Apply  M s d on N l
  5: if  N l P a then
  6:      return  M e
  7: else
  8:     Apply  M u d on N l
  9: end if
  10: if  N l L o then
  11:     return  M e
  12: end if
  13: While  U S
  14: End while
Steps 3–13 check the current log of the user against known patterns and previous logs using trained models to check if the behavior of that user is malicious, then returns a warning message about intrusion to the administrator of the system; otherwise, it continues to monitor while the user is active. The space complexity of the intrusion detection algorithm could be slightly higher because it iteratively applies a depth-first search, which depends on the depth or length of the optimal path. Thus, the space complexity is O ( b m ) , where m is the maximum depth and b is the branching factor.
In order S r , which is the next step in the AES algorithm, we use Equation (7):
S r , c = S r , ( c + s h i f t ( r , N b ) ) m o d N b , f o r 0 < r < 4 a n d 0 c N b ,
where the function s h i f t ( r , N b ) is used to create the action of moving the byte to a lower position and is dependent on the row value, s is the byte substitution table, r is the row, c is the column, and N b is the number of bytes.
For mixing the columns, a fixed polynomial is used, which is given by:
a ( x ) = 03 x 3 + 01 x 2 + 01 x + 02 ,
Figure 1 shows our data encryption algorithms. As can be observed, the initial file is ciphered using the key. The key changes in each round of the AES process.
Figure 1. Data encryption with the AES algorithm.
Then, the byte substitution is implemented and the rows are shifted.
After this, the column will be mixed and the process starts again. We used a dataset to display the intrusion detection ability of the method. This dataset contains categorical and numerical features with various scales, which is why we needed to transform it into metric space, i.e., numerical values, and normalize the values. Below, we demonstrate this method.
Each categorical feature expressing the number n of possible categorical values is transformed using a function f that maps the mth value of the feature to the mth component of an n-dimensional vector:
f ( x i ) = ( 0 , , 1 , , , n )
where x i is to the value of m. From the formula, the 1 in the braces is located at position m. Then, we apply these new features and the numerical features’ scaling function by using their corresponding mean μ and standard deviation σ values:
f ( x i ) = x i μ σ
To create the neural network, we used the following formula for one neuron to calculate the hidden layer:
h ( x ) = σ w j + i = 1 n w i j x i
where σ is a nonlinear activation function, x i is the initial ith node, and w is the weight for that particular node. To calculate the output, we use the following equation:
o = j = 1 n h j w j
where h is a hidden layer and w is the weight for that particular hidden layer.
The attributes of each entity define the identity and characteristics of its corresponding entity.
  • A r e q = R e q A t t r i | i [ 1 , I ] ;
  • A s e r v = S e r v A t t r j | j [ 1 , j ] ;
  • A r e s = R e s A t t r k | k [ 1 , K ] ;
  • A e n v = E n v A t t r l | l [ 1 , L ] , where I, J, K, and L represent the maximum number of attributes per entity.
Security policies are defined in the system of the cloud. Each user may have their own defined policies. The policy P supported by ABAC as a superset of these policies is defined in (13):
P = P m [ I , M ] , P m i s a p o l i c y ,
The access decision is made based on policy evaluation by the decision function f. P f is the evaluation function of policy p n and is defined as follows:
P n f ( A r e q , A s e r v , A r e s , A e n v ) = p e r m i t o r d e n y
Polices are evaluated by providing the attributes of the entities to the decision function f, defined as follows.
D e c i s i o n A B A C = f ( R e q u e s t o r , S e r v i c e , R e s o u r c e , E n v i r o n m e n t ) P 1 f ( R e q u e s t o r ) & P 2 f ( S e r v i c e ) & P 3 f ( R e s o u r c e ) & P 4 f ( E n v i r o n m e n t ) ,
The encrypted data are sent to the database. When the user requests the data, the ABAC algorithm is called. The algorithm uses the JSON file where all the policies and rules are defined.
The attributes of the user, action, and data are compared. When correspondence is found, access is granted to the user. The AES algorithm is applied again to decipher the encrypted data.
Figure 2 illustrates the data encryption and data access algorithms. The initial key is transformed to another form using key expansion. This key is then used to encrypt data using logical XOR.
Figure 2. Data encryption and data access algorithms.

5. Implementation and Experimental Results

In this section, we describe the implementation of the mechanism. This mechanism was built on a MacOS Mojave operating system using PyCharm IDE in Python 3.4.6 language and IntelliJ IDE in Java. Additionally, we used JSON files to write policies and rules to implement the ABAC data access algorithm. All the data and user information were stored in a MySQL database. The existing UNSW-NB15 dataset [52] was used for IDS implementation. Normal and abnormal network traffic were generated in a real-world test bed. The abnormal traffic data were generated using hacking tools, which consist of nine types of attacks: denial-of-service, worms, shellcode, exploits, generic, fuzzers, backdoors, reconnaissance, and analysis. A ground truth table consisting of the nine types of simulated attacks was used to label the dataset. The dataset input was a set of instances. Each instance contained various data types such as float, integer, binary, and nominal data. The instances that belong to one of the nine types of simulated attacks were labeled as abnormal (i.e., one), and the normal instances were labeled as zero. The number of instances (i.e., the rows in two-dimensional space) in the dataset is 2,540,044, with 2,218,761 belonging to the normal class and the remaining 321,283 belonging to the abnormal class. The number of features (i.e., the columns in two-dimensional space) is 49, representing the network packet fields such as source IP address, destination IP address, source port number, destination port number, and so on. Table 1 shows the materials used for implementation.
Table 1. Materials Used.
To justify the effectiveness of the proposed HUDH, several scenarios were generated that fell within two classes:
  • Data access process by legitimate/malicious insider user;
  • Data access process by outside user (malicious outsider).

5.1. Legitimate/Malicious Insider

Figure 3 shows the data access process. When the user tries to access data, the ABAC algorithm is activated. The algorithm involves the user, data, and action attributes from the database, as well as the policies from the JSON file to check the privileges and rights. IntelliJ IDEA is used to implement the ABAC algorithm while Apache Tomcat and Maven are used to run the process. The JSON file is used for data access.
Figure 3. Data access mechanism.
When the user sends a request to the system, it is sent to the permission-evaluator component, where it is further delegated to the policy-enforcement component, which makes all the decisions based on the rules. The policy-definition component is used to load all the policy rules. After loading the policy enforcement, it compares the request against the rules and if the system returns true, access is granted. If a user attempts to access data for which the user does not have the rights, access is denied and the user is considered a malicious insider. Furthermore, the IDS algorithm constantly monitors the requests of the user; if it identifies anomalous activity, then the system administrator is notified.

5.2. Malicious Outsider

Figure 4 illustrates the case of an attack on the cloud system. The attacker uses the command-and-control method to obtain control to cloud providers, then it generates the attacks. In this case, 5000 attacks were generated. The cloud computing server has the support of the designed IDS algorithm, which continuously monitors for malicious activity and denies unusual requests.
Figure 4. Attacks from outside the cloud.

5.3. Results

Based on the testing results, the following parameters were determined:
  • Response time;
  • Predicted/normal behavior;
  • Accuracy.

5.3.1. Response Time

Figure 5a plots the result, which shows the time performance of each request in the administrator account. The scenario involves different kinds of requests of the system, as shown in Table 2. As can be seen, the performance for each request is different. We performed operations such as add, delete, view, list of users, and project tables. The operations for users’ tables required more time. The reason for this is the complexity of the table; it has many attributes and relationships.
Figure 5. (a) Time required for administrator and project manager accounts. (b) Confusion matrix of the classification model.
Table 2. Different types of tested requests.
We observed different response times, which are due to the use of the administrator account because the administrator has more privileges. When the administrator makes requests, it involves the database and JSON file, whereas the project manager only receives a denial, which only involves the JSON file. The results demonstrated that the administrator account requires 9.67 ms to complete the process; the project manager account requires 25.48 ms to complete the process.

5.3.2. Intrusion/Normal Behavior

Figure 5b shows the confusion matrix for the proposed HUDH. Based on the test results, normal and intrusion behaviors were determined. As shown by the confusion matrix, the proposed HUDH method has high prediction accuracy. It obtained 0.9816 correct answers and 0.0017 wrong answers for normal behavior and 0.9983 correct and 0.0184 wrong answers for intrusion behavior.

5.3.3. Accuracy

Figure 6a shows the feature importance of the random forest algorithm and Figure 6b demonstrates the feature importance for the neural network algorithm. The feature importance is chosen only for determining the required attributes to train the model. These chosen features mostly impact the definition of an intrusion to the system. We used the scikit-learn library to determine the feature importance in random forest as the RandomForestClassifier and RandomForestRegressor classes. The dataset is obtained from this link: “https://scikit-learn.org/stable/modules/generated/sklearn.datasets.make_classification.html (access on 4 December 2021)” make_classification() function was used to generate the test dataset. The dataset contained 1000 examples with input features. Of the input features, 50% were informative and 50% were redundant. The feature_importances_property was retrieved to obtain the importance scores for each input feature. The system uses random forest to classify data as normal or malicious data.
Figure 6. Feature importance for (a) random forest and (b) neural network.
The resulting information was then further used to train the neural network to classify the attack data based on the different attack categories. As such, we increased the accuracy of the scheme. We confirmed that the HIDS scheme works correctly by testing it. All the access decisions are based on the rules defined within the default policy (JSON file). All users are defined in the Memory_User_Details_Service file.
Figure 7a displays the receiver operating characteristic (ROC) curve of the classification model. It shows the ability of the classification model to diagnose the given request, based on the confusion matrix results. In summary, it shows the accuracy of the trained model. The ROC is equal to 0.9962, which is a good result, indicating the model can be used to predict whether a user is authorized by integrating different environments.
Figure 7. (a) ROC curve of the classification model. (b) Neural network model history.
Figure 7b demonstrates the history of the neural network model. The loss decreased and the model accuracy increased as the number of epochs increased. The accuracy increased 0.36% and loss reduced approximately 0.47%. Based on this result, we determined that proposed model produces high-accuracy performance.
Figure 8a illustrates the ROC curve of our neural network model for Class 6, which shows a 0.9975 accuracy. Classes 4, 2, and 0 are also considered important in the feature importance step. We only show Classes 6 and 4 for comparison purposes.
Figure 8. ROC curve for (a) Class 6 and (b) Class 4.
Figure 8b illustrates the ROC curve of our neural network model for Class 4, which shows a 0.9710 accuracy for this class, which had a high importance in the previous step. Based on the results, we can see that Figure 8a,b use a different class. Thus, they have different accuracy ratios. If a class is higher, then higher accuracy can be obtained.
The average accuracy of the HUDH scheme was determined with a maximum 5000 generated threats and compared to state-of-the-art algorithms: SA-DECC [42], SE-AC [43], BRNN-L, and IDTRE [46,49]. Based on the results (Figure 9), we found that the proposed HUDH method produced an average accuracy of 99.48%; the average accuracies of BRNN-L, IDTRE, SE-AC, and SA-DECC were 98.51%, 97.82%, 98.14%, and 97.79%, respectively, showing that the proposed scheme has higher average accuracy.
Figure 9. Average accuracy of the proposed HUDH and other state-of-the-art methods: IDTRE, BRNN-L, SE-AC, and SA-DECC, with a maximum of 5000 threats.

6. Discussion

The proposed HUDH scheme applies access control to reduce the possibility of unauthorized data access, enhances the accuracy of the intrusion detection system, and provides more protection by encrypting the data in the cloud. Table 3 lists the security capabilities of our scheme compared to those of the others. Table 3 shows that the proposed HUDH provides higher accuracy by integrating three algorithms. In comparison, the other methods possess either one or two features that lower their accuracy. The HUDH scheme is more accurate, at 99.48%, than the other methods. Compared to the other methods that try to provide a partial solution, our proposed HUDH scheme is a complete solution that prevents unauthorized access of cloud computing.
Table 3. Comparisons with existing methods.
We also chose to adopt a light symmetric encryption algorithm (i.e., AES) to avoid the expensive computations required for other public cryptography-based related methods. Asymmetric algorithms are slower than symmetric algorithms, and some solutions suggested using cloud servers for decryption, which is impractical in some situations such as IoT environments.
Moreover, unlike some related methods that use rule-based approaches (i.e., requiring an expert who is responsible for designing the hand-crafted IDS rules), the proposed IDS is automated (i.e., based on machine learning) and achieved decent results, including high accuracy as indicated in the experimental results section.

7. Conclusions

In this paper, we introduced the HUDH scheme, which combines three state-of-the-art algorithms (AES, ABAC, and HIDS) for improving data access control in cloud computing. The ABAC and AES algorithms are implemented using JSON files. The proposed HUDH scheme significantly improves data security and user authentication compared to other state-of-the-art methods. The intrusion detection accuracy is improved using the features of the neural network algorithm. One of the advantages of using the proposed algorithm is reducing the possibility of unauthorized data access. We implemented the ABAC and AES algorithms in Java and HIDS on Python. Th HIDS scheme obtains features from two algorithms: first, the random forest algorithm is used for the selection of important features; second, the neural network model is used to train the data.
The HIDS scheme also detects intrusion and the accuracy for Classes 4 and 6 were determined as 0.9710 and 0.9975, respectively. Moreover, the proposed HUDH scheme can enable the data owner to delegate most of the computation overhead to powerful cloud servers. Confidentiality of user access privilege and user secret key accountability are achieved. The formal security proofs showed that the proposed scheme is secure under standard cryptographic models. In the future, we will compare the HUDH scheme with similar types of state-of-the-art mechanisms using different quality of service parameters.

Author Contributions

A.R. and N.S., conceptualization, writing, idea proposal, methodology, and results; B.A. and M.A., conceptualization, draft preparation, editing, and visualization; A.M., writing and reviewing; A.A. draft preparation, editing, and reviewing. All authors have read and agreed to this version of the manuscript.

Funding

This work was partially supported by the Sensor Networks and Cellular System (SNCS) Research Center under grant 1442-002.

Acknowledgments

Taif University Researchers Supporting Project number (TURSP-2020/302), Taif University, Taif, Saudi Arabia. The authors gratefully acknowledge the support of the SNCS Research Center at the University of Tabuk, Saudi Arabia. In addition, the authors would like to thank the dean of scientific research at Shaqra University for supporting this work.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Shabbir, M.; Shabbir, A.; Iwendi, C.; Javed, A.R.; Rizwan, M.; Herencsar, N.; Lin, J.C.W. Enhancing security of health information using modular encryption standard in mobile cloud computing. IEEE Access 2021, 9, 8820–8834. [Google Scholar] [CrossRef]
  2. Borylo, P.; Tornatore, M.; Jaglarz, P.; Shahriar, N.; Chołda, P.; Boutaba, R. Latency and energy-aware provisioning of network slices in cloud networks. Comput. Commun. 2020, 157, 1–19. [Google Scholar] [CrossRef]
  3. Razaque, A.; Frej, M.B.H.; Alotaibi, B.; Alotaibi, M. Privacy Preservation Models for Third-Party Auditor over Cloud Computing: A Survey. Electronics 2021, 10, 2721. [Google Scholar] [CrossRef]
  4. Kassab, W.A.; Darabkh, K.A. A–Z survey of Internet of Things: Architectures, protocols, applications, recent advances, future directions and recommendations. J. Netw. Comput. Appl. 2020, 163, 102663. [Google Scholar] [CrossRef]
  5. Sun, P. Security and privacy protection in cloud computing: Discussions and challenges. J. Netw. Comput. Appl. 2020, 160, 102642. [Google Scholar] [CrossRef]
  6. Fernandes, D.A.; Soares, L.F.; Gomes, J.V.; Freire, M.M.; Inácio, P.R. Security issues in cloud environments: A survey. Int. J. Inf. Secur. 2014, 13, 113–170. [Google Scholar] [CrossRef]
  7. Guan, S.; Niu, S. Stability-Based Controller Design of Cloud Control System With Uncertainties. IEEE Access 2021, 9, 29056–29070. [Google Scholar] [CrossRef]
  8. Namasudra, S. Cloud computing: A new era. J. Fundam. Appl. Sci. 2018. Available online: http://jfas.info/psjfas/index.php/jfas/article/view/3986 (accessed on 4 December 2021).
  9. Amani, M.; Ghorbanian, A.; Ahmadi, S.A.; Kakooei, M.; Moghimi, A.; Mirmazloumi, S.M.; Moghaddam, S.H.A.; Mahdavi, S.; Ghahremanloo, M.; Parsian, S.; et al. Google earth engine cloud computing platform for remote sensing big data applications: A comprehensive review. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2020, 13, 5326–5350. [Google Scholar] [CrossRef]
  10. Amazon, E.C. Amazon Web Services. 2015. Available online: http://aws.amazon.com/es/ec2/ (accessed on 9 November 2015).
  11. Cloud, A.E.C. Amazon Web Services. 2011. Available online: https://aws.amazon.com/about-aws/whats-new/2011/ (accessed on 9 November 2011).
  12. Martin, R. IBM Brings Cloud Computing to Earth with Massive New Data Centers. informationweek; August 2008. Available online: https://www.informationweek.com/cloud/ibm-brings-cloud-computing-to-earth-with-massive-new-data-centers (accessed on 4 December 2021).
  13. I. Google, “Google App Engine”. 2014. Available online: https://searchaws.techtarget.com/definition/Google-App-Engine (accessed on 22 July 2014).
  14. Kulkarni, G. Cloud computing-software as service. Int. J. Cloud Comput. Serv. Sci. 2012, 1, 11. [Google Scholar] [CrossRef]
  15. Rai, R.; Sahoo, G.; Mehfuz, S. Securing software as a service model of cloud computing: Issues and solutions. arXiv 2013, arXiv:1309.2426. [Google Scholar]
  16. Neubert, B.C.M. Valuation of a Saas Company: A Case Study of Salesforce. Com. Innov. Manag. Entrep. Sustain. 2018, 166–178. [Google Scholar]
  17. Azure, E. “Azure Web Services”. Available online: https://azure.microsoft.com/en-us/ (accessed on 1 November 2021).
  18. Yu, S.; Wang, C.; Ren, K.; Lou, W. Achieving secure, scalable, and fine-grained data access control in cloud computing. In Proceedings of the IEEE International Conference on Computer Communications, San Diego, CA, USA, 14–18 March 2010; pp. 1–9. [Google Scholar]
  19. Wan, Z.; Deng, R.H. HASBE: A hierarchical attribute-based solution for flexible and scalable access control in cloud computing. IEEE Trans. Inf. Forensics Secur. 2011, 7, 743–754. [Google Scholar] [CrossRef]
  20. Wang, G.; Liu, Q.; Wu, J. Hierarchical attribute-based encryption for fine-grained access control in cloud storage services. In Proceedings of the 17th ACM Conference on Computer and Communications Security, New York, NY, USA, 4–8 October 2010; pp. 735–737. [Google Scholar]
  21. Razaque, A.; Jararweh, Y.; Alotaibi, B.; Alotaibi, M.; Hariri, S.; Almiani, M. Energy-efficient and secure mobile fog-based cloud for the Internet of Things. Future Gener. Comput. Syst. 2022, 127, 1–13. [Google Scholar] [CrossRef]
  22. Singh, J.; Dhiman, G. A Survey on Cloud Computing Approaches. Mater. Today Proc. 2021. Available online: https://www.semanticscholar.org/paper/A-survey-on-cloud-computing-approaches-Singh-Dhiman/c3f66cff012e74ab8328fc4972a216b493b60109 (accessed on 4 December 2021).
  23. Ngabo, D.; Wang, D.; Iwendi, C.; Anajemba, J.H.; Ajao, L.A.; Biamba, C. Blockchain-based security mechanism for the medical data at fog computing architecture of internet of things. Electronics 2021, 10, 2110. [Google Scholar] [CrossRef]
  24. Almusaylim, Z.A.; Jhanjhi, N.Z. Comprehensive review: Privacy protection of user in location-aware services of mobile cloud computing. Wirel. Pers. Commun. 2020, 111, 541–564. [Google Scholar] [CrossRef]
  25. Razaque, A.; Frej, M.B.H.; Sabyrov, D.; Shaikhyn, A.; Amsaad, F.; Oun, A. Detection of Phishing Websites using Machine Learning. In Proceedings of the 2020 IEEE Cloud Summit, Harrisburg, PA, USA, 21–22 October 2020; pp. 103–107. [Google Scholar]
  26. Masud, M.; Gaba, G.S.; Choudhary, K.; Alroobaea, R.; Hossain, M.S. A robust and lightweight secure access scheme for cloud based E-healthcare services. Peer-Peer Netw. Appl. 2021, 14, 3043–3057. [Google Scholar] [CrossRef]
  27. Razaque, A.; Amsaad, F.; Hariri, S.; Almasri, M.; Rizvi, S.S.; Frej, M.B.H. Enhanced grey risk assessment model for support of cloud service provider. IEEE Access 2020, 8, 80812–80826. [Google Scholar] [CrossRef]
  28. Razaque, A.; Almiani, M.; Khan, M.J.; Magableh, B.; Al-Dmour, A.; Al-Rahayfeh, A. Fuzzy-gra trust model for cloud risk management. In Proceedings of the 2019 Sixth International Conference on Software Defined Systems (SDS), Rome, Italy, 10–13 June 2019; pp. 179–185. [Google Scholar]
  29. Li, M.; Yu, S.; Ren, K.; Lou, W. Securing personal health records in cloud computing: Patient-centric and fine-grained data access control in multi-owner settings. In Proceedings of the International Conference on Security and Privacy in Communication Systems, Singapore, 7–9 September 2010; Springer: Berlin/Heidelberg, Germany, 2010; pp. 89–106. [Google Scholar]
  30. Nurmi, D.; Wolski, R.; Grzegorczyk, C.; Obertelli, G.; Soman, S.; Youseff, L.; Zagorodnov, D. The eucalyptus open-source cloud-computing system. In Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid, Shanghai, China, 18–21 May 2009; pp. 124–131. [Google Scholar]
  31. Khan, A.R. Access control in cloud computing environment. ARPN J. Eng. Appl. Sci. 2012, 7, 613–615. [Google Scholar]
  32. Zissis, D.; Lekkas, D. Addressing cloud computing security issues. Future Gener. Comput. Syst. 2012, 28, 583–592. [Google Scholar] [CrossRef]
  33. Hota, C.; Sanka, S.; Rajarajan, M.; Nair, S.K. Capability-based cryptographic data access control in cloud computing. Int. J. Adv. Netw. Appl. 2011, 3, 1152–1161. [Google Scholar]
  34. Namasudra, S. Taxonomy of DNA-based security models. In Advances of DNA Computing in Cryptography; Chapman and Hall/CRC: Boca Raton, FL, USA, 2018; pp. 37–52. Available online: https://link.springer.com/article/10.1007/s12652-021-02942-2 (accessed on 4 December 2021).
  35. Chinnasamy, P.; Deepalakshmi, P. HCAC-EHR: Hybrid cryptographic access control for secure EHR retrieval in healthcare cloud. J. Ambient. Intell. Humaniz. Comput. 2021, 1–19. [Google Scholar] [CrossRef]
  36. Kumar, K.S.; Nair, S.A.H.; Roy, D.G.; Rajalingam, B.; Kumar, R.S. Security and privacy-aware Artificial Intrusion Detection System using Federated Machine Learning. Comput. Electr. Eng. 2021, 96, 107440. [Google Scholar] [CrossRef]
  37. Han, S.; Han, K.; Zhang, S. A data sharing protocol to minimize security and privacy risks of cloud storage in big data era. IEEE Access 2019, 7, 60290–60298. [Google Scholar] [CrossRef]
  38. Razaque, A.; Rizvi, S.S. Privacy preserving model: A new scheme for auditing cloud stakeholders. J. Cloud Comput. 2017, 6, 1–17. [Google Scholar] [CrossRef]
  39. Yang, K.; Jia, X. ABAC: Attribute-based access control. In Security for Cloud Storage Systems; Springer: New York, NY, USA, 2014; pp. 39–58. [Google Scholar]
  40. Almiani, M.; AbuGhazleh, A.; Al-Rahayfeh, A.; Atiewi, S.; Razaque, A. Deep recurrent neural network for IoT intrusion detection system. Simul. Model. Pract. Theory 2020, 101, 102031. [Google Scholar] [CrossRef]
  41. Zhang, Z.; Zeng, P.; Pan, B.; Choo, K.K.R. Large-universe attribute-based encryption with public traceability for cloud storage. IEEE Internet Things J. 2020, 7, 10314–10323. [Google Scholar] [CrossRef]
  42. Pourvahab, M.; Ekbatanifard, G. Digital forensics architecture for evidence collection and provenance preservation in IaaS cloud environment using SDN and blockchain technology. IEEE Access 2019, 7, 153349–153364. [Google Scholar] [CrossRef]
  43. Riad, K.; Hamza, R.; Yan, H. Sensitive and energetic IoT access control for managing cloud electronic health records. IEEE Access 2019, 7, 86384–86393. [Google Scholar] [CrossRef]
  44. Hahn, C.; Kim, J.; Kwon, H.; Hur, J. Efficient Iot Management with Resilience to Unauthorized Access to Cloud Storage. IEEE Trans. Cloud Comput. 2020. Available online: https://ieeexplore.ieee.org/abstract/document/9056529/ (accessed on 4 December 2021).
  45. Alkadi, O.; Moustafa, N.; Turnbull, B.; Choo, K.K.R. A deep blockchain framework-enabled collaborative intrusion detection for protecting IoT and cloud networks. IEEE Internet Things J. 2020, 8, 9463–9472. [Google Scholar] [CrossRef]
  46. Kimmel, J.C.; Mcdole, A.D.; Abdelsalam, M.; Gupta, M.; Sandhu, R. Recurrent Neural Networks Based Online Behavioural Malware Detection Techniques for Cloud Infrastructure. IEEE Access 2020, 9, 68066–68080. [Google Scholar] [CrossRef]
  47. Abdelsalam, M.; Krishnan, R.; Huang, Y.; Sandhu, R. Malware detection in cloud infrastructures using convolutional neural networks. In Proceedings of the 2018 IEEE 11th International Conference on Cloud Computing (CLOUD), San Francisco, CA, USA, 2–7 July 2018; pp. 162–169. [Google Scholar]
  48. Al Makdi, K.; Sheldon, F.T.; Hussein, A.A. Trusted Security Model for IDS Using Deep Learning. In Proceedings of the 2020 3rd International Conference on Signal Processing and Information Security (ICSPIS), Dubai, United Arab Emirates, 25–26 November 2020; pp. 1–4. [Google Scholar]
  49. Namasudra, S. An improved attribute-based encryption technique towards the data security in cloud computing. Concurr. Comput. Pract. Exp. 2019, 31, e4364. [Google Scholar] [CrossRef]
  50. Namasudra, S.; Roy, P. PpBAC: Popularity based access control model for cloud computing. J. Organ. End User Comput. (Joeuc) 2018, 30, 14–31. [Google Scholar] [CrossRef]
  51. Namasudra, S.; Roy, P.; Balusamy, B.; Vijayakumar, P. Data accessing based on the popularity value for cloud computing. In Proceedings of the 2017 International Conference on Innovations in Information, Embedded and Communication Systems (ICIIECS), Coimbatore, India, 17–18 March 2017; pp. 1–6. [Google Scholar]
  52. Moustafa, N.; Slay, J. UNSW-NB15: A comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set). In Proceedings of the 2015 Military Communications and Information Systems Conference (MilCIS), Canberra, ACT, Australia, 10–12 November 2015; pp. 1–6. [Google Scholar]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.