Next Article in Journal
Optimization of Robust LMI-Control Systems for Unstable Vertical Plasma Position in D-Shaped Tokamak
Next Article in Special Issue
The Unfolding: Origins, Techniques, and Applications within Discrete Event Systems
Previous Article in Journal
Embedded Learning Approaches in the Whale Optimizer to Solve Coverage Combinatorial Problems
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Poliseek: A Fast XACML Policy Evaluation Engine Using Dimensionality Reduction and Characterized Search

1
Institute of Systems Security and Control, School of Computer Science and Technology, Xi’an University of Science and Technology, Xi’an 710054, China
2
School of Computer Science and Technology, Xidian University, Xi’an 710071, China
3
School of Cyber Engineering, Xidian University, Xi’an 710071, China
*
Author to whom correspondence should be addressed.
Mathematics 2022, 10(23), 4530; https://doi.org/10.3390/math10234530
Submission received: 21 October 2022 / Revised: 21 November 2022 / Accepted: 26 November 2022 / Published: 30 November 2022
(This article belongs to the Special Issue Systems Engineering, Control, and Automation)

Abstract

:
Due to evaluation performance limits and compatibility problems with PDP (Policy Decision Point) in practical information systems, some established schemes have limits in handling massive complex requests. To address the existing challenges of fast rule match on interval values, we propose a novel policy evaluation engine, namely Poliseek with three desired modules. A preprocessing module of Poliseek is equipped with a static encoding strategy and converts the XACML rules and requests into four-dimensional numeric vectors in an attribute space. Owing to a novel optimization object of minimizing interval collisions, a dimensionality reducer and diffuser module can generate candidate values related to each rule vector in the identification space. These values and requests are handled by a fast policy evaluation module using well-constructed hash buckets and a characterized search algorithm. The experimental results show that if the number of requests reaches 10,000, Poliseek can find the target rule approximately 1090, 15, and 15 times faster than the Sun PDP, XEngine, and SBA-XACML, respectively. Poliseek also offers a fast evaluation progress of handling 10,000 complex policy rules with interval attribute values in 275.9 ms, which shows its strong robustness and practicality.

1. Introduction

With the rise in artificial intelligence and the emergence of cyberspace security challenges, the SOA (Service Oriented Architecture) [1] needs to process complex requests from a large number of Internet users. In a generic SOA system using resource access control [2,3] and decision response [4], a PDP (Policy Decision Point) is required to perform access control, policy match, and the authorization decision of user requests [5,6]. As a PDP forms a valuable component of SOA, its improvement of evaluation performance is a key to boost the efficiency of an SOA system.
XACML (eXtensible Access Control Markup Language) [7], an XML-based authentication policy language, is responsible for the formal description of user requests and PDP resource policies. An application scenario of PDP is related to a specific policy set that has multiple policies. A policy represented by XACML contains multiple rules and each rule in a policy is composed of Target, Condition, and Effect. Target mainly contains specific attributes to describe web resources such as Subject, Resource, and Action, thereby forming a hierarchical structure. A PDP determines a result (Permit or Deny) by comparing each attribute in a user request with the original rules in an XACML policy. The design of a PDP evaluation engine with high efficiency focuses on performing a fast match between access requests and policy sets [8,9]. The difficulties that may be encountered are as follows.
(1) The number of rules in a policy used in daily applications may reach the tens of thousands. The emergence of large-scale requests will pose a significant challenge to the PDP evaluation performance.
(2) The available methods of rule match may have some difficulties in dealing with complex types of policies. Some attribute values of rules are set to be intervals when initializing policies. For example, Condition can be defined as 5:00 a.m. ≤ time ≤ 7:00 p.m., which needs to be thoroughly considered and handled during an evaluation process.
(3) The PDP evaluation performance restricts the wide application of an SOA system and requires a particular improvement.
In response to the above problems, Sun et al. [10] propose a clustering and reordering scheme based on statistical analysis and weighted rules. Ngo et al. [11,12] conduct multi-data-type interval decision diagrams, and Mourad et al. [13,14] establish a set-based XACML middle layer to normalize user requests and rules with predicate logic. We focus on devising a fast evaluation engine to reduce the dimensionality of a rule and implement an interval conversion scheme. The main contributions of this research are as follows.
(1) A new XACML policy converting mechanism is designed with the rule dimensionality reduction [15] and interval collision optimization [16]. Each rule is converted to a distinguishable identifier that simplifies the match progress significantly.
(2) A method for rule match is developed to deal with interval values after dimensionality reduction. It provides a new perspective to find the match result and lays the foundation to implement a fast match engine.
(3) The proposed fast evaluation machine, namely Poliseek, handles requests with intervals correctly. It introduces a cascading of adjacent intervals and an improved hash table, thereby realizing a quick determination of the interval range and the matching results.
This paper elaborates on Poliseek and its features in detail. Section 2 reviews related works on reordering and clustering with statistical analysis, decision diagram, finite-state machine, and predicate logic of the formal description. The critical notions and preliminary definitions are described in Section 3. Poliseek with dimensionality reduction and its composition is illustrated in Section 4. Section 5 details the description of the rule dimensionality reduction and diffusion module. Section 6 presents the policy evaluation module and interval-processing mechanism. The experimental results are provided in Section 7. Section 8 concludes this paper and proposes future research directions.

2. Related Work

Considering that conflicts and redundancies of policies have been handled properly [17,18,19,20], we focus more on improving the PDP evaluation performance. Several documented methods are summarized and classified in this section.

2.1. Rearranging Rules in Reordering and Clustering with Statistical Analysis

Sun et al. [10] address a secure policy permutation mechanism to improve evaluation performance by reducing the number of matching operations. According to the appearance patterns of rules and user requests, this scheme calculates the hit frequency and weight of a policy. Then, a clustering algorithm, UWK-means, suitable for large-scale policy sets, is used to optimize the reordering policies by weights. Marouf et al. [21] employ the K-means algorithm to categorize a policy and then conduct statistical analysis to perform policy reordering. The user requests are only compared against the most likely rules when making the match. Both the schemes in [10] and [21] require the user requests to be distributed by specific mathematical characteristics. Due to the need of continuously updating the sorting policies, the two schemes lack execution efficiency and practicality.
The method affirmed by Liu et al. [22] is a bi-objective optimization of decision trees based on a dynamic programming algorithm. They build a tree with a reasonable number of nodes and reasonable accuracy, construct a local tree and global misclassification rates of the tree to design the decision trees and evaluate the number of tree nodes. Iyer et al. [23] put forward an algorithm based on subclass enumeration and discovered the authorization rules under the condition of sacrificing specific computational efficiency.

2.2. Match Process with Decision Diagram and Finite-State Machine

2.2.1. Decision Diagram

Ngo et al. [11] propose a solution that uses numerical interval partitioning aggregation and decision diagrams for rules. The scheme in [11] needs to convert and formalize rules based on the predicate logic and semantic analysis. A multi-type of data is used in the decision diagram to evaluate policies. One can match interval attribute values in the construction of each node value, which can be operated by intersection, union, or complement using the constructed decision diagram. Hence, the calculated intervals are matched according to the order of attributes in the decision diagram.
Liu et al. [24,25] propose a policy evaluation system named XEngine that includes policy normalization and canonical representation. The policy normalization simplifies the logical structure of a policy while the canonical representation converts the policy into a multi-valued decision diagram. Pina et al. [26] improve the algorithm of policy evaluation by combining XACML policies. This method mainly depends on the matching tree and the combination tree that are used to search the applicable rules and the matching evaluation, respectively. Unfortunately, XEngine and its improved version acquire better performance than the Sun PDP [27] only when all requests are reached at the same time. The two methods lack practicality due to the excessive memory use.

2.2.2. Finite-State Machine

Aimed at the multi-user scenario in the cloud computing environment, Ayache et al. [28] also construct a policy verification system, X2Automata. It is in the form of a finite-state machine, including an initial state, transition states, a match state, and a non-match state. Moreover, Deng et al. [29] establish a series of suiting bitmaps for the attributes Subject, Resource, and Action to locate and mark whether a rule contains the requested value. The evaluation engine then performs the AND operation on the three bitmaps in the form of a finite-state machine. The automaton can determine the matching results combined with Condition. The automaton methods are easy to implement and fast in calculation but lack the further description about the case of interval attribute values.

2.3. Normalizing Policy Set with Predicate Logic in Formal Description

Mourad et al. [14] construct an SBA-XACML (Set-Based Algebra XACML) middle layer that contains XACML converters and compilers that can convert a traditional XACML policy using a set-based approach. Both the converted policy and the user requests are fed to a policy matching function, which uses the predicate logic to standardize the relationship between attribute values. The function also contains three types of algorithms to perform the rule match in the order of Rule-Policy-PolicySet. Combining predicate logic and standardized formal description language forms the core component of the SBA-XACML framework.
The SBA-XACML framework has a decent match performance for medium-sized complex policy sets but does not test on the large ones. Turkmen et al. [30,31,32] introduce SMTs (Satisfiability Modulo Theories) into the design of the XACML policy matching scheme. SMTs apply the predicate logic to formalize the XACML policies. Each attribute needs to be cascaded with Condition, and the applicable space and the decision space related to each rule describe the match operation. Requests and policies can be normalized in a rigorous specification using SMTs, but its highly abstract process can cause limits in implementation.
Deng et al. [33] innovatively apply the LDA (Latent Dirichlet Allocation) topic model to clustering strategies based on the common method of XDLEngine processing large-scale policy sets. All the rules in the policy set are digitized and vectorized, which reduces the number of comparisons in the rule matching process and improves the matching efficiency of XDLEngine. Fang et al. [34] designed a training pipeline based on a genetic algorithm in which the parameters can be reduced by 30% while achieving similar accuracy to DenseNet.
In conclusion, the rules in policies need to be reasonably converted to simplify the matching process. The formal description methods such as predicate logic and decision diagram are often applied to improve the PDP evaluation performance. However, there is a lack of suitable methods to optimize the evaluation performance of the attribute value interval of different types of rules.

3. Preliminary Definitions

This section will describe the related descriptions and notations used in our work. The essential abbreviations and notations are given below to standardize the construction of Poliseek.

3.1. Abbreviations

The abbreviations used in this paper are listed as follows.
  • R: a complex rule vector in an XACML policy.
  • RQ: a user request vector represented by XACML.
  • SR: a sub-rule vector with atomic attribute values.
  • ISR: a sub-rule vector with atomic attribute intervals.
  • CV: a candidate value, a single-point value in projection results obtained by dimensionality reduction.
  • ICV: a candidate interval, an interval value in projection results.

3.2. Conventional Notations

The conventional notations used in this paper are showed as follows.
  • : an integer set.
  • W: an identification space, a one-dimensional space of the dimensionality reduction results where CV’s and ICV’s are distributed.
  • | x | : denotes the absolute value of the variable x.
  • I: a unit matrix.
Moreover, numeric constants are represented by non-italicized uppercase letters, e.g., N, M. Their descriptions will be given at the first appearance.

3.3. Functions

The functions below can perform certain mathematical calculations for our work. We use the letter x or X to represent the input. Note that the letter as a lowercase stands for a value or a random variable, and the uppercase for a matrix or vector.
  • E(x) calculates the mathematical expectation of a random variable x.
  • f(x) refers to the probability density function of an attribute value x in request vectors.
  • T(X) returns an m × 1 vector whose elements are m sum values of each row in an m × n matrix X.

3.4. Operators

All the binary operators used in this paper are defined as follows.
  • A B represents the union of sets A and B.
  • A B represents the scalar product of two matrices or vectors A and B.
  • A B represents the operation of multiplying each element in an n × k matrix B by each row element on the corresponding position in a 1 × k column vector A; it returns an n × k matrix.
  • A B returns a matrix Cn×k whose element cij is the smaller value between the elements in the same position in n × k matrices A and B.
  • A B returns a matrix Cn×k whose element cij is the larger value between the elements in the same position in n × k matrices A and B.
In the following, we present a detailed description of some key notations for a clearer discussion.
Notation 1. Rule Vector and Complex Rule. A rule is composed of three elements: Target, Condition, and Effect. Target is mainly composed of three attributes: Subject, Resource, and Action. Since different rules may take the same value of Effect, we chose the four attributes (Subject, Resource, Action, and Condition) to identify a complex rule vector R uniquely. Thus, a complex rule vector can be defined as four sets of attribute values t i ( i = 1 , 2 , 3 , 4 ) , i.e.,
R : = ( t 1 , t 2 , t 3 , t 4 )
Vector R is the numerical result of a rule by the preprocessing modules of our work, and t i ( i = 1 , 2 , 3 , 4 ) in R acquire four numerical values sets of Subject, Resource, Action, and Condition, respectively. Moreover, there are many categories of attribute values in XACML 2.0 and 3.0. Thus, assuming that there are M categories with Ni values, the notation in Equation (1) is modified to satisfy the formal representation of the complex rules, as shown in Equation (2):
R = ( { t l j 1 } , { t l j 2 } , { t l j 3 } , { t l j 4 } )
where l [ 1 , M ] , j [ 1 , N i ] , and t l j i refers to the jth value in the lth category of the ith attribute. When R takes ({1, 2, 3}, {1}, {[3, 6]}, {[4, 9]}), then { t l j 1 } takes {1, 2, 3} and { t l j 2 } takes {1} as its value and so on.
Notation 2. Sub-rule Vector. The sub-rules formed by extracting the atomic values of R k , the kth complex rule vector in an XACML policy, are referred to as S R k . Specifically, the jth sub-rule derived from R k is denoted by S R k j , as shown in Equation (3):
S R k j = ( s t k 1 , s t k 2 , s t k 3 , s t k 4 )
where s t k i represents the ith value in the set t l j i of Rk. The complex rule R k can be expressed as R k = { S R k j } , where j ranges from 1 to the number of sub-rules. Thus, the four different S R k j of R k =({1, 2}, {1, 3}, {4}, {9}) are ({1}, {1}, {4}, {9}), ({2}, {1}, {4}, {9}), ({1}, {3}, {4}, {9}), and ({2}, {3}, {4}, {9}), respectively. The sub-rule vectors will be processed to provide a numerical identifier through dimensionality reduction. Equation (2) can be extended to Equation (4) to make it appropriate for interval attribute values.
I S R k j = ( i s t k 1 , i s t k 2 , i s t k 3 , i s t k 4 )
Note that an I S R k j can also be derived from the complex rule vector Rk.
Notation 3. Rule Space. We use K to denote the function space of the dimensionality reduction of a policy set, where the numeralization of Subject, Action, Resource, and Condition of each rule is located at four mutually orthogonal axes. Equation (5) can be used to describe the rule space of a specific policy set:
K : = ( L 1 , L 2 , L 3 , L 4 )
where L i ( i = 1 , 2 , 3 , 4 ) represents a set of values for each s t i in N sub-rules as shown in Equation (6).
L i = { s t 1 i , s t 2 i , s t 3 i , , s t N i }
Assume that a simplified policy set contains only two sub-rules, ({1}, {1}, {3}, {4.5}) and ({2}, {1}, {2.5}, {6}). The rule space of this set can be shown in Equation (7).
K : = ( { 1 , 2 } , { 1 , 1 } , { 2.5 , 3 } , { 4.5 , 6 } )

4. Approach Overview

Poliseek is a fast policy evaluation engine with dimensionality reduction, where its composition is illustrated in this section. It includes a preprocessing module, a rule dimensionality reducer, a rule diffuser, and a policy evaluation module, as shown in Figure 1.
Poliseek receives rules and requests and converts them into a set of four-dimensional numerical vectors through the preprocessing module. The reducer and diffuser handle these four-dimensional vectors to provide optimized candidate values that can represent each rule or request. The candidate values are fed into the evaluation module to perform the match progress. The cascading intervals and candidate list are designed and included in this module to address the case of some attributes with interval values. The evaluation module conducts a quick comparison between the request candidate value and the candidate lists obtained by interval cascading and improved hash tables.

4.1. Policy Preprocessing Module

The policy preprocessing module in Poliseek is responsible for numeralization of the original policy set. This module handles the rule vectors orderly and satisfies the requirements that the attribute value may take the default value or the interval value. A rule can be represented and identified by converting Subject, Resource, Action, and Condition into a four-dimensional real number vector, where the size relationships of attribute values are retained at the same time.

4.2. Rule Dimensionality Reducer and Rule Diffuser

The dimensionality reducer receives the rule vectors or user requests and converts them into candidate values using linear dimensionality reduction. The interval results of the reducer are then improved by the diffuser, where the activated value gained by the enlightening function is included as the optimization target, thereby reducing the probability of collision intervals. The policy diffuser optimizes the converting vectors used in the reducer, while the dimensionality reducer provides the execution driver for the diffuser. The two logically satisfy each other and form the optimization model to provide candidate values with a small overlapping area that increases the flexibility and operability of the policy evaluation.

4.3. Policy Evaluation Module

The policy evaluation module of Poliseek completes the match operation and returns the result. We can obtain the numerical value of a user request by letting the request through the preprocessing module, rule dimensionality reducer, and rule diffuser sequentially. When the value is an interval, this module first cascades the neighboring candidate intervals that cover the same candidate value; thus, these intervals are joined and categorized initially. Then, the module implements the evaluation progress by searching the nearest values of the request value using improved hash tables. A candidate list for the request that contains potential intervals is thus gained. By comparing the values in the candidate list with the requested value, we can determine the match result. If the value is a single point, PDP can directly select the rule with the same conversion value. The evaluation module carries out the fast evaluation progress by coping with both interval values and single point values properly.

5. Formalization of Rule Dimensionality Reducer and Diffuser

This section gives the formal construction of the rule dimensionality reducer and diffuser. We first detail the description of the preprocessing module that helps us improve the evaluation results.

5.1. Preprocessing Methods for Interval and Default Values

The preprocessing module of Poliseek carries out the numeralization of a policy set by considering whether a rule is an interval. There are three notable cases for the attribute value of each rule.
Case 1: Generic Value. The attribute values of Subject and Action are regarded as the ordinary cases. Their numerical results are assigned according to the order of being processed, which is specified as the integer one as a start. A bijection is built between attributes in generic cases and the integer values to normalize the converted results.
Case 2: Interval Value. For the case where the attribute value is an interval, the module linearly converts the attributes into numerical intervals. To form a coherent value range of the results, the endpoints of converted intervals are based on other existing conversion values. For example, the attribute of Condition of a policy currently has an interval such as 5:00 a.m. ≤ time ≤ 7:00 p.m. to be processed and the maximum of other existing converted values in Condition is 14. Then, the module will map “0:00 a.m.” and “12:00 p.m.” to the number 15 and 39 in unit length of one hour. Hence, 5:00 a.m. ≤ time ≤ 7:00 p.m. can be converted into [19, 34] and the maximum of converted values changes into 39 synchronously.
When a half-bounded interval such as integer ≤ −1 is to be processed, the module will map the valid endpoint “−1” into 41 and the rest of the part of the interval, which is integer < −1, is mapped into the number 40. Our preprocessing module underlines the normalization of converting results. Linear mapping for intervals ensures that the numerical features of attribute intervals can be reserved. The endpoints of intervals and their relative size are handled orderly by the module.
Case 3: Default Value. When the attribute takes <AnyValue>, any value of this attribute can be matched. Default values will be handled after all the others have been processed. Each default attribute is mapped into an integer to represent the extended range of current conversion results. These extra numerical points of default values guarantee the continuity of the value range of rule vectors out of calculation convenience. An example of an XACML policy is shown in Table 1.
It can be seen in Table 1 that there are two rules with the RuleId of R26 and R27. The Effect of them are Permit and Deny, respectively, and each rule can be uniquely represented by a four-dimensional vector as defined previously, such that R26 is ({“Julius Hibbert”}, {“comment”}, {“read”, “write”}, {integer ≤−1}) and R27 is ({“Jessie”}, {“comment”}, {“any”}, {integer ≤ 6, “03:23” < time < “04:10”}).The Action in R27 is a default value, and Condition of the two rules are intervals. According to Case 1, one can first obtain the part of the numerical values of Subject, Resource, and Action, as shown in Table 2, by traversing the two rules orderly.
There are two types of intervals in Condition and they also have specific value ranges. To cope with “integer ≤ −1” first, the module initially converts the part where integer is less than −1 into the start number 1 and converts the boundary number 1 into the number 2. Thus, the conversion results of “integer ≤ −1” is [1, 2]. Similarly, “−1 ≤ integer ≤ 6” is mapped into [2, 9].
Hence, according to Case 2, “00:00 AM” and “23:59 PM” can be regarded as 10 and 1449, respectively, in units of one minute. Then, the numerical intervals of “03:23” < time < “04:10” is [213, 260]. Currently, the default value of Action in R27 is yet to be processed. According to Case 3, it should be assigned to the number 3 since other values of Action have been converted into 1 and 2 previously. Thus, the complex rule vectors of R26 and R27 are expressed in Equations (8) and (9).
R26= ({1}, {1}, {1, 2}, {[1, 2]})
R27 = ({2}, {1}, {[1, 3]}, {[1, 9], [213, 260]})
Using Notation 2 and Equation (4), the preprocessed sub-rule vectors originated in the two complex rules can be obtained, as shown in Equation (10).
{ I S R 26 1 = ( { 1 } , { 1 } , { 1 } , { [ 1 , 2 ] } ) I S R 26 2 = ( { 1 } , { 1 } , { 2 } , { [ 1 , 2 ] } ) I S R 27 1 = ( { 2 } , { 1 } , { [ 1 , 3 ] } , { [ 1 , 9 ] } ) I S R 27 2 = ( { 2 } , { 1 } , { [ 1 , 3 ] } , { [ 213 , 260 ] } )
The above vectors are then fed to the rule dimensionality reducer and the diffuser to provide the optimal candidate intervals. The preprocessing module achieves a necessary part of our fast evaluation engine for its efficient and practical mechanism to realize the numeralization of rules.

5.2. Construction of Rule Dimensionality Reducer

The rule dimensionality reducer, denoted by Pdr, can convert a numerical sub-rule vector in K into a candidate value in an identification space, W, regardless of the form of the values in vectors. Then, the semantic discriminant in the original match method of complex policy sets can boil down to the comparison of numerical values of the intervals or points in W.
It is essential for dimensionality reduction to minimize the information loss of the original rules in which the obtained candidate values should maintain a direct relationship with rules. Hence, the reducer of Poliseek draws support from the linear dimensionality reduction principle [15]. The input of the reducer is a preprocessed sub-rule vector that may contain interval values, and the output is the related candidate value or interval as shown in Equation (11):
{ P d r ( S R k ) : = ω T S R k P d r ( I S R k ) : = ω T I S R k
where the converting vector ω = ( ω 1 , ω 2 , ω 3 , ω 4 ) T ( ω i ( i = 1 , 2 , 3 , 4 ) is a nonzero positive integer) is a 4 × 1 vector for converting SRk into candidate values, CV’s, or converting ISRk into candidate intervals, ICV’s, and its acquisition requires the rule diffuser to optimize the candidate values in order to obtain a relatively average distribution. The candidate values of input containing no intervals are calculated by Equation (12).
C V = ω T S R k = ω T ( s t k 1 , s t k 2 , s t k 3 , s t k 4 )
The candidate values of the rules with intervals are a pair of (ICV, CV) in which candidate interval ICV can be expressed in Equation (13) with the i s t k i in ISRk.
I C V = ω T I S R k = ω T ( i s t k 1 , i s t k 2 , i s t k 3 , i s t k 4 )
As shown in Equations (12) and (13), (ICV, CV) is also located in the identification space, W, corresponding to (ISR, SR). Under such circumstances, the candidate intervals of P d r should still be a linear combination of the input values. Thus, the size relationship of the different candidate values in W is still in agreement with that of different numerical rule vector values.
The converting vector ω consists of four values ω i ( i = 1 , 2 , 3 , 4 ) . Inequality (14) can show the range of the linearly transformed results of a sub-rule vector when ω i > 0 and i s t k 1 [ N 0 , N 1 ] , where N0 and N1 are numeric constants as defined earlier in Section 3.
ω T ( N 0 , i s t k 2 , i s t k 3 , i s t k 4 ) < ω T ( i s t k 1 , i s t k 2 , i s t k 3 , i s t k 4 ) < ω T ( N 1 , i s t k 2 , i s t k 3 , i s t k 4 )
Then, we have the relationship below, as shown in Inequality (15), derived from inequality (10).
ω 1 N 0 + i = 2 4 ω i i s t k i < ω T ( i s t k 1 , i s t k 2 , i s t k 3 , i s t k 4 ) < ω 1 N 1 + i = 2 4 ω i i s t k i
Inequality (15) is suitable for other i s t k i (i = 2,3,4) as well. Thus, let i s t k i [ N 0 i , N 1 i ] , and ω i i s t k i [ M 0 i , M 1 i ] , where N 0 i , N 1 i , M 0 i ,   and   M 1 i are all integer constants. The range of the dimensionality reduction result can be obtained:
min   { M 0 1 , M 0 2 , M 0 3 , M 0 4 } < ω T ( i s t k 1 , i s t k 2 , i s t k 3 , i s t k 4 ) < max   { M 1 1 , M 1 2 , M 1 3 , M 1 4 }
For the case that i s t k i is an interval, the candidate intervals calculated by the dimensionality reducer can be expressed by Equation (17).
I C V = ω T I S R k [ min     ω T i s t k i , max     ω T i s t k i ]
Thus, the parameter CV in a pair of (ICV, CV) is covered by its related ICV in most instances. We choose the minimum and maximum of the combinations of all interval endpoints in ISR to restrict the converted interval. Thus, the size relationship of the points or intervals in W is still consistent with the original vectors in K [35], which can be illustrated in Figure 2.
For the convenience of discussion, it is assumed in Figure 2 that the rule vectors SR and ISR have only two dimensions to be reduced. Comparing with the linear dimensionality reducer P d r , hash algorithms are difficult to reflect the size relationship of the original data. However, the reducer of Poliseek can meet the actual demands for interval match only when the diffuser can conduct the optimization of the converting vectors.

5.3. Design and Optimization of Rule Diffuser

The candidate intervals in W generated by the reducer may be collisions to cause the secondary search for the matching requests and the loss of evaluation performance. Thus, the rule diffuser is designed to handle the case that the amounts of collision candidate intervals are excessive.
(1)
Preparation
To simplify the construction of our engine and unify the processing logic, the numerical point values are extended to intervals while calculating ω. For a single point value, st, a small deviation, δ, is chosen as the extension value. Therefore, the original value sti is transformed into the interval [sti-δ, sti+δ]. The narrow span of this type of candidate intervals can guarantee that the request value will fall into the pointed interval during policy match, thereby directly indicating the evaluation results.
Let Lm×1 and Rm×1 be the vectors containing all the left endpoints lij and right endpoints rij, where m is the number of rules. Given the operators and functions defined in Section 3 and Nomenclature, one can derive ζ l and ζ r from L and R by Equations (18) and (19), respectively, i.e.,
ζ l = T ( ( ω T L )   ( ω T R ) )
ζ r = T ( ( ω T L )   ( ω T R ) )
where ζ l and ζ r contain all the left and right endpoints of candidate intervals. Variances and kurtosis are often the indexes of the dispersion degree of numeric values [36,37]. However, they only consider the numerical distribution characteristics and fail to reflect the overlapping part of the intervals. Given the premise that the values of Subject, Action, Resource and Condition of each rule and requests are independently distributed in probability, the following definitions for further discussion are given to formalize the candidate interval.
Notation 5. Activating Function. The activating function ε ( c ) indicates the state of each point in W and shows whether a point c (CV) in W is covered by candidate intervals (ICV). When a single-point is covered by some interval, it can be seen as activated and ε ( c ) is set to be one. Otherwise, ε ( c ) is zero, i.e.,
ε ( c ) = { 1 c 0 0 c < 0
Notation 6. Overlapping Function. The overlapping function gi(c) of the ith interval uses ζ l and ζ r of each interval to represent the candidate interval in W, i.e.,
g i ( c ) = ε ( c ζ i l ) ε ( c ζ i r )
One can deduce the overlapping function of the identification space from all interval overlapping functions, gi(c). Thus, the function G(c) is defined as the overlapping function of W to demonstrate the unified description of gi(c). It is formed by all the rules in the dimensionality reduction space as shown in Equation (22).
G ( c ) = i = 1 M g i ( c ) = ε ( c ζ 1 l ) ε ( c ζ 1 r ) + + ε ( c ζ m l ) ε ( c ζ m r )
Note that G(c) indicates the coverage of all the intervals and the activated value of each point in W. Figure 3 shows an illustrative example.
It can be seen that each interval can be represented by combining two activating functions. The whole axis of W can be directly represented by G(c), which is beneficial to reflect the overall distribution of candidate intervals and calculate the size of the overlap among them.
(2)
Construction of optimization model for converting vectors in diffuser
To minimize the influence of overlapping intervals on the evaluation performance, it is necessary to minimize the expected activated value of the reduced dimensional results under a specific probability distribution. Thus, we define fi(x) (i = 1, 2, 3, 4) as the probability density function of L i in K, and then the probability density function FW(c) of the candidate values in the dimensionality reduction space can be expressed by Equation (23).
F W ( c ) = ω T x = t i = 1 4 f i ( x i )
Access requests possess the specific attribute probability distribution that should conform to fi(x). Hence, the minimum expectation of active values can guarantee the minimum of the probability that the request will fall into the overlapping interval. In summary, our optimization diffuser is shown in Equation (24):
min ω     E ( G ( c ) ) = + F W ( t ) G ( t ) d t s . t . ω ω T = I
where I is a 4 × 4 unit matrix and G(c) is an important factor that affects the model integrability. Therefore, it is necessary to discuss whether G(c) is Riemann integrable in [ , + ] . According to the sufficient and necessary condition of the Riemann integral, a function in [a, b] is Riemann integrable if and only if it is bounded and there is a measurable set of points of discontinuity. As the enlightening function G ( c ) [ 0 ,   m ] that consists of a limited amount of interval overlapping function gi(c), it has m points of discontinuity at most.
Therefore, the optimization model in Equation (24) is integrable in [ , + ] . Since the numerical integration involved would result in much computing cost and inaccuracy, we assume that each fi(x) takes a random uniform distribution. Then, the mathematical expectation of G(c) can be calculated by Equation (25).
E ( G ( c ) ) = k = 1 4 | ω k | i = 1 m | r i k l i k | max ( ζ i r ) min ( ζ i l )
Equation (25) shows that when fi(x) is a random uniform distribution, the mathematical expectation E(G(c)) can be calculated by dividing the range of all candidate intervals through the sum of all intervals. Our optimization diffuser can be expressed by Equation (26).
min ω E ( G ( c ) ) = k = 1 4 | ω k | i = 1 m | r i k l i k | k = 1 4 | ω k ( r k max l k min ) | s . t . ω ω T = I
The converting vector ω obtained by the diffuser can generate the parameters CV and ICV with a relatively slight overlapping probability. The diffuser and the rule dimensionality reducer logically constitute an optimization model and Algorithm 1 specifies the reducer and diffuser.
Algorithm 1: Calculating Candidate Results for XACML Rules
Input: NV: numerical vectors of XACML rules (SR’s and ISR’s)
Output: CR: candidate results (CV’s and ICV’s)
1:begin: Calculating_Candidate_Values(SR)
2:  for  S R i in NV
3:     S R i = extend( S R i )
4:   end for
5:  L, R, ω = Initialize(NV)
6:    ζ l = T ( ( ω T L )   ( ω T R ) )
7:    ζ r = T ( ( ω T L )   ( ω T R ) )
8:  object = Expected_Activated_Value()
9:  constraint: ω ω T I
10:  ω = Differential_Evolution(object, constraints, NV)
11:  for  S R i in NV
12:     C V i = ω T S R i
13:    CR.join( C V i )
15:   end for
16:  for  I S R i in NV
17:     I C V i = ω T I S R i
18:    CR.join( I C V i )
19:   end for
20:  return CR
21:end
The differential evolution algorithm [38], a heuristic random search algorithm based on group differences, is applied to optimize the converting vectors. The algorithm uses the differences among the previous generations to derive the next generations without encoding and decoding. The differential evolution algorithm can achieve fast convergence and is easy to implement, which is suitable for various optimization problems [39].

6. Fast Policy Evaluation Module

The policy evaluation module receives the candidate values and intervals to determine the match results for PDP. The composition of this module is described in Figure 4.
As can be seen from Figure 4, when a request contains no interval, our module will directly compare it with the candidate values. Otherwise, to simplify the match progress involving the interval evaluation, the module is implemented with the targeted designs including the cascading of candidate intervals and the construction of candidate lists.

6.1. Cascading Adjacent Candidate Intervals

As a direct match between the RQ and ICV’s in W are inefficient and inaccurate, we devise a technique of cascading the adjacent intervals to cut down the number of comparison operations during the interval match.
Notation 7. Cascading List. A cascading list (CAL) refers to a data structure containing a candidate value and a set of candidate intervals that cover the same value, as expressed by Equation (27).
C A L i : = { C V i , j I C V j     |       C V i I C V j }
Cascading lists can achieve the fuzzy classification of candidate intervals, such that the module completes a quick search of the potential set of results. Figure 5 shows a general situation of cascading the adjacent ICV. Its purpose is to categorize the candidate intervals according to the distribution characteristics. Then, the complicated relationship among the intervals can become single-layered and more flattened.
Algorithm 2 for constructing CAL greatly facilitates the policy match evaluation and mainly helps the evaluation progress involving the rules with intervals.
Algorithm 2: Cascading Adjacent Intervals
Input:  C V i : candidate value of rules
Output:  C A L i : cascading intervals
1:begin: Cascading_Adjacent_Intervals( C V i )
2:for I C V j in all candidate intervals
3:   if  C V i I C V j
4:     C A L i .insert( I C V j )
5:    end if
6:end for
7:return  C A L i
8:end
Algorithm 2 traverses each candidate interval and cascades the intervals that cover the same candidate value, such that one could obtain the C A L i of CVi. With the algorithm traversing all I C V j in W, its time complexity is evaluated to be O(N), where N is the number of ICV’s in W.

6.2. Policy Evaluation for Access Requests

When there is no interval involved, the comparison between CV’s can be straightforward. Our module also assures the correctness of the direct match results under such conditions. When it comes to the interval match, the cascading operation in Section 6.1 can convert the relationship of the candidate intervals into the size relationship of the candidate points. Thus, we choose the candidate value instead of the interval to represent requests and implement the evaluation since the cascading intervals are helpful to find the target intervals that cover the request value efficiently. To quickly locate these potential intervals, it is, therefore, necessary to first find the closet points to the request candidate value in the evaluation progress. Then, the set of the requested candidate value can be obtained.

6.2.1. Search Values Using Modified Hash Tables

The evaluation module first searches for the two nearest points in W that are distributed on the left and right sides, respectively, of the request candidate value. Thus, we categorize the candidate values in W according to the numerical precision by the idea of bucket sorting [40]. The candidate values are stored in a series of the specially constructed hash tables named bucket, and an illustrative example is shown in Figure 6.
It can be seen that the first element of each hash table is the maximum of the next table, and the last element is the minimum of the previous table. An improved table allows the two nearest values to be found by searching only one table. The evaluation module can obtain the particular hash table by calculating the index of the target hash table, and the module can directly access it. Then, a conventional search algorithm, such as a binary search, can be applied to find the nearest points of the value. The search speed using the improved hash tables is much faster than the general searching algorithm due to the relatively small scale of each hash bucket.
Normally, a larger bucket with more values needs more time to perform the additional search algorithm and a small-scale bucket often results in limited space cost. The search cost of finding the nearest points mainly depends on how precisely the buckets are divided. However, the precision of bucket division is determined by the distribution of actual values in W. Hence, the ideal search cost can reach to O(1) theoretically. Having acquired the neighboring points of the candidate value, one can promote the evaluation progress to generate the candidate lists and fulfill the final search of the interval match.

6.2.2. Candidate List and Interval Match

The module then obtains two CAL’s belonging to the two nearest points by cascading the possible intervals. Moreover, if there are intervals that are not cascaded by any of the two points but are situated between them, the module would merge these intervals with the two CAL’s to form the candidate list. This union where the match results are included is operated by the evaluation module to carry out the request match by directly comparing the values in it. Therefore, the searching space of the evaluation progress can be drastically reduced to the size of the candidate list that is relatively small. A brief sketch of this method is shown in Figure 7.
The construction of the candidate list based on cascading intervals avoids searching all candidate intervals in W, and the evaluation module needs to perform in a small range only. Thus, the evaluation progress can be fulfilled by comparing the vectors in the candidate list and returning the results.
The policy evaluation module completes the Poliseek and achieves the fast match using the well-designed candidate list. Algorithm 3 realizes our evaluation module.
Algorithm 3: Interval Policy Match Algorithm
Input: CVRQ: candidate value of requests
Output: MR: match results
1:begin: Policy_Evaluation(CVRQ)
2:   hash_tables = Initialize()
3:   potential_bucket = hash_tables[CVRQ]
4:   P1, P2 = Searching_Algorithm(potential_bucket)
5:     CALP1 = Cascading_Adjacent_Intervals(P1)
6:     CALP2 = Cascading_Adjacent_Intervals(P2)
7:   candidate_list= C A L P 1 C A L P 2
8:   for interval in [P1, P2]
9:     if interval candidate_list
10:        candidate_list.join(interval)
11:      end if
12:    end for
13:   for potential_interval in candidate_list
14:     MR = vector_compare(rq,potential_interval)
15:    end for
16:   return MR
17:end
The policy match algorithm can obtain the potential bucket of the request candidate value by accessing the index of the hash tables improved by the bucket structure. Then, the evaluation algorithm calls the cascading interval algorithm to calculate CAL1 and CAL2. Our fast policy evaluation algorithm requires less comparison operations since there are not many intervals in the actual application policies.
However, if rules themselves are accidentally similar, there could be a great number of the same attribute values among them, which can cause many rules to be converted to the same CV or ICV, resulting in a rather intense distribution of values in W. Under such circumstances, it is possible that finding the nearest values is instant, but determining the result vector in the targeted bucket requires extra comparisons. Thus, the search cost of our algorithm could reach O(N) currently. Fortunately, such an extreme condition rarely appears in the real-world policies and our algorithm still has promotion potential to approach the performance of O(1) using perfectly divided hash buckets.

7. Evaluation Experiments and Performance Analysis

In this section, some evaluation experiments have been designed and conducted to verify the availability of Poliseek and assess the XACML policy matching efficiency, which includes 10,000 rules. The experimental results are compared with the existing typical schemes, including Sun PDP [27], XEngine [24], and SBA-XACML [14]. Our Engine is proven to be reliable and practical.

7.1. Policy Configuration and Experimental Settings

Three practical policies used in the library management system (LMS) [41], virtual meeting system (VMS) [42], and auction sale management system (ASMS) [43] are employed to simulate the real-world applications. They cover the major access control scenarios in daily life, by which the test results can possess strong credibility. We expand the three policies to 3000, 6000 and 9000, respectively, by constructing a policy generator with multiple random distributions, such that they can be closer to the actual situation. Moreover, we adjust the properties of Default Value and Interval Value manually to explore the performance changes in Poliseek.
Having acquired the appropriate test cases, we implement all modules of Poliseek as a runnable algorithm on modern computers. We carry out the experiments by Python 3.7 in i5-4200 H, 2.3 GHz on a Windows-based PC with 8 GB RAM. Sun PDP, XEngine, and SBA-XACML are also conducted under the same conditions. It should be noted that although the conversion of rules or requests is a necessary step in Poliseek, the evaluation duration includes no preprocessing time and no dimensionality reduction time. They should be calculated alone when evaluating the matching performance of this scheme.

7.2. Generation of Test Requests

The declaration of availability and efficiency of Poliseek demands the test policy set to be credible. Thus, the generation of the test cases needs to be conducted more comprehensively and formally to make the evaluation experiment results solid.
Bertolino et al. [44] propose a systematic request generation framework called XCreate. It applies Context Schema in XML to cover input combinations in various practical situations and generate request skeletons for testing suites. XCreate aims to elaborate the test requests using Context Schema systematically. Then Bertolino improves it [45] and proposes two strategies for automatically generating requests, namely Incremental XPT and Simple Combinatorial. Incremental XPT optimizes the intermediate request generation and policy-under-test analysis. It defines a new criterion to make the number of requests more manageable to customize the generated requests better. Simple Combinatorial refers to combine the obtained request attribute values to form a potential request.
Martin et al. [46,47] propose a fault model with implementation framework based on Change-Impact analysis. This framework defines and applies the mutation operator to explore a fault model such that it can construct an efficient scheme to complete the policy generation and selection. The framework also increases the coverage of the test policy.
Our evaluation experiments combine the methods above with randomization to simulate the generation of practical requests. Then the credibility of the experimental results and the feasibility of our engine can be guaranteed.

7.3. Design and Constitution of Policy Evaluation Experiments

Sun PDP [27], XEngine [24], and SBA-XACML [14] are chosen as the reference subjects of evaluation experiments. Sun PDP is an open source and a widely-used policy evaluation engine. As an industry standard, Sun PDP has maintained excellent stability and practicability over the years, but it has performance limitations in dealing with the growing number of access requests. XEngine uses a tree structure to simplify the numeralization of the policy set to a manageable flat dataset, which can also achieve proper evaluation experimental performance. SBA-XACML is selected for the reason that it is a typical scheme using predicate logic, and its hierarchical evaluation algorithm can reduce the match time significantly by partially comparing rules.
Our performance evaluation experiments are conducted on the data sets derived from LMS, VMS, and ASMS. The evaluation time of Poliseek is recorded and compared with other schemes. We will also analyze the effects of the Default Value and the Interval Value on the performance of our model and give results of comprehensive test sets. Additionally, our experiments can be explored to discuss the change in other internal parameters.

7.4. Comparisons and Analyses of Policy Evaluation Results

(1)
Evaluation Performance of Single Values
In this section, we observe the experimental results of different schemes and conduct comparison and analysis between our model and others. Figure 8 shows the evaluation time of Sun PDP, XEngine, SBA-XACML, and Poliseek under original LMS, VMS, and ASMS containing no interval values.
Figure 8 demonstrates that
(1)
Poliseek can maintain the shortest evaluation time in all three policy sets, and Sun PDP spends more than ten times the evaluation time of XEngine and SBA-XACML.
(2)
When the number of test requests increases to 10,000, Poliseek only costs approximately 10 ms, but Sun PDP encounters severe performance loss. XEngine and SBA-XACML have relatively steady performances of approximately 100 ms time cost.
(3)
When the number of test requests reaches 10,000, the evaluation time of Sun PDP, XEngine, and SBA-XACML are 691 times, 11 times, and 10 times longer than Poliseek in LMS, 1169 times, 10 times, and 14 times longer in VMS, 1410 times, 25 times, and 21 times longer in ASMS, respectively. Hence, it can be concluded that Poliseek is far superior to other schemes for evaluation performance.
(4)
When the number of test requests exceeds 10,000, with the increase in the number of requests, the evaluation time of Poliseek is always controlled within 10 ms, which proves that Poliseek has good stability and can effectively be applied to the request evaluation task of large-scale or even super-large-scale policy sets.
(2)
Effect of Intervals on Evaluation Performance
We can adjust the proportion of the rules with intervals in the test set and observe the evaluation time of Poliseek. Given the randomly generated test sets, one can see the evaluation performances under different conditions from Figure 9. As the test set increases, the average evaluation time of Poliseek increases from 2.19 ms to 20.53 ms, which increases to approximately 8.6 ms compared with the results without the interval. Additionally, the linear change in evaluation time suggests that there is no performance bottleneck when Poliseek is fed with real-world data.
Table 3 shows that when there are 20%, 40%, and 60% interval rules in 10,000 rules, Poliseek still maintains smooth and efficient performance, consuming 20.928 ms at most. In addition to the results of 3000 and 4000, given the different amounts of test sets, the evaluation time of Poliseek decreases with the growth of interval rates. The change in the interval ratio does not significantly affect the performance of Poliseek.
The number of default and interval values in the four attributes of rules can affect the overall evaluation progress. To achieve a more realistic simulation of the actual situation, we reconstructed the test set by randomly generating 20% of interval values and 20% of default values and 60% of ordinary values. Figure 10 shows the performance of Poliseek under this comprehensive policy set. One can see that Poliseek is still able to maintain relatively stable performance under complex policy sets. When test rules reach 10,000, the evaluation time is still less than 300 ms, which is acceptable for users.
Table 4 shows that with the randomly generated comprehensive test set, when the test requests increase from 1000 to 10,000, the evaluation time of Poliseek reaches between 12.04 ms and 275.9 ms, which is much longer than the result of intervals and single values. It can be argued that a large number of intervals searching and comparison operations due to the occurrence of excessive default values can cause Poliseek to encounter difficulties in policy match.
(3)
Effect of Intervals on Other Aspects
When conducting the evaluation experiment, we realize that the length of hash buckets and the number of potential intervals in CAL can also be affected by the number of intervals. Therefore, this part will analyze and observe the performance of these two indicators to acquire the fluctuations of the internal parameters of Poliseek match module and analyze the performance of our model. The average length of hash buckets reflects the search space of the additional search algorithm required by the policy match algorithm, and the number of intervals in CAL indicates the overlap distribution of the intervals obtained by ω.
Figure 11 shows the average length change of hash buckets under a different number of intervals and test rules. It can be seen that as the number of intervals increases, the average length decreases from approximately 3.05 to 3.01, which suggests that the constructed hash buckets can be well adapted to the actual policy set with intervals. As shown in Figure 9 and Figure 11, the variation of the maximum length is small and the evaluation time does not exceed 20 ms, which indicates that Poliseek is stable and effective.
Figure 12 gives the number of CAL intervals under different rates, where the CAL intervals stand for the required additional search space during the evaluation. Without our engine, all intervals have to be considered potential and to be performed by naive search.
Comparing the size of the two kinds of histograms in Figure 12, it is clear that the policy match algorithm used in Poliseek scales down the search space effectively and brings significant performance enhancement. As the rate of interval increases, the average of CAL intervals gradually grows from approximately 4.99 to 6.39, which is in line with the expected trend. The results also illustrate that the search space of the policy evaluation module can maintain a small scale when Poliseek needs to handle the large and complex policy sets. As shown in Figure 9 and Figure 12, when it comes to 10,000 rules, the maximum number of CAL intervals is no larger than 24. It suggests that the additional storage space required by our engine is affordable and there is no serious code bloat during evaluation.
The above experiments indicate that Poliseek has a tremendous performance advantage compared with Sun PDP, XEngine, and SBA-XACML. Poliseek can process large-scale requests in a short time for more intervals and default values, showing high practicability.

8. Conclusions and Future Work

In this paper, we propose and implement a novel fast policy matching model, termed Poliseek, which can handle requests with intervals correctly and can be evaluated on various real-world [48,49,50] test policies. The design of Poliseek integrates three modules for fast policy match by combining linear dimensionality reduction and effectively minimizing interval collision. Equipped with well-constructed hash tables and a search algorithm, Poliseek can flexibly and systematically implement the conversion of requests and policy sets as well as the targeted candidate interval search.
A comprehensive evaluation experiment for different policy sets shows that with the increase in the number of requests, the evaluation time of Poliseek is always shorter than that of Sun PDP, XEngine, and SBA-XACML. The experimental results indicate that Poliseek can handle 1000 requests, and we will study how 10,000 or more requests can be handled by Poliseek in future research. When the number of test requests exceeds 10,000, with the increase in the number of requests, the evaluation time of Poliseek is always controlled within 10 ms, which proves that Poliseek has good stability and can effectively be applied to the request evaluation task of large-scale policy sets. To best leverage the performance advantages of Poliseek, it is recommended to apply the policy set without excessive duplicate values. With the help of technologies, such as cloud computing, big data and artificial intelligence [51,52,53], Poliseek can have multiple application scenarios in future Web systems.
Poliseek has certain limitations. As the number of requests increases, the evaluation time grows significantly. We should design a multi-granularity search scheme to improve the conversion results of candidate intervals in the matching algorithm. Since Poliseek has a relatively complicated preprocessing module, we can optimize the numeralization strategy in preprocessing steps as well as dimensionality reduction efficiency to increase its scalability and continuity. Note that the linear one-dimensionality of W is beneficial to divide policy sets. It enlightens us that there are many promising opportunities to easily realize the parallelization process of requests by probability under the scenario of distributed systems.

Author Contributions

Conceptualization, F.D. and Z.Y.; methodology, X.Z. (Xinrui Zhan); software, X.Z. (Xiaolin Zhang); validation, Y.Z. and Z.Q.; formal analysis, X.Z. (Xinrui Zhan) and C.W.; investigation, C.W.; resources, X.Z. (Xiaolin Zhang); data curation, X.Z. (Xiaolin Zhang); writing—original draft preparation, C.W.; writing—review and editing, F.D. and Z.Y.; visualization, X.Z. (Xiaolin Zhang); supervision, C.W.; project administration, F.D. and Z.Y.; funding acquisition, F.D. and Z.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded in part by the National Natural Science Foundation of China under Grants 62273272 and 61873277, and in part by the Natural Science Foundation of Shaanxi Province in China under Grants 2022JM–317 and 2022JQ–606.

Data Availability Statement

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Acknowledgments

This research was supported by the Youth Innovation Team of Shaanxi Universities.

Conflicts of Interest

The authors declare no conflict of interest.

Nomenclature

RA complex rule vector in an XACML policy
RQA user request vector represented by XACML
SRA sub-rule vector with atomic attribute values
ISRA sub-rule vector with atomic attribute intervals
CVA candidate value, a single-point value in projection results obtained by dimensionality reduction
ICVA candidate interval, an interval value in projection results
An integer set
WAn identification space, a one-dimensional space of the dimensionality reduction results where CV’s and ICV’s are distributed
| x | The absolute value of the variable x
IA unit matrix
E(x)Calculates the mathematical expectation of a random variable x
f(x)Refers to the probability density function of an attribute value x
in request vectors
T(X)Returns an m × 1 vector whose elements are m sum values of each row
in an m × n matrix X
A B Represents the union of sets A and B
A B Represents the scalar product of two matrices or vectors A and B
A B Represents the operation of multiplying each element in an n × k
matrix B by each row element on the corresponding position
in a 1 × k column vector A; it returns an n × k matrix
A B Returns a matrix Cn×k whose element cij is the smaller value
between the elements in the same position in n × k matrices A and B
A B Returns a matrix Cn×k whose element cij is the larger value
between the elements in the same position in n × k matrices A and B
t l j i The jth value in the lth category of the ith attribute
S R k j The jth sub-rule derived from R k
s t k i The ith value in the set t l j i of Rk
L i A set of values for each s t i in N sub-rules

References

  1. Qin, X.; Huang, Y.; Yang, Z.; Li, X. LBAC: A lightweight blockchain-based access control scheme for the internet of things. Inf. Sci. 2020, 554, 222–235. [Google Scholar] [CrossRef]
  2. Deng, F.; Yu, Z.; Zhang, L.; Wang, J.; Feng, K.; Kong, W.; Li, L.; Wu, J. ANNPDP: An Efficient and Stable Evaluation Engine for Large-Scale Policy Sets. IEEE Trans. Serv. Comput. 2020, 15, 1926–1939. [Google Scholar] [CrossRef]
  3. Margheri, A.; Masi, M.; Pugliese, R.; Tiezzi, F. A Rigorous Framework for Specification, Analysis and Enforcement of Access Control Policies. IEEE Trans. Softw. Eng. 2017, 45, 2–33. [Google Scholar] [CrossRef] [Green Version]
  4. Gao, L.; Yan, Z.; Yang, L.T. Game Theoretical Analysis on Acceptance of a Cloud Data Access Control System Based on Reputation. IEEE Trans. Cloud Comput. 2016, 8, 1003–1017. [Google Scholar] [CrossRef]
  5. Dammak, M.; Senouci, S.-M.; Messous, M.A.; Elhdhili, M.H.; Gransart, C. Decentralized Lightweight Group Key Management for Dynamic Access Control in IoT Environments. IEEE Trans. Netw. Serv. Manag. 2020, 17, 1742–1757. [Google Scholar] [CrossRef]
  6. Ning, J.; Cao, Z.; Dong, X.; Liang, K.; Wei, L.; Choo, K.-K.R. CryptCloud+: Secure and Expressive Data Access Control for Cloud Storage. IEEE Trans. Serv. Comput. 2018, 14, 111–124. [Google Scholar] [CrossRef]
  7. Parducci, B.; Lockhart, H.; Rissanen, E. eXtensible Access Control Markup Language (XACML) Version 3.0, OASIS Standard. 2010. Available online: http://docs.oasis-open.org/xacml/3.0/xacml-3.0-core-spec-os-en.html (accessed on 31 October 2022).
  8. Althumali, H.; Othman, M.; Noordin, N.K.; Hanapi, Z.M. Priority-based load-adaptive preamble separation random access for QoS-differentiated services in 5G networks. J. Netw. Comput. Appl. 2022, 203, 103396. [Google Scholar] [CrossRef]
  9. Lee, B.M.; Yang, H. Energy efficient scheduling and power control of massive MIMO in massive IoT networks. Expert Syst. Appl. 2022, 200, 116920. [Google Scholar] [CrossRef]
  10. Sun, P.J. XACML Policy Evaluation Optimization Research Based on Attribute Weighted Clustering and Statistics Reordering. In Proceedings of the 2017 IEEE International Conference on Information and Automation (ICIA), Macao, China, 18–20 July 2017; pp. 1190–1195. [Google Scholar] [CrossRef]
  11. Ngo, C.; Makkes, M.X.; Demchenko, Y.; de Laat, C. Multi-data-types interval decision diagrams for XACML evaluation engine. In Proceedings of the 2013 Eleventh Annual Conference on Privacy, Security and Trust, Tarragona, Spain, 10–12 July 2013; pp. 257–266. [Google Scholar] [CrossRef]
  12. Ngo, C.; Demchenko, Y.; de Laat, C. Decision Diagrams for XACML Policy Evaluation and Management. Comput. Secur. 2015, 49, 1–16. [Google Scholar] [CrossRef]
  13. Mourad, A.; Tout, H.; Talhi, C.; Otrok, H.; Yahyaoui, H. From model-driven specification to design-level set-based analysis of XACML policies. Comput. Electr. Eng. 2016, 52, 65–79. [Google Scholar] [CrossRef]
  14. Mourad, A.; Jebbaoui, H. SBA-XACML: Set-based approach providing efficient policy decision process for accessing Web services. Expert Syst. Appl. 2015, 42, 165–178. [Google Scholar] [CrossRef]
  15. Vasan, K.K.; Surendiran, B. Dimensionality reduction using Principal Component Analysis for network intrusion detection. Perspect. Sci. 2016, 8, 510–512. [Google Scholar] [CrossRef] [Green Version]
  16. DeCarlo, L.T. Psychol methods: On the meaning and use of kurtosis. Psychol. Methods 1997, 2, 292–307. [Google Scholar] [CrossRef]
  17. Xia, X. A conflict detection approach for XACML policies on hierarchical resources. In Proceedings of the IEEE International Conference on Green Computing and Communications, Besancon, France, 20 November 2012; pp. 755–760. [Google Scholar]
  18. Jebbaoui, H.; Mourad, A.; Otrok, H.; Haraty, R. Semantics-based approach for detecting flaws, conflicts and redundancies in XACML policies. Comput. Electr. Eng. 2015, 44, 91–103. [Google Scholar] [CrossRef]
  19. Deng, F.; Zhang, L.-Y. Elimination of policy conflict to improve the PDP evaluation performance. J. Netw. Comput. Appl. 2017, 80, 45–57. [Google Scholar] [CrossRef]
  20. Deng, F.; Chen, P.; Zhang, L.-Y.; Wang, X.-Q.; Li, S.-D.; Xu, H. Policy Decomposition for Evaluation Performance Improvement of PDP. Math. Probl. Eng. 2014, 2014, 1–14. [Google Scholar] [CrossRef] [Green Version]
  21. Marouf, S.; Shehab, M.; Squicciarini, A.; Sundareswaran, S. Adaptive Reordering and Clustering-Based Framework for Efficient XACML Policy Evaluation. IEEE Trans. Serv. Comput. 2010, 4, 300–313. [Google Scholar] [CrossRef]
  22. Liu, X.; Li, T.; Zhou, Z.; Hu, L. An efficient multi-objective reliability-based design optimization method for structure based on probability and interval hybrid model. Comput. Methods Appl. Mech. Eng. 2022, 392, 114682. [Google Scholar] [CrossRef]
  23. Yang, C.; Wang, Z.; Oh, S.-K.; Pedrycz, W.; Yang, B. Ensemble fuzzy radial basis function neural networks architecture driven with the aid of multi-optimization through clustering techniques and polynomial-based learning. Fuzzy Sets Syst. 2022, 438, 62–83. [Google Scholar] [CrossRef]
  24. Liu, A.X.; Chen, F.; Hwang, J.; Xie, T. Designing Fast and Scalable XACML Policy Evaluation Engines. IEEE Trans. Comput. 2010, 60, 1802–1817. [Google Scholar] [CrossRef] [Green Version]
  25. Liu, T.; Wang, Y. Beyond Scale: An Efficient Framework for Evaluating Web Access Control Policies in the Era of Big Data. In Advances in Information and Computer Security, Proceedings of the 10th International Workshop on Security, IWSEC 2015, Nara, Japan, 26–28 August 2015; Springer: Berlin/Heidelberg, Germany, 2015; pp. 316–334. [Google Scholar] [CrossRef]
  26. Ros, S.P.; Lischka, M.; Mármol, F.G. Graph-based XACML evaluation. In Proceedings of the ACM Symposium on Access Control Models and Technologies, Newark, NJ, USA, 20 June 2012; pp. 83–92. [Google Scholar]
  27. Sun’s XACML Implementation. Available online: http://sunxacml.sourceforge.net/ (accessed on 21 November 2022).
  28. Ayache, M.; Erradi, M.; Freisleben, B.; Khoumsi, A. Towards an Efficient Policy Evaluation Process in Multi-Tenancy Cloud Environments. In Proceedings of the 2016 ACM on Cloud Computing Security Workshop, New York, NY, USA, 28 October 2016; pp. 55–59. [Google Scholar] [CrossRef]
  29. Deng, F.; Wang, S.-Y.; Zhang, L.-Y.; Wei, X.-Q.; Yu, J.-P. Establishment of attribute bitmaps for efficient XACML policy evaluation. Knowl.-Based Syst. 2018, 143, 93–101. [Google Scholar] [CrossRef]
  30. Turkmen, F.; Hartog, J.D.; Ranise, S.; Zannone, N. Analysis of XACML policies with SMT. In Proceedings of the International Conference on Principles of Security and Trust, London, UK, 11 April 2015; pp. 115–134. [Google Scholar]
  31. Turkmen, F.; Demchenko, Y. On the use of SMT solving for XACML policy evaluation. In Proceedings of the International Conference on Cloud Computing Technology and Science, Hong Kong, China, 11 December 2017; pp. 539–544. [Google Scholar]
  32. Turkmen, F.; Hartog, J.D.; Ranise, S.; Zannone, N. Formal analysis of XACML policies using SMT. Comput. Secur. 2017, 66, 185–203. [Google Scholar] [CrossRef]
  33. Deng, F.; Yu, Z.; Liu, W.; Luo, X.; Fu, Y.; Qiang, B.; Xu, C.; Li, Z. An efficient policy evaluation engine for XACML policy management. Inf. Sci. 2021, 547, 1105–1121. [Google Scholar] [CrossRef]
  34. Fang, Z.; Ren, J.; Marshall, S.; Zhao, H.; Wang, S.; Li, X. Topological optimization of the DenseNet with pretrained-weights inheritance and genetic channel selection. Pattern Recognit. 2020, 109, 107608. [Google Scholar] [CrossRef]
  35. Xie, L.; Yin, M.; Yin, X.; Liu, Y.; Yin, G. Low-Rank Sparse Preserving Projections for Dimensionality Reduction. IEEE Trans. Image Process. 2018, 27, 5261–5274. [Google Scholar] [CrossRef] [PubMed]
  36. Ouyang, M.; Jeon, T.; Sotiras, A.; Peng, Q.; Mishra, V.; Halovanic, C.; Chen, M.; Chalak, L.; Rollins, N.; Roberts, T.P.L.; et al. Differential cortical microstructural maturation in the preterm human brain with diffusion kurtosis and tensor imaging. Proc. Natl. Acad. Sci. USA 2019, 116, 4681–4688. [Google Scholar] [CrossRef]
  37. Ximei, L.; Latif, Z.; Changfeng, W.; Latif, S.; Khan, Z.; Wang, X. Mean-variance-kurtosis hybrid multi-objective portfolio optimization model with a defined investment ratio. J. Eng. Technol. 2018, 6, 293–306. [Google Scholar]
  38. Storn, R.; Price, K. Differential evolution—A simple and efficient heuristic for global optimization over continuous spaces. J. Glob. Optim. 1997, 11, 341–359. [Google Scholar] [CrossRef]
  39. Yu, Z.; Si, Z.; Li, X.; Wang, D.; Song, H. A novel hybrid particle swarm optimization algorithm for path planning of UAVs. IEEE Internet Things J. 2022, 9, 22547–22558. [Google Scholar] [CrossRef]
  40. Manaseer, S.; Hwaitat, A.K.A. Measuring parallel performance of sorting algorithms. Mod. Appl. Sci. 2018, 12, 23–31. [Google Scholar] [CrossRef] [Green Version]
  41. Pretschner, A.; Baudry, B. Test-driven assessment of access control in legacy applications. In Proceedings of the International Conference on Software Testing, Verification, and Validation, Lillehammer, Norway, 9 April 2008; pp. 238–247. [Google Scholar]
  42. Mouelhi, T.; Fleurey, F.; Baudry, B.; Traon, Y.L. A model-based framework for security policy specification, deployment and testing. In Proceedings of the 11th International Conference on Model Driven Engineering Languages and Systems, Toulouse, France, 28 September 2008; pp. 537–552. [Google Scholar]
  43. Mouelhi, T.; Le Traon, Y.; Baudry, B. Transforming and Selecting Functional Test Cases for Security Policy Testing. In Proceedings of the 2009 International Conference on Software Testing Verification and Validation, Denver, CO, USA, 1–4 April 2009; pp. 171–180. [Google Scholar] [CrossRef] [Green Version]
  44. Bertolino, A.; Lonetti, F.; Marchetti, E. Systematic XACML request generation for testing purposes. In Proceedings of the 36th EUROMICRO Conference on Software Engineering and Advanced Applications, Washington, DC, USA, 1 September 2010; pp. 3–11. [Google Scholar]
  45. Bertolino, A.; Daoudagh, S.; Lonetti, F.; Marchetti, E. Automatic XACML requests generation for policy testing. In Proceedings of the IEEE Fifth International Conference on Software Testing, Verification and Validation, Montreal, QC, Canada, 18 April 2012; pp. 842–849. [Google Scholar]
  46. Martin, E.; Xie, T. A fault model and mutation testing of access control policies. In Proceedings of the 16th International World Wide Web Conference, Banff, AB, Canada, 8 May 2007; pp. 667–676. [Google Scholar]
  47. Martin, E.; Tao, X. Automated test generation for access control policies via change-impact analysis. In Proceedings of the ICSE 2007 Workshops: Third International Workshop on Software Engineering for Secure Systems, Minneapolis, MN, USA, 20 May 2007; pp. 5–6. [Google Scholar]
  48. Yu, Z.; Sohail, A.; Jamil, M.; Beg, O.A.; Tavares, J. Hybrid algorithm for the classification of fractal designs and images. Fractals 2022, 30, 1–20. [Google Scholar] [CrossRef]
  49. Sohail, A.; Yu, Z.; Arif, R.; Nutini, A.; Nofal, T. Piecewise differentiation of the fractional order CAR-T cells-SARS-2 virus model. Results Phys. 2022, 33, 1–7. [Google Scholar] [CrossRef] [PubMed]
  50. Yu, Z.; Gao, H.; Wang, D.; Alnuaim, A.; Firdausi, M.; Mostafa, A. SEI2RS malware propagation model considering two infection rates in cyber-physical systems. Phys. A Stat. Mech. Its Appl. 2022, 597, 1–12. [Google Scholar] [CrossRef]
  51. Yu, Z.; Wang, H.; Wang, D.; Li, Z.; Song, H. CGFuzzer: A fuzzing approach based on coverage-guided generative adversarial networks for industrial IoT protocols. IEEE Internet Things J. 2022, 9, 21607–21619. [Google Scholar] [CrossRef]
  52. Yu, Z.; Sohail, A.; Nofal, T.; Tavares, J. Explainability of neural network clustering in interpreting the COVID-19 emergency data. Fractals 2022, 30, 1–10. [Google Scholar] [CrossRef]
  53. Li, H.; Zhang, M.; Chen, D.; Zhang, J.; Meng, Y.; Li, Z. Image Color Rendering Based on Hinge-Cross-Entropy GAN in Internet of Medical Things. Comput. Model. Eng. Sci. 2023, 135, 779–794. [Google Scholar] [CrossRef]
Figure 1. Overall structure of policy evaluation engine Poliseek.
Figure 1. Overall structure of policy evaluation engine Poliseek.
Mathematics 10 04530 g001
Figure 2. Schematic diagram of linear projection of rule dimensionality reducer.
Figure 2. Schematic diagram of linear projection of rule dimensionality reducer.
Mathematics 10 04530 g002
Figure 3. Illustration of enlightening function in W.
Figure 3. Illustration of enlightening function in W.
Mathematics 10 04530 g003
Figure 4. Structure and workflow of fast policy evaluation module of Poliseek.
Figure 4. Structure and workflow of fast policy evaluation module of Poliseek.
Mathematics 10 04530 g004
Figure 5. Schematic diagram of cascading adjacent candidate intervals.
Figure 5. Schematic diagram of cascading adjacent candidate intervals.
Mathematics 10 04530 g005
Figure 6. Construction of improved hash tables for quick search.
Figure 6. Construction of improved hash tables for quick search.
Mathematics 10 04530 g006
Figure 7. Brief description of implementing evaluation module using candidate lists.
Figure 7. Brief description of implementing evaluation module using candidate lists.
Mathematics 10 04530 g007
Figure 8. Evaluation performance and comparison of different schemes in LMS, VMS, and ASMS.
Figure 8. Evaluation performance and comparison of different schemes in LMS, VMS, and ASMS.
Mathematics 10 04530 g008
Figure 9. Evaluation performance under different interval rates.
Figure 9. Evaluation performance under different interval rates.
Mathematics 10 04530 g009
Figure 10. Evaluation performance of modified policy set.
Figure 10. Evaluation performance of modified policy set.
Mathematics 10 04530 g010
Figure 11. Average Length of Hash Buckets of Different Interval Rates.
Figure 11. Average Length of Hash Buckets of Different Interval Rates.
Mathematics 10 04530 g011
Figure 12. Average of CAL Intervals of Different Interval Rates.
Figure 12. Average of CAL Intervals of Different Interval Rates.
Mathematics 10 04530 g012
Table 1. A Segment of an XACML Policy.
Table 1. A Segment of an XACML Policy.
A Typical Example of an XACML Rule
1:<Policy>
2:<Rule Effect = “Permit” RuleId = “R26”>
3:   <Target>
4:    <Subject>
5:     <AttributeValue>Julius Hibbert<AttributeValue>
6:    </Subject>
7:    <Resource>
8:     <AttributeValue>comment</AttributeValue>
9:    </Resource>
10:    <Action>
11:     <AttributeValue>read</AttributeValue>
12:     <AttributeValue>write</AttributeValue>
13:    </Action>
14:   </Target>
15:   <Condition>
16:   <AttributeValue>integer ≤1</AttributeValue>
17:   </Condition>
18:</Rule>
19:<Rule Effect = “Permit” RuleId = “R27”>
20:   <Target>
21:    <Subject>
22:     <AttributeValue>Jessie<AttributeValue>
23:    </Subject>
24:    <Resource>
25:     <AttributeValue>comment</AttributeValue>
26:    </Resource>
27:    <Action>
28:     <AttributeValue><AnyValue></AttributeValue>
29:    </Action>
30:   </Target>
31:   <Condition>
32:    <AttributeValue>integer≤ 6</AttributeValue>
33:    <AttributeValue>03:23 < time < 04:10</AttributeValue>
34:   </Condition>
35:   </Rule>
36:</Policy>
Table 2. An Example of Numerical Values of Attributes.
Table 2. An Example of Numerical Values of Attributes.
SubjectJulius Hibbert:1Jessie:2
Resourcecomment:1
Actionread:1write:2
Table 3. Average Evaluation Time of Different Interval Rates. (Evaluation time: ms).
Table 3. Average Evaluation Time of Different Interval Rates. (Evaluation time: ms).
Requests with
Default Value
10002000300040005000600070008000900010,000
20%2.2504.4116.9978.58311.0012.8414.3916.6318.3020.25
40%2.3854.4836.3538.57110.5612.7214.9116.7619.4620.93
60%1.9374.4008.0019.70610.7812.3514.5216.4618.4120.43
Table 4. Average Evaluation Time of Comprehensive Access Requests. (Evaluation time: ms).
Table 4. Average Evaluation Time of Comprehensive Access Requests. (Evaluation time: ms).
Requests with
Default Value
10002000300040005000600070008000900010,000
Poliseek12.0469.91103.2136.4167.1198.0213.2254.7263.9275.9
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Deng, F.; Yu, Z.; Zhan, X.; Wang, C.; Zhang, X.; Zhang, Y.; Qin, Z. Poliseek: A Fast XACML Policy Evaluation Engine Using Dimensionality Reduction and Characterized Search. Mathematics 2022, 10, 4530. https://doi.org/10.3390/math10234530

AMA Style

Deng F, Yu Z, Zhan X, Wang C, Zhang X, Zhang Y, Qin Z. Poliseek: A Fast XACML Policy Evaluation Engine Using Dimensionality Reduction and Characterized Search. Mathematics. 2022; 10(23):4530. https://doi.org/10.3390/math10234530

Chicago/Turabian Style

Deng, Fan, Zhenhua Yu, Xinrui Zhan, Chongyu Wang, Xiaolin Zhang, Yangyang Zhang, and Zilu Qin. 2022. "Poliseek: A Fast XACML Policy Evaluation Engine Using Dimensionality Reduction and Characterized Search" Mathematics 10, no. 23: 4530. https://doi.org/10.3390/math10234530

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop