Role Minimization Optimization Algorithm Based on Concept Lattice Factor

: Role-based access control (RBAC) is a widely adopted security model that provides a ﬂexible and scalable approach for managing permissions in various domains. One of the critical challenges in RBAC is the efﬁcient assignment of roles to users while minimizing the number of roles involved. This article presents a novel role minimization optimization algorithm (RMOA) based on the concept lattice factor to address this challenge. The proposed RMOA leverages the concept lattice, a mathematical structure derived from formal concept analysis, to model and analyze the relationships between roles, permissions, and users in an RBAC system. By representing the RBAC system as a concept lattice, the algorithm captures the inherent hierarchy and dependencies among roles and identiﬁes the optimal role assignment conﬁguration. The RMOA operates in two phases: the ﬁrst phase focuses on constructing the concept lattice from the RBAC system’s role–permission– user relations, while the second phase performs an optimization process to minimize the number of roles required for the access control. It determines the concept lattice factor using the concept lattice interval to discover the minimum set of roles. The optimization process considers both the user–role assignments and the permission–role assignments, ensuring that access requirements are met while reducing role proliferation. Experimental evaluations conducted on diverse RBAC datasets demonstrate the effectiveness of the proposed algorithm. The RMOA achieves signiﬁcant reductions in the number of roles compared to existing role minimization approaches, while preserving the required access permissions for users. The algorithm’s efﬁciency is also validated by its ability to handle large-scale RBAC systems within reasonable computational time.


Introduction
Role-based access control (RBAC) is a widely used security model that provides a structured approach to managing permissions in various domains [1]. In RBAC systems, users are assigned roles, and roles are associated with specific permissions. However, one of the major challenges in RBAC is the efficient assignment of roles to users while minimizing the number of roles (the role mining problem (RMP)) involved.
The proliferation of roles in an RBAC system can lead to administrative complexities, increased maintenance efforts, and potential security vulnerabilities. Therefore, there is a need for effective algorithms that can optimize the role assignment process, reducing the number of roles while ensuring that access requirements are met.
The research field of role minimization optimization algorithms based on the concept lattice factor is still relatively limited but growing. The concept of role minimization in RBAC systems has garnered attention due to the challenges posed by role proliferation and its impact on system complexity and security.
Several studies have explored different approaches for role minimization in RBAC systems [2]. Traditional methods often rely on heuristics, graph-based algorithms, or mathematical optimization techniques. However, these approaches may face limitations in terms of computational complexity, scalability, and the ability to handle large-scale RBAC systems.
The introduction of the concept lattice factor as a basis for role minimization algorithms has opened up new possibilities for more efficient and effective solutions. The concept lattice, derived from formal concept analysis, provides a structured framework to capture the relationships between roles, permissions, and users [3]. By leveraging the concept lattice factor, we aim to develop algorithms that can exploit the inherent hierarchy and dependencies among roles to minimize their number.

Related Work
Krra et al. [4] summarized and categorized many methods in recent years to approximate the optimal solutions for role generation and role allocation in access control systems, such as role mining, dynamic user-role assignments, and role refinement.
Role mining was first proposed based on initial clustering of users who were assigned the same privileges [5]. Basic-RMP [6] finds the fewest set of roles from the user rights assignments and provides the user with the role assignments along with the permissions.
Role mining algorithms partially automate the construction of an RBAC policy from an ACL (access control lists) policy and possibly other information, reducing the cost of migration to RBAC [7]. Xu and Stoller [8] proposed algorithms for role mining. The algorithms can easily be used to optimize a variety of policy quality metrics, including metrics based on policy size, metrics based on interpretability of the roles with respect to user attribute data, and compound metrics that consider size and interpretability.
The researchers found that obtaining a workable set of roles to optimize user access mapping to the role mining problem (RMP) is the well-known (NP-hard) problem. Polynomial time approximation algorithms such as greedy and random methods can be used to obtain a feasible set roles. For example, Basic-RMP maps to minimal tiling problems [6] (where each tile corresponds to a role), minimal biclique coverage [9] (where each role corresponds to biclique), and set cover problems [10] (where each subset corresponds to a role). In edge-RMP [11], work has been carried out to minimize the administrative burden by optimizing user-role and permission-role assignments. Since Basic-RMP and Edge RMP prove to be NP hard, a greedy and approximate algorithm is proposed to optimize the edges (i.e., user-role assignments (UR) and permission-role assignments (PR)) in RBAC. Ene et al. [12] also introduced fast graph reductions that allow recovery of the solution from the solution to a problem on a smaller input graph.
An unsupervised role mining method called fast miner [13] is based on permission set enumeration of predefined constraints. The Simple Role Mining Algorithm [14] is a heuristic-based solution for approximating the best set of characters. The user with the fewest privileges will be the initial entry for the role set. This process of selecting the minimum number of permissions is carried out gradually after the individual user's tasks are completed. It maintains subsequent updates to the role set by eliminating roles acquired as a federation of other roles that have been inserted into the role set. Li et al. [15] used operations and resources of permissions as the functional information in role mining algorithm, role mining with functional features (FMiner), to reduce composite roles. The HP Role Minimization Algorithm [7] and Weighted Structure Complexity Optimization [16] are exact variants of RMP because the set of roles is highly compatible with the permissions assigned to users. The process of mining roles is also included in the RBAC extension model, such as Temporary RBAC and Generalized Temporary RBAC. This is known as Temporal RMP [17]. Here, role assignments to users and permissions are enabled only for a set of time intervals. In the constrained role miner [18], the proposed role mining algorithm conforms to various constraints to optimize the role assignment to users and permissions.
When the only information is user-permission relation, roles are discovered whose semantic meaning is based on formal concept lattices [19]. They argue that the theory of formal concept analysis provides a solid theoretical foundation for mining roles from user Mathematics 2023, 11, 3047 3 of 13 permission relation. A dyadic formal context from the triadic security context represents role-based access permission and performs attribute exploration from formal concept analysis (FCA) [20,21]. An FCA construction, by introducing the enrichment of an incidence relation by a set of intervals in a formal context, investigated the approach for latticegenerating interval relations on the context side [22].
The existing algorithms mainly group permissions or users, but for role mining, both users and permissions need to be grouped, so it is necessary to find more effective methods for role mining.

Preliminaries
RBAC is an access control model that organizes user permissions based on roles. It simplifies access control management by grouping users with similar access requirements into roles, and then assigning permissions to those roles.
In this paper, we follow the basic definitions in NIST standard, which is the most widely known formal description of the RBAC model.
The RBAC model contains the following components: User: An individual or entity that interacts with the system and requires access to resources. Users are assigned roles that define their access rights.
Role: A defined set of permissions that represents a specific job function, responsibility, or level of authority within an organization. Roles are associated with users to determine their access privileges.
Permission: The rights or actions that users are authorized to perform on resources. Permissions are assigned to roles and determine what actions users can take within the system.
User-Role Assignment: The process of associating users with roles based on their job responsibilities, functions, or other attributes. User-role assignments define the roles that each user is authorized to fulfill.
Role-Permission Assignment: The process of associating permissions with roles. Rolepermission assignments specify the actions that users in a particular role are authorized to perform on resources [23].
The following definitions formalize the above discussion. U, R, P (users, roles, and permissions). UR ⊂ U × R: a many-to-many user to role assignment relation. RP ⊂ R × P: a many-to-many role to permission assignment relation. UP ⊂ U × P: a many-to-many users to permission assignment relation. Pers (r) = {p ∈ P|(r, P) ∈ RP}: the permission set owned by role r. PERS (R) = {p ∈ P|r∈R, (r, P) ∈ RP}: the permission set owned by the role set R. Given m users, n permissions, and k roles, the user-role mapping can be represented as an m × k Boolean matrix, where a ij in cell ij indicates the assignment of role j to user i. Similarly, the role-permission mapping can be represented as a k × n Boolean matrix, where a 1 in cell ij indicates the assignment of permission j to role i. Finally, the user-permission mapping can be represented as an m × n Boolean matrix, where a ij in cell ij indicates the assignment of permission j to user i. Definition 1. Role Mining Problem: Given an m × n access control matrix, UP is decomposed into sizes of m × k and k × n two matrices UR and RP, and k is the smallest among all possible matrix decompositions.

Definition 2.
A formal context or a dyadic context K is a triple (X, Y, I), where X, called the universe of discourse, is a nonempty and finite set of objects, Y is a nonempty finite set of attributes, and I ⊆ X × Y is a binary relation between X and Y.
Definition 3. For a formal context K, operators ↑: 2 X →2 Y and ↓: 2 Y →2 X are defined for every A ⊆ X and B ⊆ Y by A ↑ = {y ∈ Y/ for each x ∈ A:<x,y> ∈ I} and B ↓ = {x ∈ X/ for each y ∈ B:<x,y>I}. The operators ↑ and ↓ are known as concept-forming operators.

Definition 4. A formal concept of the context K = (X, Y, I) is a pair (A, B) of A ⊆ X and B ⊆ Y, such that A ↑ = B and B ↓ = A.
We call A extent and B intent of the concept (A, B). Formal concepts are naturally ordered by partial order "≤" using a subconcept-superconcept relation, such that, for any two formal concepts (A 1 , The objects and attributes are dual in nature, which forms a Galois connection. This connection exhibits closure relation among objects and attributes such that, from any set of formal objects, one can identify all the attributes that they have in common.

Definition 5.
The collection of all formal concepts of the context K = (X, Y, I) equipped with subconcept-superconcept partial ordering ≤ is called a concept lattice L(K).
According to the definitions of RBAC, a formal context K = (U, P, IA) corresponds to an access control matrix, where U is the user set, P is the permission set, and IA represents UP. For u ∈ U, p ∈ P, (u, p) ∈ IA, it indicates that user u has permission p. Therefore, Table 1 can be used to represent the formal context under the RBAC model.

Proposed Methodology
On the concept lattice, since all possible roles can be mined and the concepts and roles correspond one-to-one, the problem of solving the minimum set of roles on the access control matrix UP in the role mining problem can be equivalent to solving the minimum set of role concepts generated by the concept lattice. Definition 6. Minimum Role Concept Set: Let K = (U, P, IA), and S m be a set of concepts in the concept lattice L(K) generated by the formal context. If S m satisfies the following two conditions, it is called the minimum role concept set on the access control context K.
Condition 1: The permissions owned by each user in the access control context K can be represented by the union of the intents of several concepts in the concept set S m .
Condition 2 The number of concepts in the concept set S m is the smallest.
In the following discussion, we will no longer distinguish between the general formal context and the access control context, and both will be represented by K.  Theorem 1. If the concept lattice interval I ij is nonempty and is minimal with respect to ⊆, then I ij is the concept lattice factor.
and that a nonempty I ij is minimal with respect to ⊆ if it does not contain any other I i j , i.e., I ij = I i j whenever I ij ⊆ I i j for every I , j . Theorem 2. In the formal context K = (U, P, IA), the concept lattice factor is the minimum role concept set.
Proof. We prove that the concept lattice factor satisfies two conditions for the minimum role concept set. (1) According to definition 8, concept lattice factors are concepts included in the minimum interval, so all concepts in context K = (U, P, IA) can be represented by their union of the intents; (2) According to Theorem 1, the concept lattice factor, which is minimal with respect to ⊆, satisfies Condition 2.
Theorem 1 and Theorem 2 indicate that the optimal set of roles can be determined by determining the concept lattice factor in context K = (U, P, IA).
We can first calculate all intervals of the context K = (U, P, IA) using the algorithm (Algorithm 1) in reference [24]. For IA ∈ {0,1} n×m , we denote by E (IA) the n × m Boolean matrix given by (E (IA)) ij = 1 iff IA ij is nonempty and minimal with respect to ⊆. G is a collection of possibly overlapping groups of essential 1s, i.e., 1s in E (IA).
The concept lattice interval is actually a set of several formal concepts, so we can use a double loop to check whether each set s i is a subset of other sets s j in G = {s 1 , . . . , s i, . . . , s j, . . . , s n }. If so, then s i is not the set we are looking for; otherwise, s i may be the set we are looking for. Then, for each possible set s i , we need to check if it is a subset of other sets. If s i is a subset of other sets, then it is not the set we are looking for; otherwise, s i may be one of the sets we are looking for. Finally, for each possible set s i , we need to check whether it is the smallest set, that is, whether there is a set smaller than s i that can also be a subset of other sets.
Specifically, the algorithm can be implemented as follows (Algorithm 2): 3. For each set s j and s i , proceed as follows: 4.
If i = j, skip this loop.
jumps out of the loop. 8. If s i is not a subset of any set 9.
then s i is added to the result set result R s . 10. For each set s i and s j , proceed as follows: 11. is_minimal = 1 //Initialize a Boolean variable is_minimal is true, indicating whether s i is the minimum set. 12

An Illustrative Example
To demonstrate the effectiveness of our algorithm, we used the example electronic medical record system in reference [25] as a context instance for role mining and semantic assignment, thereby generating role states with semantic meaning and hierarchical structure.
In this example, user positions are divided into two categories: ordinary positions and management positions. Ordinary positions include registrar (1), surgeon (2), physician (3), gynecologist (4), nurse (5), and pharmacist (6). The management positions include surgical director (7), internal medicine director (8), gynecological director (9), medical department head (10), chief nurse (11), pharmacy director (12), and dean (13). Based on the reading and writing of information in various scenarios and authorized operations for various functions, the permissions used in the system are listed as follows: reading patient basic information (a), writing patient basic information (b), reading hospitalization information (c), writing hospitalization information (d), reading history records (e), reading diagnostic information (f), reading prescriptions (g), reading nurse reports (h), writing internal medicine history records (i), writing surgical history records (j), writing gynecological history records (k), writing internal medicine diagnostic information (l) Write surgical diagnosis information (m), gynecological diagnosis information (n), internal medicine prescription (o), surgical prescription (p), gynecological prescription (q), nurse report (r), physician authorization (s), surgeon authorization (t), gynecologist authorization (u), pharmacist authorization (v), nurse authorization (w). The attributes used in the department and functional information system are as follows: internal medicine (A), surgery (B), gynecology (C), medication (D), registration (E), diagnosis (F), nursing (G), and director (H). The entire system has 13 types of users, 23 types of permissions, and 8 types of attributes. The corresponding relationship between each type of user and permissions is listed in Table 2, and the attributes owned by each type of user are listed in Table 3.    Step 1: Construct a user permission concept lattice based on the user permission relationships provided in Table 2, mapping it to candidate role states, as shown in Figure 1.
Step 2: Determine I ji based on a ij = 1 and use the algorithm to determine the concept lattice factor. Establish a correspondence between concepts and reduced concepts to obtain the candidate role states for reduction, as shown in Figure 2. Step 3: Generate a user attribute concept set based on the user attribute relationships provided in Table 3, and sort the generated concept set based on the number of users and permissions to obtain an ordered user attribute concept set.
Step 4: In the concept set, for the extension of the corresponding concept for each role, search for its closest expression in order from top to bottom, and assign semantic meaning to each role. Figures 3 and 4 show the original and minimum roles of the electronic medical record system, respectively.
The role structure mining algorithm in this article has a simple hierarchy and requires fewer allocation relationships to be added. At the same time, the algorithm in this article uses the nearest neighbor expression of user attributes to assign semantic meaning to roles, which is more accurate than assigning semantic meaning to roles based on their permissions, user functions in the system, and actual positions in reference [25].  Table 2. Figure 2. The corresponding role concept lattice factor (nodes marked in red) in Table 2.  Table 2.
provided in Table 3, and sort the generated concept set based on the number of users and permissions to obtain an ordered user attribute concept set.
Step 4: In the concept set, for the extension of the corresponding concept for each role, search for its closest expression in order from top to bottom, and assign semantic meaning to each role. Figures 3 and 4 show the original and minimum roles of the electronic medical record system, respectively.  The role structure mining algorithm in this article has a simple hierarchy and requires fewer allocation relationships to be added. At the same time, the algorithm in this article uses the nearest neighbor expression of user attributes to assign semantic meaning to roles, which is more accurate than assigning semantic meaning to roles based on their permissions, user functions in the system, and actual positions in reference [25]. permissions to obtain an ordered user attribute concept set.
Step 4: In the concept set, for the extension of the corresponding concept for each role, search for its closest expression in order from top to bottom, and assign semantic meaning to each role. Figures 3 and 4 show the original and minimum roles of the electronic medical record system, respectively.  The role structure mining algorithm in this article has a simple hierarchy and requires fewer allocation relationships to be added. At the same time, the algorithm in this article uses the nearest neighbor expression of user attributes to assign semantic meaning to roles, which is more accurate than assigning semantic meaning to roles based on their permissions, user functions in the system, and actual positions in reference [25].

Experimental Results
We conducted an experimental study to evaluate our proposed method. The ideal method for evaluating the accuracy of role mining is to use real-world user permission data. However, obtaining such data is extremely difficult, especially those containing complete RBAC states. Therefore, most role mining algorithms use synthesized user permission data as input for evaluation [26]. Similarly, we prepared our input dataset based on the template in reference [27].
To evaluate the performance of our algorithm, we implement the algorithm by Java and run the program on the synthetic dataset. Our experimental platform is a personnel computer with an Intel(R) Core(TM) i5 CPU and 16 GB memory.
In this study, we conducted experiments and analysis on five different datasets, as shown in Table 4. We used the program shown in Algorithm 3 [28] to prepare the dataset. Firstly, we defined a set of roles based on the above template. Then we created multiple users and randomly assigned them to each role, specifying the maximum number of users for any given role. Then, we set user-permissions based on the roles assigned to each user in the study. Our goal is to achieve a 100% reconstruction rate. Figure 5 illustrates the number of original roles used for preparing the datasets against the number of extracted roles. The number of original roles and extracted roles are indicated by red and blue bars, respectively. Notably, the number of extracted roles among different datasets is close to the number of original roles, indicating that our approach is very close to the optimal solution. More specifically, the number of extracted roles is identical to the number of original roles for Dataset1, i.e., the small-scale dataset. For large datasets, the number of extracted roles is slightly lower than the original number. This is because the concept lattice factor completely eliminates concepts that can be a union of the intents.

Time Complexity
Consider first Algorithm 1. It first computes E(IA), which may be performed O(n 2 m 2 ), since it suffices to repeat for every of the nm entries of IA the test and si Our role minimization optimization algorithm is based on the concept lattice factor, which is the formal context matrix factorization. A good factorization algorithm computes a factorization of the input matrix IA using a reasonably small number of factors in such a way that the first factors have a reasonably good coverage, i.e., they explain a large portion of data. For this purpose, Radim et al. [24] employed the following function of A ∈ {0, 1} n×l and B ∈ {0, 1} l×m , representing the coverage quality of the first l factors delivered by the particular algorithm: c = 1 − E(I A, A • B)/ I A . They compared the factorization algorithms. For all datasets, it has the highest coverage by the first few factors, providing the best, almost exact factorizations.

Conclusions and Future Work
This paper proposes to use operations and resources of the permissions as the function information in role mining and presents a new role mining approach that could reduce composite roles. Our algorithm has two main processes. Firstly, we generate the initial RBAC state that each permission only belongs to a role using formal concept analysis. Secondly, we optimize this RBAC state based on concept lattice factor considering both the user-role assignments and the permission-role assignments, ensuring that access requirements are met while reducing role proliferation.
The algorithm demonstrates effectiveness in handling various optimization tasks by reducing the dimensionality of the problem through concept lattice factorization. By identifying and utilizing the inherent relationships and dependencies among variables, it can efficiently explore the solution space and converge towards optimal or near-optimal solutions.
Our approach is purely data-driven, as all performance metrics are directly associated with the inherent features of the dataset. With this approach, we can quickly set the right goal for role mining before actually running any role mining algorithms.
However, there are areas for further improvement and future work. Firstly, the algorithm's performance could be evaluated and compared against existing state-of-theart optimization algorithms to assess its competitiveness and scalability. Additionally, conducting comprehensive experimental studies on various benchmark problems and real-world applications would help validate its effectiveness and generalizability.
Furthermore, exploring ways to enhance the algorithm's robustness to handle noisy or uncertain data would be valuable. Investigating the algorithm's behavior on large-scale problems and developing strategies to scale it up effectively would also be beneficial.
Overall, the role minimization optimization by concept lattice factor presents a novel approach to optimization that shows promise [29]. Continued research and development could lead to further advancements, making it a valuable tool for solving complex optimization problems in various domains. Funding: This work was partially supported by the public welfare industry project of Zhejiang Science and Technology Department (LGG18F020012).