The technology disclosed relates generally to processes or apparatus for increasing a system's extension of protection of system hardware, software, or data from maliciously caused destruction, unauthorized modification, or unauthorized disclosure. The technology disclosed more specifically relates to coalescing redundant or outdated roles within an enterprise to reduce over-provisioning of access, thereby reducing risk exposure. In particular, the technology disclosed utilizes artificial intelligence to process role-associated data to identify similar roles and displays cluster analysis tools for examining and coalescing similar roles.
The subject matter discussed in this section should not be assumed to be prior art merely as a result of its mention in this section. Similarly, a problem mentioned in this section or associated with the subject matter provided as background should not be assumed to have been previously recognized in the prior art. The subject matter in this section merely represents different approaches, which in and of themselves can also correspond to implementations of the claimed technology.
Delegation, revocation, and supervision of user access provisioning through role based access control (RBAC) is intended to streamline identity and access management. RBAC tools are intended to allow for access permissions that appropriately reflect a particular employee's job responsibilities to be provisioned automatically. Ideally, RBAC should result in reduced over-provisioning of access, decreased risk exposure, and expedited identification of inappropriate access. Information security has invested deeply in RBAC; however, very little success has been achieved.
Drawbacks such as human bias introduced in establishing roles and permissions, the large volume of data to be analyzed, and the dynamic nature of the access landscape contribute to failures in RBAC implementation.
An opportunity arises for identifying candidate roles within an enterprise for coalescence by computing a similarity measure between pairs of roles with respect to a particular role association entity and utilizing hierarchical clustering dendrogram insights that allow a user to determine important enterprise-specific role-based access knowledge.
In the drawings, like reference characters generally refer to like parts throughout the different views. Also, the drawings are not necessarily to scale, with an emphasis instead generally being placed upon illustrating the principles of the technology disclosed. In the following description, various implementations of the technology disclosed are described with reference to the following drawings.
The following detailed description is made with reference to the figures. Sample implementations are described to illustrate the technology disclosed, not to limit its scope, which is defined by the claims. Those of ordinary skill in the art will recognize a variety of equivalent variations on the description that follows.
Identity governance and administration encompass delegation, revocation, and supervision of provisioning for user access to databases, files structures and other computing resources. Administrators assign roles with associated privileges to users. The privileges govern access to data elements and process controls found in databases, file structures and other secure processes.
Enterprises suffer so-called role explosion due to redundant and outdated roles. This is a widely recognized problem that consultants offer to solve, often with disappointing results. Oracle, for instance, offers the role coalescence tool Oracle Identity Role Intelligence (OIRI). There is much room for improvement. A role-based Autonomous Identity (AI) based access control (AIBAC) system, is disclosed in U.S. patent application Ser. No. 17/559,911, which will be assigned to Applicant in due course. The AIBAC system discovers role-based access patterns across the organization and recommends consolidated role structures to reduce enterprise security risks. This application takes a different approach.
To address such issues, the technology disclosed detects candidate roles within a role database for coalescence. Through an interactive display, it allows a user, such as a security administrator, to efficiently identify superfluous roles within the enterprise and improving access security.
A robust role coalescence engine is disclosed. This tool provides readily interpretable role mining results. This technology graphically depicts evaluations of role similarity. Users interact with identified candidates for role coalescence. A user can leverage the similarity evaluations via an interactive data visualization process to specify coalescence.
Roles are compiled with corresponding features such as membership, entitlements, and supervisor. The roles can be compiled either directly, from the enterprise role structure, or from a coalesced role list generated by role mining. Role features can include a membership list comprising all members assigned to a specific role and a list of entitlements or access privileges assigned to a particular role. The method calculates similarity measures for the role features between pairs of roles, generating measures such as Jaccard similarity, Hamming distance, or a Sorenson-Dice coefficient. Similarity measures can be expressed as distances or converted to distances. Clustering is applied.
Role pairs can then be clustered using hierarchical clustering. The hierarchy of clusters can be graphically presented in a cluster visualization such as a dendrogram or circle packing graph, as illustrated in
The cluster visualization provides the security administrator with an interpretable, user-accessible form of data analysis that interactively leads to deeper exploration or coalescence of roles. A user may select a particular leaf or branch within a dendrogram or an interior or containing circle within a circle packing diagram to drill down the particular associated cluster(s). Selection causes display of a plurality of role features for roles in the selected clusters. The interactive display supports user comparison and selection for coalescence of roles that have similar role features.
The disclosure for this approach include the previously submitted “System for Controlling Access to a Plurality of Target Systems and Applications”, identified above and incorporated by reference.
The following section describes an environment for a role coalescence engine.
System 100 includes devices and systems that facilitate control of access to target systems, including security system 102, access control system 155, role coalescence engine system 165, and target systems one through N 108. Security system 102 facilitates specifying information associated with a user of the enterprise system, such as profile data. Security system 102 is operatable by a user associated with the enterprise, such as a security administrator. Exemplary profile data may include biographic information, such as a name, user identity, and address, along with enterprise-specific information such as an employment start date, title, grade level, department, manager name, reporting hierarchy, group, years of experience, physical location, and full-time/part-time designation. Target systems one through N 108 correspond to various computers located throughout the enterprise, configured to perform specific tasks, such as an enterprise resource planning (ERP) system, a customer relationship management (CRM) system, and a supply chain management (SCM) system. Each of the target systems one through N 108 may implement a form of access control to prevent unauthorized access.
Moreover, each of the target systems may host various applications, and each application may have its own form of access control to prevent unauthorized access. As used herein, access to a system and/or an application operating on the system is referred to as an entitlement or privilege. Access control system 155 responds to requests for access, coordinating authentication and consent gathering. Access control system 155 includes model 175 which builds association rules that can be levered to create functional roles. Various implementations of the technology disclosed may comprise distinct respective systems, such as an autonomous ID engine for AIBAC or other systems for RBAC. Role coalescence engine system 165 processes a list of roles and associated data such as associated membership or entitlements to a particular role to identify similar roles as candidates for coalescence.
In the interconnection of the elements of system 100, network 145 couples security system 102, role coalescence engine system 165, access control system 155, and target systems one through N 108 in communication. The communication path can be point-to-point over public and/or private networks. Communication can occur over various networks, e.g., private networks, VPN, MPLS circuit, or Internet, and can use appropriate application program interfaces (APIs) and data interchange forms, e.g., REST, JSON, XML, or SOAP. The communications can be encrypted in some implementations of the technology disclosed. In many implementations, the communication is over a network such as the LAN (local area network), WAN (wide area network), telephone network (Public Switched Telephone Network (PSTN), Session Initiation Protocol (SIP), wireless network, point-to-point network, star network, token ring network, hub network, Internet, inclusive of the mobile Internet, via protocols such as EDGE, 3G, 4G LTE, Wi-Fi, and WiMAX.
Further continuing with the description of system 100, the components of
While system 100 is described herein with reference to particular blocks, it is to be understood that the blocks are defined for convenience of description and are not intended to require a particular physical arrangement of component parts. Further, the blocks need not correspond to physically distinct components. To the extent that physical distinct components are used, connections between components can be wired and/or wireless as desired. The different elements or components can be combined into single software modules and multiple software modules can run on the same hardware.
Terminology in use in this document includes the following. “Entitlement” is a unit of privilege and can be fine-grained or coarse-grained. “Membership” comprises one or more users (“Members”) performing a particular role. “Role Features” associated with a particular role include entitlements, membership, supervisor, certification level, etc. “Assignment” is the relationship between user and entitlement, user and membership, or user with any additional role feature.
In one implementation of the technology disclosed, similarity is measured as a Jaccard similarity index value. In other implementations of the technology disclosed, similarity may be measured by a different distance metric such as a Sorensen-Dice coefficient, Euclidean distance, or Hamming distance. A person skilled in the art will recognize that these distance metrics are listed as educational examples and are not exclusive.
Following generation of a similarity measure by the pairwise similarity measure logic 236 for each possible role pair combination within all roles extracted by the role feature extractor 226, role pair clustering logic 246 clusters the plurality of role pairs. While role pair clustering logic 246 is described herein with reference to hierarchical agglomerative clustering, hierarchical divisive clustering can also be used without further altering the components of system 200. Cluster visualization generator 256 generates a dendrogram to visually interpret the clustering results produced by role pair clustering logic 246. A user or security administrator can interact with the dendrogram via user interactive cluster display logic 266. User interactive cluster display logic 266 comprises a user selection receiver 276, a cluster drill down logic 286, and a role feature display generator 296. User selection receiver 276 receives a selection of a particular branch from the user, prompting cluster drill down logic 286 to drill down the particular branch to observe additional data corresponding to the associated roles. Details and an example are described below.
The discussion now turns to role coalescence engine 165 in greater detail, wherein the similarity measurement and clustering procedures are described further.
A plurality of role pair combinations are obtained from the extracted list of roles, step 382 routes to step 363, wherein steps 342 and 382 are repeated for at least role pair combinations that have similarity that exceeds a floor or threshold. Once all role pairs have a generated similarity measure, role pairs are clustered by similarity in step 304 to identify similar role pairs. In step 324, clusters are utilized to build a cluster visualization illustrating the hierarchy of clusters. The cluster visualization is shown as part of a graphical user interface (GUI) with options for user interaction with the cluster visualization. Following the generation of a cluster visualization, step 344 makes available user selection of any cluster within the cluster visualization.
Upon user selection of a particular cluster, at a lower or higher level of the cluster hierarchy, step 364 drills down on the selected cluster. Role features associated with roles within the selected cluster are displayed in step 384. In one example, a selected cluster may be expanded into a list of all roles within the cluster. A list of associated features for each role may also be displayed, such as role members, role entitlements, or role supervisor. In another example, a single role within the selected cluster and one or more of the role's associated features are listed. In another example, a subset of two or more roles within the total plurality of roles within the selected cluster is analyzed based on role features. Flow diagram 300A may be executed a single time for a single role feature (e.g., membership) or repeated successively for multiple features (e.g., membership, followed by entitlement).
Next, a series of examples for specific role features are described. While this process may be implemented for a wide range of role features, this application focuses on the implementation of the technology disclosed for similarity measurement of role membership and role entitlements, respectively.
The discussion now turns to a description of data structures corresponding to a particular role within an enterprise.
A role 484 comprises attributes such as membership, entitlements, and management. A person skilled in the art will recognize that additional attributes of a role may exist and a limited number of example attributes are listed for clarity. An organization 485 comprises an attribute respective to organizational units (i.e., departments or teams within an enterprise). A membership list 486 comprises attributes for a plurality of members. An entitlement list 487 comprises attributes for a plurality of entitlements. A management entity 488 comprises attributes for at least one supervisor within the enterprise.
Role entities 484 generally have a many-to-at least one relationship with organization entities 485 (i.e., a plurality of roles can belong to a single department, but a particular role can belong to one or more departments). Role entities 484 generally have a many-to-many relationship with both membership lists 486 and entitlement lists 487 (i.e., multiple roles can share the same members and multiple members can share the same role). Role entities 484 generally have a many-to-at least one relationship with management entities 488 (i.e., multiple roles can share the same supervisor s or a plurality of supervisors).
While entity relationships 400B are described herein with reference to particular entities and relationships, it is to be understood that the entities are defined for convenience of description and are not intended to require a particular relationship between entities. Further, the entities need not correspond to physically distinct components.
Each association entity for a particular role 484 may be analyzed for similarity between a pair of roles. Thus far, the discussion has discussed an overview of a variety of similarity measures. Next, these similarity measures will be described in further detail.
Whereas illustrations 500A and 500B contrast similarity measures in terms of set theory, illustration 500C contrasts Euclidean distance with Hamming distance metrics. Real line 506 is a one-dimensional Euclidean space composed of real numbers. In Euclidean space, the distance between number five and number two (or number five and number four) is determined by the absolute distance between data points on the line. Compared to traditional Euclidean distance, Hamming distance measures the number of positions between two strings of equal length at which the corresponding values differ. Hamming distance is often described as the minimum number of errors (substitutions made within respective positions) to transform one string to another. String 506 is equivalent to 5 in binary, and string 526 equals 2 in binary. All three bits within the string are different values; hence, the Hamming distance 508 is three. String 546 is equal to 4 in binary, and string 566 is equal to 5 in binary. Only one bit within the string are different values; hence, the Hamming distance 528 is one.
Now, an example process is described for the cluster visualization of a particular set of roles. For simplicity, only one form of cluster visualization is shown in the
For intersection 57,
In response to the security administrator 202 selecting the ‘Import Roles’ function, the GUI presents a window for the user to upload a file (e.g., .CSV or .TXT file) comprising a list of roles and associated role features to be imported into identity governance and administration tool 232.
The example GUI shown is shown in
User interface input devices 838 can include a keyboard; pointing devices such as a mouse, trackball, touchpad, or graphics tablet; a scanner; a touch screen incorporated into the display; audio input devices such as voice recognition systems and microphones; and other types of input devices. In general, use of the term “input device” is intended to include all possible types of devices and ways to input information into computer system 800.
User interface output devices 876 can include a display subsystem, a printer, a fax machine, or non-visual displays such as audio output devices. The display subsystem can include an LED display, a cathode ray tube (CRT), a flat-panel device such as a liquid crystal display (LCD), a projection device, or some other mechanism for creating a visible image. The display subsystem can also provide a non-visual display such as audio output devices. In general, use of the term “output device” is intended to include all possible types of devices and ways to output information from computer system 800 to the user or to another machine or computer system.
Storage subsystem 810 stores programming and data constructs that provide the functionality of some or all of the modules and methods described herein. Subsystem 878 can be graphics processing units (GPUs) or field-programmable gate arrays (FPGAs).
Memory subsystem 822 used in the storage subsystem 810 can include a number of memories including a main random-access memory (RAM) 832 for storage of instructions and data during program execution and a read only memory (ROM) 834 in which fixed instructions are stored. A file storage subsystem 836 can provide persistent storage for program and data files, and can include a hard disk drive, a floppy disk drive along with associated removable media, a CD-ROM drive, an optical drive, or removable media cartridges. The modules implementing the functionality of certain implementations can be stored by file storage subsystem 836 in the storage subsystem 810, or in other machines accessible by the processor.
Bus subsystem 855 provides a mechanism for letting the various components and subsystems of computer system 800 communicate with each other as intended. Although bus subsystem 855 is shown schematically as a single bus, alternative implementations of the bus subsystem can use multiple busses.
Computer system 800 itself can be of varying types including a personal computer, a portable computer, a workstation, a computer terminal, a network computer, a television, a mainframe, a server farm, a widely distributed set of loosely networked computers, or any other data processing system or user device. Due to the ever changing nature of computers and networks, the description of computer system 800 depicted in
We describe some implementations and features for a role coalescence engine in the following discussion.
One implementation discloses a method for identifying candidate roles to coalesce by effectively ranking similar roles as candidates. Roles are compiled from an identity governance and administration tool, and each respective role includes a plurality of role features. These role features may include attributes such as membership or entitlements. A similarity measure is determined for a pair of roles with respect to a single particular role feature, and the process is repeated iteratively for a plurality of role pairs within the compiled roles. The plurality of role pairs comprises each possible combination of a first and second role within the plurality of compiled roles. The plurality of role pairs is then clustered based on the similarity measure. A cluster visualization can be generated based on the clustered role pairs for display to a user. The displayed cluster visualization has controls for selecting a particular cluster within the cluster visualization. Upon receiving a signal indicating user selection of the particular cluster, the particular cluster is drilled down on to display at least one role feature corresponding to at least one role within the cluster.
In some implementations, the similarity measure can be computed for additional role features such as supervisor, organizational units, risk level, or session (i.e., a mapping between a user and a set of roles to which the user is assigned in the context of a working time).
The methods described in this section and other sections of the technology disclosed can include one or more of the following features and/or features described in connection with additional methods disclosed. In the interest of conciseness, the combinations of features disclosed in this application are not individually enumerated and are not repeated with each base set of features. The reader will understand how features identified in this method can readily be combined with sets of base features identified as implementations.
One implementation of the method further includes the similarity measure being computed with respect to a membership list comprising members assigned to each respective role, for each respective role pair within the plurality of role pairs. Another implementation of the method further includes the similarity measure being computed with respect to an entitlement list comprising entitlements assigned to a respective role, for each respective role pair within the plurality of roles.
In some implementations of the method, the similarity measure is a Jaccard index. In other implementations, the similarity measure is a Hamming distance. In yet other implementations, the similarity measure is a Sorensen-Dice coefficient. A user skilled in the art will appreciate that these similarity measures are explicitly expressed as examples, and a range of other similarity measures exist that may be implemented within particular embodiments of the technology disclosed.
Some implementations of the method further include hierarchical clustering as the method of role pair clustering. Following clustering of the role pairs, cluster visualization may be implemented as a dendrogram in some embodiments. In this implementation of the method, dendrogram branches correspond to the hierarchy within the hierarchical clustering results. The height of any particular dendrogram branch is proportional to distance, where greater distances in height between branches correspond to greater dissimilarity. Roles below a predetermined distance threshold are coalesced. In some implementations, each dendrogram leaf may be respective to a particular cluster. In another implementation, each dendrogram leaf may be respective to a particular role pair. In yet another implementation, each dendrogram leaf may be respective to a particular role.
Other implementations of the technology disclosed further include a circle packing graph as the cluster visualization. In this implementation of the method, each circle corresponds to a particular cluster and circle nesting corresponds to the hierarchy within the hierarchical clustering results. Clusters above a predetermined size threshold are coalesced.
Other implementations of the method further include extracting candidate roles from an additional RBAC tool such as AIBAC. Some implementations include alternative clustering methods, such as K-means clustering, mean-shift clustering, Gaussian Mixture Models (GMM), or density-based special clustering of applications with noise (DBSCAN).
Other implementations of the disclosed technology described in this section can include a tangible non-transitory computer-readable storage media, including program instructions loaded into memory that, when executed on processors, cause the processors to perform any of the methods described above. Yet another implementation of the disclosed technology described in this section can include a system including memory and one or more processors operable to execute computer instructions, stored in the memory, to perform any of the methods described above.
The preceding description is presented to enable the making and use of the technology disclosed. Various modifications to the disclosed implementations will be apparent, and the general principles defined herein may be applied to other implementations and applications without departing from the spirit and scope of the technology disclosed. Thus, the technology disclosed is not intended to be limited to the implementations shown but is to be accorded the widest scope consistent with the principles and features disclosed herein. The scope of the technology disclosed is defined by the appended claims.
This application is related to the following applications which are incorporated by reference herein for all purposes: U.S. application Ser. No. 17/559,911, titled “Role Mining Proximity Analysis for Improved Role-Based Access Control,” filed 22 Dec. 2021 (Attorney Docket. No. FORG 1016-3) which claims priority to and the benefit of U.S. Application No. 63/270,761 filed 22 Oct. 2021 (Attorney Docket. No. FORG 1016-2) and U.S. Application No. 63/255,319 filed 13 Oct. 2021 (Attorney Docket. No. FORG 1016-1); and U.S. application Ser. No. 15/900,475, titled “System for Controlling Access to a Plurality of Target Systems and Applications,” filed 20 Feb. 2018, now U.S. Pat. No. 10,708,274, issued 7 Jul. 2020 (Attorney Docket. No. FORG 1006-1); and U.S. application Ser. No. 16/016,154, titled “System for Controlling Access to a Plurality of Target Systems and Applications,” now U.S. Pat. No. 10,686,795, issued 16 Jun. 2020 (Attorney Docket. No. FORG 1006-2), which is a continuation in part of U.S. Ser. No. 15/900,475.