This description relates to databases, which are organized collections of data, and database management systems that allow the definition, creation, querying, update, and administration of databases.
In computing systems, databases may be thought of as electronic filing systems for organizing and storing data.
A relational database is a database that has a collection of tables of data items, all of which may be formally described and organized according to a relational model. In the relational model of a database, all data may be represented in terms of tuples, grouped into relations. Data in a single table of tuples may represent a relation. Further, the tables in a database may have additionally defined relations with each other. In contrast to a relational database, an object database (also object-oriented database management system) is a database in which information is represented in the form of objects as used in object-oriented programming.
Business or other computing systems may have databases, which include information that can be retrieved by any of a number of users. In a database management system, queries are a primary mechanism for retrieving information from a database. A query directed to a database may consist of questions presented to the database in a predefined format. Many relational database management systems use, for example, the Structured Query Language (SQL) standard query format.
A query directed to a database can be simple query by which data for a single data relationship may be retrieved from the database. An example of a simple query to a “books” database (which contains a table of book titles and authors, and a table of book titles and prices) may be: What are the titles of books with a price greater than $100? Alternatively, a query can be an “aggregate” query by which several data relationships or inter-relationships may be retrieved. An example of an aggregate query to the books database may be: What are the titles of books and what are the numbers of authors associated with each book? Another example of an aggregate query may be: What are the names of authors and what are the average prices of the books for each author?
An aggregate relation or relationship may be viewed as a relationship of relations in the data. To speed up responses to aggregate query responses, some databases may include predefined or pre-computed aggregate relationships in addition to single relationships between data items. The pre-computed aggregate relationships may, for example, enable quick query access to reports in a business data warehouse.
Consideration is now being given to database security and access control. In particular attention is directed to access control techniques for placing selective restriction on user-access to pre-computed aggregate relationships in a database.
In relational database theory, a relation may be about a ‘relationship’ that holds between data values in different domains. Each relationship may be expressed as a tuple, and a relation defined as a set of such tuples (e.g., (d1, d2, . . . , dn), where each element dj is a member of Dj, a data domain). In a computer database, a relation may be represented as a “table” (e.g., in row/column format). A tuple (e.g., a row of data items in a table) may be a set of attribute values, each attribute being identified by its name (e.g., column heading) and not by any ordinal position. A tuple by itself (e.g., a set of one tuple) may be viewed as a single-row table. For convenience in description herein, the term “relation” (or “relationship”) may be used interchangeably to refer to a table (e.g., the relationship between columns within the table), a tuple (e.g., a row in the table) or tuples. The term may also refer to relationships (e.g., links) between tables.
In a general aspect, a computer-implemented method involves constituting an aggregate relation of one or more explicitly-defined relations or relationships (e.g., one or more tuples) in a computer database. Each of the explicit tuples may be associated with a respective security label under a label-based access control scheme for the computer database. The computer-implemented method may further involve computing a security label for the aggregate relation based on the security labels of the one or more explicit relations constituting the aggregate relation.
In a further aspect, the label-based access control scheme for the computer database includes a lattice-based access control model in which objects and subjects of the computer database are represented as elements of a lattice. The method further involves computing an infimum (lattice operator ) of the labels of the explicit relations in each alternate combination of the explicit relations that imply the aggregate relation and computing a supremum (lattice operator ) of the infimums of all the alternate combinations that imply the aggregate relation to arrive at the label for the aggregate relation.
The details of one or more implementations are set forth in the accompanying drawings and the description below. Other features will be apparent from the description and drawings, and from the claims.
Like reference symbols in the various drawings indicate like elements.
Systems and methods for access control (e.g., read access control) of aggregate relations (“aggregates”) in computer database(s) are described herein. Using the systems and methods disclosed herein, the aggregates may be associated with labels that define visibility (e.g., for read access) of the aggregates as a function of user context. In an example business computing scenario, user context may be modeled, for example, by an algebraic structure based on user characteristics or parameters (e.g., user role, job description, experience level, level of detail, etc.). Restricting user access to authorized groups or sets of users defined by user context may encourage use of pre-computed aggregates in the database(s) for query responses. A pre-computed aggregate may be used to respond to a query from any user in an authorized group or set of users. A need to re-compute the aggregate for query responses, user-by-user, can be avoided.
The computer database(s) containing the aggregates may, for example, include one or more geographically decentralized or autonomous databases that are connected or interlinked by a computer network.
The systems and methods for access control described herein may configure and present the computer database(s) (e.g., computer database 100) as a restricted access database (e.g., restricted access database 170) to users. The computer database(s) containing the aggregates may be transparently mapped (e.g., by DBMS 140) into a single restricted access database without actual or physical data integration. The restricted access database may mask data in the computer database(s) so that only a subset of the data appears to exist, without actually segregating data into different tables, schemas, or databases, etc. In an example business implementation, a restricted access database may constrain users (e.g., business sites, departments, individuals, etc.) to operate only on their own records and at the same time allowing more privileged users and operations (e.g. reports, data warehousing, etc.) to access, for example, to all records in the databases.
In an example configuration of the computer database(s) containing the aggregates as a restricted access database, a label-based access control (or rule-based access control) scheme may be deployed, in accordance with the principles of the present disclosure. In the label-based access control scheme, security labels (attached to data items) may be used to control access to data items (e.g., who has read access and who has write access to individual rows in a table in the database). The labels may be organized in a hierarchy of access levels. Privileges of different levels of access may be granted (e.g., by an access control administrator) to different users. The access control administrator may use any criteria to determine which users are granted which level of access (and record the privileges, for example, in an access control matrix available to DBMS 140). Access to data labeled at a certain level may be restricted and given only to users who have been granted that level of access or higher.
A lattice-based access control (LBAC) model, which may be used in the label-based access control scheme (which may also abbreviated as “LBAC”), involves multiple objects (e.g., data items, resources, computers, and applications) and/or subjects (e.g., individuals, groups or organizations) of the database. Access to objects may be determined by the interactions or relationships between any combination of labelled objects and labelled users or subjects. A lattice (“LBAC interaction lattice”) may be used to mathematically define or structure the interactions or relationships between various objects and subjects and the levels of security that an object may have and that a subject may have access to. A subject may be allowed to access an object only if the security level of the subject is greater than or equal to that of the object. The LBAC interaction lattice may have an algebraic structure representing a hierarchy of labels/access levels and may be recorded in an access control matrix available to DBMS 140.
In the example configuration of the computer database(s) containing the aggregates as a restricted access database, each of the relations, explicit or aggregate, in the computer database(s) may be assigned a label or tag (“relation access level label”) that is indicative of an access level required to access (e.g., read access) the relation under the LBAC. Further, users of the computer database(s) may be categorized into one or more user contexts. Each user context may be assigned a label or tag (“user access level label”) indicative a highest access level granted to users in that user context under the LBAC. Database query answering engine 130 may interpose an adjustable mask or filter (e.g., mask 150) between users and the computer database(s) to expose only a subset of data in the computer database(s) to users. Database query answering engine 130 may, for example, use a mask 150 to control exposure of an aggregate relation to a user. Database query answering engine 130 may compare the aggregate relation access level label and the user access level label of the user context/user, with reference to the LBAC interaction lattice. If the access level granted to the user context/user is higher or equal to the access level of the aggregate relation, database query answering engine 130 may configure mask 150 to permit user access to the aggregate relation (
For purposes of illustration of a label comparison process of database query answering engine 130,
LBAC scheme 300 may involve configuring computer database 100 as a database (e.g., restricted access database 170) by assigning respective labels to both explicit relations and pre-computed aggregate relations in the database and user contexts, and using a mask or filter (e.g., mask 150) to expose only relations having certain labels in a user context. The labels used in LBAC scheme 300 for labelling the explicit relations, aggregate relations and user contexts may, for example, be color labels based on the colors of Venn diagram 200 discussed above (
User contexts in LBAC scheme 300 may, for example, be based or defined by user roles (e.g., context A:user role=Customer, context B:user role=Development engineer, context C:user role=Customer service engineer, etc.). For access control of the labelled relations in database 100, each user context or role may be assigned color labels that are associated with respective color sub-masks or filters in mask 150. For example, contexts A, B and C may assigned labels cyan 31, yellow 32 and white 33 that may be associated with a cyan filter 21, a yellow filter 22, and a white filter 23, respectively, in mask 150.
The color of the sub-mask or filter may determine, by label color, which relations can be passed through or exposed in a user context. Venn diagram 200 (
Similarly, in user context B, yellow filter 22 may expose only the relations that have a green label (not shown), a yellow label (e.g., yellow 12) or a red label (e.g., red 11) but block relations that have a blue label (not shown) or a blue color component in the color of the label (e.g., cyan 13). Further, in context C, white filter 23 may expose the relations having any color label (e.g., yellow 12, red 11 and cyan 13).
Computer databases (e.g., computer database 100,
Further, an example of textual code representation (LA) of set of aggregate relations 420 in LDB 400 may be
For LDB 400, user contexts A and B (as discussed with reference to
Under the LBAC scheme, a user (e.g., u 501) may be associated with a yellow label indicating a grant of an l3 (yellow)-access level (based, e.g., on user context). With reference to
A further aspect of the disclosure herein relates to a method for pre-computing aggregate relations for a labeled database (e.g., LDB 600) containing labelled explicit relations (e.g. explicit relations r1, r2, r3, r4, and r5) and labeling the pre-computed aggregate relations for inclusion in or association with the labeled database. The explicit relations in the database may have been labelled according to a LBAC scheme. The aggregate relations may be computed or assembled, for example, by a database query answering engine (e.g., database query answering engine 140,
The operation of method 700 may be understood by application of method 700 to label example aggregate relations a1, a2, and a3 (below), which may have been pre-computed by database query answering engine 140, for example, from explicit relations r1, r2, r3, r4, and r5 in LDB 600.
a1: classifiedAs(ecoCalculatorV1, ServiceWithComingPricelncrease)
a2: classifedAs(ecoCalculatorV1, ServiceWithHighCustomerNr)
a3: classifiedAs(ecoCalculatorV1, LowProfitService)
a4: classifiedAs(ecoCalculatorV1, HighPerformanceService)
Label for Aggregate Relation a1
With reference to the labelled explicit relations r1, r2, r3, r4 and r5 (
Since the infimum () of cyan, yellow and black colors is white (lattice 500 and Venn diagram 200), and the infimum of green, red and black colors is yellow, the expression for label a1 may be simplified to label a1=(white) (yellow).
Further, since the supremum of white and yellow colors is yellow, the expression for the label al may be further simplified as label a1=yellow.
Labels for Aggregates a2 and a3
With reference to the labelled explicit relations r1, r2, r3, r4 and r5 (
Since the infimum () of yellow and black colors is yellow and the infimum of red and black colors is red, the expressions for labels a2 and a3 may be simplified to
Method 700 may further include associating the labelled aggregate relations (e.g., a1 (yellow label), a2 (yellow label), and a3 (red label) with the LDB (e.g., LDB 600). The LDB including the labeled aggregate relations may be configured as a restricted access database (e.g., restricted access database 170,
The various infrastructure, systems, techniques, and methods described herein may be implemented in digital electronic circuitry, or in computer hardware, firmware, software, or in combinations of them. The implementations may be a computer program product, i.e., a computer program tangibly embodied in an information carrier (e.g., in a machine readable storage device or non-transitory medium), for execution by, or to control the operation of, data processing apparatus (e.g., a programmable processor, a computer, or multiple computers). A computer program, such as the computer program(s) described above, can be written in any form of programming language, including compiled or interpreted languages, and can be deployed in any form, including as a standalone program or as a module, component, subroutine, or other unit suitable for use in a computing environment. A computer program can be deployed to be executed on one computer or on multiple computers at one site or distributed across multiple sites and interconnected by a communication network.
Method steps may be performed by one or more programmable processors executing a computer program to perform functions by operating on input data and generating output. Method steps also may be performed by, and an apparatus may be implemented as, special purpose logic circuitry, e.g., an FPGA (field programmable gate array) or an ASIC (application specific integrated circuit).
Processors suitable for the execution of a computer program include, by way of example, both general and special purpose microprocessors, and any one or more processors of any kind of digital computer. Generally, a processor will receive instructions and data from a read only memory or a random access memory or both. Elements of a computer may include at least one processor for executing instructions and one or more memory devices for storing instructions and data. Generally, a computer also may include, or be operatively coupled to receive data from or transfer data to, or both, one or more mass storage devices for storing data, e.g., magnetic, magneto optical disks, or optical disks. Information carriers suitable for embodying computer program instructions and data include all forms of non volatile memory, including by way of example semiconductor memory devices, e.g., EPROM, EEPROM, and flash memory devices; magnetic disks, e.g., internal hard disks or removable disks; magneto optical disks; and CD ROM and DVD-ROM disks. The processor and the memory may be supplemented by, or incorporated in special purpose logic circuitry.
To provide for interaction with a user, implementations may be implemented on a computer having a display device, e.g., a cathode ray tube (CRT) or liquid crystal display (LCD) monitor, for displaying information to the user and a keyboard and a pointing device, e.g., a mouse or a trackball, by which the user can provide input to the computer. Other kinds of devices can be used to provide for interaction with a user as well; for example, feedback provided to the user can be any form of sensory feedback, e.g., visual feedback, auditory feedback, or tactile feedback; and input from the user can be received in any form, including acoustic, speech, or tactile input.
Implementations may be implemented in a computing system that includes a back end component, e.g., as a data server, or that includes a middleware component, e.g., an application server, or that includes a front end component, e.g., a client computer having a graphical user interface or a Web browser through which a user can interact with an implementation, or any combination of such back end, middleware, or front end components. Components may be interconnected by any form or medium of digital data communication, e.g., a communication network. Examples of communication networks include a local area network (LAN) and a wide area network (WAN), e.g., the Internet.
While certain features of the described implementations have been illustrated as described herein, many modifications, substitutions, changes and equivalents will now occur to those skilled in the art. It is, therefore, to be understood that the appended claims are intended to cover all such modifications and changes as fall within the scope of the embodiments.