The present invention relates to repository management, and in particular, to ways of facilitating integration of user-supplied rules with a repository.
The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.
With the advent of computer systems, various techniques are evolving for storing and organizing electronic information. The most ubiquitous of these is the hierarchical file system. In a hierarchical file system, data is stored as a unit of data referred to as a file. Files are stored in persistent storage, such as a disk system. The files are hierarchically organized; a typical file system has directories arranged in a hierarchy, and documents that are contained in the directories.
Typically a file system is an integrated component of an operating system. The operating system provides functions for managing the files in the file system. For example, an operating system provides utilities for copying and deleting and controlling access to the files.
Repository
A repository is a computer system that stores and manages access to “resources”. Specifically, a repository is a combination of integrated software components and an allocation of computational resources, such as memory, disk storage, a computer, and processes on the node for executing the integrated software components on a processor, the combination of the software and computational resources being dedicated to managing storage and access to resources.
A resource is a data source. The term resource encompasses a broad range of kinds of data sources. A resource can not only be a file, but also a XML document, including one stored in a file or stored in the tables of a relational database system. A resource may also be a CGI script, that, when executed, dynamically generates data.
Similar to a hierarchical file system, resources in a repository are organized according to a hierarchy referred to herein as a resource hierarchy. Each file may be located or identified by tracing a “path” through the hierarchy to the resource. For a given resource, a path begins at a root directory and proceeds down a hierarchy of directories to eventually arrive at the directory that contains the resource. A repository may associate more than one path with a resource.
A repository is typically part of a n-tier system, where the repository is in the first tier and one or more applications are in the outer tier. An application, as the term is used herein, is a unit of software that is configured to interact with and use the functions of a repository. In general, applications are comprised of integrated functions and software modules (e.g. programs comprised of machine executable code or interpretable code, dynamically linked libraries) that perform a set of related functions. The applications are configured to interact with a repository by establishing a connection to the repository through one or more interface components configured for interfacing to the repository. Often, but not necessarily, an application and repository are located on different computers; the connection to the repository includes a network connection to the repository.
Unlike a file system, a repository manages resources based on the content of the resource. For example, for a particular directory, a repository may only allow resources that contain certain types of data to be located within the directory, types such a XML document, or a XML document that conforms to a particular XML schema.
In addition, a repository may be customized through integration of business rules. Business rules are rules made by users of the repository for how a repository should manage resources on behalf of an application. Business rules include rules for how to respond to repository events, events such as accessing a resource, creating a resource in a particular directory, or moving a resource between directories. For example, a business rule may require that only documents with a particular content (e.g. XML documents, images) can be located at a particular directory. When a XML document is added to a particular directory, a business rule may require shredding a XML document (that is, break down the document into its constituent parts, e.g. elements, attributes) and store a representation of the document in an object-relational database system.
A repository may also integrate a more robust set of functions than is typically available with hierarchical file systems. A repository may provide the ability for versioning of resources, or a categorization engine that categorizes the resources.
Integrating Business Rules and Logic
Business rules are expressed by and/or implemented in business logic. Business logic refers to data, including code and instructions, that describe and/or define business rules and that control how the repository manages resources. Business logic is not native to the repository, but is instead supplied and/or input by users of the repository. Native code or data of a repository typically is developed by vendors of repositories.
There are various ways of integrating business rules and logic. The first is the single-application approach. Under this approach, one application, with unfettered access to a repository, substantially contains all the business logic for the business rules. For example, an application, with unfettered access to a repository, is configured to control what directories may be accessed by users or modules of the application. A drawback to this approach is that it does not facilitate sharing and partitioning access to the repository between multiple applications. Any application using a repository may access the resources of another application—a very undesirable situation from a security point of view.
This approach is also impractical to implement when use of third party applications is desired. It is impractical to customize a third party application in order to implement the business rules of a particular customer.
Another approach is the event-callout approach. Under this approach, a user supplies and registers callback routines that are called when certain repository events occur. The callback routines implement business logic and supply output values that indicate to the repository how the repository should manage or respond to the event. An advantage of this approach is its flexibility; a wide range of policies may be implemented using computer code. A disadvantage of this approach is its inefficiency. It involves a callout to a callback routine, which may reside within an application. Further, the repository, which has knowledge of how a repository is implemented, has no knowledge or control of how business logic is implemented in the callout routines, and is therefore unable to optimize how the repository is organized or optimize implementation and/or execution of business logic. Furthermore, the implementers of callback routines have limited knowledge of the repository design and configuration, and are less able to optimize callback routines for use with the repository.
Based on the foregoing, there is clearly a need for a new way to integrate business rules within a repository.
The present invention is illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings and in which like reference numerals refer to similar elements and in which:
A method and apparatus are described for integrating business rules within a repository. In the following description, for the purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the present invention. It will be apparent, however, that the present invention may be practiced without these specific details. In other instances, well-known structures and devices are shown in block diagram form in order to avoid unnecessarily obscuring the present invention.
Techniques are described for integrating business logic through the use of “resource configurations”. A resource configuration is a unit of business logic that is supplied, at least in part, by a user to the repository and is associated with a particular resource, such as a directory. Each resource configuration contains one or more configuration items that each defines and/or expresses one or more business rules for managing a resource associated with a resource configuration.
A resource configuration enables a user to “declaratively” and/or “programmatically” integrate business rules within a repository. A business rule is declaratively integrated within a repository by supplying business logic that conforms to a format or syntax recognized by the repository for expressing business rules and operations for managing resources in the repository. The business logic may take the form of values assigned or associated with particular attributes recognized by the repository to mean that a resource should be managed in a particular way. The business logic can also take the form of code that conforms to a computer language like syntax that specifies instructions for performing operations within a repository. The repository interprets, evaluates, and/or analyzes declaratively provided business logic to carry out the business rule expressed by the business logic.
A business rule is programmatically integrated within a repository by specifying, within a resource configuration, a callback routine and an event that triggers invocation of the callback routine. Such callback routines are referred to as event listeners. An event listener implements the business logic for handling a particular event. The output of an event-listener indicates to the repository how the repository should handle the event. Further details of event listeners and how a resource configuration may specify them can be found in the Event Listener application.
After defining the business rules in a resource configuration, a user can submit to the repository a command to associate a resource configuration to a directory or file in the repository. A resource that has been associated with a resource configuration is referred to herein as an associated resource.
Each time a repository operation acts upon a resource, the repository carries out the business rule specified in a configuration item of a resource configuration associated with the resource. For example, a configuration item of a resource configuration associated with a resource may specify a list of users who may access the resource. The configuration item is carried out, when, based on the configuration item, the repository limits access to the resource to those on the list.
Illustrative Repository with Resource Configuration
As shall be described in greater detail, an association between a resource and a resource configuration may be based on the resource's position in the hierarchy of resources. Therefore, a description of terminology used to describe hierarchical relationships in a hierarchy of resources is useful. File DB1 descends from directory DB and directory DA. DB1, however, directly descends from directory DB because there is no other directory in path between directory DB and file DB1. Therefore, DB1 is referred to herein as a child of, an immediate descendent of, or as immediately descending from directory DB1. With respect to both DA and DB, DB1 may also be referred to as descendant of or as descending from both DA or DB. With respect to file DB1, DB is referred to an immediate ascendant, while both DA and DB may be referred to as an ascendant of DB1. Note also DA and DB may also be referred to as containing DB1.
A subtree is used to refer to a directory and its descendants, where the directory has an ascendant. For example, the director DB and descendents, DB1 and DB2, form a subtree. DB has ascendant DA.
Associating Resource Configurations with Resources
Resource configurations may be associated with resources explicitly or inherently. To explicitly associate a resource configuration with a resource, a user submits to a repository a command that specifies the association. For example, a user may submit a command that identifies a resource and a resource configuration to associate with the resource, causing the repository to associate the resource with the resource configuration.
A resource may be associated with a resource configuration through “inheritance.” Under inheritance, a resource configuration explicitly associated with a directory identifies one or more resource configurations that are to be inherited by a descendant of the directory. Specifically, a configuration item in the resource configuration explicitly associated with the directory, identifies a resource configuration that is to be inherited by descendants; the configuration item can specify more than one resource configuration, allowing descendents to inherit multiple resource configurations. When a descendant resource of the directory is created, the resource is associated with the resource configuration specified by the configuration item. In this way, a resource inherits a resource configuration. Once a resource configuration is inherited by a resource, their association persists even if the resource is moved within the repository hierarchy so that the resource is no longer a descendant of the directory from whom the resource configuration was inherited.
According to an embodiment, a resource configuration can be associated with the root directory of a repository. In this case, all the resources in the repository can inherit the same resource configuration that is associated with the root directory.
In resource configuration 201, the element <defaultChildConfig> is a configuration item for defining a default resource configuration to be inherited by descendants. The <configuration> element specifies that a resource configuration, named /sys/resconfigs/rcl.xml', is to be inherited by the descendants.
Illustrative Declarative Rules
A resource configuration can be used to declaratively specify a variety of business rules for resources that have been associated with the resource configuration.
A resource configuration may specify constraints. For example, a configuration item may specify that a directory may only contain resources of a particular type.
A resource configuration can declare a default Access Control List (“ACL”). An ACL is a mapping between a list of users and types of access privileges assigned to them. For example, an element <defaultChildACL> specifies a default ACL to be assigned to a descendant when it is created. This consequently controls who can access the resource.
A resource configuration can define default properties for resources for when the resources are created. For example, an element <defauItChildConfig> can specify such default properties.
A configuration item may define a mapping of a file name extension to a Multipurpose Internet Mail Extensions (“MIME”) type. Such mappings are referred herein to as MIME type mappings. The MIME type mappings of a resource configuration are applied to its associated resources. Thus, a resource with a particular file name extension that is mapped to a particular MIME type is treated by the repository as belonging to that MIME type. A MIME type is a data format recognized by the Internet Engineering Task Force. Examples of MIME types include GIF and PostScript.
These examples of what business rules may be made declaratively are illustrative and non-limiting. Therefore, it is understood that the present invention is not limited to the illustrative business rules expressly disclosed herein.
Pre-conditions: Each configuration item may be associated with a precondition. The precondition must be satisfied if the configuration item is to apply. This allows a configuration item to be applied to different types of resources, based on their type, and content and properties. The preconditions allow sets of business rules to be applied to different types of resources, independent of where the resources are located in the repository hierarchy. The preconditions can also be used to control which descendant resources inherit resource configurations from an ascendant directory
Preconditions may be type-based. For example, satisfying a precondition may require that a resource be a particular type based on, for example, file name extensions, or MIME types, or whether the resource is an instance of a particular XML schema.
A file name extension is a string concatenated to another portion of a file name, and is delimited from the file name by punctuation such as a period. For example, in the file name “text.txt”, the file name extension is “txt”.
Pre-conditions may be content-based. For example, a precondition may require that a resource have an image resolution of 600 dpi, or that an XML document has a particular attribute value, e.g. that the attribute ‘/PO/RequesterID’ in the XML document equals the value ‘ACME’. According to an embodiment, a precondition identifies a property of a XML document resource using XPATH notation. XPath is described in XML Path Language (XPATH), version 1.0 (W3C Recommendation 16 Nov. 1999), which is incorporated herein by reference. XPath 2.0 and XQuery 1.0 are described in XQuery 1.0 and XPath 2.0 Full-Text, W3C Working Draft 9 Jul. 2004, which is incorporated herein by reference.
Illustrative Resource Configuration
The element <event-listeners> defines an event listener. The child element <events> contains elements that represent events to trigger the event listener. The element <pre-condition> specifies preconditions that must be satisfied in order to invoke an event listener for an associated resource. In this case, one pre-condition is defined. Element <existnodes> declaratively states that the associated resource must have an attribute ContentType=‘image.gif’, or, in other words, the content type must be a GIF image file.
The approach described herein for integrating business rules combines the flexibility of implementing business rules programmatically with the power of being able to integrate business rules declaratively. Because the repository can interpret and understand declaratively provided business logic, the repository can optimize the way it carries out the business logic and/or organizes the repository. Further, the business logic may be executed more efficiently because a callback routine is not invoked to execute the business logic. Finally, an application may share a repository with another application without having to rely on another application, with unfettered access to the repository, to refrain from violating the security of the application. The security may be enforced internally within the repository through resource configurations provided on behalf of the application.
Computer system 300 may be coupled via bus 302 to a display 312, such as a cathode ray tube (CRT), for displaying information to a computer user. An input device 314, including alphanumeric and other keys, is coupled to bus 302 for communicating information and command selections to processor 304. Another type of user input device is cursor control 316, such as a mouse, a trackball, or cursor direction keys for communicating direction information and command selections to processor 304 and for controlling cursor movement on display 312. This input device typically has two degrees of freedom in two axes, a first axis (e.g., x) and a second axis (e.g., y), that allows the device to specify positions in a plane.
The invention is related to the use of computer system 300 for implementing the techniques described herein. According to one embodiment of the invention, those techniques are performed by computer system 300 in response to processor 304 executing one or more sequences of one or more instructions contained in main memory 306. Such instructions may be read into main memory 306 from another computer-readable medium, such as storage device 310. Execution of the sequences of instructions contained in main memory 306 causes processor 304 to perform the process steps described herein. In alternative embodiments, hard-wired circuitry may be used in place of or in combination with software instructions to implement the invention. Thus, embodiments of the invention are not limited to any specific combination of hardware circuitry and software.
The term “computer-readable medium” as used herein refers to any medium that participates in providing instructions to processor 304 for execution. Such a medium may take many forms, including but not limited to, non-volatile media, volatile media, and transmission media. Non-volatile media includes, for example, optical or magnetic disks, such as storage device 310. Volatile media includes dynamic memory, such as main memory 306. Transmission media includes coaxial cables, copper wire and fiber optics, including the wires that comprise bus 302. Transmission media can also take the form of acoustic or light waves, such as those generated during radio-wave and infra-red data communications.
Common forms of computer-readable media include, for example, a floppy disk, a flexible disk, hard disk, magnetic tape, or any other magnetic medium, a CD-ROM, any other optical medium, punchcards, papertape, any other physical medium with patterns of holes, a RAM, a PROM, and EPROM, a FLASH-EPROM, any other memory chip or cartridge, a carrier wave as described hereinafter, or any other medium from which a computer can read.
Various forms of computer readable media may be involved in carrying one or more sequences of one or more instructions to processor 304 for execution. For example, the instructions may initially be carried on a magnetic disk of a remote computer. The remote computer can load the instructions into its dynamic memory and send the instructions over a telephone line using a modem. A modem local to computer system 300 can receive the data on the telephone line and use an infra-red transmitter to convert the data to an infra-red signal. An infra-red detector can receive the data carried in the infra-red signal and appropriate circuitry can place the data on bus 302. Bus 302 carries the data to main memory 306, from which processor 304 retrieves and executes the instructions. The instructions received by main memory 306 may optionally be stored on storage device 310 either before or after execution by processor 304.
Computer system 300 also includes a communication interface 318 coupled to bus 302. Communication interface 318 provides a two-way data communication coupling to a network link 320 that is connected to a local network 322. For example, communication interface 318 may be an integrated services digital network (ISDN) card or a modem to provide a data communication connection to a corresponding type of telephone line. As another example, communication interface 318 may be a local area network (LAN) card to provide a data communication connection to a compatible LAN. Wireless links may also be implemented. In any such implementation, communication interface 318 sends and receives electrical, electromagnetic or optical signals that carry digital data streams representing various types of information.
Network link 320 typically provides data communication through one or more networks to other data devices. For example, network link 320 may provide a connection through local network 322 to a host computer 324 or to data equipment operated by an Internet Service Provider (ISP) 326. ISP 326 in turn provides data communication services through the world wide packet data communication network now commonly referred to as the “Internet” 328. Local network 322 and Internet 328 both use electrical, electromagnetic or optical signals that carry digital data streams. The signals through the various networks and the signals on network link 320 and through communication interface 318, which carry the digital data to and from computer system 300, are exemplary forms of carrier waves transporting the information.
Computer system 300 can send messages and receive data, including program code, through the network(s), network link 320 and communication interface 318. In the Internet example, a server 330 might transmit a requested code for an application program through Internet 328, ISP 326, local network 322 and communication interface 318.
The received code may be executed by processor 304 as it is received, and/or stored in storage device 310, or other non-volatile storage for later execution. In this manner, computer system 300 may obtain application code in the form of a carrier wave.
In the foregoing specification, embodiments of the invention have been described with reference to numerous specific details that may vary from implementation to implementation. Thus, the sole and exclusive indicator of what is the invention, and is intended by the applicants to be the invention, is the set of claims that issue from this application, in the specific form in which such claims issue, including any subsequent correction. Any definitions expressly set forth herein for terms contained in such claims shall govern the meaning of such terms as used in the claims. Hence, no limitation, element, property, feature, advantage or attribute that is not expressly recited in a claim should limit the scope of such claim in any way. The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense.
The present invention is related to U.S. patent application Ser. No. ______, entitled A Repository Event Model in a Database System (Attorney Docket No. 50277-2601), filed on the equal day herewith by Thuvan Hoang, et al., the contents of which are incorporated herein by reference and is herein referred to as the “Event Listener application”.