The field of invention relates generally to relational database management; and, more specifically, to weak referenced based eviction of persistent data from cache.
Relational databases are used to define relationships between items of persistent data. For example,
The relational database entries of
Notions of “navigability” come into play in the design of a relational database. Navigability defines the ordered flow in which elements of data within a relational database can be accessed. For example, according to the simplistic relational database entries observed in
Unidirectional relationships 101, 102 enforce the above policy in which information can be obtained in a first direction of object access flow but not in a second. In a typical application, the Customer A object would include information that defines the unidirectional relationship 101 to Order 1 but the Order 1 object would not include any such information (i.e., only the Customer A object has information that corresponds to relationship 101); and, the Customer B object would include information that defines the unidirectional relationship 101 to Order 2 but the Order 2 object would not include any such information (i.e., only the Customer B object has information that corresponds to relationship 102).
An artifact of the information that defines a relationship is a “reference”. In an object oriented environment, a reference is information that allows a “pointed to” object to be identified from a “source” object. Thus, an artifact of the information that defines relationship 101 would be a reference that allows the Order 1 object to be identified from the source Customer A object. One embodiment of a reference is the identification of a location in memory where the pointed to object is found. The source object includes or calls upon the reference in order to find the pointed to object. References can frequently be viewed as basic features having other uses beyond relational databases (such as a function call where a source object employs a reference to use a method contained by the pointed to object).
A method is described in which a reference to an item of persistent data is established because the item of persistent data is cached. The reference is maintained whether or not the item of persistent data is used by an application. The reference is maintained whether or not the item of persistent data is referred to by another reference, where, the another reference is to implement a relational database relationship. The method includes removing the item of persistent data from the cache because the item of persistent data was only referred to by the reference.
The present invention is illustrated by way of example and not limitation in the figures of the accompanying drawings, in which like references indicate similar elements and in which:
a shows a collection of cached objects of persistent data that are associated through relational database relationships and that are weakly referred to by a cache object;
b shows an unused cached object of persistent data that is only weakly referred to by the cache object;
a through 5d shows a process by which a collection of cached objects that are associated through relational database relationships are evicted from the cache as a consequence of their not being used;
A persistent data management service 204 can be viewed, at least in one instance, as a “faster” database because it utilizes storage elements (e.g., semiconductor memory) that are integrated closer to the processors that execute the application software than the actual database 202 itself. These proximate storage elements are generally referred to as a cache 205. Here, the application 203 uses the persistent data management service 204 largely as it would an external database (e.g., requesting database entries and updating database entries).
The persistent database management service 204 manages cached database entries and communicates with the external database 202 to read/write database entries from database 202 from/to cache 205. Because the function of the persistent data management service 204 is heavily involved with cached information, for illustrative convenience, cache 205 is drawn as being part of the persistent data management service 204 itself.
An issue with the persistent data management service's use of the cache 205 is the presence in the cache of database entries that are no longer being used (or at least have not been used for some time and/or are not expected to be used for some time). Because the cache 205 has limited storage resources, populating the cache with database entries that are not being used results in efficiency if those database entries that are being used cannot be entered into cache 205. As such, some mechanism must exist for “cleansing” the cache 205 of the unused database entries.
a relates to a method for cleaning out the cache of its unused database entries.
As such, each the cached database entries 312 through 316 may be viewed as at least one object of an entity bean (or, more generally, as at least one object within an object oriented software environment). For simplicity, the discussion of
a shows these references schematically as “weak” reference 322, 323, 324, 325, 326 to cached persistent data objects 312, 315, 313, 316, 314, respectively. The term “weak” reference is to be contrasted against the term “strong” or “hard” reference. Here, the existence of a “weak” reference only signifies that a persistent data object exists in cache 305 while a “hard” reference signifies not only that a persistent data object exists in the cache 305 but also that it is being used from the cache 305 (e.g., by an application).
The distinction is pertinent because, as will be addressed more fully below, each persistent data object in cache 305 that is only referred to by a weak reference (i.e., the object is not referred to by a hard reference) can be removed from the cache (because without a hard reference it is not being used); while, each persistent data object that is referred to by at least one hard reference should remain in the cache because it is being used. Thus, the weak vs. hard reference distinction can be used as a criteria for deciding whether or not to keep a persistent data object in the cache 305 or to remove it from the cache 305.
In
It will be appreciated by those of ordinary skill that a “bean” is a “component” within a Java Beans environment. Component based architectures are well-known in the art and are discussed in more detail at the end of this detailed description.
Because reference 317 is deemed a hard reference, persistent data object 312 would not be removed from the cache 305 if the methodology of
b shows the same situation as in
a shows the situation of
From
Because persistent data objects 313 and 315 are removed, so are the hard references that flowed from them to persistent data objects 314 and 316, respectively. This leaves persistent data objects 314 exposed as being pointed to only by a weak reference.
Before moving on to
Before the cache's congestion reaches level 601, only objects that are not referred to at all are removed from the cache. In an embodiment where all cached persistent data objects are at least weakly referred to (e.g., by a cache object) cached objects of persistent data are not removed from the cache when the cache's congestion is at level 601 or below. That is, objects representing persistent data are not evicted from the cache leaving only “other” objects that do not correspond to persistent data to be removed from the cache. This scheme puts some priority of persistent data over non persistent data when the cache is at comparatively low levels of congestion. Moreover, of the objects that are not referred to at all those that are not referred to are deemed lowest priority (e.g., because of the suggestion that they are not being used).
Once the cache's congestion reaches level 601, however, objects representing persistent data begin to be removed from the cache along with non referred to objects that do not correspond to persistent data. The persistent data objects that removed between levels 601 and 602 are only weakly referred to; and, must have remained only weakly referred to for some period of time (e.g., X seconds). This scheme essentially identifies persistent data objects that are not just “not being used” but are “not being used and have not been used for some period of time”. Thus the eviction scheme between levels 601 and 602 maintains priority of persistent data objects over “other” objects and also prioritizes unused persistent data objects that have a recent history of use over those that do not have a recent history of use.
Once the cache reaches level 602, however, the distinction between unused persistent data objects that have a recent history of use over those that do not have a recent history of use. That is, once level 602 is reached, if a persistent data object is only weakly referred to it is marked for removal irregardless of how long it has been only weakly referred to. Persistent data that is only weakly referred to is marked for removal along with “other” objects that do not correspond to persistent data beyond level 602.
It is important to re-emphasize that although the above discussion has been directed to examples within an object oriented environment, the teachings provided herein can be extended to non object oriented environments. For example, items of cached persistent data may strongly refer to one another or may be strongly referred to by modules of software that use them. Likewise, items of cached persistent data may be weakly referred to by a software module that represents the cache itself (or some other software module).
Component based software environments use granules of software (referred to as “components” or “component instances”) to perform basic functions. Some examples of component based architectures include Java Beans (JB), Enterprise Java Beans (EJB), Common Object Request Broker Architecture (CORBA), Component Object Model (COM), Distributed Component Object Model (DCOM) and derivatives there from.
The functional granularity offered by a plurality of different components provides a platform for developing a multitude of more comprehensive tasks. For example, a business application that graphically presents the results of calculations made to an individual's financial records (e.g., amortization of interest payments, growth in income, etc.) may be created by logically stringing together: 1) an instance of a first component that retrieves an individual's financial records from a database; 2) an instance of a second component that performs calculations upon financial records; and, 3) an instance of a third component that graphically presents financial information.
Moreover, within the same environment, another business application that only graphically presents an individual's existing financial records may be created by logically stringing together: 1) another instance of the first component mentioned just above; and, 2) another instance of the third component mentioned above. That is, different instances of the same component may be used to construct different applications. The number of components within a particular environment and the specific function(s) of each of the components within the environment are determined by the developers of the environment.
Components may also be created to represent separate instances of persistent data (e.g., a first component that represents a first row of database information, a second component that represents a second row of database information, etc.).
Processes taught by the discussion above may be performed with program code such as machine-executable instructions which cause a machine (such as a “virtual machine”, general-purpose processor or special-purpose processor) to perform certain functions. Alternatively, these functions may be performed by specific hardware components that contain hardwired logic for performing the functions, or by any combination of programmed computer components and custom hardware components.
An article of manufacture may be used to store program code. An article of manufacture that stores program code may be embodied as, but is not limited to, one or more memories (e.g., one or more flash memories, random access memories (static, dynamic or other)), optical disks, CD-ROMs, DVD ROMs, EPROMs, EEPROMs, magnetic or optical cards or other type of machine-readable media suitable for storing electronic instructions. Program code may also be downloaded from a remote computer (e.g., a server) to a requesting computer (e.g., a client) by way of data signals embodied in a propagation medium (e.g., via a communication link (e.g., a network connection)).
It is believed that processes taught by the discussion above can be practiced within various software environments such as, for example, object-oriented and non-object-oriented programming environments, Java based environments (such as a Java 2 Enterprise Edition (J2EE) environment or environments defined by other releases of the Java standard), or other environments (e.g., a .NET environment, a Windows/NT environment each provided by Microsoft Corporation).
In the foregoing specification, the invention has been described with reference to specific exemplary embodiments thereof. It will, however, be evident that various modifications and changes may be made thereto without departing from the broader spirit and scope of the invention as set forth in the appended claims. The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense.