Unless otherwise indicated herein, the approaches described in this section are not prior art to the claims in this application and are not admitted to be prior art by inclusion in this section.
The term “persistence” refers to storing information on non-volatile media such as a database. Object persistence refers to persistently storing objects that are written in accordance with an object-oriented programming language such as Java. The term “Java persistence” is often used as a convenient way to describe persistently storing Java objects. Conventional approaches to Java persistence include Java Data Objects, object serialization, and bytecode rewriting.
The term “Java Data Objects (JDO)” refers to a persistency technology based, at least in part, on one of the JDO specifications such as Java Specification Request (JSR)-000012 entitled, “The Java Data Objects (JDO) Specification.” The JDO specification specifies a mechanism for transparently persisting Java objects. Transparently persisting Java objects means that the software that is used to access and modify the fields of an object follows the standard practice used in most Java applications. In order to implement JDO, however, it is necessary to identify which classes should be persistent. JDO uses a metadata file formatted in the eXtensible Mark Language (XML) to identify persistent classes.
Object serialization is a mechanism for writing the state of an object (and a graph of the objects it references) to a serial output stream. The serial output stream is written to a destination such as a file. The serial stream can be read from the file to reconstruct the object. Object serialization requires that each object implement a particular interface. For example, Java serialization based on the java.io.Serializable package requires that all serializable objects implement the interface java.io.Serializable.
Bytecode rewriting is directed to rewriting Java classes as they are loaded to a Java Virtual Machine (JVM). When the JVM detects a reference to an unloaded class, it sends a request to a class loader to load the class from the file system. In standard Java, the class is loaded directly to the JVM. In bytecode rewriting, however, a bytecode transformer is invoked to transform the class before it is loaded. The bytecode transformer can modify the class to add persistence logic.
These conventional approaches to Java persistence each impose a precondition on persistent objects. For example, object serialization requires that persistent objects implement a serializable interface. Similarly, JDO technology involves describing persistent objects with a JDO metadata file. Object persistency based on bytecode rewriting involves post-processing object bytecode.
Embodiments in accordance with the present disclosure are illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings in which like reference numerals refer to similar elements.
Embodiments in accordance with the present disclosure are generally directed to a system and method for persisting objects and object closures. In contrast to conventional persistency solutions, embodiments in accordance with the present disclosure do not impose preconditions or requirements on persistent objects. In an embodiment, objects are scanned with object introspection to identify the members of the object. Introspection refers to the process of inspecting objects to obtain metadata about the objects. The term “dynamic introspection” refers to examining metadata of classes that are already loaded into a virtual machine (e.g., a Java Virtual Machine (JVM)). The members are transformed and stored into an intermediate data structure via a recursive scan and transformation algorithm. The functions of the scan and transformation algorithm can be influenced by a set of rules that allow, for example, avoiding a recursive scan of certain members and/or skipping other members altogether. The term “transform” refers to using an algorithm to determine how a member is represented in the intermediate data structure. The rules allow for the exclusion of parts of an object closure from the recursive scan and transformation. If the object references another object, then the process is repeated recursively on the other object. Thus, in an embodiment, an entire object closure may be scanned and transformed into the intermediate data structure. The intermediate data structure is persisted in a data store.
In an embodiment, a handle is returned when an object is stored. A handle is a Java object similar to a data base key, such that it uniquely identifies the stored object in the embodiment. In contrast to a data base key a handle is opaque, meaning that only the embodiment knows the exact implementation of a handle and can therefore manipulate it. Handles do not have any methods and applications cannot manipulate them; all they can do is pass handles on to the embodiment for processing. An object's handle can be used, for example, to retrieve the object from the embodiment. Handles themselves can also be stored in the embodiment, for example for bootstrapping purposes. To later retrieve such a handle from the embodiment, the handle must be stored under a unique name, which can be an arbitrary string. Handles that are stored under a name are also called “named handles”.
When an object is stored in the embodiment, not only the object itself, but also all objects that can be accessed from the object are also stored. A first object can be accessed from a second object if the second object either holds a reference to the first object, or the second object holds a reference to a third object, from which the first object can be accessed. The set of all objects that can be accessed from an object is called the object's closure, while the object is called the closure's anchor object. (Note that an object closure can have more than one anchor object, while not every object in a closure is also an anchor object of the closure.)
Finally, objects must be reachable in order to be retrieved from the embodiment. Per definition, all objects referenced by named handles are reachable. Further, all objects that can be accessed from reachable objects are reachable. All objects that are not reachable are called unreachable. Unreachable objects fill the database with no use. Such, they are removed during the data base garbage collection.
In an embodiment, persistence system 100 is part of a multi-tiered network. The multi-tiered network may be implemented using a variety of different application technologies at each of the layers of the multi-tier architecture, including those based on the Java 2 Enterprise Edition™ (“J2EE”) specification (e.g., the Websphere platform developed by IBM Corporation), the Microsoft .NET platform, and/or the Advanced Business Application Programming (“ABAP”) platform developed by SAP AG. The J2EE specification refers to any of the J2EE specifications including, for example, the Java 2 Enterprise Edition Specification v1.3, published on Jul. 27, 2001. None of these technologies, however, are required by an embodiment in accordance with the present disclosure.
Persistence system 100 includes Scan and Transform Engine (STE) 120, object store Application Program Interface (API) 122, configuration manager 130, persistence manager API 142, and cache 150. In alternative embodiments, persistence system 100 includes more elements, fewer elements, and/or different elements. STE 120 uses introspection to scan and transform object closures. The transformed object closure is stored in an intermediate data structure and passed to persistence manager 140. STE 120 is further discussed below with reference to
In an embodiment, configuration manager 130 provides STE 120 with rules that define, at least in part, the scan and transform process. Configuration manager 130 is further discussed below with reference to
Object store API 122 provides the interface between application 110 and STE 120.
Lifecycle methods 210 initiate and end access to a persistence system (e.g., persistence system 100, shown in
The PersistenceManager class is the fully qualified class name of the persistence manager that is used (e.g. persistence manager 140, shown in
The data store identifier specifies which data store to use (e.g., database 144 and file system 146). The value of this property may depend on the persistence manager defined with the PersistenceManager class. For example, for a Java Database Connectivity (JDBC) persistence manager, the value of this property is a JDBC string pointing to a database (e.g., database 144). For a file based persistence manager this could be a file name or the fully qualified name of a directory.
The “configuration file” refers to a name (e.g., the fully qualified name) of a configuration file containing one or more rules for defining the behavior of, for example, STE 120. Referring again to
Open method 214 is directed to applications that implement their own class loading. Persistence system 100 creates instances of application classes when retrieving persistently stored information. To create instances of application classes, persistence system 100 accesses the classes of the objects to be instantiated. The second parameter of open method 214 (e.g., classBroker parameter 217) enables persistence system 100 to have access to classes loaded by the application class loaders. In one embodiment, a class with a method Class classForName (className) uses the application class loaders to find the required classes.
Close method 216 closes persistence system 100. Closing persistence system 100 assures that clean-up operations, such as closing opened files or data bases, are properly executed to avoid data loss. Closing persistence system 100 also assures that system resources (e.g., memory, etc.) are returned to the system.
Persistency methods 220 provide access to objects stored in persistence system 100.
A handle object is an instance of a class that implements Handle. Handle instances are created by persistence system 100 rather than the application (e.g., application 110, shown in
An application can store and retrieve handles in persistence system 100 using names (e.g., based on java.lang.String). Thus, named handles serve as externally accessible “starting points” for object closures, and are useful for initial object retrieval from persistence system 100. For example, an application could use the class name of the application main class for a named handle, to retrieve an initial object closure. Then, the application can use handles stored within the initial object closure to access further objects in persistence system 100 and other named handles to access further closures.
Store method 310 is substantially similar to store method 305 but it also includes ruleSets parameter 320. RuleSets parameter 320 provides zero or more rule sets to define, at least in part, the behavior of STE 120. More precisely, RuleSets parameter 320 provides the name(s) of the rule stets. The rules/rule sets are defined in the rules/configuration file (e.g., configuration file 132, shown in
Retrieve method 325 is used to retrieve an object from persistence system 100. In an embodiment, the caller identifies an object closure to retrieve with a handle (e.g., handle 328). This handle is returned by store methods 305 and 310 during the store operation. In an alternative embodiment the retrieve method 325 could have an additional parameter ruleSet. This ruleSet is identical to the rule set of store method 310. The benefit of a rule set during a retrieve operation is to reduce the amount of data that has to be read from the data base. If, for example, for the retriever of an object closure only one member of the anchor object is of interest, a rule set can help to reduce the load on the data base: The rule set can prevent the embodiment from retrieving the whole object closure by excluding all members of the anchor object that refer to other objects. Such an alternative embodiment, however, cannot cache object closures retrieved with a rule set in the same way as object closures retrieved without a rule set: A later caller to retrieve is not aware of how the object closure was retrieved, and that it may be incomplete due to a rule set. Such, the cached object closure cannot be returned. There are at least two ways to handle objects retrieved with rule sets: a) do not cache them at all or b) cache them with the rule set that was used to retrieve them. In the latter case the cached object closure may only be returned in case the rule set used during the first retrieve operation is a subset of the rule set used during later retrieve operations.
Remove methods 330 and 335 remove objects from persistence system 100. Remove method 330 identifies the object to be removed by passing its handle. Remove method 335 identifies the object to be removed by passing the object itself. Unlike store methods 305 and 310, remove method 335 is not recursively invoked on referenced objects. That is, only the object itself and not the object closure is removed from persistence system 100. Removing an object from persistence system 100 may render objects of the object closure unreachable. Thus, persistence system 100 periodically triggers a mechanism similar to the Java garbage collection mechanism to remove unreachable objects. The actual cleanup mechanism is implemented by the persistence manager (e.g., persistence manager 140, shown in
RetrieveType methods 340-350 enable a caller to retrieve a set of object closures with common features through one operation. RetrieveType method 340 retrieves all object closures of a specified class. The fully qualified class name is provided as a parameter for RetrieveType method 340.
RetrieveType method 345 is similar to RetrieveType method 340 but it reduces the result set by applying a filter. The filter works on a very low level without restoring the actual objects. The entries in the filter data structure (e.g., HashMap) are name/value pairs. The name of an entry is a string denoting a member of the class. It has the format <fully qualified class name>.<member name>. The fully qualified class name is used to discriminate members in the inheritance hierarchy, for example, when members are overwritten in a subclass. The value of an entry is either a String, or, for primitive types, an instance of the respective boxing class. RetrieveType method 345 returns those object closures that match the values in the filter data structure (e.g., HashMap).
RetrieveType method 350 is similar to RetrieveType method 345 but it employs a filter that is more powerful and less efficient. All instances of the class are retrieved as object closures from persistence system 100. The retrieved objects are then passed to the filter. The filter (e.g., a filter object) implements a method called accept( ) which can perform arbitrary computations on the object closures, and eventually returns a Boolean value. If the result for an object closure is true, then the object closure is added to the result, otherwise, it is ignored. In an alternative embodiment retrieveType methods could have an additional parameter ruleSet. In analogy to the retrieve method, the rule set could be used to restrict the scope of the retrieve operation with the benefit of less load on the data base.
Referring again to
In an embodiment, helper methods 240 store and retrieve named handles. SetNamedHandle method 242 stores a handle that is identified by its name. Similarly, getNamedHandle method 244 retrieves a handle that is identified by its name. Method 246 is similar to method 242 except that it creates a unique name which is returned. Method 248 removes a named handle.
Referring again to
In an embodiment, the intermediate representation of an object is a data structure, with one entry per member of the object. Each entry includes a name/value pair. In one embodiment, the name of each entry is a string consisting of the member's fully qualified class name and the member's name, separated by a dot (‘.’). Note that the member's class name is not necessarily the object's class. For members in which the object's class inherits from a superclass, the superclass's name is used. For example, consider two classes A and B. A defines a member a, and B inherits from A, and defines an additional member b. For an instance i of B, the name for b is B.b, while the name for a is A.a.
In one embodiment, the value of each entry is one of: the boxing class of a primitive type, a string, or a handle. The term “boxing” refers to converting a primitive type to a reference type. In one embodiment, strings are represented by the java.lang.String class. As is further described below, handles may be used when a member is itself composed of sub-members (e.g., when the member is an array).
Three examples of scanning and transforming objects are discussed below. The first approach introduces a less complex example of an embodiment in accordance with the present disclosure. The second approach introduces specific handling to improve performance (note that this is optional). The third approach refines the second by introducing rules to control what is stored and how it is stored. Note that numbers in brackets like “(n)” or “(n-m)” refer to code line n or lines n-m in the listings below. It is to be appreciated that the code listings that appear below are pseudo code (e.g., they resemble JAVA but are not “real” JAVA).
The first approach, as illustrated in Listing 1, shows the basic principle: All objects ultimately can be broken into arrays and primitive typed values like integer values (int), long values (long), floating point values (float), etc. When an object is stored (1) a distinction is made between arrays (2-3) and other kinds of objects (4-5).
When an object (which is not an array) is stored (8-19) all of the object's members are scanned, transformed, and stored. For each member (belonging to the object's class and its super classes) a dataset is created and collected in an intermediate data structure. The data structure is handed over to the persistence manager (17). There it is stored.
In an embodiment, a so-called class “Member” is used to collect a member's dataset: its name, value, type and the declaring class. The name is a string, the value is an object, the type and the declaring class are Java classes. The scan and transformation process guarantees that the value is only one of the following things: a string, an instance of a boxing class (Integer, Long, Float . . . ) or an instance of Handle. The type corresponds to the value: String.class, Integer.class, Long.class . . . Handle.class. The declaring class is necessary for the following reason: An object's member may not necessarily have been declared in the object's class itself. It may also have been declared in one of the object's superclasses. Consider two classes A and B. A declares a member a, and B inherits from A, and defines an additional member b. For an instance i of B, the declaring class of b is B while the declaring class for a (which is also present in i) is the declaring class in A.
In an embodiment, the intermediate data structure is an array of instances of Member. An object's member m can be both, of primitive type (an int, long, float . . . ) or any kind of object. If m is of primitive type, an instance of Member is created and added to the intermediate data structure (13). Note that the primitive typed member is stored as an instance of its corresponding boxing type, for example int 42 is stored as Integer(42); nevertheless the primitive type's class is stored (“int” not “Integer”) as type. The declaring class is the class declaring m (the object's class or one of int superclasses). If m is a kind of object then the algorithm is recursively called again (15) which ultimately results in a handle. The handle is stored instead of the object's value in a further instance of Member. This instance is added to the intermediate data structure, too (15). After all members of the object have been scanned and transformed, the intermediate data structure is handed over to the persistence manager where it is stored (17). Finally the object's handle h is returned to the caller (18). Note that handle h is not a result of storing the intermediate data structure; it is delivered by the persistence manger in a separate call (10). This is necessary for the following (simple) reason: If an object references itself (directly or indirectly) its handle must be known before storing its intermediate data structure, simply because the intermediate data structure contains the handle in one of the covered Member instances.
Storing an array (21-32) in fact resembles the handling of an object (8-19), but with one important difference: An array does not have members like an object has, it has indexed elements. The scan and transformation process walks through all elements of an array and treats them in the same way it treats the members of an object: If element e is of primitive type it creates an instance of Member and adds it to an intermediate data structure (26), or if element e is a kind of object it (recursively) calls our algorithm again (28) and stores the resulting handle instead of the object's value in a further instance of Member. Note that, in the case of arrays, the array's class is taken as a member's declaring class (26) (28). The instance of Member is also stored in the intermediate data structure. Similar to the case of objects, the intermediate data structure is handed over to the persistence manager where it is stored (30). Finally the array's handle h is returned to the caller (31). Note that in the case of arrays instead of member names, which do not exist, an element's index is used as member name (26) (28).
The second approach adds to the first approach a specific handling for strings (see, e.g., code lines 34-35 and 68-74 below). This is done for performance reasons: Instead of storing a string ultimately as an array of characters (with probably hundreds of single character elements) strings are treated as strings. As all major data bases today have a native notion of strings, we assume that every persistence manager implementation also has a native notion of strings.
In an embodiment, when an object is stored (33) a distinction is made between strings (34-35), arrays (36-37), and other kinds of objects (38-39). Storing a string is handled as follows (68-74): The string is treated as a single-member object. The member is given the name “value”. The member's value is the string. The member's class and declaring class is String.class (71). An instance of Member is created and added to an intermediate data structure. The intermediate data structure is handed over to the persistence manager where it is stored (72). Finally, the string's handle h is returned to the caller (73). Storing arrays and other kinds of objects is described above with reference to the first approach. Listing 2 illustrates selected aspects of the second approach.
The third approach extends the second (and the first) approach by introducing rules to control which and how things are stored (see, e.g., the italic parts of the code lines 75-126 below). The rules introduced in the third approach control whether a member is stored or not, whether the algorithm is applied recursively to a member object or not, and whether the elements of a container (like arrays, lists, sets, vectors . . . ) are excluded from the algorithm.
It is possible to declare that a certain object's member (of primitive type or any kind of object) is to be excluded from being stored (92-93). This is done using OStore's rule file. As a result the member will not be stored and, as a consequence, will be defaulted to a member type specific default when retrieved from the database. An application for this kind of rule is to prevent irrelevant fields from being stored. One example is cached values that are computed and held redundantly. Other examples are class constants. They need not to be stored.
It is possible to declare that a certain object's member shall not be scanned recursively by our algorithm (97-98). (This rule is only applicable for members that are objects, not for primitive type members.) This is done using OStore's rule file. Instead the object's handle is taken from the cache (97). To keep the object store consistent, however, in an embodiment it is mandatory that a member has been scanned recursively and stored earlier, before it can be excluded. The reason is because then the object is cached and its handle can be retrieved from the cache. It is an error if the object is not in the cache. As introduced in the first approach the handle is stored instead of the object's value (98). Excluding a specific member from being recursively scanned is desirable, for example, when within a tree structure each object not only points to its children but also to its parent. If such a tree's top object is stored, the complete tree is scanned following the children references and scanning back along the parent references is redundant. Excluding the parent link via a rule prevents the STM 120 from doing so.
It is possible to declare that a certain container's elements are not scanned recursively by the algorithm (99-102). This is done using OStore's rule file. Containers are objects like lists, sets, maps, vectors . . . and arrays. Ultimately all of the mentioned objects types store their elements within arrays. To exclude a container's elements from being scanned recursively the scan and transformation process is slightly modified by switching to the “doNotRecurseElements” mode. Therefore the doNotRecurseElements flag is set to true (100). After the container has been handled the mode is switched back by setting the doNotRecurseElements flag to false (102). The doNotRecurseElements flag is part of the “Flags” class, a simple data structure to cover boolean values (191-194). The structure is handed down recursively to the storeArray( )method. If the scan and transformation process is in mode “doNotRecurseElements” an array's element (which is not of primitive type) is not scanned but its handle is taken from the cache (119). The handle is stored instead of the object's value (119). To keep the object store consistent, however, in an embodiment it is mandatory that a member has been scanned recursively and stored earlier, before it can be excluded. The reason is because then the object is cached and its handle can be retrieved from the cache. It is an error if the object is not in the cache. One example for applying the exclusion rule is the update of a node in a tree. Assume a tree structure has been persisted in OStore. Each node holds a list of references to its children. Now one node in the tree is to be updated. If the member for the children list cannot be excluded from the scan and transformation process, the whole sub-tree below the node will be updated, too. Excluding the member from the scan and transformation process improves the efficiency of the store method. Listing 3 illustrates selected aspects of the third approach.
Java objects consist of members having one of the following member types: primitive types (int, long, double . . . ) or objects such as strings, arrays or other kinds of objects.
STE 120 processes each member based, at least in part, on its member type. Table 1 provides processing rules for Java member types according to an embodiment in accordance with the present disclosure. In an alternative embodiment, STE 120 may apply more rules, fewer rules, and/or different rules.
The rules shown above are sufficient to transform even complex object closures because all members of Java objects ultimately consist of primitive types and arrays. While strings might also be handled as character arrays, experience shows that the rule for strings shown above improves performance. This procedure is valid, because even the most primitive persistence managers have a native notion of strings.
Referring again to
Updating objects works in substantially the same way as storing them for the first time. The main difference, with respect to the algorithm, is that instead of creating a new handle for objects that are going to be stored, the object's handle is read from the cache. All objects already having been stored or having been read from the database (which means that they also already have been stored some time ago) are available in the cache as long as the application is accessing them. So instead of asking the persistence manager for a new handle (10) (23) (44) (57) (70) (90) (113) (129), OStore asks the cache for the handle. Finally instead of calling persistenceManager.insert(h,i) (17) (30) (51) (64) (72) (107) (124) (132) OStore calls persistenceManager.update(h,i).
When an object is retrieved the cache is checked at first (See, e.g., code line 136, show in listing 4) whether the object is already available or not. If yes, it simply can be returned. If not, at first the object's class is determined. The persistence manager (and only the persistence manager) is able to interpret a handle and to return the class of the object that corresponds to the handle (139). Dependent on the class a string (140-141), an array (142-143) or an object of another type (144-145) is retrieved.
To retrieve a string (148-152) its intermediate data structure is first retrieved from the persistence manager (149). The intermediate data structure of a string only has a single element such as an instance of class Member. This instance's value member is taken to create a new string and to return it to the caller (151).
To retrieve an array (168-180) the array's class is first determined with the help of the persistence manager (169). After that the array's intermediate data structure is retrieved from the persistence manager (170). Then a new array with the required size is created (171). Now all instances of class Member, which are covered by the intermediate data structure, are evaluated (172-178): A Member instance's “name” member determines the related array's element index (173). A Member instance's “value” member determines the related array element's value (175). Note that if the value is a handle at first the corresponding object has to be retrieved. Therefore the retrieval process is called recursively (175), ultimately resulting in the object represented by the handle. The retrieved object is assigned to the array's element (175). If the value is not a handle, which means that originally the element's value was of primitive type, the Member instance's “value” member can be assigned directly. Note that because the Member class stores all primitive typed values wrapped by it's boxing class, it is necessary to convert a boxing class object to a primitive type value before the assignment. When all elements of the arrays have been restored it is returned to the caller (179).
To retrieve an object (which is not an array) the object's class is first determined with the help of the persistence manager (155). After that the object's intermediate data structure is retrieved from the persistence manager (156). Then a new object is created (157). Creating the object is further described below with reference to
Referring again to
Lifecycle methods 610 include initiate method 612, release method 614, and setDatastore method 616. Initiate method 612 is used to initialize a persistence manager prior to using it. Release method 614 releases the persistence manager after it is used so that resources can be returned to the system and cleanup, as needed, can be implemented. SetDatastore method 616 is used to define the data store for the persistence manager. This is a persistence manager specific string denoting the location where the data is stored (e.g., a JDBC string pointing to a database or a qualified name of a directory).
Insert method 720 inserts an object (or rather its corresponding data structure) into the persistence manager. In an embodiment, insert method 720 takes as a parameter a handle created by createHandle method 710 to identify the inserted object. Update method 730 is used to update objects that were previously stored. The object is identified by its handle, which was created when the object was inserted.
Delete method 740 deletes an object that is stored in a data store. The object is identified by its handle, which was created when the object was inserted. In an embodiment, only the object identified by the handle is removed. All other objects of the object closure remain unchanged. Select method 750 returns an object (or rather its corresponding data structure) stored in the data store. The object is identified by its handle, which was created when the object was inserted.
In an embodiment, selectHandles method 760 returns the handles of all objects of a certain class stored in the persistence manager. The set of matching objects can be reduced by specifying a filter. In one embodiment, the provided filter is an object of type HashMap. Filters are further discussed above with reference to
Referring again to
In an embodiment, helper methods 640 store and retrieve named handles. For example insertNamedHandle method 642 stores a handle that is identified by its name. Similarly, selectNamedHandle method 644 retrieves a handle that is identified by its name.
Method 648 removes a named handle. In an embodiment, getClass method 646 returns the class (e.g., java.lang.Class) of the object identified by the provided handle. In an embodiment, only the persistence manager is able to interpret a handle.
Referring again to
In an embodiment, the following rules are supported: excluding a specific member from being stored; excluding a specific member from being recursively scanned; and excluding the elements of a specific container from being scanned. In an alternative embodiment, more rules, fewer rules, and/or different rules are defined. Excluding a specific member from being stored is desirable, for example, when the member references runtime data that should not become part of the persistent store (e.g., cached data, rendered data, etc.). Excluding a specific member from being recursively scanned is desirable, for example, when within a tree structure each object not only points to its children but also to its parent. If such a tree's top object is stored, the complete tree is scanned following the children references and scanning back along the parent references is redundant. Excluding a container's elements from being scanned is desirable, for example, for VectorS because Vectors store references to their elements in an element array. Whenever a new element is added to a Vector, it is not necessary to update all elements of the Vector. Rather, it is sufficient to insert the new element and update the element array, which is already stored.
Turning now to
Referring to process block 930, each object member is stored in an intermediate data structure based, at least in part, on its object member type. Table 1 illustrates an algorithm for storing object members according to an embodiment in accordance with the present disclosure. As discussed above with reference to
Referring to process block 940, the intermediate data structure is persisted to a data store. The data store may be a database, a file system, and/or other store capable of providing non-volatile storage. In one embodiment, a persistence manager API provides an interface to one of several types of persistence managers. Each persistence manager is responsible for storing the information contained in the intermediate data structure onto a particular media.
Referring to process block 1040, in a particular embodiment the intermediate data structure is persisted to a single data table file; e.g., a database table in a database. Details of the data table will be discussed below. Referring to process block 1050, if there is another object to be persisted, then the process may repeat by returning to processing block 1000 to process the next object.
As mentioned above, instead of storing a string ultimately as an array of characters (with probably hundreds of single character elements) strings are treated as strings. As all major databases today have a native notion of “strings,” we can assume that every persistence manager implementation also has a native notion of strings. String instances may be stored as if they have one artificial member named “˜value” of type String to accommodate character strings of any length. Those instances are referred like other objects by handles in the intermediate data structure. Thus, for example, referring to
Each row in the data table 1300 has the structure as defined in Listing 5. In accordance with the present disclosure, each member record (e.g., 1222, 1224, 1226,
All member records belonging to the same intermediate data structure are identified by a unique “object id” stored in a column 1302 of the data table 1300 called OBJECT_ID. As one intermediate data structure represents (the collection of all members of) one JAVA object instance (e.g., intermediate data structure 1202 in
A member's name (of type String) is mapped to the data table's NAME column 1304 of type LONGNVARCHAR and stored in that column LONGNVARCHAR is used to allow national character strings of any length (the used data type at least has to allow strings of a member name's maximum length—according to JAVA's language specification). For example, the name of the Customer member in the Invoice instance 1114 is “customer”, and so “customer” is stored in the NAME column 1304 of the row that stores the member record 1222.
A member's type (of type class) is mapped to the data table's TYPE column 1310 of type SMALLINT and stored in that column. In some embodiments, we map an object's type to an integer number in the range of 1 . . . N to save disk space, where N is the number of data types defined by the particular language (e.g., JAVA) for representing the intermediate data store. Thus, for example, a data type Byte may be integer “1”, a data type Character may be integer “2”, a data type Short may be integer “3”, and so on. Integer “0” may be used to refer to the Handle type. In some embodiments, a handle is an instance of a JAVA class, which simply wraps an object id used to represent the records in table MEMBER belonging to the same intermediate data structure. In other embodiments, instead of representing data types using integers, the representation may be stored, for example, using a string that reflects the type's name; e.g., “Integer”, “Byte”, “Character”, and so on.
A member's declaring class (of type class) is mapped to the table's DECLARING_CLASS column 1308 of type LONGNVARCHAR and stored in that column. In some embodiments, we represent the declaring class be the class' name. Mapping a class to a number (as done for the instance's type) would be possible too, but would also mean to store a mapping table (number mapped to class) since the number of possible classes is potentially unlimited.
In order to rebuild an object instance based on the stored member records that represent the object instance's member, it is necessary to know the object instances' class. It is not enough to know the DECLARING_CLASS of a certain member which is used to handle object inheritance. Accordingly, we can store the object instances' class as a CLASS column 1306 in the data table 300. As explained with respect to the DECLARING_CLASS column 1308, we can use LONGNVARCHAR as data type. Another less redundant way of storing the object instance's class would be to add an additional record in the data table 1300 (same object id) that represents an artificial member of the related object instance (member name: “˜class”, or something else using a forbidden name format within the JAVA programming language, to avoid naming conflicts). The object instance's class could be stored in the DECLARING_CLASS column 1308, or the class' name could be stored in the data value column 1312 called V_STRING.
A member's value may be mapped to one of the data table's data value columns 1312 and stored in that column. Accordingly, a value of type Byte it is stored in V_BYTE of type SMALLINT. A value of type Character is stored in V_CHARACTER of type NCHAR (1). A value of type Short is stored in V_SHORT of type SMALLINT. A value of type Integer is stored in V_INTEGER of type INTEGER. A value of type Long is stored in V_LONG of type BIGINT. A value of type Float is stored in V_FLOAT of type REAL. A value of type Boolean is stored in V_BOOLEAN of type BIT. In some embodiments, for example, the mapping may be based on the following JDBC 4.2 standard (Table 2):
A value of type Handle: A handle is a wrapped object id (as described earlier). Accordingly, values of type Handle may be stored by taking the wrapped object id and storing it in one of the data value columns 1312 called V_OBJECT_ID of type BIGINT (the same type as column OBJECT_ID) or any other type able to store a unique identifier).
As mentioned above, string instances may be stored as if they have one artificial member named “˜value” of type String. These instances of String objects may be referred like other objects using handles in the intermediate data structure. Accordingly, in some embodiments, a member value of type String is a JAVA object and thus is stored as an object reference (a string might be referenced by several different object instances). The V_OBJECT_ID of a member of type String will contain the object id of the related String object instance. For example, String instances may be stored as if they each have a member named “˜value” of type String stored in one of the data value columns 1312 call V_STRING of type LONGNVARCHAR to allow national character strings of any length.
Consider, for example, the Customer instance 1112 as represented in data table 1300. The value “John” for the firstname member may be treated as a String object, which can be represented in the data table 1300 as object id “3”. Similarly the lastname, street, and city members in the Customer instance 1112 can be assigned respective object ids of “4”, “5”, and “6”. Accordingly, the V_OBJECT_ID data value columns 1312 of the member records for firstname, lastname, street, and city are respectively, “3”, “4”, “5”, and “6”.
Listing 6 (lines 1-33) below illustrates an example of Java code that may be used to persist an incoming intermediate data structure (e.g., 1202,
The for loop (lines 2-31) in Listing 6 processes each member record of the incoming intermediate data structure, adding each member record into a row in the data table 1300. The number of member records may be provided in members.length. The type of data that is stored in the member record is identified in (line 4). The methods in (lines 5-9) of the class stmtInsertMember store the objet id, the member name, the class, the declaring class, and the data type of the member record in respective columns 1302, 1304, 1306, 1308, and 1310 of the data table 1300.
An if statement (lines 10-30) determines where in the data table 1300 to store the value of the member record based on the data type identified in (line 4). For example at (line 12-13), a Byte data value is stored in the V_BYTE data value column 1312 of the data table 1300 using the setByteValue method.
After each member record has been inserted into a corresponding row in the data table 1300, the data table may be written to a storage device in (line 32) such as a hard disk drive or other suitable non-volatile storage. In particular, the new rows, which now contain data comprising the intermediate data structure, may be added to an existing data table file (not shown) that is on the storage device. In this way, the intermediate data structures in an application may be collected in a single data table that is stored in a data table file on a storage device.
Listing 7 (lines 1-48) below illustrates an example of Java code that may be used to rebuild an intermediate data structure (e.g., 1202,
The rows which store the member records that comprise the specified intermediate data structure may be retrieved in (line 4) using the objectId, which identifies the specified intermediate data structure. An instance list of class ArrayList will store data, extracted from the data table 1300, comprising the specified intermediate data store.
The loop (lines 5-46) is performed for each of the rows retrieved in (line 4). For each row, the member name, its declaring class, and its data type are retrieved in (lines 7-9) from the data table 1300. The switch statement (lines 11-44) accesses one of the data value columns 1312 of the data table 1300 according the data type obtained in (line 9). The data are then added to the ArrayList instance list in (line 45).
When all the member records of the specified intermediate data structure have been retrieved, the ArrayList instance list—converted to an array of instances of the Member class—is returned.
Listing 8 (lines 1-6) below illustrates an example of Java code that may be used to identify the class of an intermediate data structure (e.g., 1202,
The above description illustrates various embodiments of the present disclosure along with examples of how aspects of the particular embodiments may be implemented. The above examples should not be deemed to be the only embodiments, and are presented to illustrate the flexibility and advantages of the particular embodiments as defined by the following claims. Based on the above disclosure and the following claims, other arrangements, embodiments, implementations and equivalents may be employed without departing from the scope of the present disclosure as defined by the claims.
The present disclosure relates to U.S. patent application Ser. No. 11/023,917 filed Dec. 27, 2004, the content of which is incorporated herein by reference in its entirety for all purposes.