A relational database management system may support the ability to store hierarchically structured documents, such as extensible markup language (XML) documents, natively as columns within relational tables. The relational objects are stored in rows of a base table. The relational objects do not contain the XML data itself, but instead contain the unique XML Document Identifier, called a “Doc ID” herein. The Doc ID is unique across a table. The XML document is stored as XML data nodes, usually in sub-trees, in XML records assigned unique record identifiers (RID). The XML records are stored separately from the base table. The Doc ID stored in the base table is used to refer to the XML records, and links between the XML records are through a Node ID index, which references the Doc ID mapped to the unique Node ID's assigned to the XML data nodes and the RID's of the XML records.
Keeping versions of an XML document after an update of any portion of the XML document may be useful. For applications that tend to have a high volume of concurrent readers, keeping multiple versions of an XML document during update so that the readers can still read the old version without waiting may be important. Multi-versioning can also help provide snapshot semantics and the ability to select from old data.
One approach is to store a version of the whole XML document each time the XML document is modified. However, this approach is inefficient in terms of storage space and time, especially when a large number of sub-document updates occur.
According to one embodiment of the present invention, a method for multi-versioning data of a hierarchically structured document stored in a plurality of data records of a relational database system, comprises: changing document data in one or more data records of the plurality of data records, each data record assigned a record identifier, the data record comprising a plurality of nodes assigned a node identifier, and the hierarchically structured document assigned a document identifier; storing an update timestamp in a base table row referencing the document identifier; storing in each changed data record a start timestamp for a start of a validity period for the changed data record and an end timestamp for an end of the validity period; and storing the start timestamp and the end timestamp in one or more node identifier index entries referencing the document identifier, the record identifier, and the node identifier.
In one aspect of the embodiment of the present invention, the one or more data records are inserted into the plurality of data records: where a current timestamp comprising a time of the inserting is stored in the base table row referencing the document identifier; where the current timestamp is stored in each inserted data record as the start timestamp and a large value is stored as the end timestamp; and where the current timestamp is stored as the start timestamp and the large value is stored as the end timestamp in the one or more node identifier index entries referencing the document identifier, the record identifier, and the node identifier.
In another aspect of the embodiment of the present invention, the one or more data records of the plurality of data records is updated: where a current timestamp comprising a time of the updating is stored in the base table row referencing the document identifier; where for each data record replaced in the updating, the current timestamp is stored in the replaced data record as the end timestamp, and for each replacement data record in the updating, the current timestamp is stored in the replacement data record as the start timestamp and a large value as the end timestamp; and where for each data record replaced in the updating, the current timestamp is stored as the end timestamp in the one or more node identifier index entries referencing the document identifier, a record identifier assigned to the replaced data record, and a node identifier assigned to the replaced data record, and for each replacement data record in the updating, one or more new node identifier index entries referencing the document identifier, a record identifier assigned to the replacement data record, and a node identifier assigned to the replacement data record are inserted, and the current timestamp is stored as a start timestamp and the large value as an end timestamp in the one or more new node identifier index entries.
In another aspect of the embodiment of the present invention, the hierarchically structured document is deleted: where the base table row referencing the document identifier is deleted; where a current timestamp comprising a time of the deleting as the end timestamp is stored in each data record of the deleted hierarchically structured document; and where the current timestamp is stored as the end timestamp in the one or more node identifier index entries for each data record of the deleted hierarchically structured document.
In another aspect of the embodiment of the present invention, a query to select a version of the hierarchically structured document is received, the query comprising the document identifier and a version timestamp; the node identifier index is searched for one or more entries referencing the document identifier and the node identifier, and where the start timestamp of the entry is less than or equal to the version timestamp and the end timestamp of the entry is greater than the version timestamp; one or more data records for the version of the hierarchically structured document are found using the found node identifier entries; and the obtained data records are returned.
In one aspect of the embodiment of the present invention, the version timestamp is obtained from the update timestamp in the base table row referencing the document identifier.
In another aspect of the embodiment of the present invention, the version timestamp is obtained from a timestamp for the query.
System and computer program products corresponding to the above-summarized methods are also described and claimed herein.
As will be appreciated by one skilled in the art, aspects of the present invention may be embodied as a system, method or computer program product. Accordingly, aspects of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, micro-code, etc.) or an embodiment combining software and hardware aspects that may all generally be referred to herein as a “circuit,” “module” or “system.” Furthermore, aspects of the present invention may take the form of a computer program product embodied in one or more computer readable medium(s) having computer readable program code embodied thereon.
Any combination of one or more computer readable medium(s) may be utilized. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java® (Java, and all Java-based trademarks and logos are trademarks of Sun Microsystems, Inc. in the United States, other countries, or both), Smalltalk, C++ or the like and conventional procedural programming languages, such as the “C” programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider).
Aspects of the present invention are described below with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer special purpose computer or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer readable medium that can direct a computer other programmable data processing apparatus, or other devices to function in a particular manner, such that the instructions stored in the computer readable medium produce an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.
The computer program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide processes for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
The flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified local function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the singular forms “a”, “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises” and/or “comprising,” when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
The corresponding structures, materials, acts, and equivalents of all means or step plus function elements in the claims below are intended to include any structure, material, or act for performing the function in combination with other claimed elements as specifically claimed. The description of the present invention has been presented for purposes of illustration and description, but is not intended to be exhaustive or limited to the invention in the form disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the invention. The embodiment was chosen and described in order to best explain the principles of the invention and the practical application, and to enable others of ordinary skill in the art to understand the invention for various embodiments with various modifications as are suited to the particular use contemplated.
In this embodiment, the Node ID index entries are sorted in descending order of the start and end timestamps 206-207, so that the more current version of the XML record is listed before the older versions of the XML record. Since the most recent data are typically accessed more frequently, the sorting of the index entries by timestamps avoids significant impact on system performance due to the multi-versioning method of the present invention.
When the change to the XML document involves an update of the XML document (410), which may be considered a replacement of existing XML record(s) with new XML record(s), the method sets the update timestamp 204 in the XML indicator column of the base table row referencing the Doc ID of the XML document to CTS (411). In this embodiment, the CTS is the time of the update operation. For the replaced XML record, the method sets the end timestamp 212 to CTS (412). For the replacement XML record, the method sets the start timestamp 211 to CTS and the end timestamp 212 to a large value (413). For the Node ID index entries for the replaced XML record, the method sets the end timestamp 207 to CTS (414). For the Node ID index entries for the replacement XML record, the method sets the start timestamp 206 to CTS and the end timestamp 207 to a large value (415). Thus, the validity period for the replaced XML record is defined to be from its existing start timestamp to CTS, while the validity period for the replacement XML record is defined to be from CTS to a time far into the future.
When an XML document is deleted (420), the method deletes the base table row referencing the Doc ID of the XML document (421). The method sets the end timestamp 212 in the XML records of the deleted XML document to CTS (422), and sets the end timestamp 207 in the Node ID index entries for the XML records of the deleted XML document to CTS (423). In this embodiment, the CTS is the time of the delete operation. Thus, the validity period for the deleted XML records is defined to be from the existing start timestamps to CTS.
Referring to both
In
Referring to both
Continuing with the example illustrated in
Continuing with the example illustrated in
For example, the method begins the search of Node ID index entries with the search key (DocID=docid, NodeID=0, START_TS<=ts and END_TS>ts), which returns the root record of the XML document with Doc ID=docid with the validity period defined by the start and end timestamps. The root record is traversed, and the method determines if the XML document contains additional XML records. In response to determining that there are additional XML records, the method searches the Node ID index for an entry with a new nodeid value. This search key (docid, nodeid, START_TS<=ts and END_TS>ts) is then used to find another XML record. This search is repeated until all the XML records for the XML document is fetched and traversed. These XML records are then returned as a particular version of the XML document.
Continuing with the example in
With the embodiment of the method of the present invention, several features may be supported, including but not limited to: last committed read feature, snapshot semantics, current version only feature, select from old data for update and delete feature, converting from non-versioning formats, and the purging of old versions and deleted data.
For the last committed read feature, when a current base table row is locked for an update, a last committed version may be found using the method of the present invention with a valid Doc ID and timestamp. The select operation described above may be used to find the corresponding XML document version. A reader of XML data need not wait for the update to complete before reading the XML document data that was committed.
For the snapshot semantics feature, a query timestamp is used to obtain the XML records instead of a stored timestamp. Using the Doc ID and the query timestamp, the select operation described above may obtain a snapshot of the XML document at the given timestamp.
For the current version only feature, utilities (such as REORG, CHECK DATA, CHECK INDEX, and REBUILD INDEX, each known in the art) may ignore old versions and focus on the current version only by checking only those records with the end timestamp=‘FFFFFFFF’.
For the selection from old data for update and delete feature, the method supports versioning of deleted XML data, so that the deleted XML data may be read back. To select old XML data, the old update timestamp from the base table row is maintained, i.e., multi-versioning of the base table is provided. The method then uses the (Doc ID, old update timestamp) pair in the select operation described above to obtain the old version of the XML document which contains the deleted XML data.
For the converting from non-versioning format feature, when converting from a non-versioning format to the versioning format supported by the method of the present invention, a zero timestamp, a timestamp at the time of conversion, or a default timestamp can be used to fill both the base table row update timestamp and the start timestamp in the Node ID index entries and XML records. Further, the end timestamps can be filled with “FFFFFFFF’ or a default.
For the purging of old versions and deleted data feature, if no one is reading a version older than timestamp ts, the records with an end timestamp <=ts can be deleted. More specifically, after a delete operation or an update operation logically deletes XML records by setting the end timestamp=CTS, the XML records can be purged when both of the following criteria are met: (1) the delete or update operation have been committed; and (2) there are no deferred fetches or readers that still need the logically deleted XML records.
Concerning criteria (1), until the delete or update operation has been committed, these operations may be rolled back. The XML records thus cannot be purged until it is known that the XML records will not be needed for rollback operations. To determine whether the delete or update operation has committed, a lock that is not compatible with a lock held by the delete or update operation can be acquired. However, since acquiring a lock is not efficient, the method may alternatively compare the end timestamp of the XML record with a ‘commit timestamp’ that is tracked for the XML table or for a larger scope. The commit timestamp is the timestamp of the oldest delete or update operation that has not committed. If the end timestamp value of a deleted XML record is older than (less than) the commit timestamp, then the delete or update operation has committed.
Concerning criteria (2), for a select operation, if the base table row was fetched to access the Doc ID and the XML indicator column value (update timestamp), but the XML records were not accessed immediately, the XML records need to persist even if the delete or update operation has logically deleted the version and committed. For the purpose of tracking the readers of XML data, when the base table row is fetched to access the Doc ID and the XML indicator column value, the timestamp for the reader is registered to track a ‘reader timestamp’. The reader timestamp is tracked for the XML table or for a larger scope. The reader timestamp is the timestamp of the oldest active reader with a read interest on the XML table. If the end timestamp value of the deleted XML record is older than (less than) the reader timestamp, then there are no readers that need to read the logically deleted XML record.
In determining if criteria (1) and (2) are met, the lesser of the commit timestamp and the reader timestamp are used to find XML records to purge. When the end timestamp value is less than the lesser of the commit timestamp and the reader timestamp, the XML record may be purged.
In this embodiment of the method of the present invention, a separate background task may be used to perform the actual purging of the XML records. When a delete operation or update operation occurs, the method provides the background task with information needed to purge XML records at a later time. This includes the end timestamp value used for the delete or update operation and information about the XML column that has XML records that were logically deleted. The lowest end timestamp value is kept for the background task. In response to the lowest end timestamp value being less than the lesser of the commit timestamp and the reader timestamp, the background task fetches XML records and determines whether the XML record is to be purged. When the end timestamp value of the XML record that is fetched is less than the lesser of the commit timestamp and the reader timestamp, the background task purges the XML record.
This feature may be helpful for a database reorganization (REORG) utility in reorganizing XML data. The REORG utility may compare the lesser of the commit timestamp and the reader timestamp with the XML record's end timestamp value. In response to the end timestamp value being lower, the REORG utility purges the XML records.
Although the present invention has been described in accordance with the embodiments shown, one of ordinary skill in the art will readily recognize that there could be variations to the embodiments and those variations would be within the spirit and scope of the present invention. Accordingly, many modifications may be made by one of ordinary skill in the art without departing from the spirit and scope of the appended claims.