This invention relates generally to data processing in computer networks. More particularly, this invention relates to techniques for management of objects with valid time stamps and system time stamps (bitemporal objects).
Bitemporal objects are associated with both a valid time that marks when a thing is known in the real world and a system time that marks when the thing is available for discovery in a Server. Bitemporal data is necessary whenever there is a requirement to maintain snapshots of a transaction across various time dimensions. For example, financial and insurance industries use bitemporal data to track changes to contracts, policies, and events in a manner that adheres to strict regulation and compliance requirements.
There is a need for improved techniques for managing bitemporal objects.
A machine has a processor and a memory connected to the processor. The memory stores instructions executed by the processor to construct an object collection where each object in the object collection has a common identifier, a valid time start field, a valid time end field, a system time start field and a system time end field. The object collection includes split objects with a legacy object and an updated object with the system time start field set to the system time that the split objects are formed.
The invention is more fully appreciated in connection with the following detailed description taken in conjunction with the accompanying drawings, in which:
Like reference numerals refer to corresponding parts throughout the several views of the drawings.
Each client device 102 includes standard components, such as a central processing unit connected to input/output devices 112 via a bus 114. The input/output devices 112 may include a keyboard, mouse, touch display and the like. A network interface circuit 116 is also connected to the bus 114 to provide connectivity to network 106. A memory 120 is also connected to the bus 114. The memory 120 stores a data source 122 configured for uploading to server 104.
The server 104 also includes standard components, such as a central processing unit 160, a bus 162, input/output devices 164 and a network interface circuit 166. A memory 170 is connected to bus 162. The memory 170 stores instructions executed by the central processing unit 160 to implement operations of the invention. In one embodiment, the memory stores an object collection curator 172 with instructions to implement the operations shown in connection with
The timestamp assignment operation 210 accommodates a user specified system start time. That is, the system allows one to set a system time start field to a value earlier than a current system time. Thus, an object inserted into a collection can effectively be placed backwards in system time. Most systems only allow objects to be inserted at or after system time. This feature is significant because client machines 102_1 through 102_N may be operating at slightly different time frames. An enterprise controlling the client machines may need to observe these distinct time domains.
If there is no user specified system start time, the current system time is used as the system start time. The system time end value is typically set to infinity upon object insertion. The system time end value may be set to system time or a user input value.
The foregoing is more fully appreciated with reference to various examples. Terms used in connection with the examples include:
Temporal objects have both a valid axis and system axis.
A bitemporal object is managed as a series of versioned objects in a collection. The ‘original’ object inserted into the object collection is kept and never changes. Updates to the object are inserted as new objects with different valid and system times. A delete of the object is also inserted as a new object. In this way, a bitemporal object can be “rolled back” to review, at any point in time, when the information was known in the real world and when it was recorded in the object collection.
Bitemporality is defined on a collection, sometimes referred to as a temporal collection. A temporal collection is a logical grouping of temporal objects that share the same axes with timestamps defined by the same range indices. One can create additional temporal collections for objects that require a different schema for the timestamps. An object can be in any number of forms, including a Resource Description Framework (RDF) object, an eXtensible Markup Language (XML) object, a JavaScript Object Notation (JSON) object and a text object. Objects are sometimes referred to herein as documents.
When a document is inserted into a temporal collection, a URI collection is created for that document. When the document is updated, a new document representing the update is inserted into the document's URI collection. Any new document inserted into the temporal collection has its own unique URI collection that holds all of the versions of that document. The latest version of each document resides in a latest collection.
The valid and system axis each make use of dateTime range indexes that define the start and end times. For example, the following code creates element range indexes to be used to create the valid and system axes. The follow code is in JavaScript®. XQuery® or other languages may also be used in accordance with embodiments of the invention.
System and valid axes may be formed using the following JavaScript® code.
An object collection or temporal collection named “kool” may be created using the previously created system and valid axes. The following code accomplishes this.
Consider the example of a stock trader, John, who places an order to buy some stock. The record of the trade is stored as a bitemporal object. The stock of KoolCo is trading around $12.65. John places a limit order to buy 100 shares of the stock for $12 at 11:00:00 on 3 Apr. 2014 (this is the valid time). The document for the transaction is recorded in the broker's database at 11:00:01 on 3 Apr. 2014 (this is the system time).
The last line of code inserts the document. The temporal collection is “kool”, the URI is “koolorder.json” and the root is the content of the document.
John looks at the trading pattern of the stock over the last week and notices that it always dips during the last minute of the trading day. At 11:30:00, John changes his order to buy the stock at the closing price (15:59:59). The change is recorded as another document in the broker's database at 11:30:01.
This results in three documents with valid and system times as shown in
The object collection may now be queried. The following query searches the temporal documents, using the cts:period-range-query function to locate the documents that were in the database between 11:10 and 11:15.
In this example, only Doc 1 meets the search criteria. This is shown pictorially in
The following query searches the temporal documents, using the cts:period-range-query function to locate the documents that have a valid time period that starts after 10:30 and ends at 15:59. ALN_FINISHES is one of the comparison operators described in “Allen Operators” discussed in detail below.
In this example, only Doc 2 meets the search criteria, as shown in
The following query searches the temporal documents, using the cts:period-range-query function to locate the documents that were in the database after 11:20.
ALN_AFTER is one of the comparison operators described below in connection with “Allen Operators”.
In this example, both Doc 2 and Doc 3 meet the search criteria, as shown in
The following query searches the temporal documents, using the cts:period-compare-query function to locate the documents that were in the database when the valid time period is within the system time period. ISO_CONTAINS is one of the comparison operators described below in connection with “ISO SQL 2011 Operators”.
In this example, only Doc 3 meets the search criteria, as shown in
The following query uses the cts:and-query to AND two cts:collection-query functions to return the temporal document that is in the URI collection, koolorder.xml, and the latest collection.
In this example, Doc 3 meets the search criteria, as shown in
The system is configured to allow one to manually set the system start time when inserting or updating a document in a collection. This feature is useful when one needs to maintain a “master” system time across multiple clients that are concurrently inserting and updating bitemporal documents, without the need for the clients to communicate with one another in order to coordinate their system times.
The system start times for document versions with the same URI must progress along the system time axis, so that an update to a document cannot have a system start time that is earlier than that of the document that chronicles its last update. However, when managing documents with different URIs in a temporal collection, it is necessary to ensure that the system time progresses at the same rate for every document insert and update.
A special timestamp, called the Last Stable Query Time (LSQT) can be enabled on a temporal collection to manage system start times across documents with different URIs. A temporal document with a system start time before the LSQT can only be queried and a document with a system start time after the LSQT can be updated/ingested, but not queried. This approach is illustrated in
When LSQT is enabled on a temporal collection, the LSQT value starts at 0 (lowest timestamp). When advanced, document reads and writes are queued until the LSQT is reset to the maximum system start time in the database. For example, the following query first checks to make sure the application time (simulated by the current time) is greater than the LSQT:
One can use the temporal:document-delete function to delete temporal documents. Deleting a temporal document maintains the document and all of its versions in the URI collection and updates the deleted document and all of its versions that have a system end time of infinity to the time of the delete. Deleting a temporal document removes the document from the latest collection. So the latest collection is the source of all of the documents that are currently valid and the URI collections are the source of the history of each document. Should one insert a document using the same URI as a deleted document, the deleted document, and all of its previous versions remain in the same URI collection as the “newly” inserted document. The newly inserted document is then added to the latest collection.
Returning to the example discussed in connection with
This transaction is recorded as another document with a valid time of 12:10:00, but due to heavy trading, the change is not recorded in the broker's database until 12:10:12. The resulting collection is shown in
At 13:00:00, the purchase order has not been filled and John decides he no longer wants to buy the stock, so he cancels his order. This cancellation is recorded as another document with a valid time of 13:00:00 and recorded in the broker's database at 13:00:02. The following code reflects this activity.
The resulting collection is shown in
The broker's policy is to honor the valid times for all orders. At 13:00:03, the order fulfillment application reviews the valid and system times recorded in the cancellation document, determines that John in fact cancelled his order before it was filled, and does not debit his account for the stock purchase. At 16:00:00, the broker deletes the order, which results in the collection shown in
The query processor 176 may be configured to support Allen interval algebra operators. Allen interval algebra is a calculus for temporal reasoning. The calculus defines possible relations between time intervals and provides a composition table that can be used as a basis for reasoning about temporal descriptions of events. The left side of
The foregoing examples reference two-dimensional object splitting. The disclosed system also supports multi-dimensional object splitting. Consider the multi-temporal time frame illustrated in
Next, there is an update at t2 with content V2. Vps3 is between vp1 and vp2, while vs3 is between vs1 and vs2. This update results in four cubes after the object split. Object “#1” has content V2 and is specified by (vp1, vp3), (vs1, vs3), (t2, INF). Object “#2” has content V1 and is specified by (vp1, vp2), (vs1, vs2), (t1, t2). Object “#3” has content V1 and is specified by (vp1, vp2), (vs2, vs3), (t2, INF). Finally object “#4” has content V1 and is specified by (vp3, vp2), (vs1, vs2), (t2, INF).
Those skilled in the art will appreciate that the disclosed techniques facilitate a number of computer system enhancements. Specified time ranges may be used to migrate selected objects to tiered storage. For example, older objects may be migrated to cheaper, slower storage resources (e.g., magnetic tape).
The query processor 176 may be configured to cache query results in system time segments to support range queries on system time. In this way, query results can be cached and utilized without being evicted from the cache.
The object collection curator 172 may be configured to form a replicated object from an object in the object collection 174. For example, a replica may be from a segment of a master object.
In one embodiment, the object collection curator 172 includes a safety switch that precludes the alteration of the history of an object. For example, the object collection curator 172 may be configured to disable edits to time field values.
An embodiment of the present invention relates to a computer storage product with a non-transitory computer readable storage medium having computer code thereon for performing various computer-implemented operations. The media and computer code may be those specially designed and constructed for the purposes of the present invention, or they may be of the kind well known and available to those having skill in the computer software arts. Examples of computer-readable media include, but are not limited to: magnetic media, optical media, magneto-optical media and hardware devices that are specially configured to store and execute program code, such as application-specific integrated circuits (“ASICs”), programmable logic devices (“PLDs”) and ROM and RAM devices. Examples of computer code include machine code, such as produced by a compiler, and files containing higher-level code that are executed by a computer using an interpreter. For example, an embodiment of the invention may be implemented using JAVA®, C++, or other object-oriented programming language and development tools. Another embodiment of the invention may be implemented in hardwired circuitry in place of, or in combination with, machine-executable software instructions.
The foregoing description, for purposes of explanation, used specific nomenclature to provide a thorough understanding of the invention. However, it will be apparent to one skilled in the art that specific details are not required in order to practice the invention. Thus, the foregoing descriptions of specific embodiments of the invention are presented for purposes of illustration and description. They are not intended to be exhaustive or to limit the invention to the precise forms disclosed; obviously, many modifications and variations are possible in view of the above teachings. The embodiments were chosen and described in order to best explain the principles of the invention and its practical applications, they thereby enable others skilled in the art to best utilize the invention and various embodiments with various modifications as are suited to the particular use contemplated. It is intended that the following claims and their equivalents define the scope of the invention.
This application claims priority to U.S. Provisional Patent Application Ser. No. 61/976,378, filed Apr. 7, 2014, the contents of which are incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
61976378 | Apr 2014 | US |