This invention relates to updating data items in a deferred copy of a data message. In particular, the invention relates to creating actual copies of data items that are updated.
Computer data messaging systems include message producers and message consumers for the distribution of application or business data. Message producers and message consumers are typically individual computer systems configured to communicate with each other, such as over a network. Message producers generate data messages that can include a number of data items, such as a header containing administrative information and a payload containing the information that the message is intended to deliver. For example, payload data can include data specific to a particular business application. The data messages are sent to message consumers who use the data message, such as business processes.
When a consumer receives a data message, it may make an update to the contents of all or part of the message. For example, a consumer may add information to a message, such as a time-stamp, or update an existing time stored in the message. Such updates to a data message made by a consumer should not be visible to the producer of the message, or other consumers who will also receive the message. This is because an update to the data message made by a particular consumer is specific to that consumer. To overcome the problem of consumers receiving a data message that has been updated by other consumers, each consumer can take a copy of the message. A copy of the message can be updated by a consumer who made the copy without affecting the original message that is visible to all other consumers. While a consumer taking a copy of a message overcomes the problem of updates to the message being visible to other consumers, such copies are unnecessary if a consumer does not update a data message. Consequently, redundant copies of a data message can exist on consumer computer systems. A redundant copy of a data message represents a waste of resources; processing time is wasted generating the redundant copy and storage space is wasted recording the redundant copy. Furthermore, a data message containing a large amount of data will be copied in its entirety even when a consumer may only update a small fraction of the complete message data. For example, a data message containing a small header and a voluminous payload may be copied in its entirety even if a consumer will only change a time-stamp in the message header. Thus, the copy of the voluminous payload is extraneous where only the small header is updated.
One known technique to alleviate some of these problems is to take a ‘deferred’ or ‘lazy’ copy of a data message as opposed to a complete copy of the message. To generate a lazy copy of an original data message a place-holder is used to represent a copy of the original message, but instead of including a complete copy of all data items in the original message the place-holder simply refers to the original message. Thus, the place-holder is actually a reference to the original message, and data items can be addressed within the place-holder, although no actual copy of the data items exists within the place-holder. Subsequently, if a data item addressed within the place-holder is updated, a full copy of all data items from the original data message are generated by populating the place-holder with a complete copy of all data items in the original data message. The required update can then be made to the data item in the copy of the message. In this way the complete message data is not actually copied until an update is made to the copy of the message, so avoiding unnecessary redundant copies of the original data message. However, the technique of lazy copying does not overcome the problem of duplicating all data in an original message when only a small fraction of the original message is updated. For example, if an original data message contains a small header and a voluminous payload, a lazy copy can be generated by a message consumer as a place-holder with a reference to the original data message. Subsequently, if the consumer updates a time-stamp addressed within the place-holder, a complete copy of the original message is generated within the place-holder. Thus, using a lazy copy technique, even when a small change is required a complete copy of all message data takes place.
It would therefore be desirable to provide a way for a message consumer to make updates to data messages in a messaging system without the updates being visible to other consumers in the messaging system, and without generating redundant copies of unnecessary message data, such as message data that is not updated.
The present invention accordingly provides, in a first aspect, a method for generating a copy of a first message including nested data items in a messaging system. The method has the steps of: creating a second message as a lazy copy of the first message; responsive to a determination that one of said data items addressed at the second message location is to be updated, creating an actual copy of the updated data item in the second message.
A message consumer initially creates a lazy copy of a data message. Subsequently, if the consumer updates a data item addressed in the lazy copy of the data message, an actual copy of the data item to be updated is generated. Other data items in the lazy copy of the data message continue to be lazy copies. In this way, only data items that are updated are copied as actual copies, with all other data items being copied as lazy copies. Thus the method avoids unnecessarily generating redundant actual copies of data items or whole data messages that will not be updated. The method also avoids the storage of redundant copies of data items or data messages.
Preferably an actual copy of a parent data item of the updated data item is also created.
Preferably a lazy copy of a child data item of the parent data item is also created.
The present invention accordingly provides, in a second aspect, an apparatus for generating a copy of a first message in a messaging system. The first message includes nested data items, and the apparatus comprises: means for creating a second message as a lazy copy of the first message; means responsive to a determination that one of said data items addressed at the second message location is to be updated for creating an actual copy of the data item in the second message.
The present invention accordingly provides, in a third aspect, a computer program product comprising computer program code stored on a computer readable storage medium that, when executed on a data processing system, instructs the data processing system to carry out the method described above.
A preferred embodiment of the present invention will now be described, by way of example only, with reference to the accompanying drawings, in which:
a is a block diagram of a lazy copy of a data message in the prior art;
b is a block diagram of an actual copy of a data message in the prior art;
a is a block diagram of a lazy copy of a data message in accordance with a preferred embodiment of the present invention;
b is a flowchart illustrating a method for generating a copy of a data message in accordance with a preferred embodiment of the present invention;
a is a block diagram of a lazy copy of a data message in accordance with a preferred embodiment of the present invention; and
b is a block diagram of a lazy copy of the data message of
a is a block diagram of a lazy copy of a data message in the prior art. A data message “A” 102 comprises two data items, “A1” 104 and “A2” 106. The data message “A” 102 is a data message generated by a producer computer system and transmitted on a computer network to consumer computer systems. Alternatively, the data message “A” 102 can be any data message in a messaging system. Data items “A1” 104 and “A2” 106 are elements within the data message “A” 102. For example, data items “A1” 104 can be a header data item and “A2” 106 can be a payload data item. A header data item can include information regarding the structure and content of the entire data message “A” 102 such as length information, chronological information (e.g. when the message “A” 102 was generated) or encryption information. A payload data item can include application data relevant to a software application executing on one or more consumer computer systems. For example, a payload data item can include business application data. Alternatively, data items “A1” 104 and “A2” 106 can be single data values such as single data variables.
A consumer in the messaging system receives the data message “A” 102 and generates a lazy copy of the data message “COPY OF A” 108. “COPY OF A” 108 is a place-holder representing a copy of data message “A” 102. “COPY OF A” 108 is not an actual copy of data message “A” 102. Rather, “COPY OF A” 108 includes references to the data items of data message “A” 102 including “REFERENCE TO A1” 110 and a “REFERENCE TO A2” 112. A reference to a data item can be a memory pointer to the data item in a memory of a computer system. Alternatively, a reference to a data item can be an object reference in an object oriented system, or a location of the data item on a non-volatile storage device of the computer system. Thus, “COPY OF A” 108 provides a consumer of the messaging system with a copy of the message “A” 102 while not requiring actual copies of the data items “A1” 104 and “A2” 106 to be generated, as references are provided to “A1” 104 and “A2” 106.
When the consumer updates one of the data items in “COPY OF A” 108 by addressing “REFERENCE TO A1” 110 or “REFERENCE TO A2” 112, the “COPY OF A” 108 is converted into an actual copy of the data message “A” 102 including actual copies of the data items “A1” 104 and “A2” 106. For example, if the consumer updates a value referenced by “REFERENCE TO A2” 112, the data items “REFERENCE TO A1” 110 and “REFERENCE TO A2” 112 that are references to data items “A1” 104 and “A2” 106 of data message “A” 102 will be replaced with actual copies of data items “A1” 104 and “A2” 106. This replacement with actual copies is required so that the lazy copy “COPY OF A” 108 can be updated without affecting the data message “A” 102. Such an actual copy is illustrated in
Thus the prior art techniques of
a is a block diagram of a lazy copy of a data message in accordance with a preferred embodiment of the present invention. The data message “A” 202 and data items “A1” 204 and “A2” 206 are identical to those described with respect to
In contrast to the prior art technique of performing an actual copy of all data items of data message “A” 202, the lazy copy of data message “A” 208 only includes an actual copy of the updated data item “A2” 212. Data item “A1” 210, which is not updated in the lazy copy of data message “A” 208, continues to be a reference to the data item “A1” 204 of data message “A” 202. In this way, the lazy copy of data message “A” 208 of
The technique described above with respect to
A method for updating a data item in a lazy copy of a data message in accordance with a preferred embodiment of the present invention will now be described with reference to
a is a block diagram of a lazy copy of a data message in accordance with a preferred embodiment of the present invention. A data message “A” 402 comprises three data items, “A1” 404, “A2” 406 and “A3” 408. The data message “A” 402 is a data message generated by a producer. The data message “A” 402 is considered to be the parent of data items “A1” 404, “A2” 406 and “A3” 408. Data item “A1” 404 includes nested data items “A11” 410 and “A12” 412. Also, data item “A2” 406 includes nested data items “A21” 414 and “A22” 416. Similarly, data item “A3” 408 includes nested data items “A31” 418 and “A32” 420.
By way of example, the method of
Starting at step 302, an actual copy of data item “A22” 416 is first created. Then at step 304 a loop is commenced through each parent of the data item to be updated. The data item to be updated is “A22” 416 that has a single parent of “A2” 406. Thus only a single iteration of the loop is required. More iterations will be required if the updated data item is more deeply nested in a tree structure of nested data items. Subsequently, at step 306, an actual copy of the parent data item is created as a parent of the actual copy of the updated data item. Thus, following step 306, copies are created as outlined below, with a copy of “A22” (the updated data item) and a copy of the parent of “A22”, which is “A2” (indentation indicates parenthood):
Then at step 308, a lazy copy is generated for all children of “A2” 406 other than the updated data item “A22” 416. Thus, a lazy copy placeholder data item is generated referencing all other children of “A2” 406. The data item “A2” 406 has one child other than “A22” 416, which is “A21” 414. Thus, a lazy copy placeholder is created referencing data item “A21” 414. Following step 308, copies have been created as outlined below, with a reference to “A21” inserted as a lazy copy of “A21” 414:
Subsequently, at step 310 the loop tests if there are more parent data items of the updated data item. The updated data item “A22” 416 has only one parent and so the method proceeds to step 312. Finally, at step 312, the copies created using the method are inserted into “COPY OF A” 422. Thus, the “REFERENCE TO A2” 426 data item is replaced with the “COPY OF A2” data item created by method steps 302 to 308 as structured below:
b is a block diagram of a lazy copy of the data message of
Thus, the “COPY OF A” 422 of
Number | Date | Country | Kind |
---|---|---|---|
0329200.0 | Dec 2003 | GB | national |