In computing environments, distributed and clustered database systems provide an efficient and dynamic platform for organizations to maintain their required data. These databases may use various redundancy and versioning mechanisms to ensure that the data for the organization is provided with high availability to any requesting physical or virtual client of the computing environment.
In some implementations, the database systems may be accessible by multiple clients at any one instance, which may cause conflicts when multiple client computing systems and/or multiple processes on an individual client system are requesting the same data. To overcome this deficiency, some database systems provide a locking mechanism, which prevents multiple clients or processes from accessing the data at the same instance. For example, a first client may request a row in a database, which would prevent any other clients or processes from accessing that same row while the first client is accessing the data. However, this locking mechanism provides inefficiencies as the other clients and processes are required to wait to access requested data. In other implementations, database systems may generate multiple copies for each of the requesting clients and/or processes. Thus, when a first client requests a copy of a data object, a first copy of the data object is provided, and when a second client or process on the same client requests a copy of the data object, a second copy of the data object is provided. However, although each of the clients or processes may access a version of the data object at the same instance, generating multiple copies for each of the requesting processes provides inefficiencies in managing resources. In particular, unnecessary memory resources may be required to handle the additional copies for each of the clients and/or processes, limiting resources that could be provided to other operations.
The technology described herein enhances the management of versioned data objects according to an implementation. In one example, a client in an object computing environment is configured to identify a request for a data object in a first version. Once the request is identified, the client identifies a modification request for the data object to modify the data object from the first version to a second version. In response to the modification request, the client generates an undo log entry to reflect the changes from the first version to the second version and updates the data object to the second version.
In some implementations, the client may further be configured to update a redo log, which can provide information about the update to other clients within the computing environment.
In computing environment 100, clients 110-111 and redo logs 150 may be deployed by an organization to store large data sets, such as those deployed within relational databases or some other object management data structure(s). In the present example, clients 110-111 maintain data object storage 135-136, wherein each data object storage includes a copy of data objects 140-141, and wherein the objects may include tables, columns in a table, rows of a table, individual entries in the tables, or some other similar object. As the objects are maintained, which may include data representative of profile information for users, configuration information for servers in a data center (e.g. software defined switches, firewalls, and router configurations), or some other similar information, processes 130-133 on clients 110-111 may request a data object from the corresponding data object storage. For example, process 130 may request a data object, such as a row reflecting user profile information, from data object storage 135. In response to the request, the requested object may be identified and provided to the requesting process.
In some implementations, such as that depicted in
As depicted in
In some implementations, the request by the process may specify a version of the object that is preferred by the process. For example, data object 141 may be on its tenth version, but process 130 may request version eight. To accommodate the request, operation 200 may be used to apply undo logs 151 to provide data object 141 with the appropriate version, wherein the undo logs include information of how to revert an object to the previous version.
Once the data object is provided to the requesting process on client 110, the process may request, and operation 200 may identify (202) a modification request for the data object to modify the data object from the first version to a second version. Returning to the example of process 130 requesting data object 141, if data object 141 represented a user profile, process 130 may make modifications to an entry for the profile, such as an address for the user, a phone number for the user, or some other similar modification to the user profile. Once the modification request is identified for the process, operation 200 further generates (203) an undo log entry to reflect the changes from the first version to the second version, and may update the data object to the second version in data object storage 135.
In some implementations, the generation of the undo log entry by operation 200 may be used to permit processes on client 110 to access multiple versions of the same data object. Referring back to the modification of the user profile, when process 130 makes a modification to the data object representing the user profile, an undo log entry may be generated such that the previous version of the profile can still be accessed by other clients of the computing environment. In particular, the undo log entry (which may comprise a delta file in some implementations) may include information about how to revert the data object to the previous version. Thus, if process 130 were to make a modification to an address of the user profile, an undo log may be generated that provides information of how to revert the address to the previous version.
In addition to updating the undo logs that are local to the client, it should be understood that the client may also generate redo log entries that are provided to redo logs 150. This redo log entry may be supplied to other clients within the computing environment, permitting the other clients to update locally stored versions of the data objects. In particular, the redo logs may permit each of the clients to locally update a version of the data object and generate an undo log to revert the data object to a previous version.
Although the example of
Referring to the example of modifying an address in a user profile, when process 131 makes a modification to the first version of the object, a second version of the object is updated within data object storage 135 and at least one log file is generated that indicates the changes made between the first version and the second version of the object, such that the second version may be reverted to the first version. Once the second version is updated, process 130 may generate a request to modify the first version of the object, wherein the modification may be identified as improper by operation 200 executing on client 110. In response to identifying the improper modification, operation 200 may prevent the update to the data object, permit the update of the data object, supply the newest version of the object to process 130, or some other similar conflict resolution operation. In the example of providing the newest object to process 130 as a conflict resolution, process 130 may update the object with the previously requested modifications, make an alternative modification, or take no action with respect to the data object. If a modification is requested from process 130, then a third version of the object may be updated in data object storage 135, and a new log entry may be generated that reflects the differences between the third version of the object and the second version of the object.
In some implementations, when modifications are being made by processes 130-133 temporary or optimistic logs may be generated prior to committing changes to logs 150-152. In particular, prior to executing a modification or finalizing a change, each of the processes may maintain their own undo and redo logs. Once a modification is committed, which is the operation described in the processes of
By maintaining the optimistic logs, the optimistic logs may also assist in managing conflicts between processes executing on the same client. Referring again to process 130 requesting and making modifications to data object 141. While process 130 is modifying the data object (prior to committing the modification), process 131 may request the same version of the object that was originally requested by process 130. To accommodate the request, data object 141 stored in data object storage 135 may already include the modifications by process 130. As a result, the optimistic logs maintained by process 130 may be consulted to generate the data object at the proper version for process 131. Once provided, process 131 may make modifications to the objects (sometimes not reflected in data object storage 135) and maintain its own optimistic logs on the version of the data object provided. Because both the processes are making modifications to the data object (wherein one may make modifications to the object itself in data object storage 135 while the other merely makes log entries to monitor its modifications), when the processes execute or finalize their modifications, a conflict resolution may take place in updating the data object. This resolution may include preventing any modification to the object (i.e. reverting it back to its previous version), permitting both modifications, or some other resolution.
As depicted, operation 200 identifies a request for a data object from process 130. In response to the request, operation 200 obtains the data object 141 associated with the request, and updates the data object using any undo and redo logs as required. For example, if a previous version of the data object is requested, then the undo logs may be used to revert the data object to the appropriate version. Additionally, in some implementations, operation 200 may monitor the redo logs to determine if any changes to the objects were made by other clients in the computing environment, and apply redo log entries if required to the data object. Once the object is made into the proper version, operation 200 may provide the object to process 130.
After being provided with the data object, process 130 may, if the request was for the most recent version of the data object, generate a request to modify the data object that is identified by operation 200. In some implementations, this request may be made for any modification to the object, however, it should be understood that process 130 may be required to provide a commit or finalize operation to provide the request. Once the request is received, operation 200 may identify undo and redo entries for logs 150-151, update the object in data object storage and store the log entries in logs 150-151.
In some implementations, when a modification request is received, operation 200 may be required to determine if a conflict has occurred, such as another process in the computing environment committing a modification to the data object prior to the commitment by process 130. Once a conflict is identified, process 200 may enter the modification, block the modification, or provide some other similar operation with respect to the modification conflict.
In some examples, by generating a redo log entry for redo logs 150, the redo log entries may be used to update other clients in the computing environment. In particular, the log entries may be used to update the object storage on each of the clients to reflect the changes on the originating client. Further, the undo log entry for undo logs 151 may be used locally to revert an object back to a previous revision. For example, if process 130 were to update a data object to a newer version, then an undo log entry may be generated that includes information of how to revert the object from the newer version to the previously identified version.
Although demonstrated in the example of
As described herein, a computing environment may include one or more computing systems that are used to provide a platform for object storage (such as relational databases). These computing environments may include a plurality of client computing systems with processes executing thereon that access and process data objects that are stored on each of the client computing systems. To provide the ability for processes executing in the computing environment to access different versions of a data object, logs may be generated, wherein the logs maintain information about the differences between each of the versions.
Referring to example data structure 400, version 440 may represent differencing data to revert or undo a second version of an object to a first version of an object. Similarly, version 441 may include undo data to revert a third version of the object to a second version of the object. Similar information may also be maintained for versions 442-443. As a result of the versioning configuration, when a request is received for a particular version, such as the version associated with version 441, the client may apply the undo data for versions 441-443 to obtain the version of the object requested by the process. In applying the versions, the client may first apply version 443, then apply version 442, and finally apply version 441. In this manner, the object may be reverted to each version until the requested version is obtained and provided to the requesting process.
In some implementations, in maintaining the versions, the client may, at various intervals, aggregate versions to improve efficiency in responding to the requests. As a result, rather than requiring the client to apply the undo data for each of the versions, the aggregated version may be used to combine the information in multiple versions to more quickly respond to the request. These aggregated versions may be generated at periodic times, when a defined number of versions have been generated for the object, or at any other similar interval.
Although demonstrated in the example of
As described herein, computing environments may include multiple clients that share versioned data objects, wherein each of the clients may maintain a copy of the versioned data objects. In operational scenario 500, at step 1, process 530 executing on client 510 requests, and is provided access to, data object 541 in data object storage 535. In providing the data object, client 510 may determine that the request was for the most recent version of the data object and, as a result, provide data object 541 without implementing any undo logs on the data object. Once the data object is provided to process 530, process 530 may enter, at step 2, a modification to the data object. This modification may include a change to a table or some other similar data structure represented by data object 541. Once the modification is entered, sometimes through commit command by a user of client 510, an update operation is performed, at step 3a, to update data object 541 in data object storage 535. As an example, if data object 541 corresponded to a table, and a user of client 510 updated a column in the table, when the user committed the changes to the table, the local copy of data object 541 may be updated to reflect the change.
Additionally, when the modification occurs, client 510, at step 3b, generates log entry 523 in undo logs 551 that corresponds to data object 541. Log entry 523 comprises data that can be used to revert data object 541 to a previous version. As a result, if, at a later instance, a process on client 510 required an older version of data object 541, the data object may be retrieved from data object storage 535, implement required undo log entries to support the earlier version, and provide the data object with the implemented undo log entries to the requesting process. Thus, the undo log entries may provide delta files, wherein when one or more delta files are applied to the particular data object, the data object may be reverted to a version associated with the last undo log entry.
In addition to updating undo logs 551 that correspond to data object 541, operational scenario 500 further generates, at step 3c, an entry into redo logs 550. In particular, redo logs 550 correspond to a shared log that permits other clients within the computing environment to update local versions of the data objects. In the present implementation, when the modification is generated for data object 541, log entry 528 is created within redo logs 550. This entry permits each client in client(s) 511 to obtain, at step 4, the redo log entries and update a local copy of data object 541, as well as add an undo log entry to support reverting the data object to a previous version. In some implementations, client(s) 511 may be configured to obtain data from the redo log at intervals, however, it should be understood that a computing system responsible for the redo log may supply new log entries at periodic intervals, based on a quantity of log entries received, or at any other similar interval.
In some implementations, in updating the redo log, issues may arise of conflicts, wherein a first client may have updated a data object prior to a second client in the computing environment. In such implementations, the redo log may be configured to handle the conflicts in a variety of ways. For example, the redo log may accept updates in the order that they are received and block or prevent entries for later modifications. As a result, if a client in client(s) 511 submitted a log entry prior to log entry 528 from client 510, then log entry 528 may be rejected by the redo log. Additionally, an error message may be provided to client 510 indicating that the modification was not accepted. This notification may cause the data object to be reverted to a state prior to the change from process 530, and may further update undo log entries and data object 541 based on the entry from the client in client(s) 511. In other implementations, different clients may be provided with different forms of priority that permit modifications of one client to overcome changes of a different client. As a result, if client 510 had a higher priority than the other clients, modifications from client 510 may be used supersede the modifications from the other clients in the computing environment.
Although demonstrated in the present example with steps 3a-3c in a particular order, it should be understood that the operations of step 3 may be implemented in any order. For example, it should be understood that client 510 may determine whether any other modifications have been made to the data object by other clients prior to making the modifications locally at client 510. Thus, step 3c may occur prior to steps 3a and 3b in some implementations.
In operation, computing environments may include multiple clients that share versioned data objects, wherein each of the clients may maintain a copy of the versioned data objects. To support synchronization of the data objects across the multiple clients, redo logs 650 are provided that permits modifications at a first client to be implemented across the other clients in the computing environment. As depicted in the present example, a client in client(s) 611 provides, at step 1, a log entry or entries to redo logs 650. These redo logs are generated as a result of modifications to shared data objects shared between the client systems.
Here, log entry 628 is generated by a client in client(s) 611 and provided to redo logs 650, wherein the log entry corresponds to a redo log entry for data object 641. For example, if a user of the client modified a column of data object 641, then data to implement the modification to the column may be provided as the redo log entry to redo logs 650. Once the redo log entry is supplied to redo logs 650, other clients in the computing environment may obtain the redo log entry to make the required updates. Using the example in operational scenario 600, client 610 obtains log entry 628, and updates data object 641 and undo log 651 to reflect the changes. In updating the undo log, client 610 may identify information that would be required to revert the current version of the data object (created by log entry 628) to the previous version of the data object. Thus, if the modification provided by log entry 628 were to make a change within an entry of a data structure, log entry 623 may reflect the information necessary to revert the entry to its previous state.
Once the update is made within client 610, process 630 may request the most current version of the data object, and may further request any previous version of the data object. For example, if process 630 were to request a version of the data object that were not the current version, then client 610 may apply one or more of log entries 620-623 to provide the process with the required version of the data objection.
Although demonstrated in the present implementation as providing a single log entry from client(s) 611, it should be understood that additional log entries may be supplied for data object 641. Each of the updates may then be supplied to client 610, wherein the entries may be used to update data object 641, and generate log entries within undo log 651. Additionally, while demonstrated as providing log entries for a single data object, it should be understood that entries may be supplied for other data objects in data object storage 635.
Communication interface 760 comprises components that communicate over communication links, such as network cards, ports, radio frequency (RF), processing circuitry and software, or some other communication devices. Communication interface 760 may be configured to communicate over metallic, wireless, or optical links. Communication interface 760 may be configured to use Time Division Multiplex (TDM), Internet Protocol (IP), Ethernet, optical networking, wireless protocols, communication signaling, or some other communication format—including combinations thereof. Communication interface 760 may be configured to communicate with one or more other computing systems that provide support for a data object storage environment. These other computing systems may include other client computing systems, computing systems to store distributed log information, control computing systems, or some other similar computing system.
Processing system 750 comprises microprocessor and other circuitry that retrieves and executes operating software from storage system 745. Storage system 745 may include volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information, such as computer readable instructions, data structures, program modules, or other data. Storage system 745 may be implemented as a single storage device, but may also be implemented across multiple storage devices or sub-systems. Storage system 745 may comprise additional elements, such as a controller to read operating software from the storage systems. Examples of storage media include random access memory, read only memory, magnetic disks, optical disks, and flash memory, as well as any combination or variation thereof, or any other type of storage media. In some implementations, the storage media may be a non-transitory storage media. In some instances, at least a portion of the storage media may be transitory. It should be understood that in no case is the storage media a propagated signal.
Processing system 750 is typically mounted on a circuit board that may also hold the storage system. The operating software of storage system 745 comprises computer programs, firmware, or some other form of machine-readable program instructions. The operating software of storage system 745 comprises modification operation 720 with undo logs 722, data process 725, and data object storage 730. The operating software on storage system 745 may further include an operating system, utilities, drivers, network interfaces, applications, or some other type of software. When read and executed by processing system 750 the operating software on storage system 745 directs client computing system 700 to operate as a client described herein in
In at least one implementation, modification operation 720, when read and executed by processing system 750, directs processing system 750 to identify a request for a data object in data object storage 730 by data process 725 also executing on client computing system 700. In response to the request, modification operation 720 may permit data process 725 associated with the request to access the data in the storage system. Once provided, modification operation 720 may identify a modification to the data object and, in response to the modification, update undo logs 722 to reflect the modification and update the data object within data object storage 730. By updating undo logs 722, versions of the data object may be maintained, such that a second process on the same or different client may revisit or reprocesses a previous version of the same data object. In particular, undo logs 722 may maintain the required information to take an object from a new version to a previous version. Additionally, in some examples, redo logs may be updated such that other client computing systems of the computing environment may update their data objects in accordance with the update. The redo logs may be maintained on a separate computing system or may be distributed across the client computing systems of the computing environment in some implementations.
Although demonstrated in the example of
The included descriptions and figures depict specific implementations to teach those skilled in the art how to make and use the best mode. For the purpose of teaching inventive principles, some conventional aspects have been simplified or omitted. Those skilled in the art will appreciate variations from these implementations that fall within the scope of the invention. Those skilled in the art will also appreciate that the features described above can be combined in various ways to form multiple implementations. As a result, the invention is not limited to the specific implementations described above, but only by the claims and their equivalents.
This application hereby claims the benefit of and priority to U.S. Provisional Patent Application No. 62/552,889, titled “EFFICIENT VERSIONED OBJECT MANAGEMENT,” filed Aug. 31, 2017, and which is hereby incorporated by reference in its entirety.
Number | Date | Country | |
---|---|---|---|
62552889 | Aug 2017 | US |