The present disclosure relates generally to data set synchronization and replication and more particularly to a system and method for the synchronization of data sets with reference to one or more replication constraints.
Synchronization may refer to the process of making the data contained in a first data set identical with the data contained in a second data set. Replication may refer to the process of maintaining identical copies of data on the first data set and the second data set. During data set synchronizations, file input/output journals accumulated in spools may become very large, causing the spools to overfill. In addition, data set synchronizations may greatly reduce the performance of a protected application such that users may wish to slow down or stop the synchronization and/or replication process during certain hours of the day to avoid degradation of an application's performance. However, slowing the synchronization may cause the spool to overfill during the synchronization process.
According to one embodiment of the present disclosure, a method for synchronizing and replicating data sets includes receiving a request to synchronize a first data set associated with a first server and a second data set associated with a second server. The method also includes determining, with reference to one or more replication constraints, whether to begin synchronization. The method further includes applying one or more resource control actions in response to determining to begin synchronization.
In some embodiments of the present disclosure, the method may include determining compliance with one or more replication constraints, and determining a synchronization speed based in response to non-compliance with the one or more replication constraints. Further embodiments may include determining compliance with one or more replication constraints by determining a system utilization metric of the first server, and comparing the system utilization metric of the first server to the one or more replication constraints.
In particular embodiments of the present disclosure, the method may include generating one or more low content replication journal entries comprising a set of instructions indicating write events at one or more data ranges. In other embodiments, the method may include consolidating the one or more low content replication journals by receiving a first set of instructions indicating write events at one or more data ranges, receiving a second set of instructions indicating write events at one or more data ranges, identifying one or more redundancies between the first and second sets of instructions, and generating a third set of instructions comprising the instructions of the first and second sets of instructions without the one or more redundancies.
In further embodiments of the present disclosure, the method may include retrieving data from the one or more data ranges, and generating one or more full content replication journal entries comprising the one or more low content journal entries and the retrieved data. In some embodiments, the method may include sending one or more of the one or more full content replication journal entries to the second server for replication, and recording which of the one or more full content replication journal entries have been sent to the second server for replication.
Technical advantages of the present disclosure include a system and method that allow for the synchronization and replication of data sets according to one or more replication constraints. Particular embodiments of the present disclosure may allow for replication constraints based on the time of day. For example, a user may wish to set a very low synchronization speed or entirely prohibit synchronization during business hours in order to avoid performance deterioration of the first server. Other embodiments of the present disclosure may allow for replication constraints based on a system utilization metric of the first server such as processor utilization, memory utilization, or network bandwidth. For example, the user may wish to only allow the processes related to synchronization to use a certain portion of available system resources based on the system utilization metric.
Other technical advantages of the present disclosure include generating low content replication journal entries comprising a set of instructions indicating write events at one or more data ranges, which may allow for lower memory utilization and thus a lower possibility of the spool overfilling. For example, low content journal entries produced during the synchronization of a large data set may have a lower likelihood of causing the spool to overfill since they do not include the data to be written to the data ranges. Particular embodiments of the present disclosure may also allow for consolidating the one or more low content replication journals, which may lead to further lowered memory utilization and possibility of the spool overfilling.
Other technical advantages of the present disclosure include recording which of the one or more full content journal entries have been sent to the second server for replication, which may prevent the synchronization process from having to start over. For example, if the synchronization and/or replication process is delayed (for example, because of non-compliance with a replication constraint), the data sets would not be required to restart the synchronization process before continuing replication.
Other technical advantages of the present disclosure will be readily apparent to one skilled in the art from the following figures, descriptions, and claims. Moreover, while specific advantages have been enumerated above, various embodiments may include all, some, or none of the enumerated advantages.
For a more complete understanding of the present disclosure and its advantages, reference is now made to the following descriptions, taken in conjunction with the accompanying drawings, in which:
Embodiments of the present disclosure and its advantages are best understood by referring to
System 100 includes network 110, which may refer to any interconnecting system capable of transmitting audio, video, signals, data, messages, or any combination of the preceding. Network 110 may include all or a portion of a public switched telephone network (PSTN), a public or private data network, a local area network (LAN), a metropolitan area network (MAN), a wide area network (WAN), a local, regional, or global communication or computer network such as the Internet, a wireline or wireless network, an enterprise internet, or any other suitable communication link, including combinations thereof. System 100 may also include a master server 120. Master server 120 may be used to process and/or store data communications to or from network 110 or system 100. In particular embodiments, master server 120 may be associated with a master data set 122 that is maintained at master server 120. System 100 may also include a replica server 130, that may be used to process and/or store data communicated to or from network 110 or system 100. In particular embodiments, replica server 130 may be associated with a replica data set 132. Replica data set 132 may be data that is replicated from and/or synchronized with master data set 122.
Interface 220 may refer to any suitable device operable to receive input for master server 120, send output from master server 120, perform suitable processing of the input or output or both, communicate to other devices, or any combination of the preceding. Interface 220 may include appropriate hardware (e.g. modem, network interface card, etc.) and software, including protocol conversion and data processing capabilities, to communicate through a LAN, WAN, or other communication system that allows master server 120 to communicate to other devices. Interface 220 may include one or more ports, conversion software, or both.
Memory 230 stores information. Memory 230 may comprise one or more tangible, computer-readable, and/or computer-executable computer readable medium, and may exclude signals or carrier waves. Examples of memory include computer memory (for example, Random Access Memory (RAM) or Read Only Memory (ROM)), mass computer readable media (for example, a hard disk), removable computer readable media (for example, a Compact Disk (CD) or a Digital Video Disk (DVD)), database and/or network storage (for example, a server), and/or other computer-readable medium.
Spool 231 refers to a portion of memory 230 wherein one or more replication journals may be temporarily stored prior to being sent to replica server 130 for replication. During synchronization, master server 120 records all I/O operations that occur at data set 122 and stores them in replication journals as one or more instructions for replication. In particular embodiments, the replication journals 232 may be stored in spool 231 until synchronization is complete, at which point they are sent to replica server 130. Replica server 130 may then and perform the one or more instructions for replication by replaying the I/O operations that previously occurred at data set 122 on the files at data set 132. Thus, it will be understood that spool 231 acts as a queue for the instructions for replication located in replication journals 232. In some embodiments, replication may be performed through an online replication process, wherein the replication journals are sent to replica server 130 as soon as master server 120 is ready to send them to replica server 130. In other embodiments, replication may be performed through a periodic replication process, wherein the replication journals are only sent to replica server 130 at certain fixed times.
Applications 240 refer to logic that may be encoded in one or more tangible media and may perform operations when executed by processor 210. For example, master server 120 may run applications such as Microsoft Exchange, Microsoft Sharepoint, or Microsoft Dynamics CRM. In accordance with the present invention, one or more applications 240 on master server 120 and the data associated with said applications stored at data set 122 may be synchronized and replicated at replica server 130 and data set 132, respectively.
A number of issues exist in the synchronization and replication of data sets. For instance, the synchronization of large data sets may lead to extremely large replication journals being stored in spool 231. During synchronization, replication journals 232 may be stored in spool 231 until synchronization is complete, at which point they are sent to replica server 130 for replication. Because large data sets require a greater amount of time to synchronize than smaller data sets, replication journal sizes increase for larger data sets. Therefore, it is possible that the size of replication journals 232 required to be stored in spool 231 during the synchronization of large data set may exceed the capacity of spool 231.
In addition, the synchronization and replication of data sets may require a significant amount of system resources at master server 120 and replica server 130. As a result, the performance of applications on master server 120 may suffer while synchronization and replication occurs. Accordingly, one may desire to slow down and/or restrict synchronization and replication during particular times of day, for instance, during business hours or any other time of day at which the application will utilize more resources. However, slowing and/or restricting synchronization and replication leads to larger replication journals 232 being stored in spool 231 due to the increased amount of time necessary to complete synchronization. Thus, it is possible that the size of replication journals 232 required to be stored in spool 231 during a slowed or restricted synchronization may exceed the capacity of spool 231. It is therefore an object of the present invention to overcome such issues in the art.
At step 320, master server 120 determines whether to begin synchronization of data sets 122 and 132. This determination may be made based on any number of replication constraints. For example, the user may wish to prohibit synchronization during certain times of day in which internal or external network or system resources may be in high demand or otherwise have relatively limited availability. If it is determined at step 320 to not begin synchronization, master server 120 continues to determine whether it should begin the synchronization process until it is determined that it should begin synchronization. If it is instead determined that the synchronization process should begin, master server 120 begins synchronization and applies one or more resource control actions to the synchronization and replication process. The one or more resource control actions may include one or all of the following as explained further below: setting a synchronization speed based on a system utilization metric, postponing synchronization during certain times of day, switching to a periodic replication process and generating low content replication journal entries, and/or consolidating the low content replication journal entries.
At step 330, master server 120 determines whether it is compliant with one or more replication constraints. In particular embodiments, a user may wish to set a replication constraint based on the time of day. For example, the user may wish to entirely discontinue synchronization during business hours or other days and/or times when internal or external network or system resources may be in high demand or otherwise have relatively limited availability. In other embodiments, a user may wish to set a replication constraint based on a system utilization metric related to the performance of one or more applications 240 on master server 120. In such embodiments, the user may wish to adjust synchronization speed based on one or more system utilization metrics such as processor utilization, memory utilization, or network bandwidth. For example, the user may wish to only allow the processes related to synchronization to use a certain portion of available system resources determined from the one or more system utilization metrics. In further embodiments, a user may wish to set replication constraints based on both time of day and performance of application 240.
If it is determined at step 330 that system 100 is not compliant with the one or more replication constraints, the method continues to step 332 where master server 120 determines and sets an appropriate synchronization speed in order to comply with the one or more replication constraints. In particular embodiments, master server 120 may set the synchronization speed based on the comparison of a system utilization metric and one or more of the replication constraints. For example, if an application 240 requires a large amount of system resources on master server 120, the synchronization speed may be lowered in order to prevent the deterioration of the performance of application 240. In other embodiments, master server 120 may adjust the synchronization speed, or even prohibit synchronization, for specified periods of time. For example, a user may wish to set a very low synchronization speed or entirely prohibit synchronization during business hours in order to avoid performance deterioration of application 240 on master server 120. Although step 320 is shown in
If it is instead determined at step 320 that system 100 is in compliance with the one or more replication constraints, the method continues to step 340. At step 340, master server 120 switches to a periodic replication process and generates one or more low content replication journals comprising instructions indicating the data ranges where write events have occurred. This is in contrast to full content replication journal entries which include both instructions indicating data ranges where write events have occurred and the data of the write events. For purposes of illustration of these concepts, reference to
At step 350, master server 120 may make use of a journal consolidation algorithm to consolidate the one or more low content replication journal entries, removing any redundancies between the different replication journal entries. This may include receiving first and second sets of instructions indicating write events at one or more data ranges, identifying one or more redundant write events between the first and second sets of instructions, and generating a third set of instructions comprising the instructions of the first and second sets of instructions without the one or more redundancies. Referring to
At step 360, master server 120 retrieves the data in data set 122 to be written to the data ranges indicated in the one or more low content replication journal entries. This may be accomplished by copying the data in the data ranges indicated in the low content replication journal entries. For example, the Volume Shadow Copy Service in Microsoft Windows may be utilized to create Volume Shadow Service (VSS) snapshots of the data located in the indicated data ranges. Referring to
Due to the amount of system resources required to retrieve data in step 360, steps 360, 370, and 380 may take an extended amount of time to complete. Therefore, it is possible that steps 360, 370, and 380 may not fully come to completion before synchronization is stopped due to one or more replication constraints. In prior systems, if the synchronization process were stopped for some reason without completion, the entire synchronization and replication processes would need to be restarted. In some situations, this may require master server 120 to compare data sets 122 and 132 to determine differences in the data contained therein, which may consume a large amount of time and/or system resources. However, according to the present disclosure, master server 120 may keep track of the progress of the synchronization and replication processes in order to avoid restarting the processes. For example, master server 120 may record which of the generated full content replication journal entries have been sent to replica server 130 for replication, avoiding the need for master server 120 to compare data sets 122 and 132. Thus, the synchronization and replication processes may resume from their previous state, and only the unsent full content replication journal entries need to be sent to replica server 130 for replication.
At step 420, master server 120 begins to generate online replication journal entries comprising sets of instructions indicating write events at one or more data ranges, and the data that is to be written to the data ranges. In some embodiments, master server 120 will capture the point in time at which the determination is made to switch to an online replication process, and this point in time will act as the watershed for generating online replication journal entries for the online replication process, as opposed to the low content journal entries used in the periodic replication process. In particular embodiments, master server 120 may then begin to take VSS snapshots of those write events occurring after the watershed point in time, and may place the snapshots into the generated online replication journal entries. In further embodiments, the online replication journal entries are then queued behind the previously generated low content journal entries in spool 231.
Once master server 120 has begun to generate online replication journal entries, the method continues to step 430. At step 430, master server 120 retrieves the data to be written to the data ranges indicated in the one or more low content replication journal entries. This may be accomplished, for example, by taking VSS snapshots of the data ranges indicated in the low content replication journal entries. Referring to
Once the data is removed from the online replication journal entries at step 520, master server 120 then switches to a periodic replication process and begins to generate low content replication journal entries at step 530. In particular embodiments, the low content replication journal entries generated may record those write events occurring after the error detection. At step 540, master server 120 consolidates the stripped online replication journal entries and the low content replication journal entries according to the consolidation method above. After consolidation, master server 120 may then check to determine whether it may switch back to the online replication process, as shown in
Although the present disclosure has been described in several embodiments, a myriad of changes, substitutions, and modifications may be suggested to one skilled in the art, and it is intended that the present disclosure encompass such changes, substitutions, and modifications as fall within the scope of the present appended claims.