The disclosure relates to a system for distributing updates made to a master file system, and to a system including a journal database for tracking changes made to a master file system.
A master file system may be connected to multiple client workstations using an Internet-based connection, where each client may maintain its own replica of the master file system. Thus, any changes made to the master file system need to be distributed to the various client workstations in order to update the respective replicas. The various client workstations may be located throughout the world, and sometimes in relatively remote or rural areas where Internet connections have a relatively low bandwidth and are sporadic in connectivity.
In light of the above, it may become challenging to distribute the master file system updates to the client workstations. Indeed, the Internet-based connection between the client workstations and the master file system may have a relatively low bandwidth, thereby making the transfer of data between the master file system and the client workstations slow and cumbersome. Furthermore, it should also be appreciated that the client workstations may not be connected to the master file system for relatively long periods of time due to the sporadic Internet connectivity in the area. In some situations, the client workstations may not have network access for several weeks at a time. Additionally, the master file system as well as the replica stored on the client workstation may both be relatively large in size, thereby compounding the challenges that already exist with the transfer and updating of data.
There are directory mirroring tools currently available that may be used to analyze the differences between the file directories of the master file system and the client workstations in order to determine which files to update. However, these directory mirroring tools may have drawbacks. For example, directory mirroring tools may require a significant amount of time to analyze the differences in the file directories between the master file system and the various client workstations. Depending on the size of the file system being mirrored, it may take several minutes or even hours to analyze the differences between the master file system and the various client workstations for every instance when synchronization is executed. It should be appreciated that the client workstations may only have a few hours at a time to synchronize their copies of files with the master file system especially if an Internet connection is unreliable, so any additional time spent to analyze the differences creates a significant impact on the system update.
In one aspect, a system for distributing changes to a replica file system on a client workstation is disclosed, and includes a master file system, a journal database, and an application server including a processor. The journal database stores a journal table and a client timestamp, where the journal table includes a plurality of entries listed in chronological order that each represent a modification to the master file system made at a specific point in time. The client timestamp indicates a last point in time when the replica file system was updated. The application server is in communication with the master file system and the journal database. The processor executes instructions for determining the last point in time when the replica file system of the client workstation was updated based on the client timestamp, querying the journal table based on the client timestamp, and locating a specific entry within the journal table based on the client timestamp. The specific entry within the journal table reflects at least one modification to the master file system made subsequent to the last point in time when the replica file system of the client workstation was updated.
In another aspect, a method of updating a replica file system of a client workstation is disclosed. The method comprises initiating a connection between an application server and the client workstation over a network. The application server is in communication with a master file system and a journal database. The method also includes querying the application server, by the client workstation, for any updates which are applicable to the client workstation. The method further includes determining, by a processor of the application server, a last point in time when the replica file system of the client workstation was updated. The method also includes querying a journal table, by the processor of the application server, based on the last point in time when the replica file system was updated. The journal table is stored in a journal database and includes a plurality of entries listed in chronological order that each represent a modification to the master file system made at a specific point in time. Finally, the method includes locating, by the processor of the application server, a specific entry within the journal table based on the last point in time when the replica file system of the client workstation was updated.
Other objects and advantages of the disclosed method and system will be apparent from the following description, the accompanying drawings and the appended claims.
In one non-limiting embodiment, the network 24 may connect the application server 20 to the client workstation 22 over relatively long distances. For example, the application server 20 and the client workstation 22 may be located in different countries, or even on different continents of the world. Furthermore, it should also be appreciated that at times, the network 24 between the application server 20 and the client workstation 22 may have intermittent connectivity. For example, sometimes the client workstation 22 may be unable to connect to the network 24 for a period of several days, or even weeks. The client workstation 22 may also only have continuous connectively to the network 24 for a relatively short period of time, such as a few hours. However, it is to be appreciated that the present disclosure is not limited to these specific examples, and that in another embodiment the application server 20 and the client workstation 22 may be connected to one another using a relatively reliable Internet connection as well, and the client workstation 22 may be connected to the network 24 on a daily basis.
The application server 20 may be in communication with a journal database 30 and a master file system 32. The client workstation may be in communication with a replica file system 40. The replica file system 40 may be a copy of the master file system 32. Indeed, as explained in greater detail below, the application server 20 may send updates of the master file system 32 over the network 24, and to the replica file system 40.
Referring now to
The processor unit 54 executes instructions for software that may be loaded into a storage device, such as the memory 56. The processor unit 54 may be a set of one or more processors or may include multiple processor cores, depending on the particular implementation. Further, the processor unit 54 may be implemented using one or more heterogeneous processor systems in which a main processor is present with secondary processors on a single chip. In another embodiment, the processor unit 54 may be a homogeneous processor system containing multiple processors of the same type.
The memory 56 and the persistent storage 58 are examples of storage devices. As used herein, a storage device is any tangible piece of hardware that is capable of storing information either on a temporary basis and/or a permanent basis. The memory 56 may be, for example, a non-volatile storage device. The persistent storage 58 may take various forms depending on the particular implementation, and the persistent storage 58 may contain one or more components or devices. For example, the persistent storage 58 may be a hard drive, a flash memory, a rewritable optical disk, a rewritable magnetic tape, and/or some combination of the above. The media used by the persistent storage 58 also may be removable. For example, without limitation, a removable hard drive may be used for the persistent storage 58.
A storage device, such as the memory 56 and/or the persistent storage 58, may store data for use with the processes described herein. For example, a storage device may store (e.g., have embodied thereon) computer-executable instructions, executable software components, configurations, layouts, schedules, or any other information suitable for use with the methods described herein. When executed by the processor unit 54, such computer-executable instructions and components cause the processor 54 to perform one or more of the operations described herein.
The communications unit 60, in these examples, provides for communications with other computing devices or systems. In the exemplary embodiment, the communications unit 60 is a network interface component. The communications unit 60 may provide communications through the use of either or both physical and wireless communication links.
The input/output unit 62 enables input and output of data with other devices that may be connected to the computing device 50. For example, without limitation, the input/output unit 62 may provide a connection for user input through a user input device, such as a keyboard and/or a mouse. Further, the input/output unit 62 may send output to a printer. The display 64 provides a mechanism to display information, such as any information described herein, to a user. For example, a presentation interface such as the display 64 may display a graphical user interface, such as those described herein.
Instructions for the operating system and applications or programs are located on the persistent storage 58. These instructions may be loaded into the memory 56 for execution by the processor unit 54. The processes of the different embodiments may be performed by the processor unit 54 using computer implemented instructions and/or computer-executable instructions, which may be located in a memory, such as the memory 56. These instructions are referred to herein as program code (e.g., object code and/or source code) that may be read and executed by a processor in the processor unit 54. The program code in the different embodiments may be embodied on different physical or tangible computer-readable media, such as the memory 56 or the persistent storage 58.
The program code 66 is located in a functional form on non-transitory computer-readable media 68 that is selectively removable and may be loaded onto or transferred to the computing device 50 for execution by the processor unit 54. The program code 66 and computer-readable media 68 form computer program product 70 in these examples. In one example, the computer-readable media 68 may be in a tangible form, such as, for example, an optical or magnetic disc that is inserted or placed into a drive or other device that is part of the persistent storage 58 for transfer onto a storage device, such as a hard drive that is part of the persistent storage 58. In a tangible form, the computer-readable media 68 also may take the form of a persistent storage, such as a hard drive, a thumb drive, or a flash memory that is connected to the computing device 50. The tangible form of the computer-readable media 68 is also referred to as computer recordable storage media. In some instances, the computer-readable media 68 may not be removable.
Alternatively, the program code 66 may be transferred to the computing device 50 from the computer-readable media 68 through a communications link to the communications unit 60 and/or through a connection to the input/output unit 62. The communications link and/or the connection may be physical or wireless in the illustrative examples. The computer-readable media also may take the form of non-tangible media, such as communications links or wireless transmissions containing the program code.
In some illustrative embodiments, the program code 66 may be downloaded over a network to the persistent storage 58 from another computing device or computer system for use within the computing device 50. For instance, program code stored in a computer-readable storage medium in a server computing device may be downloaded over a network from the server to the computing device 50. The computing device providing the program code 66 may be a server computer, a workstation, a client computer, or some other device capable of storing and transmitting the program code 66.
The program code 66 may be organized into computer-executable components that are functionally related. For example, the program code 66 may include one or more part agents, ordering manager agents, supplier agents, and/or any component suitable for practicing the methods described herein. Each component may include computer-executable instructions that, when executed by the processor unit 54, cause the processor unit 54 to perform one or more of the operations described herein.
The different components illustrated herein for the computing device 50 are not meant to provide architectural limitations to the manner in which different embodiments may be implemented. The different illustrative embodiments may be implemented in a computer system including components in addition to or in place of those illustrated for computing device 50. For example, other components shown in
In another example, a bus system may be used to implement the communications fabric 52 and may include one or more buses, such as a system bus or an input/output bus. Of course, the bus system may be implemented using any suitable type of architecture that provides for a transfer of data between different components or devices attached to the bus system. Additionally, a communications unit may include one or more devices used to transmit and receive data, such as a modem or a network adapter. Further, a memory may be, for example, without limitation, the memory 56 or a cache such as that found in an interface and memory controller hub that may be present in the communications fabric 52.
Turning back to
The journal database 30 may be used to track each update or change made to one of the files or directories of the master file system 32. Turning to
The extraction utility 100 of the application server 20 may monitor the document management system 102 for any updates, such as the addition of new files or directories, or for obsolete files or directories that are deleted. Any new additions to the document management system 102 may be copied to the master file system 32, and the associated update actions may be recorded in the journal database 30. As explained in greater detail below, the associated update actions such as the recording of a new file or directory, as well as the removal or deletion of an obsolete file or directory from the master file system 32 may also be recorded in the journal database 30 by the extraction utility 100.
The journal database 30 may store two tables, namely a journal table 108 and a client table 110.
As seen in
In block 204, the client workstation 22 may query the application server 20 for any updates specific to the document management system 102, which are applicable to the specific client workstation 22. For example, if the client workstation 22 includes the client identifier JDoe1, then the client workstation 22 queries the application server 20 for any updates that are applicable to JDoe1. Method 200 may then proceed to block 206.
In block 206, the processor 54 of the application server 20 queries the client table 110 (
In block 208, the processor 54 of the application server 20 queries the journal table 108 of the journal database 30 based on the timestamp 146 determined in block 206 above to determine the modifications to the master file system 32 that are not reflected within the replica file system 40 of the client workstation 22. Specifically, the processor 54 of the application server 20 queries the journal table 108 based on the timestamp 146, and locates a specific entry 120 within the journal table 108 that reflects a modification made to the master file system 32 made subsequent to the last point in time when the replica file system 40 of the client workstation 22 was updated.
For example, with specific reference to
In block 210, the processor 54 of the application server 20 may then determine a total file size of all of the modifications to the master file system 32 that are not currently reflected in the replica file system 40. Specifically, the application server 20 may determine the file size of all modifications that are not reflected within the replica file system 40 of the client workstation 22 by summing or adding together the file size of each individual entry 120 within the journal table 108 that represents a modification not reflected within the replica file system 40 of the client workstation 22. For example, if the replica file system 40 of the client workstation 22 does not include the modifications that are represented in entries 1202-120N (
In block 212, the application server 20 may then send an indication over the network 24 indicating the total file size of all the modifications to the master file system 32 that are not currently reflected in the replica file system 40 to the client workstation 22. It should be appreciated that the client workstation 22 may use the total file size in order to update the total progress bar 178 shown in
In block 214, the client workstation 22 may send an indication over the network 24 and to the application server 20 signifying that the client workstation 22 is ready, and may receive updates to its replica file system 40. Method 200 may then proceed to block 216.
In block 216, the application server 20 may send updates to the client workstation 22 over the network 24. The updates include each and every modification to the master file system 32 made subsequent to the last point in time when the replica file system 40 of the client workstation 22 was updated. In the present example, with reference to
Furthermore, it is to be appreciated that as each individual update is sent to the client workstation 22 over the network 24, the timestamp 146 (
In block 218, once all of the updates have been sent to the client workstation 22, then method 200 may then terminate. It is to be appreciated that method 200 may also terminate in the event a user of the client workstation 22 terminates the file synchronization. Alternatively, the method 200 may also terminate in the event the network 24 becomes unavailable.
Referring generally to the figures, the disclosed system provides an approach for quickly and efficiently determining the differences between the master file system and the replica file system. It should also be appreciated that the disclosed system does not require differential analysis of the master file system and the replica file system. Moreover, each change may be propagated to a client workstation only once. Furthermore, file synchronization may be interrupted at any time, and the next file synchronization may still resume at the point in time where the last change to the replica file system was performed. Thus, the client workstations may progressively synchronize the data stored within its replica file system in short sessions. This may be especially advantageous if the client workstation is located in an area where Internet connectively is relatively slow or sporadic.
While the forms of apparatus and methods herein described constitute preferred aspects of this disclosure, it is to be understood that the disclosure is not limited to these precise forms of apparatus and methods, and the changes may be made therein without departing from the scope of the disclosure.