The present invention is related to the field of computer system backup and restoration, and more particularly to the backup and restoration of a computer's operating system.
Existing computer backup/restore systems generally provide for remote backup storage of copies of files from a computer, and later restoration of such files from the remote backup system back to the computer as may be needed. Most existing systems readily support the backup and restoration of user data files, and some support the backup and restoration of system files that form part of the computer's operating system. While systems of the latter type theoretically provide for restoration of a computer's operating system in the event of loss or corruption of critical operating system files, in practice such restoration may be difficult or impossible even though the files have been backed up and are available for restoration. This difficulty arises in part because of the great complexity of modern operating systems and the large amount of dynamic operating state information that they maintain. It may be impossible for a backup system to fully and coherently re-create a usable operating state of an operating system. In many cases it is necessary for a user to painfully re-create a specific operating state by re-installing the operating system and then applying all necessary incremental changes to it.
Some operating systems, notably the Windows® family of operating systems from Microsoft Corporation, provide a “system restore” function that enables the operating system to be fully restored to a specific configuration as represented by a “restore point”. Periodically during operation, the operating system creates a set of restore point files, which are copies of critical operating system files as they exist at the moment the restore point is being created. Examples of such files include certain executable (program) files, dynamically linked libraries (DLLs), and the system “registry”, which is a large collection of files specifying the entire hardware and software configuration of the computer. The restore point files are saved on a local storage device of the computer, typically a magnetic disk drive on which the operating system is also stored. One main use of the system restore function is to “roll back” the operating system from a current operating state to an operating state at a previous restore point. This roll-back operation can be useful to recover from hardware or software changes that introduce problems in the operation of the computer. In this use, the operating system is executing and itself performs the roll-back. In some systems, notably the newer Vista®-based systems, the system restore function can be initiated from a “system recovery environment” to create a functioning operating system from a previously saved restore point.
Existing backup and restore systems/techniques as discussed above may suffer from certain undesirable limitations. Conventional remote backup/restore systems may not adequately provide for restoration of a computer's operating system to a coherent operating state. System restore functions may rely on a locally stored copy of restore points and are therefore vulnerable to certain failures that will render the restore-point files unavailable, such as a failure of the local storage device.
Disclosed is a backup and restoration technique that enables complete recovery of an operating system even in the event of such catastrophic events as severe data corruption or complete failure of a computer's storage device. System files as well as user files can be restored to a target computer (either the original computer from which they were backed up or another computer), without the need for the target computer to be in a bootable state. The technique provides both flexible system restoration as well as greater reliability due to the use of remote storage.
In a disclosed method, during normal operation of the computer, a backup operation is periodically performed during which user files and system files are copied to a separate backup system for persistent storage. The system files include sets of restore-point files from a source storage device of the computer, the restore-point files having been created by an operating system of the computer and each set of restore-point files being constituents of the operating system at a corresponding point in time. A restoration operation is subsequently performed, for example after an event which causes loss of the restore-point files from the source storage device. The restoration operation includes accessing a recovery storage device which stores (a) a recovery program and (b) copies of the user files and system files as previously provided to the backup system. The computer is first operated in a limited-functionality recovery mode including (a) executing the recovery program to restore the restore-point files from the recovery storage device to a target storage device, which may be the same as or different from the source storage device from which the files were backed up, and (b) executing a system restore function of the computer with a selected set of the restored restore-point files to restore the operating system as constituted at the corresponding point in time. Subsequently, the computer is operated in a full-functionality operating mode including (a) executing the operating system as so restored, and (b) executing the recovery program to restore the user files from the recovery storage device to the computer for subsequent normal use by a user of the computer.
The technique also involves backup and restoration services provided by the backup system as more specifically described and claimed below. The technique provides for robust restoration of an operating system relying on the operating system's own system restore function using restore points, while minimizing dependence on the level of operability of the source storage device and operating system upon occurrence of a data loss event.
The foregoing and other objects, features and advantages will be apparent from the following description of particular embodiments of the invention, as illustrated in the accompanying drawings in which like reference characters refer to the same parts throughout the different views. The drawings are not necessarily to scale, emphasis instead being placed upon illustrating the principles of various embodiments of the invention.
In operation, each client computer 10 executes an operating system which employs operating system (O/S) files 16, as well as various application programs which generate user data files (USER FILES) 18. Examples of user data files include e-mail files, word processing files, spreadsheets, etc. The collection of O/S files 16 is generally quite large and varied. It includes program or “executable” files, dynamically linked library (DLL) files, registry file(s) that store comprehensive information about the current hardware and software configuration of the client computer, and many other kinds of files which are used by the operating system during normal operation. Examples of common operating systems include UNIX-based systems such as Linux and the Windows® family of operating systems from Microsoft (e.g., Windows XP® and Windows Vista®).
Some operating systems (notably including the Windows family) include a “system restore” function which is used to periodically save certain of the O/S files 16 for potential later use in “rolling back” or restoring the operating system to a previous configuration. The points in time at which these save operations occur are referred to as “restore points”, and the set of files saved at each restore point is referred to herein as a set of “restore-point files”.
In operation, the backup system 12 provides backup and restore services to the client computers 10 as described in more detail herein. Backup operations are generally conducted over the network 14, with the clients 10 using the network 14 to transfer files to the backup system 12 for persistent storage. During restoration operations, the backup system 12 creates a recovery storage device 25 such as an optical disk which includes copies of files to be restored to a client computer 10 as well as a “recovery agent” program, and the recovery storage device 25 is sent to a user of a client computer 10 for use in a restoration process as described in more detail below.
The backup computer system 12 maintains both a pool of files (shown as POOL FILES) 22 as well as sets of client computer files (shown as CLT FILES) 24 which are specific to the respective client computers 10 (i.e., CLT FILES 24-1 are stored on behalf of client computer 10-1, CLT FILES 24-2 for client computer 10-2, etc.). Each set of client computer files 24-x includes the user files 18 and some of the restore-point files 20 for the respective client computer 10-x. Specifically, those restore-point files 20 that are unique to a given client computer 10 are stored as part of the client computer files 24 for that client computer 10. It will be appreciated that in a typical networked computer system such as that of
At step 44 the backup system 12 generates a recovery storage device 25 which includes (a) the files to be restored and (b) a recovery agent program that can be executed by the target client computer 10 as part of the restoration process. The files to be restored typically include the user data files 18 as well as restore-point files 20 that were previously transferred to the backup system 12 in a backup operation (see
At step 46, the target client computer 10 is booted from the recovery storage device 25, and at step 48 the recovery agent is executed from the recovery storage device 25 to restore the restore-point files to the target storage 30 of the target client computer 10. Once the restore-point files are restored, then at step 50 the O/S system restore function is executed with a user selection of a restore point, thereby restoring the O/S to the selected restore point. At step 52 the recovery agent is used to also restore the user data files to the target storage 30.
Referring to
Also during normal operation 54, the client computer 10 performs a backup operation 66, which again may be done periodically and/or as specifically necessitated or requested. The backup operation 66 may be performed by a backup application program executing on the client computer 10, which may be part of or separate from the operating system. An example of a backup application program is a backup agent which is a component of a Connected® Backup service available from Iron Mountain, Inc. During the backup operation 66, the client computer 10 transfers copies of the user data files 18 and the restore-point files 20 to the backup system 12, where the backup service 58 receives the files and stores them persistently for potential later use in a recovery/restoration operation. As previously indicated, the backup operation 66 and backup service 58 may employ data reduction techniques which reduce the amount of data that is transferred to and stored by the backup system 12. Examples of these data reduction techniques are now described.
A first data reduction technique is aimed at eliminating the transmission and storage of duplicate files, by either a single client computer 10 and even by different client computers 10. When a file is transferred using this technique, it is saved in the pool files 22 (
It will be appreciated that this first data reduction technique can increase backup efficiency when there is significant file duplication across multiple backup operations 66. In the case of the restore-point files 20, there is typically considerable duplication of certain file sets across a series of restore points, including for example executable files and files forming the system registry, and therefore this technique can advantageously provide greater backup efficiency for these files. Of the thousands of files constituting a given restore point, it may be necessary to transfer only a small fraction (e.g., tens) of files that are new since the last restore point.
A second data reduction technique sends only the changed data in files, saving transmission time and storage requirements. This mechanism relies on computing a series of hashes of a file. One hash is computed for each block of the file (where “block” refers to a fixed-size set of successive bytes). The hashes are stored on the client computer 10 for every file that is backed up. A file is not re-sent unless the hash of a particular block changes in a subsequent backup. This technique can be useful for large files which change only slightly over time, such as some registry files. In this case a complete “base” version of the file is transferred initially, and during subsequent backup operations 66 only the blocks that differ from the base version are transferred. For each changed version that is backed up in this manner, the backup system 12 stores data indicating that the file is a changed version of the base file, along with references to the changed blocks. Over a long enough period of time as the number of changes grows, it may be desirable to re-transmit an up-to-date full version of the file, which could be done at regular intervals for example or based on a size threshold for the number of changed blocks. For example, once it becomes necessary to transfer one-half of the blocks of the file, it may be desirable to instead send a new full copy. Subsequent backups can revert to the changed-block technique with reference to the new full copy. An example of this second technique is a function called Delta Block® which is part of the Connected® Backup of Iron Mountain Inc.
As a result of the backup operations 66 and backup service 58, the backup system 12 stores backup copies of the user files 18 and restore-point files 20 in the pool files 22 and client computer files 24, with some of the files being stored as respective sets of changed blocks referred to separate full versions of the files pursuant to the second technique described above. These files are available for use in providing the restoration service 60 upon occurrence of the event 62, as now described.
The client computer 10 transitions from normal operation 54 to the restoration operation 56 upon occurrence of an event 62 that creates the need for restoration of files, and transitions back to normal operation 54 when the restoration operation 56 is completed as indicated at 64. The event 62 may be a hardware or software condition that causes the loss of any/all of the user files 18, O/S files 16 or restore-point files 20. Examples of such conditions include failure of a disk drive at the client computer 10, as well as corruption of critical O/S files by a computer virus program that is permitted to execute on the client computer 10.
It will be appreciated that part of providing the restoration service 60 is to transfer the files stored in the backup system 12 to the client computer 10. If the client computer 10 has sufficient functionality, it may be capable of receiving the files via the network 14. However, it is assumed herein that the event 62 is of a nature that prevents the client computer 10 from executing its full-functionality operating system and therefore cannot receive files in this manner. In this condition, the client computer 10 may also be described as “non-bootable”, meaning that it cannot start up an operating system to the point of normal operation.
In this case, it is assumed that there is an out-of-band process for initiating the restoration service 60 at the backup system 12. For example, a user may place a telephone call to an operator of the backup system 12 or use another computer 10 to issue an electronic request for the restoration service 60. As part of the restoration service 60, the backup system 12 creates the recovery storage device 25 that contains copies of the files to be restored to the client computer 10, these files being obtained from the pool files 22 and client computer files 24 for the particular client computer 10 for which restoration is requested. The files are preferably stored on the recovery storage device 25 in the same form as originally stored in the client computer 10 from which they were backed up, which means that the backup system 12 performs whatever data expansion is necessary to undo any data reduction performed during the backup process. Specifically, this may mean creating multiple copies of files that may have been saved only once pursuant to the first data reduction technique described above, and creating full versions of files that were saved as a base version plus changed blocks pursuant to the second data reduction technique. In this latter case, a full version is obtained by writing the changed blocks to the base version of the file.
Included on the recovery storage device 25 are copies of the restore-point files 20 for the client computer 10 to enable restoration of the operating system. The recovery storage device 25 also preferably includes a recovery agent program which is to be executed by the client computer 10 as part of the restoration operation 56 as described below. Once the recovery storage device 25 is created, it is provided to the client computer system 10 for use in the restoration operation 56. As an example, the recovery storage device 25 may include a set of optical storage discs or one or more flash memory devices which are shipped or otherwise delivered to a user of the client computer 10. The user inserts the recovery storage device 25 into an appropriate port of the client computer 10 and then executes the restoration operation 56.
The restoration operation 56 includes an initial recovery mode of operation 68 and a subsequent full-functionality operating mode 70. The recovery mode 68 is an operating mode of limited functionality. It may correspond to the so-called “safe mode” of operating a personal computer. In the newer Windows Vista® operating system, there is a specific “recovery environment” that can be entered in which limited functions are provided to recover the operating system and attempt to reinitiate normal operation with the recovered operating system. These are examples of the recovery mode 68. The full-functionality operating mode 70 corresponds to normal operation of the operating system as normally booted, which of course in the present context presumes that the operations in the recovery mode 68 successfully make the operating system bootable.
In the recovery mode 68, the client computer 10 accesses the recovery storage device 25 which has been provided by the restoration service 60 as described above. In the case of optical media, this will entail reading data from an optical disk drive in the client computer 10, and in the case of a flash memory device it may entail reading data from a USB or similar input/output port. The client computer 10 executes the recovery agent program which is stored on the recovery storage device 25. Under user control, the recovery agent copies the backed-up restore-point files 20 from the recovery storage device to an appropriate system area of target storage 30 (such as a magnetic disk drive) of the client computer 10. In Windows® XP and Vista® systems, the system area is a specific folder named System Volume Information. Newer server operating systems may have a separate partition for system restore. Once the restore-point files 20 have been copied to this location, the client computer 10 then executes a system restore function which typically will be available as part of the recovery environment (i.e., provided by the BIOS or other system software which is permitted to execute in the recovery mode 68). The system restore function is capable of re-building a fully functional instance of the operating system from an individual set of restore point files 20-x of a given restore point (i.e. restore point 20-2). The system restore function typically enables a user to select which restore point the operating system is to be restored to.
Upon completion of the system restore function, the client computer is re-started into the full-functionality operating mode 70 executing the operating system as restored in the recovery mode 68. At this point the user may again execute the recovery agent from the recovery storage device 25 in order to recover/restore the user data files 18 to the client computer. Upon completion of this operation, the client computer 10 has been fully restored to its normal operating condition, and as shown at 64 it then transitions back to normal operation 54.
Although the foregoing describes the restoration of files to the same client computer 10 from which the files were backed up, the disclosed technique can also be used to restore the files to a computer other than the one from which they were backed up. This operation may be desirable when a user's computer is to be replaced, for example. Thus in general the backup and restore operations involve first and second computers 10, which in some cases can be the same computer and in other cases may be distinct computers. Additionally, even in the case of restoring files to the same computer 10, it may be desirable to restore the files to a different storage device than the one the files were backed up from. This may be desirable when the original storage device has failed and is replaced, for example. Thus the backup and restore operations may involve distinct source and target storage devices respectively. Other variations of the disclosed technique may also be utilized.