The present invention relates to data backup. More particularly, the present invention is a method and apparatus for managing backup data.
Many schemes have been developed to protect data from accidental loss or damage. One of them is hardware redundancy schemes, such as redundant arrays of independent disks (RAID).
Unfortunately, hardware redundancy schemes are ineffective in dealing with logical data loss or corruption. For example, a file deletion or virus infection is often automatically replicated to all of the redundant hardware components and can neither be prevented nor recovered from by such technologies.
To overcome this problem, backup technologies have been developed to retain multiple versions of a production system over time. This has allowed administrators to restore previous versions of data and to recover from data corruption.
One type of data protection system involves making point in time (PIT) copies of data. A first type of PIT copy is a hardware-based PIT copy, which is a mirror of a primary volume onto a secondary volume. The main drawbacks of the hardware-based PIT copy are that the data ages quickly and that each copy takes up as much disk space as the primary volume. A software-based PIT, so called a “snapshot,” is a “picture” of a volume at the block level or a file system at the operating system level.
Backup data is generated in accordance with a data backup policy. Typically, the data backup policy sets an expiration time of each backup. For example, a system may retain all writes to the system for two days to provide any-point-in-time protection, and retain hourly snapshots for two weeks, daily snapshots for two months, and monthly snapshots for one year. Each snapshot has its own expiration time. Typically, the expiration time is determined by a main system clock. The system automatically deletes backup data upon expiration of the timer of each backup in accordance with the main system clock.
If a system operator accidentally or maliciously advances the main system clock, the system would automatically delete snapshots or a metadata timer of which is set before the accidentally or maliciously advanced time. In that situation, the system may or may not recover the deleted data.
The present invention is a method and apparatus for managing backup data. A data backup system defines a plurality of time windows for creating and maintaining backup data in accordance with a data backup policy. Each of the time windows is assigned a predetermined amount of storage space. When the data backup system creates a backup data, the system determines whether a storage space assigned to a time window is large enough to accommodate the new backup data. If the storage space is large enough, the new backup data is stored, but if the storage space is not large enough, the system deletes the oldest backup data until enough storage space is obtained.
The system may assign a predetermined number of data backups to each of the time windows. Newly created backup data is stored if the number of backups does not exceed the assigned number. The system may also use an internal clock, independent from a main clock, in managing backup data.
A more detailed understanding of the invention may be had from the following description of a preferred embodiment, given by way of example, and to be understood in conjunction with the accompanying drawings, wherein:
The present invention will be described with reference to the drawing figures wherein like numerals represent like elements throughout.
It should be noted that the primary data volume 104 and the secondary data volume 108 can be any type of data storage, including, but not limited to, a single disk, a disk array (such as a RAID), or a storage area network (SAN). The main difference between the primary data volume 104 and the secondary data volume 108 lies in the type of data storage device at each location. The primary volume 104 is typically an expensive, fast, and highly available storage subsystem, whereas the secondary volume 108 is typically a cost-effective, high capacity, and comparatively slow (for example, ATA/SATA disks) storage subsystem.
It should be noted that the configurations of the system in
The controller 112 provides overall control of generating, storing, and deleting backup data. The backup data generation unit 114 generates backup data, such as snapshots, under the control of the controller 112 as desired under the backup policy. The backup data is stored in a storage unit, such as a secondary volume 108. Each backup data has its own expiration time and the controller 112 deletes backup data when that expiration time has expired.
A process for managing backup data will be explained with reference to
The controller 112 assigns each of the time windows a predetermined amount of storage space (step 304). For example, the controller 112 may assign 100 GB for APIT window, 100 GB for hourly snapshots, 100 GB for daily snapshots, 100 GB for weekly snapshots, and 100 GB for monthly snapshots.
The backup data generation unit 114 creates backup data under the control of the controller 112 (step 306). For example, if the data backup policy is set to retain every write operation for APIT protection, the backup data generation unit 114 duplicates every write operation in the storage space assigned to the APIT window. In storing the writes, the controller 112 determines whether the assigned storage space is large enough to store the new backup data (step 308). If there is enough assigned storage space remaining to accommodate the new backup data, the new backup data is stored (step 310). However, if the assigned storage space is not large enough, the oldest stored backup data is deleted successively in the assigned storage space until enough storage space in the assigned storage space is obtained to accommodate the newly created backup data (step 312).
As previously described with reference to the prior art, each write retained for APIT protection is deleted after a specific expiration time has passed, for example 24 hours, and the passage of time is calculated by the main system clock. In contrast, in accordance with the present invention, the writes are not deleted depending upon the passage of time, but rather depending upon space availability. This is done without regard to the status of the main system clock. In a time period wherein few writes are committed to the primary storage, the APIT window may retain a much longer period of data; whereas in a time period of very high write activity, a shorter period of data may be retained. The duration of retention is a function of the assigned storage space and frequency of write operations. With this scheme, backup data is protected from accidental or malicious adjustment of the main system clock.
The controller 112 defines a plurality of time windows for creating and maintaining backup data in accordance with the data backup policy (step 402). The controller 112 assigns each of the time windows a predetermined number of backups (step 404). For example, the controller 112 may assign 100 backups for APIT window, 50 for hourly snapshots, 10 for daily snapshots, 10 for weekly snapshots, and 20 for monthly snapshots.
The backup data generation unit 114 creates backup data under the control of the controller 112 (step 406). For example, if the data backup policy is set to retain every write operation for APIT protection, the backup data generation unit 114 duplicates every write operation in a storage assigned to the APIT window. In storing the writes, the controller 112 determines whether the assigned number has been exceeded before storing the new backup data (step 408). If the assigned number has not been exceeded, the new backup data is stored (step 410). However, if the assigned number has been exceeded, the oldest backup data may be first deleted and the new backup data is stored (step 412). Alternatively, if the assigned number has been exceeded, generation of new backup data may be stopped, or interleaving backup data may be deleted before storing the new backup data.
As previously described with reference to the prior art, each write retained for APIT protection would typically be deleted after a certain expiration time has passed, and the passage of time is calculated in accordance with the main system clock. In contrast, in accordance with the present invention, the writes are not deleted depending upon the passage of time, but rather depending upon the available number of backups. This is done without regard to the main system clock. Therefore, in a time period wherein few writes are committed to the primary storage, the APIT window may retain a longer period of data; whereas in a time period of very high write activity, a shorter period of data may be retained. The duration of retention is a function of the assigned number and frequency of write operations. With this scheme, backup data is protected from accidental or malicious adjustment of the main system clock.
A backup data is created in accordance with the data backup policy (step 504). The controller 112 determines whether the expiration time for a particular backup has expired in accordance with the internal clock 116 (step 506). Expired backup data is deleted (step 510) and unexpired backup data is maintained (step 508).
The data protection unit 106 deletes expired backup data in accordance with the internal clock 116, rather than the main clock. With this scheme, the data protection unit 106 may maintain the lifespan of data backups independent from an adjustment to the main clock.
Alternatively, the system may record the interval that the system has been up and adjust the internal clock by the last recorded interval. The interval is recorded on a persistent media. The internal clock may be referred to as an “uptime clock” since the internal clock in this alternative counts only the time that the system is running. When the system is recovered from shut down, the main clock and the internal clock should be reset. The internal clock is adjusted with the last recorded interval during which the system is up. With this scheme, the internal clock may not jump back or forward more than one recorded interval. As a consequence, the backup data is expired based only on the time that the system is running not counting the time that the system is down.
The foregoing embodiments may be combined with each other. For example, the data backup policy may specify that at least five (5) hourly snapshots should be taken at any given time as far as the hourly snapshots do not take more than 100 GB of storage space. The system may then take as many snapshots until the 100 GB are used up. The system may further set an expiration time for each backup data in accordance with an internal clock. Thereafter, the system may delete expired backup data even before the 100 GB limit is used up.
While specific embodiments of the present invention have been shown and described, many modifications and variations could be made by one skilled in the art without departing from the scope of the invention. The above description serves to illustrate and not limit the particular invention in any way.
This application claims priority from U.S. provisional application Nos. 60/541,626 filed Feb. 4, 2004 and 60/542,011 filed Feb. 5, 2004, which are incorporated by reference as if fully set forth herein.
Number | Name | Date | Kind |
---|---|---|---|
4635145 | Horie et al. | Jan 1987 | A |
4727512 | Birkner et al. | Feb 1988 | A |
4775969 | Osterlund | Oct 1988 | A |
5163148 | Walls | Nov 1992 | A |
5235695 | Pence | Aug 1993 | A |
5269022 | Shinjo et al. | Dec 1993 | A |
5297124 | Plotkin et al. | Mar 1994 | A |
5438674 | Keele et al. | Aug 1995 | A |
5455926 | Keele et al. | Oct 1995 | A |
5485321 | Leonhardt et al. | Jan 1996 | A |
5555371 | Duyanovich et al. | Sep 1996 | A |
5638509 | Dunphy et al. | Jun 1997 | A |
5666538 | DeNicola | Sep 1997 | A |
5673382 | Cannon et al. | Sep 1997 | A |
5774292 | Georgiou et al. | Jun 1998 | A |
5774643 | Lubbers et al. | Jun 1998 | A |
5774715 | Madany et al. | Jun 1998 | A |
5805864 | Carlson et al. | Sep 1998 | A |
5809511 | Peake | Sep 1998 | A |
5809543 | Byers et al. | Sep 1998 | A |
5835953 | Ohran | Nov 1998 | A |
5854720 | Shrinkle et al. | Dec 1998 | A |
5857208 | Ofek | Jan 1999 | A |
5864346 | Yokoi et al. | Jan 1999 | A |
5872669 | Morehouse et al. | Feb 1999 | A |
5875479 | Blount et al. | Feb 1999 | A |
5911779 | Stallmo et al. | Jun 1999 | A |
5949970 | Sipple et al. | Sep 1999 | A |
5961613 | DeNicola | Oct 1999 | A |
5963971 | Fosler et al. | Oct 1999 | A |
5974424 | Schmuck et al. | Oct 1999 | A |
6021408 | Ledain et al. | Feb 2000 | A |
6021491 | Renaud | Feb 2000 | A |
6023709 | Anglin et al. | Feb 2000 | A |
6029179 | Kishi | Feb 2000 | A |
6041329 | Kishi | Mar 2000 | A |
6044442 | Jesionowski | Mar 2000 | A |
6049848 | Yates et al. | Apr 2000 | A |
6061309 | Gallo et al. | May 2000 | A |
6067587 | Miller et al. | May 2000 | A |
6070224 | LeCrone et al. | May 2000 | A |
6098148 | Carlson | Aug 2000 | A |
6128698 | Georgls | Oct 2000 | A |
6131142 | Kamo et al. | Oct 2000 | A |
6131148 | West et al. | Oct 2000 | A |
6134660 | Boneh et al. | Oct 2000 | A |
6163856 | Dion et al. | Dec 2000 | A |
6173293 | Thekkath et al. | Jan 2001 | B1 |
6173359 | Carlson et al. | Jan 2001 | B1 |
6195730 | West | Feb 2001 | B1 |
6225709 | Nakajima | May 2001 | B1 |
6247096 | Fisher et al. | Jun 2001 | B1 |
6260110 | LeCrone et al. | Jul 2001 | B1 |
6266784 | Hsiao et al. | Jul 2001 | B1 |
6269423 | Kishi | Jul 2001 | B1 |
6269431 | Dunham | Jul 2001 | B1 |
6282609 | Carlson | Aug 2001 | B1 |
6289425 | Blendermann et al. | Sep 2001 | B1 |
6292889 | Fitzgerald et al. | Sep 2001 | B1 |
6301677 | Squibb | Oct 2001 | B1 |
6304880 | Kishi | Oct 2001 | B1 |
6317814 | Blendermann et al. | Nov 2001 | B1 |
6324497 | Yates et al. | Nov 2001 | B1 |
6327418 | Barton | Dec 2001 | B1 |
6336163 | Brewer et al. | Jan 2002 | B1 |
6336173 | Day et al. | Jan 2002 | B1 |
6339778 | Kishi | Jan 2002 | B1 |
6341329 | LeCrone et al. | Jan 2002 | B1 |
6343342 | Carlson | Jan 2002 | B1 |
6353837 | Blumenau | Mar 2002 | B1 |
6360232 | Brewer et al. | Mar 2002 | B1 |
6389503 | Georgis et al. | May 2002 | B1 |
6397307 | Ohran | May 2002 | B2 |
6408359 | Ito et al. | Jun 2002 | B1 |
6487561 | Ofek et al. | Nov 2002 | B1 |
6496791 | Yates et al. | Dec 2002 | B1 |
6499026 | Rivette et al. | Dec 2002 | B1 |
6557073 | Fujiwara | Apr 2003 | B1 |
6557089 | Reed et al. | Apr 2003 | B1 |
6578120 | Crockett et al. | Jun 2003 | B1 |
6615365 | Jenevein et al. | Sep 2003 | B1 |
6625704 | Winokur | Sep 2003 | B2 |
6654912 | Viswanathan et al. | Nov 2003 | B1 |
6658435 | McCall | Dec 2003 | B1 |
6694447 | Leach et al. | Feb 2004 | B1 |
6725331 | Kedem | Apr 2004 | B1 |
6766520 | Rieschl et al. | Jul 2004 | B1 |
6779057 | Masters et al. | Aug 2004 | B2 |
6779058 | Kishi et al. | Aug 2004 | B2 |
6779081 | Arakawa et al. | Aug 2004 | B2 |
6796489 | Slater et al. | Sep 2004 | B2 |
6816941 | Carlson et al. | Nov 2004 | B1 |
6816942 | Okada et al. | Nov 2004 | B2 |
6834324 | Wood | Dec 2004 | B1 |
6839843 | Bacha et al. | Jan 2005 | B1 |
6850964 | Ferguson et al. | Feb 2005 | B1 |
6877016 | Hart et al. | Apr 2005 | B1 |
6898600 | Fruchtman et al. | May 2005 | B2 |
6915397 | Lubbers et al. | Jul 2005 | B2 |
6931557 | Togawa | Aug 2005 | B2 |
6950263 | Suzuki et al. | Sep 2005 | B2 |
6973369 | Trimmer et al. | Dec 2005 | B2 |
6973534 | Dawson | Dec 2005 | B2 |
6978283 | Edwards et al. | Dec 2005 | B1 |
6978325 | Gibble | Dec 2005 | B2 |
7007043 | Farmer et al. | Feb 2006 | B2 |
7020779 | Sutherland | Mar 2006 | B1 |
7032126 | Zalewski et al. | Apr 2006 | B2 |
7055009 | Factor et al. | May 2006 | B2 |
7072910 | Kahn et al. | Jul 2006 | B2 |
7096331 | Haase et al. | Aug 2006 | B1 |
7100089 | Phelps | Aug 2006 | B1 |
7111136 | Yamagami | Sep 2006 | B2 |
7111194 | Schoenthal et al. | Sep 2006 | B1 |
7127388 | Yates et al. | Oct 2006 | B2 |
7127577 | Koning et al. | Oct 2006 | B2 |
7139891 | Apvrille et al. | Nov 2006 | B1 |
7152077 | Veitch et al. | Dec 2006 | B2 |
7152078 | Yamagami | Dec 2006 | B2 |
7155465 | Lee et al. | Dec 2006 | B2 |
7155586 | Wagner et al. | Dec 2006 | B1 |
7200726 | Gole et al. | Apr 2007 | B1 |
7203726 | Hasegawa et al. | Apr 2007 | B2 |
7251713 | Zhang | Jul 2007 | B1 |
7302057 | Rotholtz et al. | Nov 2007 | B2 |
7346623 | Prahlad et al. | Mar 2008 | B2 |
7558839 | McGovern | Jul 2009 | B1 |
7774610 | McGovern et al. | Aug 2010 | B2 |
20010047447 | Katsuda | Nov 2001 | A1 |
20020004835 | Yarbrough | Jan 2002 | A1 |
20020016827 | McCabe et al. | Feb 2002 | A1 |
20020026595 | Saitou et al. | Feb 2002 | A1 |
20020091670 | Hitz et al. | Jul 2002 | A1 |
20020095557 | Constable et al. | Jul 2002 | A1 |
20020144057 | Li et al. | Oct 2002 | A1 |
20020163760 | Lindsay et al. | Nov 2002 | A1 |
20020166079 | Ulrich et al. | Nov 2002 | A1 |
20020199129 | Bohrer et al. | Dec 2002 | A1 |
20030004980 | Kishi et al. | Jan 2003 | A1 |
20030005313 | Gammel et al. | Jan 2003 | A1 |
20030025800 | Hunter et al. | Feb 2003 | A1 |
20030037211 | Winokur | Feb 2003 | A1 |
20030046260 | Satyanarayanan et al. | Mar 2003 | A1 |
20030120476 | Yates et al. | Jun 2003 | A1 |
20030120676 | Holavanahalli et al. | Jun 2003 | A1 |
20030126136 | Omoigui | Jul 2003 | A1 |
20030126388 | Yamagami | Jul 2003 | A1 |
20030135672 | Yip et al. | Jul 2003 | A1 |
20030149700 | Bolt | Aug 2003 | A1 |
20030158766 | Mital et al. | Aug 2003 | A1 |
20030182301 | Patterson et al. | Sep 2003 | A1 |
20030182350 | Dewey | Sep 2003 | A1 |
20030188208 | Fung | Oct 2003 | A1 |
20030217077 | Schwartz et al. | Nov 2003 | A1 |
20030225800 | Kavuri | Dec 2003 | A1 |
20040015731 | Chu et al. | Jan 2004 | A1 |
20040098244 | Dailey et al. | May 2004 | A1 |
20040103147 | Flesher et al. | May 2004 | A1 |
20040158766 | Liccione et al. | Aug 2004 | A1 |
20040167903 | Margolus et al. | Aug 2004 | A1 |
20040168034 | Homma et al. | Aug 2004 | A1 |
20040168057 | Margolus et al. | Aug 2004 | A1 |
20040181388 | Yip et al. | Sep 2004 | A1 |
20040181707 | Fujibayashi | Sep 2004 | A1 |
20040186858 | McGovern et al. | Sep 2004 | A1 |
20050010529 | Zalewski et al. | Jan 2005 | A1 |
20050044162 | Liang et al. | Feb 2005 | A1 |
20050063374 | Rowan et al. | Mar 2005 | A1 |
20050065962 | Rowan et al. | Mar 2005 | A1 |
20050066118 | Perry et al. | Mar 2005 | A1 |
20050066222 | Rowan et al. | Mar 2005 | A1 |
20050066225 | Rowan et al. | Mar 2005 | A1 |
20050076070 | Mikami | Apr 2005 | A1 |
20050076261 | Rowan et al. | Apr 2005 | A1 |
20050076262 | Rowan et al. | Apr 2005 | A1 |
20050076264 | Rowan et al. | Apr 2005 | A1 |
20050097260 | McGovern et al. | May 2005 | A1 |
20050108302 | Rand et al. | May 2005 | A1 |
20050144407 | Colgrove et al. | Jun 2005 | A1 |
20050182910 | Stager et al. | Aug 2005 | A1 |
20050240813 | Okada et al. | Oct 2005 | A1 |
20060010177 | Kodama | Jan 2006 | A1 |
20060047895 | Rowan et al. | Mar 2006 | A1 |
20060047902 | Passerini | Mar 2006 | A1 |
20060047903 | Passerini | Mar 2006 | A1 |
20060047905 | Matze et al. | Mar 2006 | A1 |
20060047925 | Perry | Mar 2006 | A1 |
20060047989 | Delgado et al. | Mar 2006 | A1 |
20060047998 | Darcy | Mar 2006 | A1 |
20060047999 | Passerini et al. | Mar 2006 | A1 |
20060143376 | Matze et al. | Jun 2006 | A1 |
20060259160 | Hood et al. | Nov 2006 | A1 |
Number | Date | Country |
---|---|---|
2 256 934 | Jun 2000 | CA |
0 845 733 | Jun 1998 | EP |
0 869 460 | Oct 1998 | EP |
1 058 254 | Dec 2000 | EP |
1 122 910 | Aug 2001 | EP |
1 233 414 | Aug 2002 | EP |
1333379 | Apr 2006 | EP |
1 671 231 | Jun 2006 | EP |
1 671231 | Jun 2006 | EP |
WO9903098 | Jan 1999 | WO |
WO9906912 | Feb 1999 | WO |
WO-0118633 | Mar 2001 | WO |
WO-03067438 | Aug 2003 | WO |
WO-2004084010 | Sep 2004 | WO |
WO2005031576 | Apr 2005 | WO |
WO2006023990 | Mar 2006 | WO |
WO2006023991 | Mar 2006 | WO |
WO2006023992 | Mar 2006 | WO |
WO2006023993 | Mar 2006 | WO |
WO2006023994 | Mar 2006 | WO |
WO2006023995 | Mar 2006 | WO |
Number | Date | Country | |
---|---|---|---|
20050193236 A1 | Sep 2005 | US |
Number | Date | Country | |
---|---|---|---|
60541626 | Feb 2004 | US | |
60542011 | Feb 2004 | US |