The present invention is related to computer storage and in particular to backup and recovery of data.
Several methods are conventionally used to prevent the loss of data. Typically, data is backed up in a periodic manner (e.g., once a day) by a system administrator. Many systems are commercially available which provide backup and recovery of data; e.g., Veritas NetBackup, Legato/Networker, and so on. Another technique is known as volume shadowing. This technique produces a mirror image of data onto a secondary storage system as it is being written to the primary storage system.
Journaling is a backup and restore technique commonly used in database systems. An image of the data to be backed up is taken. Then, as changes are made to the data, a journal of the changes is maintained. Recovery of data is accomplished by applying the journal to an appropriate image to recover data at any point in time. Typical database systems, such as Oracle, can perform journaling.
Except for database systems, however, there are no ways to recover data at any point in time. Even for database systems, applying a journal takes time since the procedure includes:
Recovering data at any point in time addresses the following types of administrative requirements. For example, a typical request might be, “I deleted a file by mistake at around 10:00 am yesterday. I have to recover the file just before it was deleted.”
If the data is not in a database system, this kind of request cannot be conveniently, if at all, serviced. A need therefore exists for processing data in a manner that facilitates recovery of lost data. A need exists for being able to provide data processing that facilitates data recovery in user environments other than in a database application.
A storage system provides data storage services for users and their applications. The storage system performs additional data processing to provide for recovery of lost data, including performing snapshot operations and journaling. Snapshots and journal entries are stored separately from the production data volumes provided for the users. Older journal entries are cleared in order to make for new journal entries. This involves updating a snapshot by applying one or more of the older journal entries to an appropriate snapshot. Subsequent recovery of lost data can be provided by accessing an appropriate snapshot and applying journal entries to the snapshot to reproduce the desired data state.
Aspects, advantages and novel features of the present invention will become apparent from the following description of the invention presented in conjunction with the accompanying drawings:
The backup and recovery system shown in
The host 110 typically will have one or more user applications (APP) 112 executing on it. These applications will read and/or write data to storage media contained in the data volumes 101 of storage system 100. Thus, applications 112 and the data volumes 101 represent the target resources to be protected. It can be appreciated that data used by the user applications can be stored in one or more data volumes.
In accordance with the invention, a journal group (JNLG) 102 is defined. The data volumes 101 are organized into the journal group. In accordance with the present invention, a journal group is the smallest unit of data volumes where journaling of the write operations from the host 110 to the data volumes is guaranteed. The associated journal records the order of write operations from the host to the data volumes in proper sequence. The journal data produced by the journaling activity can be stored in one or more journal volumes (JVOL) 106.
The host 110 also includes a recovery manager (RM) 111. This component provides a high level coordination of the backup and recovery operations. Additional discussion about the recovery manager will be discussed below.
The storage system 100 provides a snapshot (SS) 105 of the data volumes comprising a journal group. For example, the snapshot 105 is representative of the data volumes 101 in the journal group 102 at the point in time that the snapshot was taken. Conventional methods are known for producing the snapshot image. One or more snapshot volumes (SVOL) 107 are provided in the storage system which contain the snapshot data. A snapshot can be contained in one or more snapshot volumes. Though the disclosed embodiment illustrates separate storage components for the journal data and the snapshot data, it can be appreciated that other implementations can provide a single storage component for storing the journal data and the snapshot data.
A management table (MT) 108 is provided to store the information relating to the journal group 102, the snapshot 105, and the journal volume(s) 106.
A controller component 140 is also provided which coordinates the journaling of write operations and snapshots of the data volumes, and the corresponding movement of data among the different storage components 101, 106, 107. It can be appreciated that the controller component is a logical representation of a physical implementation which may comprise one or more sub-components distributed within the storage system 100.
The Journal Header 219 comprises an offset number (JH_OFS) 211. The offset number identifies a particular data volume 101 in the journal group 102. In this particular implementation, the data volumes are ordered as the 0th data volume, the 1st data volume, the 2nd data volume and so on. The offset numbers might be 0, 1, 2, etc.
A starting address in the data volume (identified by the offset number 211) to which the write data is to be written is stored to a field in the Journal Header 219 to contain an address (JH_ADR) 212. For example, the address can be represented as a block number (LBA, Logical Block Address).
A field in the Journal Header 219 stores a data length (JH_LEN) 213, which represents the data length of the write data. Typically it is represented as a number of blocks.
A field in the Journal Header 219 stores the write time (JH_TIME) 214, which represents the time when the write request arrives at the storage system 100. The write time can include the calendar date, hours, minutes, seconds and even milliseconds. This time can be provided by the disk controller 140 or by the host 110. For example, in a mainframe computing environment, two or more mainframe hosts share a timer and can provide the time when a write command is issued.
A sequence number (JH_SEQ) 215 is assigned to each write request. The sequence number is stored in a field in the Journal Header 219. Every sequence number within a given journal group 102 is unique. The sequence number is assigned to a journal entry when it is created.
A journal volume identifier (JH_JVOL) 216 is also stored in the Journal Header 219. The volume identifier identifies the journal volume 106 associated with the Journal Data 225. The identifier is indicative of the journal volume containing the Journal Data. It is noted that the Journal Data can be stored in a journal volume that is different from the journal volume which contains the Journal Header.
A journal data address (JH_JADR) 217 stored in the Journal Header 219 contains the beginning address of the Journal Data 225 in the associated journal volume 106 that contains the Journal Data.
Journal Header 219 and Journal Data 225 are contained in chronological order in their respective areas in the journal volume 106. Thus, the order in which the Journal Header and the Journal Data are stored in the journal volume is the same order as the assigned sequence number. As will be discussed below, an aspect of the present invention is that the journal information 219, 225 wrap within their respective areas 210, 220.
The management table 300 shown in
A journal attribute (GRATTR) 312 is associated with the journal group 102. In accordance with this particular implementation, two attributes are defined: MASTER and RESTORE. The MASTER attribute indicates the journal group is being journaled. The RESTORE attribute indicates that the journal group is being restored from a journal.
A journal status (GRSTS) 315 is associated with the journal group 102. There are two statuses: ACTIVE and INACTIVE.
The management table includes a field to hold a sequence counter (SEQ) 313. This counter serves as the source of sequence numbers used in the Journal Header 219. When creating a new journal, the sequence number 313 is read and assigned to the new journal. Then, the sequence number is incremented and written back into the management table.
The number (NUM_DVOL) 314 of data volumes 101 contained in a give journal group 102 is stored in the management table.
A data volume list (DVOL_LIST) 320 lists the data volumes in a journal group. In a particular implementation, DVOL_LIST is a pointer to the first entry of a data structure which holds the data volume information. This can be seen in
The management table includes a field to store the number of journal volumes (NUM_JVOL) 330 that are being used to contain the data (journal header and journal data) associated with a journal group 102.
As described in
The management table includes fields to store pointers to different parts of the data areas 210, 220 to facilitate wrapping. Fields are provided to identify where the next journal entry is to be stored. A field (JI_HEAD_VOL) 331 identifies the journal volume 106 that contains the Journal Header Area 210 which will store the next new Journal Header 219. A field (JI_HEAD_ADR) 332 identifies an address on the journal volume of the location in the Journal Header Area where the next Journal Header will be stored. The journal volume that contains the Journal Data Area 220 into which the journal data will be stored is identified by information in a field (JI_DATA_VOL) 335. A field (JI_DATA_ADR) 336 identifies the specific address in the Journal Data Area where the data will be stored. Thus, the next journal entry to be written is “pointed” to by the information contained in the “JI_” fields 331, 332, 335, 336.
The management table also includes fields which identify the “oldest” journal entry. The use of this information will be described below. A field (JO_HEAD_VOL) 333 identifies the journal volume which stores the Journal Header Area 210 that contains the oldest Journal Header 219. A field (JO_HEAD_ADR) 334 identifies the address within the Journal Header Area of the location of the journal header of the oldest journal. A field (JO_DATA_VOL) 337 identifies the journal volume which stores the Journal Data Area 220 that contains the data of the oldest journal. The location of the data in the Journal Data Area is stored in a field (JO_DATA_ADR) 338.
The management table includes a list of journal volumes (JVOL_LIST) 340 associated with a particular journal group 102. In a particular implementation, JVOL_LIST is a pointer to a data structure of information for journal volumes. As can be seen in
The management table includes a list (SS_LIST) 350 of snapshot images 105 associated with a given journal group 102. In this particular-implementation, SS_LIST is a pointer to snapshot information data structures, as indicated in
Each snapshot information data structure also includes a list of snapshot volumes 107 (
Further in accordance with the invention, a single sequence of numbers (SEQ) 313 are associated with each of one or more snapshots and journal entries, as they are created. The purpose of associating the same sequence of numbers to both the snapshots and the journal entries will be discussed below.
Continuing with
In a step 420, the recovery manager 111 will initiate the journaling process. Suitable communication(s) are made to the storage system 100 to perform journaling. In a step 425, the storage system will make a journal entry (also referred to as an “AFTER journal”) for each write operation that issues from the host 110.
With reference to
The fields JI_DATA_VOL 335 and in the management table identify the journal volume and the beginning of the Journal Data Area 220 for storing the data associated with the write operation. The JI_DATA_VOL and JI_DATA_ADR fields are copied to JH_JVOL 216 and to JH_ADR 212, respectively, of the Journal Header, thus providing the Journal Header with a pointer to its corresponding Journal Data. The data of the write operation is stored.
The JI_HEAD_VOL 331 and JI_HEAD_ADR 332 fields are updated to point to the next Journal Header 219 for the next journal entry. This involves taking the next contiguous Journal Header entry in the Journal Header Area 210. Likewise, the JI_DATA_ADR field (and perhaps JI_DATA_VOL field) is updated to reflect the beginning of the Journal Data Area for the next journal entry. This involves advancing to the next available location in the Journal Data Area. These fields therefore can be viewed as pointing to a list of journal entries. Journal entries in the list are linked together by virtue of the sequential organization of the Journal Headers 219 in the Journal Header Area 210.
When the end of the Journal Header Area 210 is reached, the Journal Header 219 for the next journal entry wraps to the beginning of the Journal Header Area. Similarly for the Journal Data 225. To prevent overwriting earlier journal entries, the present invention provides for a procedure to free up entries in the journal volume 106. This aspect of the invention is discussed below.
For the very first journal entry, the JO_HEAD_VOL field 333, JO_HEAD_ADR field 334, JO_DATA_VOL field 337, and the JO_DATA_ADR field 338 are set to contain their contents of their corresponding “JI_” fields. As will be explained the “JO_” fields point to the oldest journal entry. Thus, as new journal entries are made, the “JO_” fields do not advance while the “JI_” fields do advance. Update of the “JO_” fields is discussed below.
Continuing with the flowchart of
The snapshot is stored in one (or more) snapshot volumes (SVOL) 107. A suitable amount of memory is allocated for fields 355-357. The information relating to the SVOLs for storing the snapshot are then stored into the fields 355-357. If additional volumes are required to store the snapshot, then additional memory is allocated for fields 355-357.
Recovering data typically requires recover the data state of at least a portion of the data volumes 101 at a specific time. Generally, this is accomplished by applying one or more journal entries to a snapshot that was taken earlier in time relative to the journal entries. In the disclosed illustrative embodiment, the sequence number SEQ 313 is incremented each time it is assigned to a journal entry or to a snapshot. Therefore, it is a simple matter to identify which journal entries can be applied to a selected snapshot; i.e., those journal entries whose associated sequence numbers (JH_SEQ, 215) are greater than the sequence number (SS_SEQ, 351) associated with the selected snapshot.
For example, the administrator may specify some point in time, presumably a time that is earlier than the time (the “target time”) at which the data in the data volume was lost or otherwise corrupted. The time field SS_TIME 352 for each snapshot is searched until a time earlier than the target time is found. Next, the Journal Headers 219 in the Journal Header Area 210 is searched, beginning from the “oldest” Journal Header. The oldest Journal Header can be identified by the “JO_” fields 333, 334, 337, and 338 in the management table. The Journal Headers are searched sequentially in the area 210 for the first header whose sequence number JH_SEQ 215 is greater than the sequence number SS_SEQ 351 associated with the selected snapshot. The selected snapshot is incrementally updated by applying each journal entry, one at a time, to the snapshot in sequential order, thus reproducing the sequence of write operations. This continues as long as the time field JH_TIME 214 of the journal entry is prior to the target time. The update ceases with the first journal entry whose time field 214 is past the target time.
In accordance with one aspect of the invention, a single snapshot is taken. All journal entries subsequent to that snapshot can then be applied to reconstruct the data state at a given time. In accordance with another aspect of the present invention, multiple snapshots can be taken. This is shown in
If the free space falls below a predetermined threshold, then in a step 720 some of the journal entries are applied to a snapshot to update the snapshot. In particular, the oldest journal entry(ies) are applied to the snapshot.
Referring to
As an observation, it can be appreciated by those of ordinary skill, that the sequence numbers will eventually wrap, and start counting from zero again. It is well within the level of ordinary skill to provide a suitable mechanism for keeping track of this when comparing sequence numbers.
Continuing with
Thus, in step 730, if the threshold for stopping the process is met (i.e., free space exceeds threshold), then the process stops. Otherwise, step 720 is repeated for the next oldest journal entry. Steps 730 and 720 are repeated until the free space level meets the threshold criterion used in step 730.
If such a snapshot can be found in step 721, then the earlier journal entries can be removed without having to apply them to a snapshot. Thus, in a step 722, the “JO_” fields (JO_HEAD_VOL 333, JO_HEAD_ADR 334, JO_DATA_VOL 337, and JO_DATA_ADR 338) are simply moved to a point in the list of journal entries that is later in time than the selected snapshot. If no such snapshot can be found, then in a step 723 the oldest journal entry is applied to a snapshot that is earlier in time than the oldest journal entry, as discussed for step 720.
Still another alternative for step 721 is simply to select the most recent snapshot. All the journal entries whose sequence numbers are less than that of the most recent snapshot can be freed. Again, this simply involves updating the “JO_” fields so they point to the first journal entry whose sequence number is greater than that of the most recent snapshot. Recall that an aspect of the invention is being able to recover the data state for any desired point in time. This can be accomplished by storing as many journal entries as possible and then applying the journal entries to a snapshot to reproduce the write operations. This last embodiment has the potential effect of removing large numbers of journal entries, thus reducing the range of time within which the data state can be recovered. Nevertheless, for a particular configuration it may be desirable to remove large numbers of journal entries for a given operating environment.
It can be appreciated that the foregoing described steps can be embodied entirely in the controller 140 (e.g., a disk controller). This can take on the form of pure software, custom logic, or some suitable combination of software and hardware, depending on the particular implementation. More generally, the foregoing disclosed embodiments typically can be provided using a combination of hardware and software implementations. One of ordinary skill can readily appreciate that the underlying technical solution will be determined based on factors including but not limited or restricted to system cost, system performance, legacy software and legacy hardware, operating environment, and so on. The described current and contemplated embodiments can be readily reduced to specific implementations without undue experimentation by those of ordinary skill in the relevant art.
The present application is a Continuation Application of U.S. application Ser. No. 11/802,278, filed on May 22, 2007 (now U.S. Pat. No. 7,783,848), which is a Continuation Application of U.S. application Ser. No. 11/408,831, filed Apr. 20, 2006 (now U.S. Pat. No. 7,243,197, which is a Continuation Application of U.S. application Ser. No. 10/608,391, filed Jun. 26, 2003 (now U.S. Pat. No. 7,111,136), which are hereby incorporated by reference in their entirety for all purposes. This application is related to the following commonly owned and U.S. applications: application Ser. No. 10/621,791, filed Jul. 16, 2003, abandoned, and application Ser. No. 10/627,507, filed Jul. 25, 2003, abandoned.
Number | Name | Date | Kind |
---|---|---|---|
4077059 | Cordi et al. | Feb 1978 | A |
4819156 | DeLorme et al. | Apr 1989 | A |
4823261 | Bank et al. | Apr 1989 | A |
5065311 | Masai et al. | Nov 1991 | A |
5086502 | Malcolm | Feb 1992 | A |
5263154 | Eastridge et al. | Nov 1993 | A |
5280611 | Mohan et al. | Jan 1994 | A |
5369757 | Spiro et al. | Nov 1994 | A |
5404508 | Konrad et al. | Apr 1995 | A |
5479654 | Squibb | Dec 1995 | A |
5551003 | Mattson et al. | Aug 1996 | A |
5555371 | Duyanovich et al. | Sep 1996 | A |
5644696 | Pearson et al. | Jul 1997 | A |
5664186 | Bennett et al. | Sep 1997 | A |
5680640 | Ofek et al. | Oct 1997 | A |
5701480 | Raz | Dec 1997 | A |
5720029 | Kern et al. | Feb 1998 | A |
5721918 | Nilsson et al. | Feb 1998 | A |
5751997 | Kullick et al. | May 1998 | A |
5835953 | Ohran | Nov 1998 | A |
5867668 | Spirakis et al. | Feb 1999 | A |
5870758 | Bamford et al. | Feb 1999 | A |
5987575 | Yamaguchi | Nov 1999 | A |
6081875 | Clifton et al. | Jun 2000 | A |
6128630 | Shackelford | Oct 2000 | A |
6154852 | Amundson et al. | Nov 2000 | A |
6189016 | Cabrera et al. | Feb 2001 | B1 |
6223176 | Ricard et al. | Apr 2001 | B1 |
6269381 | St. Pierre et al. | Jul 2001 | B1 |
6269431 | Dunham | Jul 2001 | B1 |
6298345 | Armstrong et al. | Oct 2001 | B1 |
6301677 | Squibb | Oct 2001 | B1 |
6324654 | Wahl et al. | Nov 2001 | B1 |
6353878 | Dunham | Mar 2002 | B1 |
6397351 | Miller et al. | May 2002 | B1 |
6442706 | Wahl et al. | Aug 2002 | B1 |
6463501 | Kern et al. | Oct 2002 | B1 |
6473775 | Kusters et al. | Oct 2002 | B1 |
6539462 | Mikkelsen et al. | Mar 2003 | B1 |
6560614 | Barboy et al. | May 2003 | B1 |
6587970 | Wang et al. | Jul 2003 | B1 |
6594781 | Komasaka et al. | Jul 2003 | B1 |
6604183 | Beaven et al. | Aug 2003 | B2 |
6658434 | Watanabe et al. | Dec 2003 | B1 |
6665815 | Goldstein et al. | Dec 2003 | B1 |
6691245 | DeKoning | Feb 2004 | B1 |
6711409 | Zavgren et al. | Mar 2004 | B1 |
6711572 | Zakharov et al. | Mar 2004 | B2 |
6728747 | Jenkins et al. | Apr 2004 | B1 |
6732125 | Autrey et al. | May 2004 | B1 |
6742138 | Gagne et al. | May 2004 | B1 |
6799189 | Huxoll | Sep 2004 | B2 |
6816872 | Squibb | Nov 2004 | B1 |
6829819 | Crue et al. | Dec 2004 | B1 |
6839819 | Martin | Jan 2005 | B2 |
6877109 | Delaney et al. | Apr 2005 | B2 |
6898688 | Martin et al. | May 2005 | B2 |
6915315 | Autrey et al. | Jul 2005 | B2 |
6978282 | Dings et al. | Dec 2005 | B1 |
6981114 | Wu et al. | Dec 2005 | B1 |
7036043 | Martin et al. | Apr 2006 | B2 |
8005796 | Yamagami | Aug 2011 | B2 |
20010010070 | Crockett et al. | Jul 2001 | A1 |
20010049749 | Katsuragi et al. | Dec 2001 | A1 |
20010056438 | Ito | Dec 2001 | A1 |
20020016827 | McCabe et al. | Feb 2002 | A1 |
20020078244 | Howard | Jun 2002 | A1 |
20030074523 | Johnson | Apr 2003 | A1 |
20030115225 | Suzuki et al. | Jun 2003 | A1 |
20030135650 | Kano et al. | Jul 2003 | A1 |
20030177306 | Cochran et al. | Sep 2003 | A1 |
20030195903 | Manley et al. | Oct 2003 | A1 |
20030220935 | Vivian et al. | Nov 2003 | A1 |
20030229764 | Ohno et al. | Dec 2003 | A1 |
20040010487 | Prahlad et al. | Jan 2004 | A1 |
20040030837 | Geiner et al. | Feb 2004 | A1 |
20040044828 | Gibble et al. | Mar 2004 | A1 |
20040059882 | Kedem et al. | Mar 2004 | A1 |
20040068636 | Jacobson et al. | Apr 2004 | A1 |
20040088508 | Ballard et al. | May 2004 | A1 |
20040117572 | Welsh et al. | Jun 2004 | A1 |
20040128470 | Hetzler et al. | Jul 2004 | A1 |
20040133575 | Farmer et al. | Jul 2004 | A1 |
20040139128 | Becker et al. | Jul 2004 | A1 |
20040153558 | Gunduc et al. | Aug 2004 | A1 |
20040163009 | Goldstein et al. | Aug 2004 | A1 |
20040172577 | Tan et al. | Sep 2004 | A1 |
20040225689 | Dettinger et al. | Nov 2004 | A1 |
20040250033 | Prahlad et al. | Dec 2004 | A1 |
20040250182 | Lyle et al. | Dec 2004 | A1 |
20050027892 | McCabe et al. | Feb 2005 | A1 |
20050039069 | Prahlad et al. | Feb 2005 | A1 |
20050108302 | Rand et al. | May 2005 | A1 |
20050193031 | Midgley et al. | Sep 2005 | A1 |
20050256811 | Pagel et al. | Nov 2005 | A1 |
20060242371 | Shono et al. | Oct 2006 | A1 |
Number | Date | Country |
---|---|---|
03-103941 | Apr 1991 | JP |
05-002517 | Jan 1993 | JP |
11-194964 | Jul 1999 | JP |
11-353215 | Dec 1999 | JP |
2000-155706 | Jun 2000 | JP |
2002-278819 | Sep 2002 | JP |
2001-195286 | Jul 2007 | JP |
WO03092166 | Nov 2003 | WO |
Number | Date | Country | |
---|---|---|---|
20100274985 A1 | Oct 2010 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 11802278 | May 2007 | US |
Child | 12834557 | US | |
Parent | 11408831 | Apr 2006 | US |
Child | 11802278 | US | |
Parent | 10608391 | Jun 2003 | US |
Child | 11408831 | US |