1. Field of the Invention
This present invention generally relates to storage technology and more specifically to management of backup data.
2. Background of the Invention
Recently, a variety of backup data protection methods, such as continuous data protection method (CDP) and copy on write snapshot method have been developed. Storage systems having storage system function and backup software, as well as the corresponding storage devices (e.g. tape library, NAS and VTL) and media (e.g. tape and disk) became available for backup and recovery of data. As known to persons of skill in the art, a backup of production data may be taken and the backup data may be maintained in a storage system using a variety of methods (backup data modes).
Each such mode of backup and data protection has a specific set of characteristics, such as media cost, RTO (Recovery time objective) and RPO (Recovery point objective). The value of the RTO parameter, for example, determines how fast the backup data can be restored from the backup media. The RPO, on the other hand, determines how granular the recovery points are.
As will be clearly understood by persons of skill in the art, the value of a specific set of backup data to the business operation of the user who maintains it may change time to time. Therefore, in order to save costs associated with the used storage media, and to optimize the aforesaid RTO and RPO parameters taking into account the user's actual needs, the backup data mode should be also changed (transited) in response to change of circumstances and the value of the backup data to user's business operation. Moreover, it is desirable to have a unified management environment for managing backup data in order to reduce the cost of backup and recovery data management and the operating costs associated with using various backup methods and systems mentioned above.
However, the conventional industry approaches are deficient in their ability to provide systems for efficient backup data management, which would allow seamless change of backup data mode.
The inventive methodology is directed to methods and systems that substantially obviate one or more of the above and other problems associated with conventional techniques for backup data management.
In accordance with one aspect of the inventive methodology, there is provided a computerized storage system including at least one storage system configured to store backup data and a management computer including a central processing unit and a memory. The management computer is operatively coupled to the at least one storage system via an interconnect. This management computer is configured to maintain backup management information associated with the backup data stored in the at least one storage system; detect an occurrence of a condition for transitioning the backup data; develop a transition plan for transitioning the backup data in accordance with a predetermined scenario; initiate a transition based on the developed plan; and update the backup management information based on the transition.
In accordance with another aspect of the inventive methodology, there is provided a method for transitioning backup data stored in at least one backup data storage system. The inventive method involves maintaining backup management information associated with the backup data stored in the at least one storage system; detecting an occurrence of a condition for:transitioning the backup data; developing a transition plan for transitioning the backup data in accordance with a predetermined scenario; performing a transition based on the developed plan; and updating the backup management information based on the transition.
In accordance with a further aspect of the inventive methodology, there is provided a computer readable medium embodying a set of instructions, which, when executed by one or more processors, causes the one or more processors to perform a method for transitioning backup data stored in at least one backup data storage system. The aforesaid method involves maintaining backup management information associated with the backup data stored in at least one storage system; detecting an occurrence of a condition for transitioning the backup data; developing a transition plan for transitioning the backup data in accordance with a predetermined scenario; performing a transition based on the developed plan; and updating the backup management information based on the transition.
In accordance with yet further aspect of the inventive methodology, there is provided a computerized storage system having continuous data protection capability. The inventive system includes a journal volume configured to store a journal; and a snapshot volume configured to store multiple snapshots. The inventive system is further configured to maintain the first interval between snapshots corresponding to an older journal record to be longer than a second interval between snapshots corresponding to a newer journal record.
In accordance with yet further aspect of the inventive methodology, there is provided a method for managing a mode of continuous data protection. The inventive method involves storing a journal; and storing multiple snapshots. In accordance with the inventive method, a first interval between snapshots corresponding to an older journal record is longer than a second interval between snapshots corresponding to a newer journal record.
Additional aspects related to the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. Aspects of the invention may be realized and attained by means of the elements and combinations of various elements and aspects particularly pointed out in the following detailed description and the appended claims.
It is to be understood that both the foregoing and the following descriptions are exemplary and explanatory only and are not intended to limit the claimed invention or application thereof in any manner whatsoever.
The accompanying drawings, which are incorporated in and constitute a part of this specification exemplify the embodiments of the present invention and, together with the description, serve to explain and illustrate principles of the inventive technique. Specifically:
In the following detailed description, reference will be made to the accompanying drawings, in which identical functional elements are designated with like numerals. The aforementioned accompanying drawings show by way of illustration, and not by way of limitation, specific embodiments and implementations consistent with principles of the present invention. These implementations are described in sufficient detail to enable those skilled in the art to practice the invention and it is to be understood that other implementations may be utilized and that structural changes and/or substitutions of various elements may be made without departing from the scope and spirit of present invention. The following detailed description is, therefore, not to be construed in a limited sense. Additionally, the various embodiments of the invention as described may be implemented in the form of a software running on a general purpose computer, in the form of a specialized hardware, or combination of software and hardware.
One embodiment of the invention obviates the deficiencies of the prior art and achieves the aforesaid cost reductions and the optimization of the RTO and RPO parameters corresponding to change in conditions and the current value of the backup data. In one embodiment, the inventive system incorporates a management computer, one or more host computers, a backup server and storage systems. Backup data generated in the aforesaid system may be stored in various storage devices such as disk arrays, tape libraries, VTL (virtual tape library) and VDL (virtual disk library having block interface), which may be implemented using various technologies. The management information associated with the backup data is maintained by the management computer. The management computer conducts transition of the backup data mode by first detecting that a condition to perform the transition has occurred, developing a plan of transition according to a predetermined scenario, performing (initiating) the transition based on the developed plan and updating the management information of the backup data in accordance with the performed transition. By keeping the management information up to date, the management computer can provide a unified method for restoring necessary data from the backup data even if the transition (moving and/or change) of the backup data mode has been performed.
An embodiment of the invention provides methods for controlling intervals between snapshots for a CDP system. By applying the inventive system and methods mentioned above, cost reduction and optimization of the RTO and RPO parameters corresponding to change in conditions and the value of the backup data are successfully achieved.
The main processor 111 performs various processes associated with the array controller 110. The main processor 111 and other components use the following information stored in the memory 200: consistency information 201, volume information 202, mapping information 203 and pool Information 204. The main processor 111 performs the aforesaid processes by executing various programs stored in the memory 200.
One or more hosts 500, Backup server 510 and management computer 520 are connected to the host interface 113 via SAN 901 (e.g. Fibre Channel, iSCSI(IP)). VDL (Virtual disk library) 710 is connected to the external disk controller 114 via SAN 901. The host 500, the backup server 510 and the management computer 520 are interconnected via the LAN 903, which may be implemented, for example, as an IP-based network technology.
The management computer 520 is also connected to the array controller 110 via out-of-band Network 902 (e.g. IP). To have capability as computers, Host 500, Backup server 510 and management computer 520 have resources such as processor and memory (not shown in
Volumes (Logical Units) provided by the storage system 100 are produced from collection of areas in HDDs. They may be protected by storing parity code (i.e. by RAID configuration). One or more hosts 500 can store data in the volume and utilize the data in the volume. In other words, Host 500 writes data to the volume and reads data from the volume.
Processor 521 performs various processes associated with the management computer 520. The processor 521 and other components use the following information stored in the memory 530: backup data information 531; scenario information 532; term information 533; service information 534 and amount information 535.
The processor 521 performs a backup data management process by executing the OS 536 and the backup data management program 537, which are stored in the Memory 530. The detailed description of the backup data process will be provided below.
As shown in
As shown in
At step 1002, the management computer 520 develops a plan of the transition according to a scenario provided by one or more users by means of the management computer 520.
At step 1003, the management computer 520 performs the transition based on the developed plan.
The details of each of the above steps will be provided below.
Specifically, at step 1101, the management computer 520 checks term information 533.
At step 1102, if there is backup data having age exceeding the duration of the current term for the backup data, the process proceeds to step 1103. This check is performed by comparison between the predetermined duration and deference of current time from the beginning time mentioned above. Otherwise, the process proceeds to step 1104.
At step 1103, the management computer 520 determines that the transition of the backup data should be performed.
At step 1104, the management computer 520 determines that no action should be taken. In this case, steps 1002 and 1003 are not executed.
At step 1201, the management computer 520 checks service information 534.
At step 1202, the management computer 520 acquires the aforesaid rating of the backup data from the service status associated with the backup data.
At step 1203, if there is backup data that falls below or exceeds the threshold, the process proceeds to step 1204. Otherwise, the process proceeds to step 1205.
At step 1204, the management computer 520 determines that a transition of the backup data needs to be performed.
At step 1205, the management computer 520 determines that no action should be taken. In this case, steps 1002 and 1003 are not executed.
At step 1301, the management computer 520 checks the amount information 535.
At step 1302, if the amount of the backup data falls below or exceeds the threshold, the process proceeds to step 1303. Otherwise, the process proceeds to step 1304.
At step 1303, the management computer 520 decides to perform transition of the backup data.
At step 1304, the management computer 520 determines that no action should be performed. In this case, step 1002 and 1003 are not executed.
At step 1401, the management computer 520 references the scenario information 532 and the backup data information 531 with respect to the backup data to be transited.
At step 1402, the management computer 520 determines a manner of the transition based on the Scenario information 532.
G.1. Basic process of Journaling
The production volumes 620 constitute a consistency group 610. A generate journal (JNL) function 810 in the storage system 100 obtains update data that is transferred to update the data in the production volumes 620, assigns a sequence number (incremental number) to each journal record per each consistency group 610, and records the update data as a journal record in the journal volumes 630 that are assigned to each consistency group 610. The consistency group information 201 described in
As shown in
Moreover, the make snapshot function 830 in the storage system 100 obtains snapshot of each base volume 640 at predetermined intervals and updates the volume information 202. As described in
G.2. Basic process of Copy on Write Snapshot
For example, in
To achieve above, the array controller 110 uses the mapping information 203 and Pool information 204.
At step 1801, the array controller 110 checks target volume and target area of write access (update).
At step 1802, the array controller 110 checks the mapping information 203. If other write access occurred in target segment after the latest snapshot (point in time), the process proceeds to 1805. If not, the process proceeds to step 1803.
At step 1803, the array controller 110 assigns a new chunk to store the old data. To do this, the array controller 110 updates the mapping information 203 and the pool information 204.
At step 1804, the array controller 110 copies the data in the target segment to the new chunk.
At step 1805, the array controller 110 stores the new data to the segment.
At step 1806, if the array controller 110 has checked all segments of the target area, the process ends. If not, the array controller 110 advances the check to the next segment (step 1807).
In responding a read access for one snapshot, by referring the Mapping information 203, the array controller 110 finds the chunk having old data that is needed to reclaim the specified snapshot and the segment storing data that is needed to reclaim the specified snapshot. Then -the array controller 110 provides the required data.
Moreover, the apply journal function 840 applies (writes) journal from the journal corresponding the selected snapshot to the journal corresponding the indicated point in time according to the sequence number in the metadata 634. The nearest snapshot for the target point in time should be selected to make amount of journal to be applied smallest. Apply journal function 840 can recognize the journal to be applied by referring to the volume information 202. After completion of applying the journal, the apply journal function 840 changes status of the snapshot to accessible (read/write access is allowed).
Then, as described in
At step 1901, the management computer 520 instructs the array controller 110 to change the mode of the backup data from CDP to snapshot.
At step 1902, the array controller 110 releases journal of the backup data specified by the instruction while the array controller 110 maintains snapshot of the backup data. That is, the array controller 110 updates the volume information 202.
At step 1903, the array controller 110 notifies the management computer 520 of the completion of the change.
At step 1904, the management computer 520 updates the backup data information 531 according to the change.
At step 2001, the management computer 520 instructs the backup server 510 to change the mode of the backup data from snapshot to tape backup (or backup using VTL). VTL (Virtual Tape Library) is a disk-based storage system that emulates Tape Library by supporting commands for tape device.
At step 2002, the backup server 510 reads the specified backup data from snapshot specified by the instruction. After that, the backup server 510 writes the backup data to the Tape Library (or VTL) 700.
At step 2003, the backup server 510 informs the management computer 520 of the completion of the backup data moving operation.
At step 2004, the management computer 520 updates the backup data information 531 according to the migration of the backup data.
At step 2005, the management computer 520 instructs the array controller 110 to release the snapshots associated with the backup data.
At step 2006, the array controller 110 releases the snapshots regarding the backup data. That is, the array controller 110 makes the related chunks free (unused) by checking and updating the mapping information 203 and the pool information 204. The array controller 110 also updates the volume information 202.
At step 2007, the array controller 110 notifies the management computer 520 of the completion of the change.
At step 2008, the management computer 520 updates backup data information 531 according to the change.
At step 2101, the management computer 520 instructs the array controller 110 to change the mode of the backup data from snapshot to VDL. VDL (Virtual Disk Library) is a tape-based storage system that emulates disk by supporting block access commands for disk device.
At step 2102, the array controller 110 transfers the specified backup data from snapshot specified by the instruction to VDL via the external disk controller 114.
At step 2103, the array controller 110 releases the snapshots regarding the backup data. That is, the array controller 110 makes the related chunks free (unused) by checking and updating the mapping information 203 and Pool information 204. The array controller 110 also updates the volume information 202.
At step 2104, the array controller 110 notifies the management computer 520 of the completion of the change.
At step 2105, the management computer 520 updates backup data information 531 according to the change.
At step 2201, the management computer 520 instructs the array controller 110 to change the interval of the snapshot for CDP regarding the backup data.
At step 2202, the array controller 110 releases snapshots in far past period. That is, the array controller 110 makes the related chunks free (unused) by checking and updating Mapping information 203 and the pool information 204. The array controller 110 also updates the volume information 202.
At step 2203, the array controller 110 notifies the management computer 520 of the completion of the change.
At step 2204, the management computer 520 updates backup data information 531 according to the change.
By the above process, in regard to previously mentioned snapshots used for CDP, interval of the snapshots becomes larger as the snapshot gets old as described in
The above process may be performed by the array controller 110 in Storage system 100 based on age of backup data maintained with CDP using the volume information 202. That is, the above control may be performed by the array controller 110 automatically instead of the management computer 520.
To achieve this, the array controller 110 uses the mapping information 203 and the pool information 204.
At step 2301, the host 500 issues a write request and transfers write data to the array controller 110.
At step 2302, the array controller 110 checks target TPV 670 and target area of the write access by referring the write request.
At step 2303, the array controller 110 checks the mapping information 203 for a segment in the target area. If a chunk has already been assigned to the segment, the process proceeds to step 2306. If not, the process proceeds to step 2304.
At step 2304, the array controller 110 assigns a new chunk to store the write data. To do this, the array controller 110 updates the mapping information 203 and the pool information 204.
At step 2305, the array controller 110 stores the write data to the new chunk.
At step 2306, the array controller 110 stores the write data to the existing chunk.
At step 2307, if the array controller 110 has checked all segments of the target area, the process ends. If not, the array controller 110 advances the check to the next segment (step 2308).
In responding a read request for TPV 670 from the Host 500, by referring the mapping information 203 and the pool information 204, the array controller 110 finds the chunk having data to be read and sends the data to the host 500. If no chunk is assigned to the area specified by the read request, the array controller 110 sends data of zero (0) to the Host 500.
At step 2401, the management computer 520 instructs the array controller 110 to set WORM (write once read many) regarding TPV.
At step 2402, the array controller 110 compresses the data stored in the TPV. It makes unused chunks regarding the TPV. After that, the array controller 110 forbids modification of the data in the TPV for retention period specified by the instruction. That is, setting WORM makes the TPV read-only.
At step 2403, the array controller 110 releases the unused chunk by checking and updating Mapping information 203 and Pool information 204.
At step 2404, the array controller 110 informs the management computer 520 of the completion of setting WORM.
At step 2405, the management computer 520 updates backup data information 531 according to the change.
By the above process, used capacity (media cost) for the data is reduced and protection of the data tightens. After the retention period, the data in the TPV is able to be modified (i.e. removing of WORM).
At step 2501, the management computer 520 presents a user with one or more candidate data that can be restored according to the backup data information 531.
At step 2502, the management computer 520 receives a restore request from a user.
At step 2503, the management computer. 520 refers Backup data information 531 and determines a method to restore the data specified by the request.
At step 2504, the management computer 520 instructs the array controller 110 or Backup server 510 to restore the data according to the selected method.
At step 2505, the array controller 110 or the backup server 510 restores data based on the received restore command.
At step 2506, the array controller 110 or the backup server 510 which restores the data notifies the management computer 520 of the completion of restoring the data.
At step 2507, the management computer 520 reports the completion of the restoration of the specified data to the user.
Finally, at step 2508, the user starts to use the restored data.
The computer platform 3601 may include a data bus 3604 or other communication mechanism for communicating information across and among various parts of the computer platform 3601, and a processor 3605 coupled with bus 3601 for processing information and performing other computational and control tasks. Computer platform 3601 also includes a volatile storage 3606, such as a random access memory (RAM) or other dynamic storage device, coupled to bus 3604 for storing various information as well as instructions to be executed by processor 3605. The volatile storage 3606 also may be used for storing temporary variables or other intermediate information during execution of instructions by processor 3605. Computer platform 3601 may further include a read only memory (ROM or EPROM) 3607 or other static storage device coupled to bus 3604 for storing static information and instructions for processor 3605, such as basic input-output system (BIOS), as well as various system configuration parameters. A persistent storage device 3608, such as a magnetic disk, optical disk, or solid-state flash memory device is provided and coupled to bus 3601 for storing information and instructions.
Computer platform 3601 may be coupled via bus 3604 to a display 3609, such as a cathode ray tube (CRT), plasma display, or a liquid crystal display (LCD), for displaying information to a system administrator or user of the computer platform 3601. An input device 3610, including alphanumeric and other keys, is coupled to bus 3601 for communicating information and command selections to processor 3605. Another type of user input device is cursor control device 3611, such as a mouse, a trackball, or cursor direction keys for communicating direction information and command selections to processor 3604 and for controlling cursor movement on display 3609. This input device typically has two degrees of freedom in two axes, a first axis (e.g., x) and a second axis (e.g., y), that allows the device to specify positions in a plane.
An external storage device 3612 may be connected to the computer platform 3601 via bus 3604 to provide an extra or removable storage capacity for the computer platform 3601. In an embodiment of the computer system 3600, the external removable storage device 3612 may be used to facilitate exchange of data with other computer systems.
The invention is related to the use of computer system 3600 for implementing the techniques described herein. In an embodiment, the inventive system may reside on a machine such as computer platform 3601. According to one embodiment of the invention, the techniques described herein are performed by computer system 3600 in response to processor 3605 executing one or more sequences of one or more instructions contained in the volatile memory 3606. Such instructions may be read into volatile memory 3606 from another computer-readable medium, such as persistent storage device 3608. Execution of the sequences of instructions contained in the volatile memory 3606 causes processor 3605 to perform the process steps described herein. In alternative embodiments, hard-wired circuitry may be used in place of or in combination with software instructions to implement the invention. Thus, embodiments of the invention are not limited to any specific combination of hardware circuitry and software.
The term “computer-readable medium” as used herein refers to any medium that participates in providing instructions to processor 3605 for execution. The computer-readable medium is just one example of a machine-readable medium, which may carry instructions for implementing any of the methods and/or techniques described herein. Such a medium may take many forms, including but not limited to, non-volatile media, volatile media, and transmission media. Non-volatile media includes, for example, optical or magnetic disks, such as storage device 3608. Volatile media includes dynamic memory, such as volatile storage 3606. Transmission media includes coaxial cables, copper wire and fiber optics, including the wires that comprise data bus 3604. Transmission media can also take the form of acoustic or light waves, such as those generated during radio-wave and infra-red data communications.
Common forms of computer-readable media include, for example, a floppy disk, a flexible disk, hard disk, magnetic tape, or any other magnetic medium, a CD-ROM, any other optical medium, punchcards, papertape, any other physical medium with patterns of holes, a RAM, a PROM, an EPROM, a FLASH-EPROM, a flash drive, a memory card, any other memory chip or cartridge, a carrier wave as described hereinafter, or any other medium from which a computer can read.
Various forms of computer readable media may be involved in carrying one or more sequences of one or more instructions to processor 3605 for execution. For example, the instructions may initially be carried on a magnetic disk from a remote computer. Alternatively, a remote computer can load the instructions into its dynamic memory and send the instructions over a telephone line using a modem. A modem local to computer system 3600 can receive the data on the telephone line and use an infra-red transmitter to convert the data to an infra-red signal. An infra-red detector can receive the data carried in the infra-red signal and appropriate circuitry can place the data on the data bus 3604. The bus 3604 carries the data to the volatile storage 3606, from which processor 3605 retrieves and executes the instructions. The instructions received by the volatile memory 3606 may optionally be stored on persistent storage device 3608 either before or after execution by processor 3605. The instructions may also be downloaded into the computer platform 3601 via Internet using a variety of network data communication protocols well known in the art.
The computer platform 3601 also includes a communication interface, such as network interface card 3613 coupled to the data bus 3604. Communication interface 3613 provides a two-way data communication coupling to a network link 3614 that is connected to a local network 3615. For example, communication interface 3613 may be an integrated services digital network (ISDN) card or a modem to provide a data communication connection to a corresponding type of telephone line. As another example, communication interface 3613 may be a local area network interface card (LAN NIC) to provide a data communication connection to a compatible LAN. Wireless links, such as well-known 802.11a, 802.11b, 802.11g and Bluetooth may also used for network implementation. In any such implementation, communication interface 3613 sends and receives electrical, electromagnetic or optical signals that carry digital data streams representing various types of information.
Network link 3613 typically provides data communication through one or more networks to other network resources. For example, network link 3614 may provide a connection through local network 3615 to a host computer 3616, or a network storage/server 3617. Additionally or alternatively, the network link 36.13 may connect through gateway/firewall 3617 to the wide-area or global network 3618, such as an Internet. Thus, the computer platform 3601 can access network resources located anywhere on the Internet 3618, such as a remote network storage/server 3619. On the other hand, the computer platform 3601 may also be accessed by clients located anywhere on the local area network 3615 and/or the Internet 3618. The network clients 3620 and 3621 may themselves be implemented based on the computer platform similar to the platform 3601.
Local network 3615 and the Internet 3618 both use electrical, electromagnetic or optical signals that carry digital data streams. The signals through the various networks and the signals on network link 3614 and through communication interface 3613, which carry the digital data to and from computer platform 3601, are exemplary forms of carrier waves transporting the information.
Computer platform 3601 can send messages and receive data, including program code, through the variety of network(s) including Internet 3618 and LAN 3615, network link 3614 and communication interface 3613. In the Internet example, when the system 3601 acts as a network server, it might transmit a requested code or data for an application program running on client(s) 3620 and/or 3621 through Internet 3618, gateway/firewall 3617, local area network 3615 and communication interface 3613. Similarly, it may receive code from other network resources.
The received code may be executed by processor 3605 as it is received, and/or stored in persistent or volatile storage devices 3608 and 3606, respectively, or other non-volatile storage for later execution. In this manner, computer system 3601 may obtain application code in the form of a carrier wave.
It should be noted that the present invention is not limited to any specific firewall system. The inventive policy-based content processing system may be used in any of the three firewall operating modes and specifically NAT, routed and transparent.
Finally, it should be understood that processes and techniques described herein are not inherently related to any particular apparatus and may be implemented by any suitable combination of components. Further, various types of general purpose devices may be used in accordance with the teachings described herein. It may also prove advantageous to construct specialized apparatus to perform the method steps described herein. The present invention has been described in relation to particular examples, which are intended in all respects to be illustrative rather than restrictive. Those skilled in the art will appreciate that many different combinations of hardware, software, and firmware will be suitable for practicing the present invention. For example, the described software may be implemented in a wide variety of programming or scripting languages, such as Assembler, C/C++, perl, shell, PHP, Java, etc.
Moreover, other implementations of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. Various aspects and/or components of the described embodiments may be used singly or in any combination in the inventive backup data management system. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the invention being indicated by the following claims..