1. Field of the Invention
This invention generally relates to storage technology and, more specifically, to data protection and recovery of files and data.
2. Description of the Related Art
A conventional method for performing a backup and recovery of data is to backup data periodically (e.g. once a day) from a storage system to a backup media, such as magnetic tapes. In taking the data backup, a snapshot of a storage area (e.g. a storage volume) is often used to obtain data with consistency. That is, the data is read from a snapshot or a quiescence image and wrote to the backup media. Several methods for providing a snapshot of a storage area in a storage system, using either logical or physical techniques are well known in the art. The backup data saved on the backup media is a static data and is copied to a new storage area (e.g. a new volume) when the data needs to be restored.
However, the above conventional method can only restore the image of the data at the time point of the snapshot, and restoring data from the backup data may result in a loss of certain amount of updates because the backup data may not be entirely up to date. Moreover, if the latest backup data is, for example, inconsistent or corrupt, an older generation of the backup data must be used in the restore operation.
Recently, there emerged new advanced storage systems having a capability to perform journaling and to restore data using the journal. This capability is known as a continuous data protection (CDP). With this capability, all updates for a storage area are recorded as a journal, and the data at an arbitrary time point can be restored using the journal. In this journaling and restoring operation, snapshots may used. That is, besides the journal, snapshots of the storage area are maintained at predetermined intervals, and restoring the data at an arbitrary time point is achieved by applying the journal between time point of a snapshot and the time point to the snapshot. One system and method for providing the aforesaid CDP capability is disclosed in U.S. Pat. No. 7,111,136.
In this conventional method, however, when user application software uses multiple related files (i.e. these files have some relation), the consistency of such files as a whole may not be achieved in the event one file has been closed but one or more other files have not been closed during the journaling operation. As would be appreciated by those of skill in the art, a file is in a consistent state when it is closed by application software. Therefore, methods and apparatuses for searching time points wherein all related files have been closed (i.e. each file is in consistent state) during the journaling operation are needed to achieve the consistency of a group of files as a whole.
In another related case, there may be a database (DB) application, which manages files handling relatively large volumes of data. In such a data system, the data is stored as files in the file system area and the management information (location information etc.) of the data is stored in the database. In this case, for data recovery using the journaling capability, methods and apparatuses for seeking time points wherein both of the database and the file have consistency as a whole are also needed.
Thus, the conventional technology fails to provide techniques for searching the journal for time points wherein all related files have been closed. In addition, the conventional technology fails to provide methodology for finding a time point when both a database and a related file have consistency as a whole.
The inventive methodology is directed to methods and systems that substantially obviate one or more of the above and other problems associated with conventional techniques for backup and recovery of data.
In accordance with one aspect of the inventive concept, there is provided a computerized data storage system. The inventive system includes a production volume storing application data; a base volume storing a copy of the application data; and a journal volume storing updates to the application data stored in the production volume. The production volume, the base volume and the journal volume form a consistency group. The journal volume is additionally operable to store a marker including a status information on at least two related files stored in the data storage system.
In accordance with another aspect of the inventive concept, there is provided a computerized data storage system. The inventive system includes a production volume storing application data; a base volume storing a copy of the application data; and a journal volume storing updates to the application data stored in the production volume. The production volume, the base volume and the journal volume form a consistency group. The journal volume is also configured to store a marker including a status information on at least one file stored in the data storage system and commit status information on a database table stored in the data storage system, which is related to the at least one file.
In accordance with one aspect of the inventive concept, there is provided a method involving storing application data in a production volume of a data storage system; storing a copy of the application data in a base volume of the data storage system and storing in a journal volume of the data storage system updates to the application data stored in the production volume. The production volume, the base volume and the journal volume form a consistency group. The inventive method further involves storing in the journal volume a marker comprising a status information on at least two related files stored in the data storage system.
In accordance with one aspect of the inventive concept, there is provided a method involving storing application data in a production volume of a data storage system; storing a copy of the application data in a base volume of the data storage system and storing in a journal volume of the data storage system updates to the application data stored in the production volume. The production volume, the base volume and the journal volume form a consistency group. The inventive method further involves storing in the journal volume a marker comprising a status information on at least one file stored in the data storage system and commit status information on at least one database table stored in the data storage system, which is related to the at least one file.
In accordance with one aspect of the inventive concept, there is provided a computer-readable medium storing computer-executable instructions implementing a method involving storing application data in a production volume of a data storage system; storing a copy of the application data in a base volume of the data storage system and storing in a journal volume of the data storage system updates to the application data stored in the production volume. The production volume, the base volume and the journal volume form a consistency group. The inventive method further involves storing in the journal volume a marker comprising a status information on at least two related files stored in the data storage system.
In accordance with one aspect of the inventive concept, there is provided a computer-readable medium storing computer-executable instructions implementing a method involving storing application data in a production volume of a data storage system; storing a copy of the application data in a base volume of the data storage system and storing in a journal volume of the data storage system updates to the application data stored in the production volume. The production volume, the base volume and the journal volume forming a consistency group. The aforesaid method further involves storing in the journal volume a marker comprising a status information on at least one file stored in the data storage system and commit status information on at least one database table stored in the data storage system, which is related to the at least one file.
Additional aspects related to the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. Aspects of the invention may be realized and attained by means of the elements and combinations of various elements and aspects particularly pointed out in the following detailed description and the appended claims.
It is to be understood that both the foregoing and the following descriptions are exemplary and explanatory only and are not intended to limit the claimed invention or application thereof in any manner whatsoever.
The accompanying drawings, which are incorporated in and constitute a part of this specification exemplify the embodiments of the present invention and, together with the description, serve to explain and illustrate principles of the inventive technique. Specifically:
In the following detailed description, reference will be made to the accompanying drawings, in which identical functional elements are designated with like numerals. The aforementioned accompanying drawings show by way of illustration, and not by way of limitation, specific embodiments and implementations consistent with principles of the present invention. These implementations are described in sufficient detail to enable those skilled in the art to practice the invention and it is to be understood that other implementations may be utilized and that structural changes and/or substitutions of various elements may be made without departing from the scope and spirit of present invention. The following detailed description is, therefore, not to be construed in a limited sense. Additionally, the various embodiments of the invention as described may be implemented in the form of a software running on a general purpose computer, in the form of a specialized hardware, or combination of software and hardware.
This invention discloses methods to search and find recover time point with the whole consistency. In this invention, operation and process regarding makers including status of related file are provided. As other method, markers indicating a commit time point of DB and related file are also provided. Using the inventive techniques, a user, application software and management software can search a recover time point with the required consistency. They can also obtain data in the time point when the related data is also consistent.
In one embodiment, the system comprises host computes, a management terminal and a storage system having a journaling capability mentioned above. The storage system can make and insert markers including status information of related files in the journal. The storage system provides information regarding the markers so that user can search a maker indicating time point that has required consistency. Then the user or application software can obtain data in the time point with whole consistency of related data.
In another embodiment, the storage system can make and insert markers that indicate database's commit point regarding file managed by the database. The storage system provides information regarding the makers, therefore user or application software can obtain data in the time point with whole consistency of related data by specifying a marker as time point to be recovered.
The main processor 101 executes various software programs stored in the memory 200, including, without limitation, read/write process program 211 and the data protection/recovery program 212. The host 500 and the management terminal 520 are connected to the host interface 113 via the SAN 901, which may be implemented using, for example, Fibre Channel, iSCSI(IP) or any other suitable interconnect technology. The host 500 and the management terminal 520 are interconnected via LAN 903, which may be an IP-based network.
The management terminal 520 is also connected to an array controller 110 via an out-of-band network 902, which may also be an IP-based network. Various volumes (Logical Units) provided by the storage system 100 are composed from a collection of storage areas located in HDDs. Data consistency in these storage areas may be protected using a parity code, such as by utilizing the RAID configuration well known to persons of skill in the art.
In addition,
The production volumes 620 constitute a consistency group 610. The generate journal (JNL) function 810 in the storage system 100 obtains data that is transferred to update the production volumes 620, assigns a sequence number (incremental number) to the journal per each consistency group 610, and records it as journal on the journal volumes 630 that are assigned for each consistency group 610. The consistency group information 201 described in
In
As mentioned above, the application software 501 uses the files stored in the storage system 100 and these files are related to each other from the data consistency perspective. The host 500 manages the related files and their status using the file group information 511.
At step 1102, the management terminal 520 sends the Array controller 110 a command with the determined condition to get information about marker(s) based on the condition. This command is transferred via SAN 901 or out-of-band network 902. At step 1103, the array controller 110 receives the command. At step 1104, the array controller 110 finds the maker(s) that satisfy the condition by searching the Marker information 203. At step 1105, the array controller 110 sends the information regarding the appropriate maker(s) to the management terminal 520. The management terminal 520 can show the information to users. At step 1106, using the information about the selected marker(s), the management terminal 520 determines the marker to indicate a time point to be restored. This decision may be made by user or the management software 521 on the management terminal 520.
At step 1107, the management terminal 520 sends the array controller 110 a command to obtain restored data based on the determined marker. The command is transferred by SAN 901 or out-of-band network 902. At step 1108, the array controller 110 receives the restore command specifying the determined marker. At step 1109, the array controller 110 selects the latest snapshot image before the time point of the specified marker. At step 1110, the array controller 110 applies journal to the selected snapshot image up to the marker. Finally, at step 1111, the array controller 110 allows to access to the restored data.
In order to determine the condition at step 1101 and determine the marker at step 1106, the management terminal 520 can have the file group information 511 and use this information for the decisions. The file group information 511 in the management terminal 520 can be generated by collecting and aggregating the file group information 511 in each Host 500 via LAN 903.
By the processes described above, the time point wherein all related files have been closed can be searched and recovered data with whole consistency regarding the related files can be obtained by users, the application software 501, OS 502, management software 521 and the like.
In the secondary storage system 100, the receive journal function 870 receives journals sent from the primary storage system 100. When the receive journal function 870 receives (detects) a marker, the receive journal function 870 records the information about the marker in marker information 203 in the secondary storage system 100. That is, the marker information 203 is regenerated in the secondary storage system 100.
By performing the same operations explained in the first embodiment, the time point wherein all related files have been closed can be searched and recovered data with whole consistency regarding the related files can be obtained by users, application software 501, OS 502, the management software 521 etc, with the second storage system 100, because the secondary storage system 100 can have various information mentioned in the first embodiment.
In addition to the above process, by performing the similar process of searching a marker and restoring data described in the first embodiment, the time point wherein all related files have been closed can be searched and recovered data with whole consistency regarding the related files can be obtained by users, application software 501, OS 502, management software 521 and so on.
The computer platform 2401 may include a data bus 2404 or other communication mechanism for communicating information across and among various parts of the computer platform 2401, and a processor 2405 coupled with bus 2401 for processing information and performing other computational and control tasks. Computer platform 2401 also includes a volatile storage 2406, such as a random access memory (RAM) or other dynamic storage device, coupled to bus 2404 for storing various information as well as instructions to be executed by processor 2405. The volatile storage 2406 also may be used for storing temporary variables or other intermediate information during execution of instructions by processor 2405. Computer platform 2401 may further include a read only memory (ROM or EPROM) 2407 or other static storage device coupled to bus 2404 for storing static information and instructions for processor 2405, such as basic input-output system (BIOS), as well as various system configuration parameters. A persistent storage device 2408, such as a magnetic disk, optical disk, or solid-state flash memory device is provided and coupled to bus 2401 for storing information and instructions.
Computer platform 2401 may be coupled via bus 2404 to a display 2409, such as a cathode ray tube (CRT), plasma display, or a liquid crystal display (LCD), for displaying information to a system administrator or user of the computer platform 2401. An input device 2410, including alphanumeric and other keys, is coupled to bus 2401 for communicating information and command selections to processor 2405. Another type of user input device is cursor control device 2411, such as a mouse, a trackball, or cursor direction keys for communicating direction information and command selections to processor 2404 and for controlling cursor movement on display 2409. This input device typically has two degrees of freedom in two axes, a first axis (e.g., x) and a second axis (e.g., y), that allows the device to specify positions in a plane.
An external storage device 2412 may be connected to the computer platform 2401 via bus 2404 to provide an extra or removable storage capacity for the computer platform 2401. In an embodiment of the computer system 2400, the external removable storage device 2412 may be used to facilitate exchange of data with other computer systems.
The invention is related to the use of computer system 2400 for implementing the techniques described herein. In an embodiment, the inventive system may reside on a machine such as computer platform 2401. According to one embodiment of the invention, the techniques described herein are performed by computer system 2400 in response to processor 2405 executing one or more sequences of one or more instructions contained in the volatile memory 2406. Such instructions may be read into volatile memory 2406 from another computer-readable medium, such as persistent storage device 2408. Execution of the sequences of instructions contained in the volatile memory 2406 causes processor 2405 to perform the process steps described herein. In alternative embodiments, hard-wired circuitry may be used in place of or in combination with software instructions to implement the invention. Thus, embodiments of the invention are not limited to any specific combination of hardware circuitry and software.
The term “computer-readable medium” as used herein refers to any medium that participates in providing instructions to processor 2405 for execution. The computer-readable medium is just one example of a machine-readable medium, which may carry instructions for implementing any of the methods and/or techniques described herein. Such a medium may take many forms, including but not limited to, non-volatile media, volatile media, and transmission media. Non-volatile media includes, for example, optical or magnetic disks, such as storage device 2408. Volatile media includes dynamic memory, such as volatile storage 2406. Transmission media includes coaxial cables, copper wire and fiber optics, including the wires that comprise data bus 2404. Transmission media can also take the form of acoustic or light waves, such as those generated during radio-wave and infra-red data communications.
Common forms of computer-readable media include, for example, a floppy disk, a flexible disk, hard disk, magnetic tape, or any other magnetic medium, a CD-ROM, any other optical medium, punchcards, papertape, any other physical medium with patterns of holes, a RAM, a PROM, an EPROM, a FLASH-EPROM, a flash drive, a memory card, any other memory chip or cartridge, a carrier wave as described hereinafter, or any other medium from which a computer can read.
Various forms of computer readable media may be involved in carrying one or more sequences of one or more instructions to processor 2405 for execution. For example, the instructions may initially be carried on a magnetic disk from a remote computer. Alternatively, a remote computer can load the instructions into its dynamic memory and send the instructions over a telephone line using a modem. A modem local to computer system 2400 can receive the data on the telephone line and use an infra-red transmitter to convert the data to an infra-red signal. An infra-red detector can receive the data carried in the infra-red signal and appropriate circuitry can place the data on the data bus 2404. The bus 2404 carries the data to the volatile storage 2406, from which processor 2405 retrieves and executes the instructions. The instructions received by the volatile memory 2406 may optionally be stored on persistent storage device 2408 either before or after execution by processor 2405. The instructions may also be downloaded into the computer platform 2401 via Internet using a variety of network data communication protocols well known in the art.
The computer platform 2401 also includes a communication interface, such as network interface card 2413 coupled to the data bus 2404. Communication interface 2413 provides a two-way data communication coupling to a network link 2414 that is connected to a local network 2415. For example, communication interface 2413 may be an integrated services digital network (ISDN) card or a modem to provide a data communication connection to a corresponding type of telephone line. As another example, communication interface 2413 may be a local area network interface card (LAN NIC) to provide a data communication connection to a compatible LAN. Wireless links, such as well-known 802.11a, 802.11b, 802.11g and Bluetooth may also used for network implementation. In any such implementation, communication interface 2413 sends and receives electrical, electromagnetic or optical signals that carry digital data streams representing various types of information.
Network link 2413 typically provides data communication through one or more networks to other network resources. For example, network link 2414 may provide a connection through local network 2415 to a host computer 2416, or a network storage/server 2417. Additionally or alternatively, the network link 2413 may connect through gateway/firewall 2417 to the wide-area or global network 2418, such as an Internet. Thus, the computer platform 2401 can access network resources located anywhere on the Internet 2418, such as a remote network storage/server 2419. On the other hand, the computer platform 2401 may also be accessed by clients located anywhere on the local area network 2415 and/or the Internet 2418. The network clients 2420 and 2421 may themselves be implemented based on the computer platform similar to the platform 2401.
Local network 2415 and the Internet 2418 both use electrical, electromagnetic or optical signals that carry digital data streams. The signals through the various networks and the signals on network link 2414 and through communication interface 2413, which carry the digital data to and from computer platform 2401, are exemplary forms of carrier waves transporting the information.
Computer platform 2401 can send messages and receive data, including program code, through the variety of network(s) including Internet 2418 and LAN 2415, network link 2414 and communication interface 2413. In the Internet example, when the system 2401 acts as a network server, it might transmit a requested code or data for an application program running on client(s) 2420 and/or 2421 through Internet 2418, gateway/firewall 2417, local area network 2415 and communication interface 2413. Similarly, it may receive code from other network resources.
The received code may be executed by processor 2405 as it is received, and/or stored in persistent or volatile storage devices 2408 and 2406, respectively, or other non-volatile storage for later execution. In this manner, computer system 2401 may obtain application code in the form of a carrier wave.
Finally, it should be understood that processes and techniques described herein are not inherently related to any particular apparatus and may be implemented by any suitable combination of components. Further, various types of general purpose devices may be used in accordance with the teachings described herein. It may also prove advantageous to construct specialized apparatus to perform the method steps described herein. The present invention has been described in relation to particular examples, which are intended in all respects to be illustrative rather than restrictive. Those skilled in the art will appreciate that many different combinations of hardware, software, and firmware will be suitable for practicing the present invention. For example, the described software may be implemented in a wide variety of programming or scripting languages, such as Assembler, C/C++, perl, shell, PHP, Java, etc.
Moreover, other implementations of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. Various aspects and/or components of the described embodiments may be used singly or in any combination in the computerized storage system with journaling capability. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the invention being indicated by the following claims.