The foregoing and other objects, features and advantages of the invention will be apparent from the following more particular description of preferred embodiments of the invention, as illustrated in the accompanying drawings.
The host apparatus 2 may be a personal computer, workstation, or mainframe. The host apparatus 2 has hardware resources, such as a CPU (Central Processing Unit) 21, a main memory 22, an interface unit 23, a local I/O device 24, and a timer 25, which are interconnected via an internal bus 26. The host apparatus 2 also has software resources, such as a device driver and an operating system (OS). By this configuration, the host apparatus 2 executes various programs under control of the CPU 21, and achieves desired processing in cooperation with the hardware resources. For example, under the control of the CPU 21, the host apparatus 2 executes an application program on the OS. The application program is a program for achieving the processing that the host apparatus 2 primarily intends to execute. Upon its execution, the application program requests access (such as data-read or data-write) to the storage apparatus 4. For such access, a storage manager may be installed on the host apparatus 2. The storage manager is a management program for managing access to the storage apparatus 4. The storage manager may be separate from the OS. Alternatively, it may be incorporated to form a part of the OS. Various programs may be configured as a single module or as a plurality of modules.
Referring back to
The storage apparatus 4 includes a storage unit 41 comprising a plurality of physical disk devices, and a controller 42 for controlling the storage unit 41.
The disk devices are selected from, for example, FC (Fibre Channel) disks, FATA (Fibre Attached Technology Adapted) disks, SATA (Serial AT Attachment) disks, optical disk drives, or similar. In a storage area provided by one or more disk devices, one or more logically defined volumes (hereinafter referred to as logical volumes) are established.
The logical volumes are given an attribute according to their purpose of use, and managed in accordance with their assigned unique identifier (LUN: Logical Unit Number). In this embodiment, a data volume 41a, a journal volume 41b, a snapshot volume 41c, and a command management volume 41d are defined in the storage unit 41. It would understood that the journal volume 41b, the snapshot volume 41c, and the command management volume 41d function as volumes for data backup.
The data volume 41a is a volume used when the application program reads/writes data. The journal volume 41b is a volume for storing journal data, which is update history information of the data volume 41a. The journal data typically includes: data written to the data volume 41a, an address in the data volume 41a to which the data has been written, and management information, e.g., the time when the data was written. The snapshot volume 41c is a volume for storing snapshot data (images) of the data volume 41a at particular points in time. The command management volume 41d is a volume for temporarily holding specific commands sent from the host apparatus 2.
The logical volumes are accessed in blocks of a specific size. Each block is given a logical block address (LBA). Thus, the host apparatus 2 accesses a target storage area the logical volumes by specifying an address based on the above-described identifier and logical block address to the controller 42 in the storage apparatus 4.
The controller 42 is configured as a system circuit including, among other things, a CPU 421, memory 422, a cache mechanism 423, and a timer 424, and thereby performs overall control over inputs/outputs between the host apparatus 2 and the storage unit 41. Also, the controller 42 may typically include one or more channel adapters and one or more disk adapters (not shown in the drawing). The memory 422 functions as the main memory for the CPU 421. For example, as shown in
The cache mechanism 423 comprises a cache memory, and is used for temporarily storing data input/output between the host apparatus 2 and the storage unit 41. Specifically, commands sent from the host apparatus 2 are temporarily held in the cache memory, and data read from the data volume 41a in the storage unit 41 is temporarily held in the cache memory before being transmitted to the host apparatus 2.
The timer 424 keeps time, and provides the CPU 421 with timestamps, as necessary. The term “timestamp” is used here as a broad meaning including data indicating a particular point in time, a particular date, or a combination of both. The control program utilizes those timestamps under the control of the CPU 421.
The storage system 1 according to this embodiment is designed to be able to restore data using timestamps based on the time indicated by the timer in the host apparatus 2 (hereinafter referred to as “host apparatus timestamps”) and to be also able to restore data using timestamps based on the time indicated by the timer in the storage apparatus 4 (hereinafter referred to as “storage apparatus timestamps”). More specifically, the host apparatus 2 transmits an obtained host apparatus timestamp together with a specific command to the storage apparatus 4, and thereby the storage apparatus 4 stores that host apparatus timestamp sent from the host apparatus 2 in a specific area. In doing so, the storage apparatus 4 restores data using the stored host apparatus timestamps if it receives a restoration request specifying a time in the host apparatus.
More specifically, as shown in
The storage manager also monitors whether the system status of the host apparatus 2 satisfies any parameters defined in a restoration point setting parameter table or not (STEP 403). The restoration point setting parameter table is a table that defines parameters for setting restoration points, item by item.
Referring back to
The storage manager also monitors whether or not any restoration request has been given (STEP 405). The restoration request is, for example, given by a user via a dialogue box provided by the storage manager. When receiving any restoration request, the storage manager generates a restoration management request command and then transmits the command to the storage apparatus 4 (STEP 406). As described above, a restoration request is set in the command field of the here-generated restoration management request command. A restoration management request command where a restoration request is set in its command field may also simply be referred to as a restoration request command.
Referring to
If the request command is a restoration management request (“Yes” in STEP 709), the controller 42 associates the restoration management request command with the obtained timestamp and writes the resulting command to the command management volume 41d (STEP 709). If the restoration management request command is a restoration point setting request, it includes the host apparatus timestamp. While the controller 42 monitors the command management volume 41d, if any request command exists in the command management volume 41d, the controller 42 also executes the processing according to the request command. The details will be explained below.
Referring to
As shown in
As shown in
Referring to
Referring to
Specifically, the controller 42 determines whether the timestamp included in the restoration request command is based on the time in the host apparatus or based on the time in the storage apparatus. The timestamp included in the restoration request command is a timestamp indicating a particular point in time to which data restoration has been requested, i.e., specifying whether data should be restored based on the time in the host apparatus or based on the time in the storage apparatus. The timestamp is, for example, given by a user via a dialogue box provided by a recovery manager. The recovery manager may be designed to inquire from the controller 42 any point in time where restoration can be executed, and to provide a user with the inquiry result so that the user can select a particular time. If the timestamp included in the restoration request command is not based on the time in the host apparatus (“No” in STEP 1301), the controller 42 interprets the timestamp as being based on the time in the storage apparatus, and thus executes the processing described from STEP 1302 to STEP 1307. On the other hand, if the specified timestamp is based on the time in the host apparatus (“Yes” in STEP 1301), the controller 42 executes the processing described from STEP 1401 to STEP 1414 in
More specifically, if the specified timestamp is not based on the time in the host apparatus, the controller 42 interprets the specified timestamp as being based on the time in the storage apparatus, and obtains the storage-based snapshot timestamp closest in time to the designated timestamp. Namely, the controller 42 refers to the snapshot volume list in the volume management table, and extracts one element, i.e., the timestamp indicating the particular point in time that the snapshot processing was executed (snapshot timestamp) SS-TIME(i), from a referenced node (STEP 1302). Then, the controller 42 compares the designated timestamp with the extracted snapshot timestamp SS-TIME(i), and determines whether the designated timestamp is before the extracted snapshot timestamp SS-TIME(i) (STEP 1303). If the designated timestamp coincides with the extracted snapshot timestamp SS-TIME(i), the controller 42 applies the snapshot data corresponding to the extracted snapshot timestamp SS-TIME(i) to the data volume 41a. This results in restoration of data as of the point in time that the system administrator has intended.
If the designated timestamp is not before the extracted snapshot timestamp SS-TIME(i) (“No” in STEP 1303), the controller 42 extracts the next snapshot timestamp SS-TIME(i=i+1) from the snapshot volume list (STEP 1302), and compares it in the same manner (STEP 1303). The extracting and comparing steps are repeated until an applicable snapshot timestamp SS-TIME(i) has been obtained. If an applicable timestamp has not been obtained even after comparing the designated timestamp with all snapshot timestamps SS-TIME(i; 0<i<n+1) in the snapshot volume list, the controller 42 may return an error message to the host apparatus 2, and ends the processing. If the designated timestamp is before the extracted snapshot timestamp (“Yes” in STEP 1303), the controller 42 selects the snapshot timestamp SS-TIME(S=i−1) of the preceding node relative to the snapshot timestamp SS-TIME(i) currently referred to, and then restores the data volume 41a based on the snapshot data corresponding to the selected snapshot timestamp SS-TIME(S) (STEP 1304). By way of this, data is restored using the snapshot data corresponding to the storage-based snapshot timestamp closest in time to the specified timestamp.
Further, The controller 42 refers to the journal header area in the journal volume 41b, and extracts a timestamp indicating a particular point in time that journaling was performed (journal timestamp) JH-TIME(i) (STEP 1305). Then, the controller 42 compares the designated timestamp with the extracted journal timestamp JH-TIME(i), and determines whether the designated timestamp is before the journal timestamp JH-TIME(i) (STEP 1306). If the designated timestamp is not older than the journal timestamp JH-TIME(i) (“No” in STEP 1306), the controller 42 extracts the next journal timestamp JH-TIME(i=i+1) (STEP 1305), and compares and determines it in the same manner (STEP 1306). If the designated timestamp is before the extracted journal timestamp JH-TIME(i) (“Yes” in STEP 1306), the controller 42 selects the journal timestamp JH-TIME(J=i−1) of the preceding node relative to the journal timestamp JH-TIME(i) currently referred to, and extracts journal data corresponding to any journal timestamps that are after the time indicated by the snapshot timestamp SS-TIME(S) that has been used for the restoration, and up to the now selected journal timestamp JH-TIME(J). The controller 42 sequentially applies the extracted journal data to the data volume 41a, thereby restoring the data volume 41a (STEP 1307). By way of this, with regard to the designated timestamp, data is restored using the journal data corresponding to the storage-based journal timestamp. Accordingly, in combination with the above restoration using the snapshot data, high-speed data restoration is realized.
If the designated timestamp is found in STEP 1301 to be based on the time in the host apparatus, the controller 42 first refers to the snapshot volume list in the volume management table, and extracts one item, i.e., the host-apparatus-based snapshot timestamp SS-HTIME(i) indicating the host-apparatus-based time that the snapshot processing was executed, from a node (STEP 1401 in
If the designated timestamp is before the snapshot timestamp SS-HTIME(i) (“Yes” in STEP 1402), the controller 42 extracts the host-apparatus-based snapshot timestamp SS-HTIME(S=i−1) of the preceding node relative to the snapshot timestamp SS-HTIME(i) currently referred to, and checks whether there are one or more different snapshot timestamps SS-TIME(p) between the above two snapshot timestamps SS-HTIME(i) and SS-HTIME(S) (STEP 1403). The different snapshot timestamp SS-TIME(p) used here means a timestamp indicating a point in time that the snapshot processing was executed based on the time in the storage apparatus. If the storage apparatus 4 executes a snapshot independently from the host apparatus 2 (in other words, not based on requests from the host apparatus 2), or if a snapshot is executed based on a snapshot request including no host apparatus timestamp, the snapshot timestamp SS-TIME(p) will be stored in the snapshot volume list.
If there is no such different snapshot timestamp SS-TIME(p) (“No” in STEP 1403), the data volume 41a is recovered based on the snapshot data corresponding to the above extracted snapshot timestamp SS-HTIME(S) (STEP 1407). On the other hand, if there is such a different snapshot timestamp SS-TIME(p), the controller 42 performs the steps as depicted from STEP 1404 to STEP 1406, to extract the storage-apparatus-based snapshot timestamp SS-TIME(S′) corresponding to the designated host-apparatus-based timestamp.
Specifically, if there are one or more different snapshot timestamps SS-TIME(p) (“Yes” in STEP 1403), the controller 42 obtains the difference time 5T between the host-apparatus-based snapshot timestamp SS-HTIME(S) and the storage-apparatus-based snapshot timestamp SS-TIME(S) (STEP 1404). In other words, the controller 42 refers to the list data, i.e., the snapshot timestamps SS-TIME and SS-HTIME, in the same node of the snapshot volume list in the volume management table shown in
Then, the controller 42 extracts a snapshot timestamp SS-TIME(S.offset), which is an offset timestamp obtained by offsetting the storage-apparatus-based snapshot timestamp SS-TIME(S) using the difference time δT (STEP 1405). The controller 42 compares the offset snapshot timestamp SS-TIME(S.offset) with the storage-apparatus-based snapshot timestamp SS-TIME(p), and determines whether the offset snapshot timestamp SS-TIME(S.offset) is after the storage-apparatus-based snapshot timestamp SS-TIME(p) (STEP 1406). If the snapshot timestamp SS-TIME(S.offset) is not after the snapshot timestamp SS-TIME(p), the controller 42 then obtains the next offset snapshot timestamp SS-TIME(S.offset=S.offset+1) (STEP 1405), and compares those timestamps in the same manner (STEP 1406). If the snapshot timestamp SS-TIME(S.offset) is after the snapshot timestamp SS-TIME(p), the controller 42 restores the data volume 41a based on the snapshot data corresponding to the snapshot timestamp SS-TIME(S.offset) currently referred to (STEP 1407).
Further, the controller 42 refers to the journal header area in the journal volume 41b, and extracts the host-apparatus-based journal timestamp JH-HTIME(j) indicating the host-apparatus-based point in time that journaling was performed (STEP 1408). Then, the controller 42 compares the designated timestamp with the extracted journal timestamp JH-HTIME(j), and determines whether the designated timestamp is before the journal timestamp JH-HTIME(j) (STEP 1409). If the designated timestamp is not older than the journal timestamp JH-HTIME(j) (“No” in STEP 1409), the controller 42 extracts the next journal timestamp JH-HTIME(j=j+1) (STEP 1408), and compares those timestamps in the same manner (STEP 1409). If the designated timestamp is older than the journal timestamp JH-HTIME(j) (“Yes” in STEP 1409), the controller 42 extracts the host-apparatus-based journal timestamp JH-HTIME(J=j−1) of the preceding node relative to the current host-apparatus-based journal timestamp JH-HTIME(j), and further checks whether there are one or more different journal timestamps JH-TIME(q) between the above two journal timestamps JH-HTIME(j) and JH-HTIME(J) (STEP 1410). The different journal timestamp used here means a timestamp indicating a point in time that journaling was performed based on the time in the storage apparatus.
If there is no such different journal timestamp JH-TIME(q) (No in STEP 1410), the data volume 41a of the past is restored by sequentially applying journal data corresponding to journal timestamps that are after the time indicated by the snapshot timestamp SS-TIME used above for reflecting data in the data volume 41a, and up to the above-obtained journal timestamp JH-HTIME(J), to the data volume 41a (STEP 1414).
If there are one or more different journal timestamps JH-TIME(q), the controller 42 performs steps as depicted from STEP 1410 to STEP 1413, to extract the storage-apparatus-based journal timestamp corresponding to the designated host-apparatus-based timestamp.
Specifically, if there are one or more different journal timestamps JH-TIME(q) (“Yes” in STEP 1410), the controller 42 obtains the difference time 5T between the preceding host-apparatus-based journal timestamp JH-HTIME(J), and its corresponding storage-apparatus-based journal timestamp JH-TIME(J) (STEP 1411). The controller 42 extracts a journal timestamp JH-TIME(J.offset), which is the offset timestamp obtained by offsetting the storage-apparatus-based journal timestamp JH-TIME(J) using the difference time δT (STEP 1412). Then, the controller 42 compares the offset journal timestamp JH-TIME(J.offset) with the storage-apparatus-based journal timestamp JH-TIME(j), and thereby determines whether the offset journal timestamp JH-TIME(J.offset) is before the storage-apparatus-based journal timestamp JH-TIME(j) (STEP 1413). If the journal timestamp JH-TIME(J.offset) is not before the journal timestamp JH-TIME(j), the controller 42 extracts the next offset journal timestamp JH-TIME(J.offset=J.offset+1), and compares those timestamps in the same manner. If the journal timestamp JH-TIME(J.offset) is older than the journal timestamp JH-TIME(j), the controller 42 restores the data volume 41a, by sequentially applying journal data corresponding to journal timestamps that are after the snapshot timestamp SS-TIME(S) used above for restoring the data volume 41a, and up to the preceding journal timestamp JH-HTIME(J.offset), to the data volume 41a (STEP 1414). By way of this, data as of a host-apparatus-based point in time as intended by the system administrator is restored.
Referring to
The “normal write” time sequence shows points in time when normal write request commands were issued. In
The “restoration point setting” time sequence shows points in time when restoration point setting request commands were issued. As described before, the restoration point setting request commands are restoration management request commands where restoration point setting has been designated. In the restoration point setting request commands, it is possible to optionally designate whether the relevant processing involves snapshot processing or not. In
The “journaling” time sequence shows points in time when journaling was performed in the storage apparatus 4. Journaling is performed together with the write processing in accordance with the normal write request commands, and it is also performed based on the restoration point setting request commands. Specifically, if a restoration point setting request command designates “Without snapshots,” the storage apparatus 4 performs journaling only. In
The “snapshot” time sequence shows points in time when snapshots were created in the storage apparatus 4. In part, snapshots are executed depending on the host apparatus 2, and, in part, snapshots are executed by the storage apparatus 4 independently of the host apparatus 2. The snapshots dependent on the host apparatus 2 are those executed in accordance with the restoration point setting request commands. In
Based on the above, if a user at the host apparatus 2 wishes to restore data on the data volume 41a as of the time of 10:40, host-apparatus-based time, the user designates the host-apparatus-based timestamp of 10:40, using a dialogue box provided by the storage manager.
In accordance with that user's instructions, the storage manager generates a restoration request command including the host-apparatus-based timestamp of “10:40” and transmits the command to the storage apparatus 4.
In response to the restoration request command, the controller 42 in the storage apparatus 4 refers to the snapshot volume list, searches for the host-apparatus-based snapshot timestamps sequentially from the oldest node, and, as a consequence of this, extracts that closest to the time of 10:40. In the example shown in
The controller 42 next refers to the journal header area in the journal volume, and searches for the host-apparatus-based journal timestamps, sequentially from the oldest node. In this example, however, there is no journal timestamp including any host apparatus timestamp until the journal timestamp indicating the time of 11:00. Thus, referring to the list data storing the host apparatus timestamp and the storage apparatus timestamp, associated one-to-one with each other, the controller 42 obtains a difference time δT between the time in the host apparatus and the time in the storage apparatus. In doing so, it will be found that the time in the host apparatus is 30 minutes behind the time in the storage apparatus. The controller 42 searches for any journal timestamp that has been offset from the storage-apparatus-based journal timestamp via the addition of the 30-minute difference time, and that indicates a time not after 10:40. In this example, the journal timestamp showing the time of 10:40, host-apparatus-based time (10:10, storage-apparatus-based time) is extracted.
Subsequently, the controller 42 applies the snapshot data corresponding to the above-extracted closest preceding snapshot timestamp (i.e., the snapshot data at 10:30, host-apparatus-based time) to the data volume 41a, and further applies the journal data corresponding to the above-extracted journal timestamp (i.e., the journal data at 10:10, storage-apparatus-based time) to the data volume 41a to which the above snapshot data had been already applied, thereby obtaining the data volume 41a of the past as intended. By way of this, data on the data volume 41a is restored based on the time in the host apparatus.
Several advantages result from a storage system of the present invention, some of which have been discussed above.
In the storage system 1 according to this embodiment, when the host apparatus 2 accesses to the storage apparatus 4, the host apparatus 2 transmits a command including an internally obtained host apparatus timestamp to the storage apparatus 4, and accordingly the storage apparatus 4 stores the transmitted host apparatus timestamp in a specific area. If the storage apparatus 4 receives a restoration request designating a time in the host apparatus from the host apparatus 2, the storage apparatus 4 restores data using the stored host apparatus timestamps. Thus, according to this embodiment, system administrators can restore data based on the time in the host apparatus.
Further, in the storage system 1 according to this embodiment, each restoration request designates whether data should be restored based on the time in the host apparatus or based on the time in the storage apparatus. Thus, according to this embodiment, the system administrators can restore data as of a proper point in time according to the reasons and content of the relevant failure.
Moreover, in the storage system 1 according to this embodiment, where a restoration request is made based on the time in the host apparatus, and even if the storage apparatus 4 does not store the corresponding host apparatus timestamp, the storage apparatus 4 can restore data to a point in time as close as possible to the time in the host apparatus designated in the restoration request, considering the time difference between the times in the host apparatus and in the storage apparatus.
As described above, according to this embodiment, the system administrators can restore data as expected.
The above-described embodiment is just an example for explaining the invention, and is not intended to limit the invention only to that embodiment. The invention can be embodied in other specific forms, without departing from the spirit of the invention. For example, in the above embodiment, the processing has been explained as being executed in sequential steps, but the invention is not limited to that. As long as no contradiction in operation is generated, the order of those steps may be changed, or some steps may be executed in parallel.
Further, in the storage system according to this embodiment, the storage manager is, as a management program, designed to issue various commands to the storage apparatus 4, but the invention is not limited to this. For example, an application program may be designed to issue various commands to the storage apparatus 4, by incorporating all or a part of the functions of the storage manager into the application program.
In addition, in the storage system according to this embodiment, the command management volume is established in the storage unit as one of the backup volumes, but the invention is not limited to this. For example, the command management volume may be established in the local memory in the controller.
The invention can be widely applied to storage apparatuses storing computer-processed data.
While the invention has been described with respect to a limited number of embodiments, those skilled in the art, having the benefit of this disclosure, will appreciate that other embodiments can be devised which do not depart from the scope of the invention as disclosed herein. Accordingly, the scope of the invention should be limited only by the attached claims.
| Number | Date | Country | Kind |
|---|---|---|---|
| 2006-154935 | Jun 2006 | JP | national |