Various embodiments of the present invention are hereinafter described in conjunction with the appended drawings:
It is to be noted, however, that the appended drawings illustrate only example embodiments of the invention, and are therefore not considered limiting of its scope, for the invention may admit to other equally effective embodiments.
Various embodiments of this invention, described further below, are directed to a system and method for virtual tape systems which keeps track of such scheduled tasks and utilizes this information to improve the prediction.
In the following, the removable storage medium may be assumed to be magnetic tape storage medium. The invention as it is, however, may be applied also for other kinds of removable storage media, because, as a matter of fact, the nature of how the data is actually stored, be that random or sequential access, or be that magnetic, optical, holographic or any other physical way of storing data, is not decisive for the present invention.
With general reference to the figures and with special reference now to
Each row in table 100 allows deriving a rule. The rule essentially represents the decision to mount a physical tape volume or not.
The administrator of the virtual tape system can associate each scheduled task with the application name 104 of the application which generates the workload. The example of table 100 in
Today's virtual tape systems provide different interfaces for library management (for example, SCSI Media Changer, IBM 3494). Table 100 records the initiator addresses 106 for library management accesses and the used protocol.
The task time window 110 defines the time frame, when the rule described by this row of table 100 is valid. Various formats are possible. One embodiment uses a crontab-like style to specify valid time frames. The crontab command, found in Unix and Unix-like operating systems, is used to schedule commands to be executed periodically. It reads a series of commands and collects them into a file also known as a “crontab”, which is later read and whose instructions are carried out.
The priority 112 of a rule determines which rule to be selected if multiple rules (rows) are valid for a certain point in time. For instance, in the scenario shown in table 100 of
The workload type 114 describes the anticipated need for the mount of a physical volume when the application requests to mount a logical volume. Preferably, the following workload types are defined:
read defines a rule where it is anticipated that a mount request to a logical volume is followed by a subsequent read access. For instance, row 9 in Table 100,
write defines a rule where it is anticipated that a mount request to a logical volume is followed by a subsequent write access. For instance, row 6 in Table 100,
read-write (read first) and write-read (write first) define rules where it is anticipated that the application mounts two logical volumes to copy data from one logical volume to another logical volume, for instance, when IBM Tivoli Storage Manager reorganizes the data on tape via space reclamation.
The rule read-write (read first) models the behaviour that the tape using application mounts the input volume first for a read operation and after that the output volume for a write operation.
The rule write-read (write first) models the behaviour that the tape using application mounts the output volume first for a write operation and after that the input volume for a read operation.
Immediate mount defines a rule where the corresponding physical volume should always being mounted when the application mounts a logical volume. For instance, row 8 in Table 100,
Deferred mount defines a rule where the mount of the corresponding physical volume should always being deferred until the first access to data. For instance, row 7 in Table 100,
The interval 116 is only applicable for read-write (read first) and write-read (write first) workload. The interval specifies if a second mount request is considered to be adjacent to a first mount request or not.
The last mount 118 is only applicable for read-write (read first) and write-read (write first) workload and is represented by a time stamp. This time stamp records when the respective rule has been applied the last time. In conjunction with the interval 116 the last mount 118 helps to identify whether a second mount request is considered as a first mount request or as a second mount request. Thereby the second mount request is considered second if it is adjacent to the corresponding first mount request based on the interval field 116.
The medium 120 defines rules for specific cartridge media types. For instance, row 10 of Table 100,
The tape medium serial number or volume serial (Vol Ser) number may be associated with a range, abbreviated as “volser” range 122. It defines rules for specific tape media serial number ranges. For instance, row 12 of Table 100,
With additional reference now to
In step 404 the eligible rows of Table 100 are determined such that they match the conditions specified by the values of the columns of the Media Changer Address 106, Task Window 110, medium type 120, and medium volser range 122.
In step 406 the controller logic selects the rule with the highest priority. In case of multiple rows having the same priority 112 one single rule is selected by evaluating further secondary field values. This can be set by the administrator before.
In step 408 the controller logic decides if the workload 114 of the rule which was selected in step 406 indicates ‘immediate’. In the YES case it schedules, step 410, the mount of the respective physical volume immediately and exits this procedure. In the NO case of step 408 the process continues to step 412.
In step 412 the controller logic decides if the workload 114 of the rule which was selected in step 406 indicates ‘deferred’. In the YES case it continues to step 414, and it does not schedule the mount of the respective physical volume and exits this procedure. The logic here is that no physical drive will be occupied until the first I/O to the tape volume is received from the host. This contributes to rest valuable physical resources. In the NO case of step 412 the process flows to step 416.
In step 416 the controller logic decides if the workload 114 of the rule which was selected in step 406 indicates ‘read’. In the YES case it is determined in step 418, if the logical mount request can be satisfied without a physical mount of the corresponding physical volume, for instance, due to the fact that a copy of the logical volume still resides in the disk cache. Then it decides to exit this procedure in step 420, if no mount request is required.
Otherwise, if a mount request is required in the NO branch of decision 418, it decides to immediately schedule a mount request, step 422 and to exit this procedure.
In the NO case of step 416 the process flows to step 424. In step 424 the controller logic decides if the workload 114 of the rule which was selected in step 406 indicates ‘write’. In the YES case step 426 is executed and it does not mount the respective physical volume and exits this procedure in step 420. The data will be written to the disk cache. In the NO case of step 424 the process flows to step 428 of
In step 428 of
In step 430 the control logic updates the last mount time (118) with the current time.
Decision 431 uses the result of step 429 to check if this is the first mount. In the YES case of decision 431 a decision 432 determines if the mount request can be satisfied without a mount of the corresponding physical volume, for instance, due to the fact again, that a copy of the logical volume still resides in the disk cache. If so, the request is serviced from disk cache and it is decided to exit this procedure in step 449. Otherwise, step 438, the request is immediately scheduled as in step 422 above and this procedure exits in step 449.
In the NO case of decision 431 a second mount request within interval (116) has been determined in step 429. The control logic flows to step 440 and updates the last mount time (118) with a time stamp which references to a point in time before the interval; thus the next time, when the rule which was selected in step 406 is evaluated, step 429 determines again a first mount request.
Then the control flow continues with step 442: no physical mount request is scheduled. Instead, a write access is anticipated which will be written to the disk cache. From step 442 it exits this procedure in step 449.
In an alternate embodiment of step 440 the mount time is not reset: Additional meta data is used to determine if this is a third, a fourth, or so mount request to the same rule of table 100 within a certain time interval. In that alternate embodiment of this invention, the administrator can configure the behaviour for the next step 442.
In step 460 of
In step 462 the control logic updates the last mount time 118 with the current time.
Decision 463 uses the result of step 461 to check if this is the first mount. In the YES case of decision 463 it does not schedule the mount of the physical volume and exits in step 480. Instead, a write access is anticipated which will be written to the disk cache.
In the NO case of decision 463, a second mount request within interval (116) is determined. The control logic flows to step 466 and updates the last mount time (118) with a time stamp which references to a point in time before the interval; thus the next time, when the rule which was selected in step 406 is evaluated, step 461 determines again a first mount request. From step 466 the process flows to step 468.
In an alternate embodiment of step 466 the mount time is not reset and additional meta data is used to determine if this is a third, a forth, or so mount request to the same rule of table 100 within a certain time interval. In that alternate embodiment of this invention, the administrator can configure the behaviour for the next step 468.
Then the control logic determines in a decision 468, if the mount request can be satisfied without a mount of the corresponding physical volume, for instance, due to the fact again, that a copy of the logical volume still resides in the disk cache. If so, the request is serviced from disk cache, step 470, and it is decided to exit this procedure, step 480. Otherwise the request is immediately scheduled, 472, as in step 422 above. Then it exits this procedure in step 480.
In the NO case of decision 460, further cases could be appended if ever necessary. If no conditions remain to be evaluated, the procedure is exited.
A second preferred embodiment uses the basic structural and control flow elements as does the preceding one, presented in
Applications which use tape, for instance backup systems, typically process the following three steps when they access data on a virtual or a physical tape volume:
According to the preceding embodiment, the tape emulation system decides during Step 1 of the algorithm summary shortly above whether to mount the respective physical volume or not. But, during Step 2 more information is available; thus upcoming I/O can be predicted more precisely. This is exploited by the second embodiment, which extends the table of
The preceding embodiment predicts upcoming I/O requests only during Step 1 of the procedure above. The second embodiment evaluates the new field of the Host I/O Address 108 which allows recalculating the prediction of upcoming I/O requests during Step 2 of the procedure of the algorithm summary for a second time. Since the recalculation during Step 2 can take into account more information than the calculation during Step 1, the recalculation during Step 2 can predict the upcoming workload even more precisely than the initial calculation during Step 1.
The method according to the second embodiment executes the algorithm which is introduced above in
After that the method uses the same steps as introduced in the first embodiment.
As should reveal from the above description an iteration of steps 2) (calculating upcoming I/O workload based on said meta data) and 3) (deciding based on said calculation, if or when an incoming mount request for a logical tape volume will be serviced by mounting a physical tape volume) after having evaluated the address 108 of the device initiating the input/output (I/O) command takes place. The distinction of the library management initiator 106 and the host I/O initiator 108 helps to describe the task windows more precisely. For instance, with the help of the host I/O initiator 108 the tape emulating system can differentiate during the label verification (Step 2 of the summary algorithm described above) whether the logical tape is accessed by a server application 12 or by a Storage Agent which are often implemented for so-called LAN-free backup.
Various options are available to configure the rows in table 100. In one embodiment the rows are updated manually. In one embodiment the tape management system extracts the scheduled tasks from a tape using application and updates table 100 automatically. In one embodiment the tape emulating system analyzes the historic data and statistics of past mount requests: The preferred method is to use the statistic of the last six weeks and to correlate the mount activity of each day of the week (Monday, Tuesday, Wednesday, . . . ) separately, because very often tape using applications comprise daily schedules and weekly schedules. In one embodiment the previously described methods can be mixed.
The present invention can be realized in hardware, software, or a combination of hardware and software. A cache controller of a removable storage medium controller, for example of a virtual tape library system according to the present invention can be realized in a centralized fashion in one computer system, or in a distributed fashion where different elements are spread across several interconnected computer systems. Any kind of computer system or other apparatus adapted for carrying out the methods described herein is suited. A typical combination of hardware and software could be a general purpose computer system with a computer program that, when being loaded and executed, controls the computer system such that it carries out the methods described herein.
This invention can equally be applied to other storage technologies of removable physical storage media such as holographic storage, optical disk storage, magnetic disk storage, optical tape, or solid-state memory such as a memory stick, in addition to magnetic tape storage.
The present invention can also be embedded in a computer program product, which comprises all the features enabling the implementation of the methods described herein, and which, when loaded in a computer system, is able to carry out these methods.
Computer program means or computer program in the present context mean any expression, in any language, code or notation, of a set of instructions intended to cause a system having an information processing capability to perform a particular function either directly or after either or both of the following
Furthermore, the method described herein may take the form of a computer program product accessible from a computer-usable or computer-readable medium providing program code for use by or in connection with a computer or any instruction execution system. For the purposes of this description, a computer-usable or computer readable medium may be any apparatus that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. The medium may be an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system (or apparatus or device) or a propagation medium. Examples of a computer-readable medium include a semiconductor or solid state memory, magnetic tape, a removable computer diskette, a random access memory (RAM), a read-only memory (ROM), a rigid magnetic disk and an optical disk. Current examples of optical disks include compact disk, read only memory (CD-ROM), compact disk, read/write (CD-RW), and DVD.
Number | Date | Country | Kind |
---|---|---|---|
06118643.3 | Aug 2006 | DE | national |