This invention relates to systems and methods for balancing performance of heterogeneous tape drives and tape cartridges.
Virtual tape systems, such as the IBM TS7700 and Oracle VTL, and disk storage systems incorporating a file system, such as the IBM Spectrum Scale, support functionality referred to as Hierarchical Storage Management (hereinafter simply referred to as HSM). HSM is a storage technology that is configured to efficiently manage multiple tiers of storage media by migrating or placing more frequently accessed data on storage media that is expensive but supports higher access speeds, while migrating or placing less frequently accessed data on inexpensive storage media with lower access speeds. For example, the IBM TS7700 is designed to manage two tiers of storage, namely a disk storage tier and a tape storage tier, such that more frequently accessed data is placed on the disk storage tier and less frequently accessed data is placed on the tape storage tier.
The IBM TS7700 manages virtual tapes as files and supports virtual tapes with sizes up to 25 Gigabytes. With the possibility of such large virtual tapes, it can take a significant amount of time to move virtual tape files between a disk storage tier and a tape storage tier. As a countermeasure to such time-consuming data transfers, large files may be “striped” across multiple tape cartridges to improve I/O performance. Using this technique, a file is divided into multiple blocks and these blocks are transferred to or from the tape cartridges in parallel. While similar techniques are used on arrays of disk drives to increase data access rates, these techniques are not as simple or efficient to implement on append-only storage media such as tape.
In view of the foregoing, what are needed are systems and methods to stripe data across multiple tape cartridges in a way that is efficient and takes into account the unique characteristics of append-only storage media. Ideally, such systems and methods may be used to efficiently stripe data across tape using heterogeneous tape cartridges and tape drives.
The invention has been developed in response to the present state of the art and, in particular, in response to the problems and needs in the art that have not yet been fully solved by currently available systems and methods. Accordingly, systems and methods are disclosed to balance performance of heterogeneous tape drives and/or tape cartridges. The features and advantages of the invention will become more fully apparent from the following description and appended claims, or may be learned by practice of the invention as set forth hereinafter.
Consistent with the foregoing, a method for balancing performance of heterogeneous tape drives and/or tape cartridges is disclosed. In one embodiment, such a method includes identifying a plurality of tape cartridges to which data is to be written, and identifying a plurality of tape drives to write the data to the tape cartridges. The method divides the data into a plurality of blocks, where each block is configured to be written to one of the plurality of tape cartridges by one of the plurality of tape drives. The method further sizes each block in accordance with an amount of time required to write the block to the corresponding tape cartridge by the corresponding tape drive. In certain embodiments, the blocks are sized so that they take substantially the same amount of time to be written to their corresponding tape cartridges. Once the blocks are sized appropriately, the method writes the blocks to their corresponding tape cartridges.
A corresponding system and computer program product are also disclosed and claimed herein.
In order that the advantages of the invention will be readily understood, a more particular description of the invention briefly described above will be rendered by reference to specific embodiments illustrated in the appended drawings. Understanding that these drawings depict only typical embodiments of the invention and are not therefore to be considered limiting of its scope, the invention will be described and explained with additional specificity and detail through use of the accompanying drawings, in which:
It will be readily understood that the components of the present invention, as generally described and illustrated in the Figures herein, could be arranged and designed in a wide variety of different configurations. Thus, the following more detailed description of the embodiments of the invention, as represented in the Figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of certain examples of presently contemplated embodiments in accordance with the invention. The presently described embodiments will be best understood by reference to the drawings, wherein like parts are designated by like numerals throughout.
The present invention may be embodied as a system, method, and/or computer program product. The computer program product may include a computer readable storage medium (or media) having computer readable program instructions thereon for causing a processor to carry out aspects of the present invention.
The computer readable storage medium may be a tangible device that can retain and store instructions for use by an instruction execution device. The computer readable storage medium may be, for example, but is not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing. A non-exhaustive list of more specific examples of the computer readable storage medium includes the following: a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), a static random access memory (SRAM), a portable compact disc read-only memory (CD-ROM), a digital versatile disk (DVD), a memory stick, a floppy disk, a mechanically encoded device such as punch-cards or raised structures in a groove having instructions recorded thereon, and any suitable combination of the foregoing. A computer readable storage medium, as used herein, is not to be construed as being transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission media (e.g., light pulses passing through a fiber-optic cable), or electrical signals transmitted through a wire.
Computer readable program instructions described herein can be downloaded to respective computing/processing devices from a computer readable storage medium or to an external computer or external storage device via a network, for example, the Internet, a local area network, a wide area network and/or a wireless network. The network may comprise copper transmission cables, optical transmission fibers, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers. A network adapter card or network interface in each computing/processing device receives computer readable program instructions from the network and forwards the computer readable program instructions for storage in a computer readable storage medium within the respective computing/processing device.
Computer readable program instructions for carrying out operations of the present invention may be assembler instructions, instruction-set-architecture (ISA) instructions, machine instructions, machine dependent instructions, microcode, firmware instructions, state-setting data, or either source code or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C++ or the like, and conventional procedural programming languages, such as the “C” programming language or similar programming languages.
The computer readable program instructions may execute entirely on a user's computer, partly on a user's computer, as a stand-alone software package, partly on a user's computer and partly on a remote computer, or entirely on a remote computer or server. In the latter scenario, a remote computer may be connected to a user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider). In some embodiments, electronic circuitry including, for example, programmable logic circuitry, field-programmable gate arrays (FPGA), or programmable logic arrays (PLA) may execute the computer readable program instructions by utilizing state information of the computer readable program instructions to personalize the electronic circuitry, in order to perform aspects of the present invention.
Aspects of the present invention are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, may be implemented by computer readable program instructions.
These computer readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer readable program instructions may also be stored in a computer readable storage medium that can direct a computer, a programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer readable storage medium having instructions stored therein comprises an article of manufacture including instructions which implement aspects of the function/act specified in the flowchart and/or block diagram block or blocks.
The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other device to cause a series of operational steps to be performed on the computer, other programmable apparatus or other device to produce a computer implemented process, such that the instructions which execute on the computer, other programmable apparatus, or other device implement the functions/acts specified in the flowchart and/or block diagram block or blocks.
Referring to
As shown, the network environment 100 includes one or more computers 102, 106 interconnected by a network 104. The network 104 may include, for example, a local-area-network (LAN) 104, a wide-area-network (WAN) 104, the Internet 104, an intranet 104, or the like. In certain embodiments, the computers 102, 106 may include both client computers 102 and server computers 106 (also referred to herein as “host systems” 106). In general, the client computers 102 initiate communication sessions, whereas the server computers 106 wait for requests from the client computers 102. In certain embodiments, the computers 102 and/or servers 106 may connect to one or more internal or external direct-attached storage systems 112 (e.g., arrays of hard-disk drives, solid-state drives, tape drives, etc.). These computers 102, 106 and direct-attached storage systems 112 may communicate using protocols such as ATA, SATA, SCSI, SAS, Fibre Channel, or the like.
The network environment 100 may, in certain embodiments, include a storage network 108 behind the servers 106, such as a storage-area-network (SAN) 108 or a LAN 108 (e.g., when using network-attached storage). This network 108 may connect the servers 106 to one or more storage systems 110, such as arrays 110a of hard-disk drives or solid-state drives, tape libraries 110b or virtual tape systems 110b, individual hard-disk drives 110c or solid-state drives 110c, tape drives 110d, CD-ROM libraries, or the like. To access a storage system 110, a host system 106 may communicate over physical connections from one or more ports on the host 106 to one or more ports on the storage system 110. A connection may be through a switch, fabric, direct connection, or the like. In certain embodiments, the servers 106 and storage systems 110 may communicate using a networking standard such as Fibre Channel (FC).
Referring to
When data is striped across multiple disk drives having different data transfer rates, the disk drive whose data transfer rate is the lowest becomes a bottleneck in completing the write. In this context, the simplest approach to reduce the amount of time required for writing is to adjust the block size written to each of the disk drives in accordance with the data transfer rates of the disk drives. However, additional factors may need to be considered with append-only storage media such as tape. For example, when striping data across multiple tape cartridges 204, the amount of time required to move a head of a tape drive 202 to an appropriate location on tape and the time required to mount a tape cartridge 204 in a tape drive 202 may also need to be considered. Also relevant is the fact that different tape drives 202 may have different data transfer rates depending on the type and format of the tape cartridge 204 that is mounted in the tape drive 202.
Referring to
Referring to
Referring to
As shown in
Since the amount of the time required to write a block 300 to all of the tape cartridges 204 (i.e., tape cartridges n=1, 2, . . . , N) is ideally the same, a system of equations may be used to solve for the block size for each tape cartridge 204. Specifically, the system of equations shown in
Referring to
Referring to
The method 800 then selects 806 N tape drives 202 to which to write the file 200 to the N tape cartridges 204. As mentioned above, this may in certain embodiments be all the functioning tape drives 202 in the virtual tape system 110b. The method 800 then determines 808 the sizes of the blocks to be written to the tape cartridges 204. In certain embodiments, this may be accomplished using a system of equations such as that illustrated in
The flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts or carry out combinations of special purpose hardware and computer instructions.