1. Field of the Invention
The present invention relates generally to storage devices and methods for storing and retrieving computer data, and more particularly to systems and methods for use in emulated tape drive storage systems.
2. Description of the Related Art
With the increasing popularity of Internet commerce and network centric computing, businesses and other entities are becoming more and more reliant on large amounts of information. Protecting critical data from loss due to systems failures, virus attacks, and the like is therefore of primary importance.
Tape drives have long been a choice for storing archival back-up data in information systems. Historically, many such tape drives have used data compression to maximize the amount of data that can be stored on the tape. Tape, however, is a relatively slow and inefficient storage medium compared to hard drives or disks. Consequently emulated “tape” drives that use arrays of hard drives or disks have become more popular. For example, a hard drive may appear to a host computer or storage system as a storage tape device where data is stored, organized, and retrieved as if the hard drive is a tape storage device. The actual data may be stored in any of a variety of fashions, but will emulate or appear to the host processor and applications running that the data is stored on a physical tape storage device. For instance, the data will be stored and retrieved as a long serial sequence of data from the storage system similar to that of a physical tape storage device.
Emulated tape drive storage systems often rely on software-based data compression techniques to enable the storage of more data. Although, such compression techniques may increase the capacity of the drives and assist in emulating a tape drive capacity, the software compression techniques generally decrease the performance or speed of the input/output (I/O) operations of the storage devices because of the delay caused to serially compress the data. Therefore, software compression techniques are generally utilized, or turned “on” only when emulating tape drives and turned “off” otherwise.
Conventional storage systems that may emulate tape storage systems generally include high-speed servers having a Redundant Array of Independent Discs (RAID). Additionally, such systems may perform data protection features such as Error Correction Codes (ECC) or Cyclic Redundancy Codes (CRCs) through software based solutions. Such software-based systems generally reduce performance and speed as well.
It is desired that emulated storage tape systems have compression comparable to tape drives in order to approximately double (on average) the storage capacity of the systems while also increasing their performance. It is further desired for inexpensive RAID (either ATA or SATA) systems to have higher reliability for use in storage tape systems.
According to one aspect of the present invention an exemplary storage system for storing data from a host system and emulating a storage device is described. The storage system may include a compression device associated with a controller and at least one storage device where the controller is adapted to receive a sequence of data, divide the sequence of data into two or more blocks of data, and compress at least two of the two or more blocks in parallel. The system may further create an index associated with the blocks of data to output the data as a continuous stream of data. The exemplary compression device may provide for fibre channel, Ethernet, iSCSI, and other host bus interfaces as well as hardware data compression, ECC, CRC, and data encryption. In one example, dedicated hardware is included to perform parallel compression of the blocks, and reconfigurable hardware is included to perform various data protection methods, such as error correction methods and the like.
According to another aspect of the present invention an exemplary method for storing data and emulating a storage tape device is described. The method may include receiving a stream of data, dividing the stream of data into two or more blocks, compressing at least two of the two or more blocks in parallel, and indexing the two or more blocks. The compressed data may then be stored in a storage device. The method may further include dedicated hardware to perform parallel compression of the blocks, and various data protection methods, such as error correction methods, using reconfigurable hardware.
The present invention and its various embodiments are better understood upon consideration of the detailed description below in conjunction with the accompanying drawings and claims.
The following description is presented to enable any person of ordinary skill in the art to make and use the invention. Descriptions of specific devices, techniques, and applications are provided only as examples. Various modifications to the examples described herein will be readily apparent to those of ordinary skill in the art, and the general principles defined herein may be applied to other examples and applications without departing from the spirit and scope of the invention. Thus, the present invention is not intended to be limited to the examples described herein and shown, but is to be accorded the scope consistent with the appended claims.
In the following description, an exemplary compression device is disclosed that provides a hardware compression scheme that may increase the performance and efficiency of a storage system, and specifically, emulated storage tape systems, e.g., magnetic tape libraries or the like. The exemplary compression device includes parallel compression hardware that is configured to compress portions or blocks of the incoming data in parallel before storing the compressed portions of the incoming data. An index associated with the incoming order of the blocks of data is created and output to a main memory as a contiguous stream of compressed data. The exemplary compression device and method may increase the capacity of the storage system as well as increase the speed or throughput of the storage system by compressing the data in parallel. For example, the exemplary compression device may virtually exceed effective bandwidth between a host computer and appliances (and between sub-appliances) by compressing the data before transmission thereby increasing the performance of a storage system.
As described below, the exemplary compression device may include an adapter coupled to a controller of a disk based or hard drive storage system. In one example, the compression device is integral with the controller, e.g., on the controller motherboard, for a disk or hard drive based storage system. Alternatively, the compression device may include a separate device or card that may provide a host interface and compression functions such that an off the shelf motherboard may be used. The compression device may further include data protection techniques such as Error Correction Codes (ECC), Cyclic Redundancy Codes (CRC), encryption, and the like.
Additionally, various applications including an exemplary compression device may include the following features alone or in combination. The compression device may provide for Fibre channel, Ethernet, internet small computer system interface (iSCSI), and other host bus (box to box) interfaces; hardware data compression, ECC, CRC, and data encryption for additional data protection; high speed dual ported memory and bridge capabilities between private high-speed local bus and main controller high-speed local bus; enhanced emulation functions that include storage device emulation, library emulation and visualization, controller-to-controller coherency, and cache backup for fail-over; I/O acceleration/performance that scales with the number of adapters in the controller; and for additional host interfaces via a mezzanine card.
Host computers 20, primary storage locations 40, and emulated tape systems 50 can be connected directly such as using parallel small computer system interface (SCSI), integrated device electronics (IDE), and the like. Alternatively or additionally, host computers 20, storage units 40, and emulated tape systems 50 can be connected through a network topology such as fiber channel, Ethernet, and the like. If multiple storage units 40 are used, they may be daisy-chained together to increase capacity.
Host computers 20 may include various types of computers and computer systems such as personal computers, personal digital assistants, web-enabled appliances, cell phones, and the like, or any combination thereof. A host computer 20 may also be a server. Moreover, when exemplary information infrastructure 10 includes multiple servers, the servers may communicate through a client network. The servers may include various types of servers such as those based on the Unix, Linux, or Microsoft Windows operating systems, and the like, or a combination thereof. The client network may include any type of network such as the Internet, a corporate intranet, a wide area network, a local area network, a wireless network, and the like, or any combination thereof.
Primary storage unit 40 may be arranged in a number of different types of configurations such as a storage array network (SAN), network attached storage (NAS), direct attached storage, and the like. In other examples, the storage network 30 may reside in the chassis or cabinet of servers, stand alone storage devices, and the like, or a combination thereof.
Emulated tape systems 50 may include various types of devices capable of storing data such as rack mount servers connected to low cost disk arrays or custom integrated hardware systems and the like. In one exemplary embodiment, an emulated tape system 50 may include a plurality of hard drives or disks configured to emulate a storage tape drive system. For a more detailed description of an exemplary storage unit configured to emulate a storage tape drive system, see U.S. patent application Ser. No. 10/072,437 entitled EMULATED BACKUP TAPE DRIVE USING DATA COMPRESSION, filed on Feb. 5, 2002, and U.S. patent application Ser. No. 10/072,527 entitled STORAGE SYSTEM UTILIZING AN ACTIVE SUBSET OF DRIVES DURING DATA STORAGE AND RETRIEVAL OPERATIONS, filed on Feb. 5, 2002, both of which are incorporated herein by reference in their entirety as if fully set forth herein.
In the exemplary embodiment depicted in
Additionally, in the exemplary system depicted in
It should be recognized that controller 110 and storage unit 52 may include various additional components such as connectors and ports. For example, connectors such as Ethernet connectors 116, 117 and serial connectors 118, 119 may be included. Additionally, it should be recognized that controller 110 and storage unit 52 may include fewer components than those depicted in
In the exemplary system depicted in
In the exemplary embodiment depicted in
When storing data, controller 110 processes incoming data sequences from a host computer and outputs the data through PCI 316 to a Host Bus Adapter (HBA) 320, which is connected to a storage device such as a RAID. It should be recognized that HBA 320 may be a component within or in communication with controller 110 depending on the particular application and design considerations. When retrieving data, controller 110 receives instructions from a host computer, accesses a storage device, retrieves the data from the storage device, and transmits the data to the host computer that requested the data.
In the exemplary embodiment depicted in
In one exemplary method of storing data, compression device 100 compresses data received by controller 110 from the host computer before storing the data in a storage unit. When retrieving data, compression device 100 decompresses the previously compressed data retrieved from a storage unit by controller 110 before transmitting the decompressed data to a host computer.
As described above, during a data storing process, data from a host computer enters through host data interface controller 410 to be stored on a storage unit. The present exemplary controller 410 may include a dual fibre channel (FC) to PCIX controller. Controller 410 may include any suitable chip, such as a dual fibre channel to PCI-X bridge IC for active/active fail-over manufactured by Qlogic or the like. A dual FC interface is generally desirable to connect to a SAN or the like. Further, a dual interface may allow for active/active failover, throughput, single IC implementation, and generally conserves board space. Controller 410 may route data to data buffer memory 434 or compression manager 418 depending on address.
The present exemplary compression device 400 includes a separate local bus that provides data transfer between controller 410 and high-speed dual ported memory. The separate local bus allows the system to isolate data transfers from the main controller local bus thereby improving performance of the local memory.
In the present exemplary embodiment, a stream of data, which is characteristic of data in applications emulating sequential access storage devices such as tape drives, can be compressed using parallel compression devices 442. More particularly, a stream of data can be divided into multiple data portions/blocks, and multiple compression devices 442 compress the data portions/blocks of the data stream in parallel, such as in a ping-pong fashion.
However, the resulting compressed data stream of data portions/blocks may no longer reside in contiguous physical memory due to the unpredictable efficiency of compression for an arbitrary portion of data. For example, one data block/portion may compress more or less than another data portion/block. Thus, in the present exemplary embodiment, the data portions/blocks of the data stream are indexed, for example using a scatter/gather list, based on their original sequence before being compressed and stored in a storage unit. During a retrieve process, the compressed data can then be retrieved from a storage unit and reassembled into a logically contiguous entity using the index (e.g., the scatter/gather list). It should be noted that the use of compression may also serve as an integrity check of the data. For example, there is a high probability of detecting an error if the data has been corrupted when the compressed data does not decompress to its original size.
During the data storing process, data protection functions may be performed on the data. Exemplary data protection functions may include data compression, Error Correction Codes (ECCs), Cyclic Redundancy Checks (CRCs), data encryption, and the like, or a combination thereof. In the present exemplary embodiment, an ECC may include a Reed-Solomon encoder/decoder technique. An exemplary CRC may include a 32 bit CRC with polynomial: X32+X26+X23+X22+X16+X12+X11+X10+X8+X7+X5+X4+X2+X+1. It should be recognized by those skilled in the art that other ECC and CRC techniques may be implemented and are contemplated. Various data encryption techniques may also be employed to protect the data as will be appreciated by those skilled in the art.
In the present exemplary embodiment, data protection functions are performed using reconfigurable hardware, such as FPGA 430. In addition to data protection functions, FPGA 430 may be programmed to perform various additional functions such as those of a dual ported memory controller, separate local bus to memory bridge, separate local bus to main controller local bus bridge, main controller local bus to dual ported memory bridge, private local bus arbiter, and the like. One advantage of using a FPGA is that the FPGA can be reconfigured/reprogrammed in the field, e.g., enhancements may be made as desired. FPGA may also be known or referred to in the art as PAL, PLA, FPLA, EPLD, CPLD, EEPLD, and LCA. However, it should be recognized that a non-programmable logic device, such as an ASIC, may be used depending on the application.
Additionally, in the present exemplary embodiment, the parallel data compression function may be specifically performed using a dedicated compression hardware device. More particularly, in the exemplary embodiment depicted in
After data protection functions are performed, the processed data, which in the present exemplary compression device 400 has been compressed as data portions/blocks, is then transferred via Direct Memory Access (DMA) into either local dual ported memory or main controller memory to be stored in a storage unit. As described above, the storage unit may include a RAID (ATA or SATA), JBOD storage array, and the like.
In the present exemplary compression device 400, data transfer functions and data protection functions are managed and/or performed via a local microprocessor, compression manager 418. Compression manager 418 processes interrupts and accumulates a sequence of data from the host processor according to program buffer memory 422 and cues up compressors 442. Additionally, emulation functions and upper level management of single or multiple compression adapters may be performed via the controller processor residing on the motherboard (see, e.g.,
The present exemplary compression device 500 includes a dual FC to PCIX controller 510, PCI-X Bus 514, integrated microprocessor compression manager 518, boot flash 519, internal device port 520, DDR memory bus 522, FPGA 530, data buffer memory 534, data bus 538, compression hardware 542, PMC slot 546, and PCI-X bus to the server 550.
The bolded dotted arrows indicated an exemplary path of data to and from the host interface to the main memory in compression device 500. During a storing process, data from a host computer enters through controller 510 to be stored in a storage unit. The flow of data proceeds through separate local bus PCI-X Bus 514, controlled by compression manager 518 as described above, through FPGA 530 and into data buffer memory 534. From data buffer memory 534, the data proceeds to dual compression ASICs chips included in compression hardware 542. As described above, the data may be divided into multiple data portions/blocks and compressed in parallel. The data portions/blocks may be indexed and routed via DMA as a sequential series of compressed data into local dual ported memory or main controller memory to be stored in a storage unit(s).
During readout or a storage retrieval process, data from the storage unit is retrieved from the storage unit(s), decompressed, and reassembled into the original sequential stream of data using the index created when storing the data blocks. In this manner, compression device 500 may emulate a sequential access storage device such as magnetic storage tape devices and/or magnetic tape libraries.
The above detailed description is provided to illustrate exemplary embodiments and is not intended to be limiting. It will be apparent to those skilled in the art that numerous modification and variations within the scope of the present invention are possible. For example, various hardware implementations with similar functions will be recognized and are contemplated. Further, numerous other processes not explicitly described herein may be used within the scope of the exemplary methods and structures described as will be recognized by those skilled in the art.
The present application claims benefit of earlier filed provisional patent application, U.S. application Ser. No. 60/466,450, filed on Apr. 28, 2003, and entitled “MULTI-PORT DATA PROTECTION APPARATUS AND METHODS OF DATA PROTECTION,” which is hereby incorporated by reference as if fully set forth herein.
Number | Name | Date | Kind |
---|---|---|---|
5167034 | MacLean et al. | Nov 1992 | A |
5293388 | Monroe et al. | Mar 1994 | A |
5684986 | Moertl et al. | Nov 1997 | A |
6049848 | Yates et al. | Apr 2000 | A |
6067587 | Miller et al. | May 2000 | A |
6094532 | Acton et al. | Jul 2000 | A |
6128698 | Georgis | Oct 2000 | A |
6145069 | Dye | Nov 2000 | A |
6269464 | Boussina et al. | Jul 2001 | B1 |
6324497 | Yates et al. | Nov 2001 | B1 |
6389503 | Georgis et al. | May 2002 | B1 |
6629062 | Coffey et al. | Sep 2003 | B2 |
6671832 | Apisdorf | Dec 2003 | B1 |
6819271 | Geiger et al. | Nov 2004 | B2 |
6883079 | Priborsky | Apr 2005 | B1 |
20020003881 | Reitmeier et al. | Jan 2002 | A1 |
20020105507 | Tranchina et al. | Aug 2002 | A1 |
20020166058 | Fueki | Nov 2002 | A1 |
20030007779 | Miyata et al. | Jan 2003 | A1 |
20030041162 | Hochmuth et al. | Feb 2003 | A1 |
20030086300 | Noyes et al. | May 2003 | A1 |
20030149700 | Bolt | Aug 2003 | A1 |
20030149840 | Bolt et al. | Aug 2003 | A1 |
20030185302 | Abrams, Jr. | Oct 2003 | A1 |
20040003154 | Harris et al. | Jan 2004 | A1 |
20040010660 | Konshak et al. | Jan 2004 | A1 |
20040181388 | Yip et al. | Sep 2004 | A1 |
Number | Date | Country | |
---|---|---|---|
20040243745 A1 | Dec 2004 | US |
Number | Date | Country | |
---|---|---|---|
60466450 | Apr 2003 | US |