A write operation to a rotating disk drive writes to a physical location on the disk. In other words, there is a one-to-one mapping between the received address and ‘the physical address on the disk. Solid state drives (SSDs) may have an indirection table that creates a virtual address space for the received storage operations. The indirection table maps the addresses for the received storage commands to the virtual addresses assigned to the data by the SSD.
The SSD may use a scratch storage area to remap data from one location to another. The larger the available scratch storage area, the more efficient the SSD is in remapping data to different locations. The storage device manufacturer or enterprise user may wish to reserve some percentage of Flash memory in the SSD drive for these scratch operations. For example, the SSD may be able to perform larger data block transfers in a shorter amount of time when a larger scratch storage area is reserved.
Once a storage area in the SSD is used for storing data, it may no longer be available for the scratch storage operations, even if the storage area is never used again for storing data. For example, some storage systems only support read and write operations and have no way to invalidate or “free up” previously used address space.
Some SSD drives provide a trim command that can invalidate data in previously written to address spaces. However, the trim command is only supported by certain combinations of operating systems and file management systems, such as those used with the Microsoft® Windows® operating system. As mentioned above, other operating systems do not support the trim command and can only issue read and write operations. Even within operating systems that do support the trim command, the trim commands are controlled by that operating system, not by the applications originating the storage operations. Thus, unused or underutilized storage space may unnecessarily reduce the available scratch space in a SSD drive and thus reduce overall drive efficiency.
The initiator 100 may be any device or application that writes and/or reads data to and from another device. For example, the initiator 100 may comprise one or more servers, server applications, database applications, routers, switches, client computers, personal computers, Personal Digital Assistants (PDA), smart phones, or any other wired or wireless computing device and/or software that accesses data in target 400 or disk drives 600.
In another example, the initiator 100 may comprise a stand-alone appliance, device, or blade, and the target 400 may comprise a stand-alone storage array of disk drives 500. In yet another example, the initiator 100 may be a processor or software application in a computer that accesses target 400 over an internal or external data bus.
Target 400 may be any device that stores data accessed by another device, application, software, initiator, or the like, or any combination thereof. Target 400 may be located in a personal computer or server, or may be a stand-alone device coupled to the initiator 100 via a computer bus or packet switched network connection.
In one example, the target 400 may comprise storage devices or storage servers that contain storage media such as solid state memory, rotating disk drives, solid state drives (SSD) or the like, or any combination thereof. For example, target 400 may contain multiple disk drives 500 that may exist locally within the same physical enclosure as storage processor 200, within a same enclosure with other targets 400, or exist externally in a chassis connected to target 400 and/or storage processor 200 through an interconnect mechanism.
In one example, the initiator 100, storage processor 200, and/or target 400 are coupled to each other via wired or wireless connections 12A and 12B. Different communication protocols can be used over connection 12A between initiator 100 and storage processor 200 and connection 12B between storage processor 200 and target 400. Example protocols may include Fibre Channel Protocol (FCP), Small Computer System Interface (SCSI), Advanced Technology Attachment (ATA) and encapsulated protocols such as Fibre Channel over Ethernet (FCoE), Internet Small Computer System Interface (ISCSI), Fibre Channel over Internet Protocol (FCIP), ATA over Ethernet (AoE), or the like, or any combination thereof.
Storage processor 200 may be any combination of hardware and/or software located in a storage appliance, wireless or wired router, server, gateway, firewall, switch, computer processing system, rate adapter, host bus adapter (HBA), chip on a motherboard, a piece of logic located in an integrated circuit, or the like, or any combination thereof. The initiator 100 may issue storage commands to the disk drives 500 in target 400 though the storage processor 200. The storage commands may include write commands and read commands that have associated storage addresses. The storage commands may be normalized by the storage processor 200 into block-level commands such as “reads” and “writes” of an arbitrary number of blocks.
Storage processor 200 may include disk drives 600 configured to accelerate storage accesses to disk drives 500. For example, the disk drives 600 may be used as a cache and/or tiering media for storing copies of data contained in disk drives 500. However, disk drives 600 may be used for any operation where storage processor 200 may want to access an alternative storage media.
Examples of how disk drives 600 may be used as a cache and/or tiering media are described in the following co-pending patent applications which are all herein incorporated by reference in their entirety: U.S. patent application Ser. No. 12/889,732 filed on Sep. 24, 2010; U.S. patent application Ser. No. 12/814,438 filed on Jun. 12, 2010; U.S. patent application Ser. No. 12/605,119 filed on Oct. 23, 2009; U.S. patent application Ser. No. 12/605,160 filed Oct. 23, 2009; and U.S. patent application Ser. No. 12/684,387 filed Jan. 8, 2010 which are all herein incorporated by reference in their entirety.
In one example, the disk drives 605 include a drive controller 610 that uses a drive queue 620 to manage the dispatch and ordering of storage commands 110 received from storage processor 200. In one example, the drive controller 610 may be implemented using an application specific integrated circuit (ASIC), however, other types of logic circuitry may also be used. In another example, drive controller 620 may be software executing within a processor, ASIC or other circuitry or device.
Drive controller 610 may access the different memory devices 630 for different storage commands 110. For example, drive controller 610 may stripe data over different combinations of memory devices 630 based on the amount of data associated with the storage commands 110. As a result, data associated with a first storage command 110 may be stored over multiple memory devices A-C, data associated with a second storage command 110 may only be stored in memory device A, and data associated with a third storage command 110 may be stored in memory device B.
The disk drives 500 in target 400 may have a similar structure as the disks drives 600 shown in
Command queue 300 may be associated with the disk drives 500 in target 400, the disk drives 600 in storage processor 200, or may include one set of command queues for disk drives 500 and a second set of command queues for disk drives 600.
Storage processor 200 may receive a storage command 110 in command queue 300 from initiator 100. Storage processor 200 may send the storage command 110 over connection 310 to disk drives 600 when the address associated with the storage command 110 contains an address for data contained in disk drives 600. When the address does not match an address associated with data in disk drives 600, storage processor 200 may forward the storage command 110 in command queue 300 over connection 12B to disk drives 500.
Storage processor 200 creates shadow SSD drives and uses accesses to the shadow drives to conduct operations not supported by the initiators. For example, the storage processor 200 may interpret accesses to the shadow drives as requests for trim operations that invalidate address locations in the corresponding physical drives 605. These trimming operations free up additional scratch space in the SSD drives 605 that can then be used by the drive controller 610 to more efficiently manage storage space and improve overall performance of disk drives 500 and/or disk drives 600.
Existing applications would require logical changes to utilize the trim operation and function. However, such application changes would not function unless the operating system itself supported the trim function. It is therefore advantageous to support the logical functionality of the trim operation without deviating from the universally supported read and write command model of the operating system. The system below allows for applications to utilize trim without explicit operating system support.
Normally, the storage processor 200 would respond back to the query 310 from initiator 100 by identifying the SSD drives 605A, 605B, and 605C that are actually physically attached to storage processor 200. However, with the utilization of the present system, storage processor 200 provides a response 320 that identifies both the three actual physical drives 605A, 605B, and 605C and three shadow drives 705A, 705B, and 705C.
The shadow drives 705 do not actually exist in the storage processor 200 but when accessed allow the initiator 100 to initiate alternative commands to the physical drives 605.
The three shadow drives 705A, 705B, and 705C are identified as having the same attributes as corresponding SSD drives 605A, 605B, and 605C, respectively. For example, if SSD drive 605A is a 160 gigabyte drive, shadow drive 705A may also be identified as a 160 gigabyte drive. It should also be noted that in at least one example, the same protocol previously used for identifying physical SSD drives 605 is also used for identifying the associated shadow drives 705. The only difference is that the shadow drives 705 do not actually exist and the initiator 100 knows which drives identified by the storage processor 200 are actual physical drives 605 and which drives are associated virtual shadow drives 705.
Any scheme can be used for associating or mapping a shadow/virtual drive 705 with an actual physical drive 605. For example, the storage processor 200 may enumerate the set of physical drives 605 on a first bus segment and enumerate the second set of shadow drives 705 on a second bus segment. In this example, the SSD drive 605A and the associated shadow drive 705A may be assigned a same drive number by the storage processor 200 but assigned to different bus segments, such as on a SCSI bus.
In an alternate embodiment, the first set of physical drives 605A, 605B, and 605C may be identified by the storage processor 200 using even drive identifier values and the second set of shadow drives 705A, 705B, and 705C may be identified by the storage processor 200 using odd drive identifier values. For example, the shadow drive 705A may be assigned a drive identifier value of 1 and the associated physical drive 605A may be assigned a drive identifier value of zero. Similarly, the shadow drive 705B may be assigned a drive identifier value of 3 and the associated physical drive 605B may be assigned a drive identifier value of 2.
Of course any other type of indexing or mapping may be used to associate the shadow drives 705 with the physical drives 605. In yet another example, the first 32 drive numbers may be associated with physical drives 605 and the second 32 drive numbers may be associated with the shadow drives 705. For example, a first one of the first 32 SSD drive numbers may associated with physical drive 605A and a first one of the second 32 shadow drive numbers may be associated with shadow drive 705A and therefore linked to associated physical drive 605A.
In one example, a read command 356 addressed to a shadow drive 705 will perform a noop. For example, the storage processor 200 will not initiate any read or write operation to the physical drive 605 when the read command 356 is received. In another example, the read command 356 to the shadow drive 705 may cause the storage processor 200 to initiate an alternative storage operation. For example, the read command 356 may cause the storage processor 200 to conduct a cache pre-load or pre-read operation. In another example, the read command 356 may cause the storage processor 200 to read data from a rotating disk drive and store the data into a solid state memory. In yet another example, the read command 356 may cause the storage processor 200 to read or identify performance metrics or other information from the physical drive 605.
In one example, a write command 358 to a shadow drive 705 causes the storage processor 200 to execute a trim operation in the associated physical drive 605. For example, the storage processor 200 may invalidate data in the physical drive 605 at the address identified in the write command 358. In this example, the storage processor 200 determines the write command 358 is directed to one of the shadow drives 705, identifies the associated physical drive 605, and then sends a trim command to the associated physical drive 605 that invalidates the particular block address range identified in the write command 358.
Any commands supported by the SSD drives 605 may be initiated by the storage processor 200 in response to receiving a shadow read command 356 or shadow write command 358. For example, a write command 358 addressed to a first set of shadow above. A write command 358 addressed to a second set of shadow drives 705 may cause the storage processor 200 to conduct a pre-load operation, maintenance operation, or any other operation or command that may be supported by the physical drives 605.
Referring to
In operation 806, the initiator 100 may want to conduct an alternate storage operation other than a read or write. For example, the initiator 100 may want to invalidate data in a particular address region of a physical drive 605. The initiator 100 in operation 808 generates a shadow write command that includes the identifier of a shadow drive 705 associated with the physical drive 605 containing the data to be invalidated.
For example, the physical drive 605 with the data to be invalidated may have an associated drive identifier of 2 (e.g., physical drive 605B in
When the storage command is directed to a shadow drive in operation 904, the storage processor 200 determines if the command is a write command or a read command in operation 906. When the storage command is a write operation to a shadow drive, the storage processor 200 in operation 908 initiates a trim command to the associated physical drive 605. For example, the storage processor 200 will identify the physical drive 605 associated with the shadow drive 705 identified in the storage command 705. The storage processor 200 will then send a trim command in the identified physical drive 605 that invalidates the data at the address identified in the write command.
As mentioned above the storage processor 200 may operate as a proxy device between the initiator 100 and target 400 and use the disk drives 600 in
By using read and write operations to invalidate data, the storage processor 200 provides the physical drive with more scratch memory space for redirection and other maintenance operations. For example, the physical drive will have more space for reassembling disjoined blocks of data into a contiguous address block. These reassembled blocks can then be accessed more quickly by the initiator 100.
The shadow drives 705 also allow the initiator 100 to execute alternative storage operations, such as trim operations, that are not supported in some storage systems. For example, the trim operation can be initiated in a storage system that can only send read and write commands.
Hardware and Software
Several examples have been described above with reference to the accompanying drawings. Various other examples are also possible and practical. The systems and methodologies may be implemented or applied in many different forms and should not be construed as being limited to the examples set forth above. Some systems described above may use dedicated processor systems, micro controllers, programmable logic devices, or microprocessors that perform some or all of the commands. Some of the commands described above may be implemented in software or firmware and other commands may be implemented in hardware.
For the sake of convenience, the commands are described as various interconnected functional blocks or distinct software modules. This is not necessary, however, and there may be cases where these functional blocks or modules are equivalently aggregated into a single logic device, program or command with unclear boundaries. In any event, the functional blocks and software modules or features of the flexible interface can be implemented by themselves, or in combination with other commands in either hardware or software.
Digital processors, software and memory nomenclature
As explained above, embodiments of this disclosure may be implemented in a digital computing system, for example a CPU or similar processor. More specifically, the term “digital computing system,” can mean any system that includes at least one digital processor and associated memory, wherein the digital processor can execute instructions or “code” stored in that memory. (The memory may store data as well.)
A digital processor includes but is not limited to a microprocessor, multi-core processor, Digital Signal Processor (DSP), Graphics Processing Unit (GPU), processor array, network processor, etc. A digital processor (or many of them) may be embedded into an integrated circuit. In other arrangements, one or more processors may be deployed on a circuit board (motherboard, daughter board, rack blade, etc.). Embodiments of the present disclosure may be variously implemented in a variety of systems such as those just mentioned and others that may be developed in the future. In a presently preferred embodiment, the disclosed methods may be implemented in software stored in memory, further defined below.
Digital memory, further explained below, may be integrated together with a processor, for example Random Access Memory (RAM) or FLASH memory embedded in an integrated circuit Central Processing Unit (CPU), network processor or the like. In other examples, the memory comprises a physically separate device, such as an disk drive, storage array, or portable FLASH device. In such cases, the memory becomes “associated” with the digital processor when the two are operatively coupled together, or in communication with each other, for example by an I/O port, network connection, etc. such that the processor can read a file stored on the memory. Associated memory may be “read only” by design (ROM) or by virtue of permission settings, or not. Other examples include but are not limited to WORM, EPROM, EEPROM, FLASH, etc. Those technologies often are implemented in solid state semiconductor devices. Other memories may comprise moving parts, such a conventional rotating disk drive. All such memories are “machine readable” in that they are readable by a compatible digital processor. Many interfaces and protocols for data transfers (data here includes software) between processors and memory are well known, standardized and documented elsewhere, so they are not enumerated here.
Storage of Computer Programs
As noted, some embodiments may be implemented or embodied in computer software (also known as a “computer program” or “code”; we use these terms interchangeably). Programs, or code, are most useful when stored in a digital memory that can be read by one or more digital processors. The term “computer-readable storage medium” (or alternatively, “machine-readable storage medium”) includes all of the foregoing types of memory, as well as new technologies that may arise in the future, as long as they are capable of storing digital information in the nature of a computer program or other data, at least temporarily, in such a manner that the stored information can be “read” by an appropriate digital processor. The term “computer-readable” is not intended to limit the phrase to the historical usage of “computer” to imply a complete mainframe, minicomputer, desktop or even laptop computer. Rather, the term refers to a storage medium readable by a digital processor or any digital computing system as broadly defined above. Such media may be any available media that is locally and/or remotely accessible by a computer or processor, and it includes both volatile and non-volatile media, removable and non-removable media, embedded or discrete.
Having described and illustrated a particular example system, it should be apparent that other systems may be modified in arrangement and detail without departing from the principles described above. Claim is made to all modifications and variations coming within the spirit and scope of the following claims.
The present application is a continuation application of U.S. Ser. No. 13/039,162, filed on Mar. 2, 2011, now U.S. Pat. No. 8,635,416, issued on Jan. 21, 2014, which is incorporated herein by reference.
Number | Name | Date | Kind |
---|---|---|---|
5954796 | McCarty et al. | Sep 1999 | A |
6041366 | Maddalozzo, Jr. et al. | Mar 2000 | A |
6401147 | Sang et al. | Jun 2002 | B1 |
6636982 | Rowlands | Oct 2003 | B1 |
6678795 | Moreno et al. | Jan 2004 | B1 |
6721870 | Yochai et al. | Apr 2004 | B1 |
6742084 | Defouw et al. | May 2004 | B1 |
6789171 | Desai et al. | Sep 2004 | B2 |
6810470 | Wiseman et al. | Oct 2004 | B1 |
7017084 | Ng et al. | Mar 2006 | B2 |
7089370 | Luick | Aug 2006 | B2 |
7110359 | Acharya | Sep 2006 | B1 |
7856533 | Hur et al. | Dec 2010 | B2 |
7870351 | Resnick | Jan 2011 | B2 |
7873619 | Faibish et al. | Jan 2011 | B1 |
7975108 | Holscher et al. | Jul 2011 | B1 |
8010485 | Chatterjee et al. | Aug 2011 | B1 |
20020035655 | Finn et al. | Mar 2002 | A1 |
20020175998 | Hoang | Nov 2002 | A1 |
20020194434 | Kurasugi | Dec 2002 | A1 |
20030012204 | Czeiger et al. | Jan 2003 | A1 |
20030167327 | Baldwin et al. | Sep 2003 | A1 |
20030177168 | Heitman et al. | Sep 2003 | A1 |
20030210248 | Wyatt | Nov 2003 | A1 |
20040128363 | Yamagami et al. | Jul 2004 | A1 |
20040146046 | Jo et al. | Jul 2004 | A1 |
20040186945 | Jeter, Jr. et al. | Sep 2004 | A1 |
20040215923 | Royer, Jr. | Oct 2004 | A1 |
20050025075 | Dutt et al. | Feb 2005 | A1 |
20050195736 | Matsuda | Sep 2005 | A1 |
20050278529 | Kano | Dec 2005 | A1 |
20060005074 | Yanai et al. | Jan 2006 | A1 |
20060034302 | Peterson | Feb 2006 | A1 |
20060053263 | Prahlad et al. | Mar 2006 | A1 |
20060075191 | Lolayekar et al. | Apr 2006 | A1 |
20060112232 | Zohar et al. | May 2006 | A1 |
20060212524 | Wu et al. | Sep 2006 | A1 |
20060218389 | Li et al. | Sep 2006 | A1 |
20060242377 | Kanie et al. | Oct 2006 | A1 |
20060277329 | Paulson et al. | Dec 2006 | A1 |
20070050538 | Northcutt et al. | Mar 2007 | A1 |
20070050548 | Bali et al. | Mar 2007 | A1 |
20070079105 | Thompson | Apr 2007 | A1 |
20070118710 | Yamakawa et al. | May 2007 | A1 |
20070124407 | Weber et al. | May 2007 | A1 |
20070192444 | Ackaouy et al. | Aug 2007 | A1 |
20070233700 | Tomonaga | Oct 2007 | A1 |
20070283086 | Bates | Dec 2007 | A1 |
20080028162 | Thompson | Jan 2008 | A1 |
20080098173 | Chidambaran et al. | Apr 2008 | A1 |
20080104363 | Raj et al. | May 2008 | A1 |
20080126584 | Mullis et al. | May 2008 | A1 |
20080162864 | Sugumar et al. | Jul 2008 | A1 |
20080215827 | Pepper | Sep 2008 | A1 |
20080215834 | Dumitru et al. | Sep 2008 | A1 |
20080250195 | Chow et al. | Oct 2008 | A1 |
20080320269 | Houlihan et al. | Dec 2008 | A1 |
20090006725 | Ito et al. | Jan 2009 | A1 |
20090006745 | Cavallo et al. | Jan 2009 | A1 |
20090034377 | English et al. | Feb 2009 | A1 |
20090110000 | Brorup | Apr 2009 | A1 |
20090138665 | Suzuki | May 2009 | A1 |
20090240873 | Yu et al. | Sep 2009 | A1 |
20090259800 | Kilzer et al. | Oct 2009 | A1 |
20090262741 | Jungck et al. | Oct 2009 | A1 |
20090276588 | Murase | Nov 2009 | A1 |
20090307388 | Tchapda | Dec 2009 | A1 |
20100011154 | Yeh | Jan 2010 | A1 |
20100030809 | Nath | Feb 2010 | A1 |
20100080237 | Dai et al. | Apr 2010 | A1 |
20100088469 | Motonaga et al. | Apr 2010 | A1 |
20100115206 | de la Iglesia et al. | May 2010 | A1 |
20100115211 | de la Iglesia et al. | May 2010 | A1 |
20100122020 | Sikdar et al. | May 2010 | A1 |
20100125857 | Dommeti et al. | May 2010 | A1 |
20100169544 | Eom et al. | Jul 2010 | A1 |
20100174939 | Vexler | Jul 2010 | A1 |
20110047347 | Li et al. | Feb 2011 | A1 |
20110258362 | McLaren et al. | Oct 2011 | A1 |
20120198176 | Hooker et al. | Aug 2012 | A1 |
Entry |
---|
Friedman et al., Windows 2000 Performance Guide, File Cache Performance and Tuning [reprinted online], O'Reilly Media, Jan. 2002 [retrieved on Oct. 29, 2012], Retrieved from the Internet, http://technet.microsoft.com/en-us/library/bb742613.aspx#mainsection. |
Rosenblum et al., The LFS Storage Manager, Proceedings of the 1990 Summer Usenix, 1990, pp. 315-324. |
Stolowitz Ford Cowger Listing of Related Cases, Feb. 7, 2012. |
Number | Date | Country | |
---|---|---|---|
20140136768 A1 | May 2014 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 13039162 | Mar 2011 | US |
Child | 14157945 | US |