Data storage system garbage collection based on at least one attribute

Description

BACKGROUND

Data storage devices (DSDs) are often used with a host in a data storage system to record data on or to reproduce data from a recording media. As one type of DSD, a disk drive can include a rotating magnetic disk and a head actuated over the disk to magnetically write data to and read data from the disk. Such disks include a plurality of radially spaced, concentric tracks for recording data.

Shingled Magnetic Recording (SMR) has been introduced as a way of increasing the amount of data that can be stored in a given area on a disk by increasing the number of Tracks Per Inch (TPI). SMR increases TPI by using a relatively wide shingle write head to overlap tracks like roof shingles. The non-overlapping portion then serves as a narrow track that can be read by a narrower read head.

Although a higher number of TPI is ordinarily possible with SMR, the overlap in tracks can create a problem when writing data since new writes to a previously overlapped track affects data written in the overlapping track. For this reason, tracks are sequentially written to avoid affecting previously written data.

Managing sequentially written data for SMR media typically includes the DSD using an indirection system to translate between different addressing schemes to ensure that data is sequentially written. When data is modified for a particular Logical Block Address (LBA), the indirection system allows the DSD to sequentially write the modified data to a new location and remap the LBA for the data to the new location. The old version of the data at the previous location becomes obsolete or invalid data.

In order to free up space on the disk, a Garbage Collection (GC) process can be performed to make the portions of the disk storing invalid or obsolete data available for storing valid data. This can be accomplished during a GC process by relocating the valid data from a particular area on the disk and leaving invalid data to be overwritten. Other types of storage media using indirection, such as solid-state memory, may also use GC to free up portions of the memory storing invalid data.

BRIEF DESCRIPTION OF THE DRAWINGS

The features and advantages of the embodiments of the present disclosure will become more apparent from the detailed description set forth below when taken in conjunction with the drawings. The drawings and the associated descriptions are provided to illustrate embodiments of the disclosure and not to limit the scope of what is claimed.

FIG. 1 is a block diagram depicting a data storage system according to an embodiment.

FIG. 2 is a block diagram including a Data Storage Device (DSD) of FIG. 1 according to an embodiment.

FIG. 3A is a flowchart for a Garbage Collection (GC) process according to an embodiment.

FIG. 3B is a flowchart for a data coherency process during data relocation according to an embodiment.

FIG. 4 is an implementation environment according to an embodiment.

FIG. 5 is another implementation environment according to an embodiment.

FIG. 6A is a flowchart for a GC process according to an embodiment.

FIG. 6B is a flowchart for a data coherency process during data relocation according to an embodiment.

FIG. 7 is a conceptual diagram illustrating the assignment of zones to different logical volumes and the assignment of a zone as a destination portion according to an embodiment.

FIG. 8 is a conceptual diagram illustrating the assignment of multiple zones to a logical volume and the assignment of a zone as a destination portion according to an embodiment.

DETAILED DESCRIPTION

In the following detailed description, numerous specific details are set forth to provide a full understanding of the present disclosure. It will be apparent, however, to one of ordinary skill in the art that the various embodiments disclosed may be practiced without some of these specific details. In other instances, well-known structures and techniques have not been shown in detail to avoid unnecessarily obscuring the various embodiments.

System Overview

FIG. 1 shows data storage system 100 according to an embodiment that includes host 101 and Data Storage Devices (DSDs) 107, 108, and 109. System 100 can be, for example, a computer system (e.g., server, desktop, cloud storage device, data archiving system, etc.) or other electronic device such as a Digital Video Recorder (DVR). In this regard, system 100 may be a stand-alone system or part of a network, such as network 122. Those of ordinary skill in the art will appreciate that system 100 and DSD 106 can include more or less than those elements shown in FIG. 1 and that the disclosed processes can be implemented in other environments.

In the example embodiment of FIG. 1, DSDs 106, 107, 108, and 109 can be located in one location or can be separated at different locations. As shown in FIG. 1, DSD 106 is a part of host 101 and stores applications for execution on host 101, while DSDs 107, 108 and 109 primarily store user data of host 101.

Input device 102 can be a keyboard, scroll wheel, or pointing device allowing a user of system 100 to enter information and commands to system 100, or to allow a user to manipulate objects displayed on display device 104. In other embodiments, input device 102 and display device 104 can be combined into a single component, such as a touch-screen that displays objects and receives user input.

In the embodiment of FIG. 1, host 101 includes Central Processing Unit (CPU) 110 which can be implemented using one or more processors for executing instructions including a microcontroller, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA), hard-wired logic, analog circuitry and/or a combination thereof. A processor of host 101 as referenced hereinafter can be one or more of the foregoing processors or another processor configured to perform functions described herein. CPU 110 interfaces with host bus 113. Also interfacing with host bus 113 are Random Access Memory (RAM) 112, input interface 114 for input device 102, display interface 116 for display device 104, Read Only Memory (ROM) 118, and network interface 120 for interfacing with network 122.

RAM 112 is a volatile memory of host 101 that interfaces with host bus 113 to provide information stored in RAM 112 to CPU 110 during execution of instructions in software programs such as device drivers 14 or Operating System (OS) 20. More specifically, CPU 110 first loads computer-executable instructions from DSD 106 into a region of RAM 112. CPU 110 can then execute the stored process instructions from RAM 112. Data such as data to be stored in DSDs 106, 107, 108, or 109, or data retrieved from DSDs 106, 107, 108 and 109 can also be stored in RAM 112 so that the data can be accessed by CPU 110 during execution of software programs to the extent that such software programs have a need to access and/or modify the data.

As shown in FIG. 1, DSD 106 can be configured to store one or more of: Garbage Collection (GC) manager 10, application 12, device drivers 14, file system 16, translation module 18, OS 20, and mapping table 28. GC manager 10 includes computer-executable instructions for DSDs 106, 107, 108 and 109 for performing garbage collection processes as discussed in more detail below.

In other embodiments, any one or more of GC manager 10, application 12, device drivers 14, file system 16, translation module 18, OS 20, or mapping table 28 can reside on DSDs 106, 107, 108, or 109. In one such example, GC manager 10 may reside at each of DSDs 106, 107, 108, and 109 so as to distribute execution of GC manager 10 throughout system 100.

Application 12 can include, for example, a program executed by host 101 that can request or modify user data stored in DSDs 107, 108, or 109, such as a data archiving program or multimedia program. Device drivers 14 provide software interfaces on host 101 for devices such as input device 102, display device 104, or DSDs 106, 107, 108, and 109. In addition, DSD 106 can store Operating System (OS) 20, which includes kernel 22, File System (FS) intercept 24, and storage stack 26. The contents of DSD 106 may be loaded into resident memory of host 101 (e.g., RAM 112) for execution and/or state tracking during operation of host 101.

File system (FS) 16 can be a file system implemented in a user space of host 101 with translation module 18 to interface with FS intercept 24, as described below in more detail with reference to the example implementation environment of FIG. 5.

DSD 106 can also store mapping table 28, which can be used to translate or map between logical addresses (e.g., logical block addresses) used by host 101 to refer to data and corresponding physical addresses (e.g., physical block address) indicating the location of data in DSDs 106, 107, 108 or 109. As discussed in more detail below with reference to FIG. 2, mapping table 28 may be used as part of an indirection system for Shingled Magnetic Recording (SMR) media or solid-state media to allow for the reassignment of logical addresses to different physical locations in DSDs 106, 107, 108, or 109.

As shown in FIG. 1, DSDs 107, 108, and 109 store user data 30, 33, and 34, respectively. The user data is data that is stored or accessed by host 101.

FIG. 2 depicts a block diagram of DSD 107 according to an embodiment. In the embodiment of FIG. 2, DSD 107 includes both solid-state memory 130 and disk 138 for storing data. In this regard, DSD 107 can be considered a Solid-State Hybrid Drive (SSHD) in that it includes both solid-state Non-Volatile Memory (NVM) media and disk NVM media. In other embodiments, each of disk 138 or solid-state memory 130 may be replaced by multiple Hard Disk Drives (HDDs) or multiple Solid-State Drives (SSDs), respectively, so that DSD 107 includes pools of HDDs or SSDs.

DSD 107 includes controller 124 which comprises circuitry such as one or more processors for executing instructions and can include a microcontroller, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA), hard-wired logic, analog circuitry and/or a combination thereof. In one implementation, controller 124 can include a System on a Chip (SoC).

Host interface 128 is configured to interface DSD 107 with host 101 and may interface according to a standard such as, for example, PCI express (PCIe), Serial Advanced Technology Attachment (SATA), or Serial Attached SCSI (SAS). As will be appreciated by those of ordinary skill in the art, host interface 128 can be included as part of controller 124.

Sensor 141 is also connected to controller 124. Sensor 141 may provide controller 124 with an input indicating an environmental condition such as a high temperature or high vibration condition of DSD 107.

In the example of FIG. 2, disk 138 is rotated by a spindle motor (not shown). DSD 107 also includes head 136 connected to the distal end of actuator 132, which is rotated by Voice Coil Motor (VCM) 134 to position head 136 in relation to disk 138. Controller 124 can control the position of head 136 using VCM control signal 36.

As appreciated by those of ordinary skill in the art, disk 138 may form part of a disk pack with additional disks radially aligned below disk 138. In addition, head 136 may form part of a head stack assembly including additional heads with each head arranged to read data from and write data to a corresponding surface of a disk in a disk pack.

Disk 138 includes a number of radial spaced, concentric tracks (not shown) for storing data on a surface of disk 138. The tracks on disk 138 may be grouped together into zones of tracks with each track divided into a number of sectors that are spaced circumferentially along the tracks. In the example of FIG. 2, disk 138 includes zone 140 which can serve as a source portion and zone 142 which can serve as a destination portion for the relocation or Garbage Collection (GC) of data.

Disk 138 may include one or more zones with overlapping tracks resulting from SMR to increase the amount of data that can be stored in a given area on a disk. As noted above, SMR tracks are generally sequentially written to avoid affecting previously written data and can involve using an indirection system to ensure that data is sequentially written. When data is modified for a particular Logical Block Address (LBA), the indirection system allows the DSD to sequentially write the modified data to a new location and remap the LBA for the data from the previous location to the new location.

In an SMR storage system in which indirection is used, GC can be used to recapture space used by data that is obsolete. However, performing GC and the relocation of data can decrease a performance bandwidth of system 100 that may otherwise be available to host 101.

In view of the foregoing, the present disclosure provides a tunable approach to improve control over GC and the relocation of data. According to another aspect, some embodiments of the present disclosure also include data management processes and systems to reduce the performance impact of relocating data.

In addition to disk 138, the NVM media of DSD 106 also includes solid-state memory 130 with blocks 131 for storing data. While the description herein refers to solid-state memory generally, it is understood that solid-state memory may comprise one or more of various types of memory devices such as flash integrated circuits, Chalcogenide RAM (C-RAM), Phase Change Memory (PC-RAM or PRAM), Programmable Metallization Cell RAM (PMC-RAM or PMCm), Ovonic Unified Memory (OUM), Resistance RAM (RRAM), NAND memory (e.g., Single-Level Cell (SLC) memory, Multi-Level Cell (MLC) memory, or any combination thereof), NOR memory, EEPROM, Ferroelectric Memory (FeRAM), Magnetoresistive RAM (MRAM), other discrete NVM chips, or any combination thereof.

Solid-state memory 130 may use an indirection system to allow for the mapping of LBAs to different physical locations as part of a wear leveling process for a more even usage of blocks 131. In one implementation, modified data is written to a new physical location in solid-state memory 130 and the LBA for the data is remapped from a previous physical location to the new physical location. As with SMR media, solid-state memory 130 can also employ a GC process to recapture space used by data that is obsolete or no longer valid. Solid-state memory 130 can include a source or destination portion in the GC and data relocation processes discussed below. In some embodiments, DSD 107 may include solid-state memory 130, rotating magnetic media including disk 138, and/or a combination of both types of non-volatile storage.

In FIG. 2, volatile memory 139 can include, for example, a Dynamic Random Access Memory (DRAM), which can be used by DSD 107 to temporarily store data. Data stored in volatile memory 139 can include data read from NVM media (e.g., disk 138 or solid-state memory 130), data to be written to NVM media, instructions loaded from firmware 40 of DSD 107 for execution by controller 124, or data used in executing firmware 40. In this regard, volatile memory 139 in FIG. 2 is shown as storing firmware 40 which can include instructions for execution by controller 124 to implement the data relocation and garbage collection processes discussed below. Firmware may be stored in one of the non-volatile storage shown such as solid-state memory 130 and/or rotating magnetic media including disk 138.

In operation, DSD 107 receives read and write commands from host 101 via host interface 128 for reading data from and writing data to the NVM media of DSD 107. In response to a write command from host 101, controller 124 may buffer the data to be written for the write command in volatile memory 139.

For data to be stored in solid-state memory 130, controller 124 receives data from host interface 128 and may buffer the data in volatile memory 139. In one implementation, the data is then encoded into charge values for charging cells in solid-state memory 130 to store the data.

In response to a read command for data stored in solid-state memory 130, controller 124 in one implementation reads current values for cells in solid-state memory 130 and decodes the current values into data that can be transferred to host 101. Such data may be buffered by controller 124 before transferring the data to host 101 via host interface 128.

For data to be written to disk 138, controller 124 can encode the buffered data into write signal 38 which is provided to head 136 for magnetically writing data to the surface of disk 138.

In response to a read command for data stored on disk 138, controller 124 positions head 136 via VCM control signal 36 to magnetically read the data stored on the surface of disk 138. Head 136 sends the read data as read signal 38 to controller 124 for decoding, and the data is buffered in volatile memory 139 for transferring to host 101.

Example Garbage Collection and Data Relocation Processes

FIG. 3A is a flowchart for a garbage collection process that can be performed by either host 101 or by a DSD such as DSD 107 according to an embodiment. In block 302, an initial location is determined for data to be stored based on at least one attribute defined by host 101. Host 101 can use GC manager 10 to define at least one attribute or GC policy to tune or control where garbage collection should be performed (i.e., a source portion), when GC should be performed, where the valid data resulting from the GC should be relocated to (i.e., a destination portion), how to organize the relocated data in the destination portion, or where to initially store data during file creation.

Attributes that host 101 may define can include attributes of the data such as an expiration date for the data, a frequency of access of the data, ownership of the data, or a fragmentation level of the data. Host 101 may also define attributes that are conditions of a source portion that is garbage collected or conditions of a destination portion for storing valid data resulting from the GC operation.

For example, an attribute defined by host 101 may include a “data age” or expiration date used to determine whether certain data has expired. The data age or expiration date may be based on a data retention policy such as to remove all or substantially all files past a certain age or to remove files marked for deletion within a certain time period. In such an example, data may be grouped together by an expiration date so that data having the same expiration date are grouped together into one portion of system 100 (e.g., a particular logical volume, DSD, or portion of a DSD), so that the entire portion can be obsoleted at the same time without having to relocate much valid data.

In another example, host 101 may define an attribute based on a frequency of access of data (i.e., “hot/cold” attributes) so that data which is accessed less frequently (i.e., “cold data”) can be grouped together in a portion of system 100 or so that data that is accessed more frequently (i.e., “hot data”) is grouped together in a portion of system 100. More frequently accessed data often results in more data for GC than less frequently accessed data or data that needs to be relocated more frequently. Grouping frequently accessed data in one source portion can therefore ordinarily enhance the effect of a single GC operation since a single GC operation in a source portion of frequently accessed data would free up more space than multiple GC operations in source portions that do not contain as much invalid data. In other words, by grouping the more frequently accessed data together, it is ordinarily possible to precondition certain portions of system 100 for data relocation and thereby reduce an overall number of GC or data relocation operations.

In this regard, host 101 may also define an attribute to include considerations of a level of fragmentation in identifying a source portion for GC, such as an amount of capacity or performance that can be gained by performing GC in a particular source portion to provide the greatest capacity gain utilization.

Host 101 may also define an attribute for GC based on the ownership of data so that data owned by a particular user is grouped together in a portion of system 100. If the data of the user then needs to be deleted or relocated, the GC or relocation of that user's data is then more isolated to a particular portion of system 100 and can have less of an effect on system wide performance.

As noted above, host 101 executing GC manager 10 may also define an attribute based on a condition of a source or destination portion. Such conditions of the source or destination portions can include, for example, a reliability condition, an environmental condition, a wear level, an available data capacity, a distance from previous users of the data, a network bandwidth available between the source and destination portions, an availability of the source or destination portions, or an energy cost in operating the source or destination portions.

In one example, the attribute may include a reliability condition of the source portion or the destination portion such as a status of head 136 (e.g., a head that may need to be disabled) or a level of errors encountered when writing data on disk 138. In another example, the attribute defined by host 101 may identify source portions for garbage collection that have encountered a high level of errors so that data can be relocated to a destination portion with a lower level of errors.

In other examples, host 101 may define an attribute such that data is relocated from a source portion that has been utilized more often to a destination portion that has been utilized less (e.g., based on a wear level). This can ordinarily allow for a longer life for media such as solid-state media or can reduce the negative effects of repeatedly writing to the same location on disk media, as discussed in more detail below. Host 101 may also define an attribute based on an available data capacity so that data is relocated to a destination portion with a greater available data capacity.

In another example, host 101 may define an environmental condition such as a temperature or vibration condition such that data is relocated from a source portion experiencing a high temperature or high vibration condition to a destination portion experiencing a lower temperature or lower vibration condition.

The attribute or attributes may also take into consideration network considerations so that relocated data can be accessed quicker, the relocation of data is more efficient, or so that the data relocation has less of an impact on system performance. In one such example, host 101 defines the at least one attribute based on the location of previous users of data so that data is located to a closer physical location to the previous users of the data. Host 101 may also define an attribute so that there is a minimum network bandwidth between the source portion and the destination portion to improve the efficiency in relocating data in system 100.

The availability of the source or destination portions may also be considered. In such an example, host 101 may define an attribute based on an availability of the source or destination portions so that there is less activity or operations being performed at the source or destination portions.

In another implementation, an attribute may be defined by host 101 pertaining to a cost of operating the source and destination portions such that, for example, data is relocated from a source portion with a higher operating cost to a destination portion with a lower operating cost.

Returning to the process of FIG. 3A, in block 304, a source portion is identified based on the at least one attribute from a plurality of source portions for a GC operation. In an embodiment where GC manager 10 identifies a source portion, the source portion may be a particular logical volume, DSD, or a portion of a DSD in system 100. In an embodiment where a controller of a DSD (e.g., controller 124 of DSD 107) identifies a source portion, the source portion can be a portion of DSD 107 such as zone 140 on disk 138.

In block 306, a destination portion is identified based on the at least one attribute for storing data resulting from garbage collecting the source portion. In an embodiment where GC manager 10 identifies a destination portion, the destination portion may be a particular logical volume, DSD, or portion of a DSD in system 100. In an embodiment where a controller of a DSD identifies a destination portion, the destination portion can be a portion of the DSD such as zone 142 on disk 138.

In block 308, GC is performed in the source portion into the destination portion. As discussed above, GC can be performed by copying valid data from the source portion to the destination portion and freeing the invalid or obsolete areas in the source portion to be overwritten.

In block 310, the source portion is designated as a new destination portion for a new GC operation. By rotating the destination portion, it is ordinarily possible to reduce the likelihood of uneven wear on a particular portion of system 100 that is repeatedly used as a destination portion. Such rotation of the destination portion can also help mitigate problems associated with repeatedly writing in the same location on disk media such as Adjacent Track Interference (ATI) or Wide Area Track Erasure (WATER).

FIG. 3B is a flowchart for a data coherency process that can be performed by either host 101 or by a DSD such as DSD 107 according to an embodiment. This process can be performed in conjunction with the GC process of FIG. 3A or may be performed as part of another data relocation process to ensure coherency between the data being relocated from the source portion and the relocated data in the destination portion with a reduced impact on performance in the source portion.

In block 312, data to be relocated and/or a destination portion are identified based on an attribute defined by host 101. As discussed above with reference to FIG. 3A, the attribute can include an attribute of the data or a condition of system 100.

In block 314, data is relocated from a source portion to the destination portion in accordance with the identification of data and/or the location of the destination portion in block 312. The source portion and the destination portion may be in the same DSD such as with zones 140 and 142 in FIG. 2, or the source and destination portions may reside in or include different DSDs such as DSD 107 and DSD 108.

In block 316, it is determined whether a change was made in the source portion to relevant data while relocating data in block 314. Relevant data may include data that would have otherwise have been relocated in block 314. For example, changes to data in the source portion that were not identified for relocation in block 312 based on the at least one attribute would not be considered a change to relevant data. In some implementations, the changes can be determined based on comparing metadata from before and after relocating the data in block 314. The metadata may result from the use of a Copy On Write (COW)-based file system that generates a change in metadata when there is a change in a file. Scanning the metadata of the file system can then show whether changes took place and where.

If there was a change to relevant data in the source portion during relocation, the changed relevant data is relocated from the source portion to the destination portion in block 318 and the process ends in block 320.

The process of FIG. 3B ordinarily provides for coherency between the source and destination portions while still allowing for the performance of changes to data in the source portion while relocating data in block 314. In some embodiments, changes made to relevant data in the source portion may be blocked in block 318 to further ensure coherency between the data being relocated from source portion and the relocated data in the destination portion. If there was no change made to relevant data during relocation, the process ends in block 320 without performing block 318.

In other embodiments, controller 124 or host 101 may repeatedly perform blocks 316 and 318 until there is no change made to relevant data. With each iteration of blocks 316 and 318, less changes are expected since the time for relocating data should decrease. The process can end once there are no further changes to relevant data in the source portion.

FIG. 4 depicts an example implementation environment including host 101 and DSDs 106, 107, 108, and 109 according to an embodiment. As shown in FIG. 4, host 101 includes a user space and a kernel space.

The user space includes GC manager 10, application 12, and recovery module 19 for reconstructing a file system after an error. Recovery module 19 can include a recovery tool similar to Check Disk (CKDSK) or File System Consistency Check (FSCK), but on a system-wide level to handle inconsistencies or errors identified across different file systems and/or DSDs of system 100. In addition, recovery module 19 may consult with GC manager 10 for determining source or destination portions when relocating data.

As shown in FIG. 4, GC manager 10 takes in at least one attribute or policy that can include user tunable parameters for controlling GC or data relocation in the DSDs. As discussed above, the at least one attribute can include a condition of system 100 or an attribute of data stored in system 100. The at least one attribute can be used to identify a source portion of system 100, a destination portion, a time for performing GC or data relocation, or particular data to be relocated.

In determining when or where to perform GC, the GC policies or attributes can include an expected Input Output (IO) usage for the DSD such that GC can be performed on a drive when it is expected to have less IO usage so as to have less of an impact on system performance. A history of IO usage may come from information provided by host 101 or from a DSD in system 100. In another example, host 101 may define an attribute based on an IO usage associated with a time of day so that GC takes place when it would have less of an impact on system performance.

The at least one attribute may also consider an availability of system resources (e.g., processing, storage, or bandwidth) which can include evaluating a level of activity of host 101 in system 100. In one example, the at least one attribute may specify a level of connection resources between the source portion and the destination portion so as to reduce a time for relocating data.

With reference to FIG. 4, GC manager 10 can relocate data from the destination portion to the source portion and can also influence the data placement policies of the file system, as indicated by the dashed line from GC manager 10 to FS 32, which can include, for example, a file system such as Ext4 or NILFS in the kernel space. This can allow for determining an initial location for data to be stored in system 100 based on the at least one attribute defined by host 101. By initially grouping or consolidating data based on the at least one attribute, the relocation of data is usually made more efficient since the data is less dispersed across different portions of system 100.

In the embodiment shown, GC manager 10 sits above the file system layer and can query portions of system 100 to determine a time or portions for performing GC based on the at least one attribute. GC manager 10 may also include a lower level module that can execute the processes of FIGS. 3A and 3B. In this example where GC manager 10 sits above the file system layer, it can identify valid versus expired/deleted data without relying on SCSI/ATA hints or notifications (e.g., TRIM or UNMAP commands).

The kernel space can be part of OS 20 executed by host 101 and includes storage stack 26 for interfacing with and networking DSDs 106, 107, 108, and 109. FS 32 organizes data stored in system 100 by interfacing with storage stack 26. In addition, application 12 can use FS 32 to retrieve and store data in DSDs 106, 107, 108, and 109 as user data.

FIG. 5 depicts another implementation environment including host 101 and DSDs 106, 107, 108, and 109 according to an embodiment. In the example of FIG. 5, host 101 executes GC manager 10, application 12, and recovery module 19, as with the example implementation environment of FIG. 4. Unlike the implementation environment of FIG. 4, the example of FIG. 5 also includes FS 16 and translation module 18 in the user space and FS intercept 24 in the kernel space.

FS intercept 24 interfaces with application 12 and can intercept read and write commands and pass the commands to FS 16 in the user space. FS 16 can include a file system implementing COW such as Linear Tape File System (LTFS). As noted above, a COW-based file system can allow for a relatively quick identification of changes to relevant data during data relocation by scanning the metadata of the file system for changes.

For its part, FS 16 generates commands that include a block address indicating a logical address for metadata or data associated with the command. The commands are accepted by translation module 18 which can translate the logical address into a device address for a particular DSD and/or translate between different interface protocols (e.g., ATA, SCSI). The device address can also identify a location in a zone of storage media (e.g., zone 142 of DSD 107).

Translation module 18 passes the translated device address to storage stack 26 with the respective read or write command for the storage media. In practice, translation module 18 can be a plug-in driver without requiring modification to FS 16.

Other embodiments may include a different system implementation than the examples shown in FIGS. 4 and 5. For example, in other embodiments, translation module 18 may instead be part of the kernel space. In this regard, some of the modules may be assigned to different layers/spaces than as shown, and some may be split into additional modules or combined into fewer modules.

FIG. 6A is a flowchart for a GC process that can be performed by either host 101 or by a DSD such as DSD 107 according to an embodiment. In block 602, an initial location is determined for data to be stored in system 100 based on at least one attribute defined by host 101. As noted above, this initial placement of data based on the at least one attribute can later improve the efficiency of GC or data relocation based on the at least one attribute since the relevant data for relocation is not spread across different locations in system 100.

In block 604, a time for performing GC is determined. The time for GC can be determined so as to reduce the impact on system IO performance. Thus, host 101 or a DSD controller such as controller 124 may determine when to perform GC based on an availability of processing resources, an availability of the source portion or the destination portion, a time of day, or a level of activity of host 101. In this regard, GC can take place at a time when there is expected to be less activity in servicing other host commands so that the GC has less of an effect on the performance of system 100 in servicing host commands. For example, the time for GC can be at a time of day that has historically had less activity so that the GC does not interfere with the servicing of commands from host 101. The IO usage patterns can be either learned by a DSD controller or can be observed/provided by host 101. In addition, host 101 or a DSD controller may postpone GC for a source portion if modifications are being made to relevant data in the source portion.

In block 606, a source portion is identified based on the at least one attribute from a plurality of source portions for a GC operation. In an embodiment where GC manager 10 identifies a source portion, the source portion may be a particular logical volume, DSD, or portion of a DSD in system 100. In an embodiment where a controller of a DSD such as controller 124 of DSD 107 identifies a source portion, the source portion can be a portion of DSD 107 such as zone 140 on disk 138.

In block 608, a destination portion is identified based on the at least one attribute for storing data resulting from garbage collecting the source portion. In an embodiment where GC manager 10 identifies a destination portion, the destination portion may be a particular logical volume, DSD, or portion of a DSD in system 100. Thus the source and destination portions may be on the same volume/DSD/portion of a DSD or on different volumes/DSDs/portions of a DSD. In an embodiment where controller 124 of DSD 107 identifies a destination portion, the destination portion can be a portion of DSD 107 such as zone 142 on disk 138.

In block 610, GC is performed by identifying valid data in the source portion. Valid data is data that is a most recent version of the data that has not been obsoleted. In block 612, the valid data is copied into the destination portion and organized according to the at least one attribute. In one example, the valid data may be organized so that the copied data within the destination portion is organized by an expiration date, frequency of access, or ownership of the data.

In block 614, the source portion is designated as a new destination portion for a new GC operation. As noted above with reference to FIG. 3A, rotating the destination portion can reduce uneven wear on a particular portion of system 100 that might otherwise be repeatedly used as a destination portion. Such rotation of the destination portion can also help mitigate problems associated with repeatedly writing in the same location on disk media such as Adjacent Track Interference (ATI) or Wide Area Track Erasure (WATER).

In block 616, the destination portion is set as available for storing data after completion of GC. This allows for the destination portion to be used for storing user data in addition to the copied data from the above GC process. In future GC operations, the destination portion may then serve as a source portion for performing GC.

FIG. 6B is a flowchart for a data coherency process that can be performed by either host 101 or by a DSD such as DSD 107 according to an embodiment. This process can be performed in conjunction with the GC processes of FIG. 3A or 6A, or may be performed as part of another data relocation process to ensure coherency between the data being relocated from the source portion and the relocated data in the destination portion with a reduced impact on performance in the source portion.

In block 618, a time for relocating data is determined. As noted above, the time for data relocation can be determined so as to reduce the impact on system 10 performance. Thus, host 101 or a DSD controller such as controller 124 may determine when to relocate data based on an availability of processing resources, an availability of the source portion or the destination portion, a time of day, or a level of activity of host 101. In this regard, data relocation can take place at a time when there is expected to be less activity in servicing other host commands so that the data relocation has less of an effect on the performance of system 100 in servicing host commands. For example, the time for data relocation can be at a time of day that has historically had less activity so that the data relocation does not interfere with the servicing of commands from host 101. The IO usage patterns can be either learned by a DSD controller or can be provided by host 101. In addition, host 101 or a DSD controller may postpone GC for a source portion if modifications are being made to relevant data in the source portion.

In block 620, data to be relocated and/or a destination portion are identified based on an attribute defined by host 101. As discussed above, the attribute can include an attribute of the data or a condition of system 100.

In block 622, data is relocated from a source portion to the destination portion in accordance with the identification of data and/or the location of the destination portion in block 620. The source portion and the destination portion may be in the same DSD such as with zones 140 and 142 in FIG. 2, or the source and destination portions may reside in or include different DSDs such as DSDs 107 and 108.

In block 624, it is determined whether a change was made in the source portion to relevant data while relocating data in block 622. Relevant data can include data that would have otherwise have been relocated in block 622. For example, changes to data in the source portion to data that was not identified for relocation in block 620 based on the at least one attribute may not be considered a change to relevant data. In some implementations, the changes can be determined based on comparing metadata from before and after relocating the data. By not blocking changes in the source portion during the relocation of data, the performance of write commands to the source portion is not hindered by the relocation of data.

If there was no change made to relevant data during relocation in block 622, the process proceeds to block 632 to set the destination portion as available for storing data.

On the other hand, if there was a change to relevant data in the source portion during relocation in block 622, the changed relevant data is relocated from the source portion to the destination portion in block 626. As with block 622, the relocation of the changed relevant data does not prevent the performance of write commands in the source portion. Since the time to relocate any changed relevant data in block 626 is likely less than the time to initially relocate data in block 622, it is less likely that there are additional changes made to relevant data while relocating the changed relevant data in block 626.

Another check is performed in block 628 to determine if any changes were made to additional relevant data while relocating the changed relevant data in block 626. If so, the additional changed relevant data is relocated from the source portion to the destination portion in block 630 while blocking further changes to the source portion.

In other embodiments, there may be more iterations of blocks 624 and 626, or blocks 624 and 626 may be performed repeatedly until there are no changes made to relevant data in the source portion without blocking changes. This ordinarily allows for the IO performance of the source portion to remain generally unchanged during data relocation while maintaining coherency between the relevant data stored in the source and destination portions. With each iteration of blocks 624 and 626, less changes are expected since the time for relocating data should decrease.

In block 632, the destination portion is set as available for storing data. This allows for the destination portion to be used for storing user data in addition to the relocated data from the above data relocation process. In future data relocation operations, the destination portion may then serve as a source portion.

In block 634, a new destination portion is identified for the further relocation of data. The identification of the new destination portion can be based on the at least one attribute defined by host 101 without considering the previously used destination portion so that the destination portion rotates within system 100.

FIG. 7 is a conceptual diagram illustrating the assignment of a zone as a destination portion for GC and the assignment of each of the remaining zones as a logical volume used by host 101 for storing data according to an embodiment. In FIG. 7, zones 0, 1 and 2 are each mapped to volumes B, C, and A, respectively, via a file system for the zones and mapping table 28. In other embodiments, different zones may each use a different file system.

A single zone in FIG. 7 can include a particular physical portion of a disk such as zone 140 or zone 142 of disk 138, or a single zone can include a portion of a solid-state memory such as one or more blocks 131 in solid-state memory 130. In other embodiments, a single zone can comprise an entire disk surface or an entire DSD.

The shading of the volumes and the corresponding shading of the zones shows the mapping correspondence and the level of fragmentation for each of the volumes/zones. The darker shading of volumes/zones indicates a higher level of fragmentation for the volume/zone.

Zone N is a floating spare zone in FIG. 7 for storing data resulting from GC of another zone. For example, zone 1 may be garbage collected into zone N or any one or more of a set of floating spare zones along the lines of the GC process described above for FIG. 3A or 6A. After completing GC, zone N is mounted as the new volume C and zone 1 (previously mapped to volume C) is assigned as the new zone N for a subsequent GC operation. By rotating the destination portion, it is ordinarily possible to reduce wear on a particular zone used to store data resulting from GC. In addition, a particular zone with a lower use may be targeted or identified as the destination portion. The host defined attribute would then be based on a previous usage of the destination portion so that a zone with a lower usage or wear is identified as the destination portion. In other embodiments, zone N may include multiple zones for storing data resulting from GC.

FIG. 8 is a conceptual diagram illustrating the assignment of multiple zones to a logical volume and the assignment of one zone as a destination portion for GC according to an embodiment. In FIG. 8, zones 0 to N-1 are mapped to a single logical volume. The mapping or assignment of multiple zones to a single logical volume can be used to accommodate large files whose size may otherwise exceed the size of a zone. Such large files may, for example, cover a disk platter surface and have a size of thousands of megabytes.

In other embodiments, a fewer number of zones may be mapped to a single logical volume. For example, a first pair of zones can be mapped to a first logical volume and a second pair of zones can be mapped to a second logical volume. In addition, other implementations can include GC of multiple zones into a single zone or GC of a single zone into multiple zones.

As with FIG. 7, a single zone in FIG. 8 can include a particular physical portion of a disk such as zone 140 or zone 142 of disk 138, or a single zone can include a portion of a solid-state memory such as one or more blocks 131 in solid-state memory 130. In other embodiments, a single zone can comprise an entire disk surface or an entire DSD.

In contrast to the implementation of FIG. 7, the implementation depicted in FIG. 8 can allow for the GC process to be hidden from the user level since it is outside of mapping table 28 and resides at a lower level than the file system.

The grey shading indicates that the fragmentation level of the entire volume is an average of the fragmentation level of the corresponding zones. A GC process as in FIG. 3A or 6A is performed at the zone level with zone N serving as a floating spare zone or destination portion which can rotate. Upon completion of a GC process, zone N can be mapped to the volume via the file system and mapping table 28 and one of the zones previously mapped to the logical volume is mapped out to serve as a new floating spare zone or destination portion.

Other Embodiments

Those of ordinary skill in the art will appreciate that the various illustrative logical blocks, modules, and processes described in connection with the examples disclosed herein may be implemented as electronic hardware, computer software, or combinations of both. Furthermore, the foregoing processes can be embodied on a computer readable medium which causes a processor or computer to perform or execute certain functions.

To clearly illustrate this interchangeability of hardware and software, various illustrative components, blocks, and modules have been described above generally in terms of their functionality. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the overall system. Those of ordinary skill in the art may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present disclosure.

The various illustrative logical blocks, units, modules, and controllers described in connection with the examples disclosed herein may be implemented or performed with a general purpose processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), a field programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination thereof designed to perform the functions described herein. A general purpose processor may be a microprocessor, but in the alternative, the processor may be any conventional processor, controller, microcontroller, or state machine. A processor may also be implemented as a combination of computing devices, e.g., a combination of a DSP and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a DSP core, or any other such configuration.

The activities of a method or process described in connection with the examples disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. The steps of the method or algorithm may also be performed in an alternate order from those provided in the examples. A software module may reside in RAM memory, flash memory, ROM memory, EPROM memory, EEPROM memory, registers, hard disk, a removable media, an optical media, or any other form of storage medium known in the art. An exemplary storage medium is coupled to the processor such that the processor can read information from, and write information to, the storage medium. In the alternative, the storage medium may be integral to the processor. The processor and the storage medium may reside in an Application Specific Integrated Circuit (ASIC).

The foregoing description of the disclosed example embodiments is provided to enable any person of ordinary skill in the art to make or use the embodiments in the present disclosure. Various modifications to these examples will be readily apparent to those of ordinary skill in the art, and the principles disclosed herein may be applied to other examples without departing from the spirit or scope of the present disclosure. The described embodiments are to be considered in all respects only as illustrative and not restrictive and the scope of the disclosure is, therefore, indicated by the following claims rather than by the foregoing description. All changes which come within the meaning and range of equivalency of the claims are to be embraced within their scope.

Claims

1. A method for managing data in a data storage system including a host and at least one Data Storage Device (DSD) including a plurality of zones for storing data, the method comprising: assigning one or more zones of the plurality of zones as a destination portion in the at least one DSD for storing data resulting from a garbage collection operation;assigning multiple remaining zones of the plurality of zones as at least one logical volume used by the host for storing data, wherein the one or more zones assigned as the destination portion are outside of the at least one logical volume;identifying, based on at least one attribute defined by the host, a source portion for the garbage collection operation from a plurality of source portions in the multiple remaining zones assigned as the at least one logical volume; andperforming garbage collection of data in the source portion into the destination portion.
2. The method of claim 1, further comprising: identifying valid data in the source portion during the garbage collection operation; andorganizing the valid data in the destination portion according to the at least one attribute.
3. The method of claim 1, further comprising identifying valid data in the source portion during the garbage collection operation based on the at least one attribute such that the valid data is grouped together with other data in the destination portion having the at least one attribute.
4. The method of claim 1, wherein the at least one attribute includes at least one of an expiration date for the data, a frequency of access of the data, ownership of the data, or a fragmentation level of the data.
5. The method of claim 1, wherein the at least one attribute includes a reliability condition of the source portion or the destination portion, an environmental condition of the source portion or the destination portion, a wear level of the source portion or the destination portion, an available data capacity of the source portion or the destination portion, a distance of the source portion or the destination portion from previous users of the data, a network bandwidth available between the source portion and the destination portion, an availability of the source portion or the destination portion, or an energy cost in operating the source portion or the destination portion.
6. The method of claim 5, wherein the environmental condition of the source portion or the destination portion includes a temperature condition or a vibration condition.
7. The method of claim 5, wherein the reliability condition of the source portion or the destination portion includes a status of a head used for writing data or a level of errors encountered when writing data.
8. The method of claim 1, further comprising determining a time for performing garbage collection.
9. The method of claim 8, wherein determining the time for garbage collection is based on at least one of an availability of processing resources, an availability of the destination portion or the source portion, a time of day, or a level of activity for the host.
10. The method of claim 1, wherein after completion of garbage collection, the method further comprises setting the destination portion as available for storing data.
11. The method of claim 1, wherein the source portion and the destination portion are located in separate DSDs of the at least one DSD.
12. The method of claim 1, wherein the source portion and the destination portion are located in the same DSD of the at least one DSD.
13. The method of claim 1, further comprising: after performing garbage collection of data in the source portion into the destination portion, mapping at least one zone of the one or more zones assigned as the destination portion to the at least one logical volume;mapping at least one zone of the multiple remaining zones out of the at least one logical volume; andassigning the at least one zone mapped out of the at least one logical volume as a new destination portion.
14. A data storage system for storing data, the data storage system comprising: a host including a processor; andat least one Data Storage Device (DSD) in communication with the host, the at least one DSD including a plurality of zones for storing data;wherein the processor is configured to: define at least one attribute for performing garbage collection in the at least one DSD;assign one or more zones of the plurality of zones as a destination portion in the at least one DSD for storing data resulting from a garbage collection operation;assign multiple remaining zones of the plurality of zones as at least one logical volume used by the host for storing data, wherein the one or more zones assigned as the destination portion are outside of the at least one logical volume;identify, based on at least one attribute, a source portion for the garbage collection operation from a plurality of source portions in the multiple remaining zones assigned as the at least one logical volume; andperform garbage collection of data in the source portion into the destination portion.
15. The data storage system of claim 14, wherein the processor is further configured to: identify valid data in the source portion during the garbage collection operation; andorganize the valid data in the destination portion according to the at least one attribute.
16. The data storage system of claim 14, wherein the processor is further configured to identify valid data in the source portion during the garbage collection operation based on the at least one attribute such that the valid data is grouped together with other data in the destination portion having the at least one attribute.
17. The data storage system of claim 14, wherein the at least one attribute includes at least one of an expiration date for the data, a frequency of access of the data, ownership of the data, or a fragmentation level of the data.
18. The data storage system of claim 14, wherein the at least one attribute includes a reliability condition of the source portion or the destination portion, an environmental condition of the source portion or the destination portion, a wear level of the source portion or the destination portion, an available data capacity of the source portion or the destination portion, a distance of the source portion or the destination portion from previous users of the data, a network bandwidth available between the source portion and the destination portion, an availability of the source portion or the destination portion, or an energy cost in operating the source portion or the destination portion.
19. The data storage system of claim 18, wherein the environmental condition of the source portion or the destination portion includes a temperature condition or a vibration condition.
20. The data storage system of claim 18, wherein the reliability condition of the source portion or the destination portion includes a status of a head used for writing data or a level of errors encountered when writing data.
21. The data storage system of claim 14, wherein the processor is further configured to determine a time for performing garbage collection.
22. The data storage system of claim 21, wherein the processor is further configured to determine the time for garbage collection based on at least one of an availability of processing resources, an availability of the destination portion or the source portion, a time of day, or a level of activity for the host.
23. The data storage system of claim 14, wherein the processor is further configured to set the destination portion as available for storing data after completion of the garbage collection.
24. The data storage system of claim 14, wherein the source portion and the destination portion are located in separate DSDs of the at least one DSD.
25. The data storage system of claim 14, wherein the source portion and the destination portion are located in the same DSD of the at least one DSD.
26. The data storage system of claim 14, wherein the processor is further configured to: after performing garbage collection of data in the source portion into the destination portion, map at least one zone of the one or more zones assigned as the destination portion to the at least one logical volume;map at least one zone of the multiple remaining zones out of the at least one logical volume; andassigning the at least one zone mapped out of the at least one logical volume as a new destination portion.
27. A Data Storage Device (DSD) in communication with a host, the DSD comprising: a non-volatile memory including a plurality of zones for storing data; anda controller configured to: receive at least one attribute defined by the host for performing garbage collection in the non-volatile memory;assign one or more zones of the plurality of zones as a destination portion in the at least one DSD for storing data resulting from a garbage collection operation;assign multiple remaining zones of the plurality of zones as at least one logical volume used by the host for storing data, wherein the one or more zones assigned as the destination portion are outside of the at least one logical volume;identify, based on the at least one attribute defined by the host, a source portion for the garbage collection operation from a plurality of source portions in the multiple remaining zones assigned as the at least one logical volume; andperform garbage collection of data in the source portion into the destination portion.
28. The DSD of claim 27, wherein the controller is further configured to: identify valid data in the source portion during the garbage collection process; andorganize the valid data in the destination portion according to the at least one attribute.
29. The DSD of claim 27, wherein the controller is further configured to identify valid data in the source portion during the garbage collection process based on the at least one attribute such that the valid data is grouped together with other data in the destination portion having the at least one attribute.
30. The DSD of claim 27, wherein the at least one attribute includes at least one of an expiration date for the data, a frequency of access of the data, ownership of the data, or a fragmentation level of the data.
31. The DSD of claim 27, wherein the at least one attribute includes a reliability condition of the source portion or the destination portion, an available data capacity of the source portion or the destination portion, an availability of the source portion or the destination portion, or a wear level of the source portion or the destination portion.
32. The DSD of claim 31, wherein the reliability condition of the source portion or the destination portion includes a status of a head used for writing data or a level of errors encountered when writing data.
33. The DSD of claim 27, wherein the controller is further configured to determine a time for performing garbage collection.
34. The DSD of claim 33, wherein the controller is further configured to determine the time for garbage collection based on at least one of an availability of processing resources, an availability of the destination portion or the source portion, a time of day, or a level of activity for the host.
35. The DSD of claim 27, wherein the controller is further configured to set the destination portion as available for storing data after completion of the garbage collection.
36. The DSD of claim 27, wherein the controller is further configured to: after performing garbage collection of data in the source portion into the destination portion, map at least one zone of the one or more zones assigned as the destination portion to the at least one logical volume;map at least one zone of the multiple remaining zones out of the at least one logical volume; andassign the at least one zone mapped out of the at least one logical volume as a new destination portion.

CROSS-REFERENCE TO RELATED APPLICATION

This application claims the benefit of U.S. Provisional Application No. 61/932,113, filed on Jan. 27, 2014, which is hereby incorporated by reference in its entirety.

US Referenced Citations (467)

Number	Name	Date	Kind
6018789	Sokolov et al.	Jan 2000	A
6065095	Sokolov et al.	May 2000	A
6078452	Kittilson et al.	Jun 2000	A
6081447	Lofgren et al.	Jun 2000	A
6092149	Hicken et al.	Jul 2000	A
6092150	Sokolov et al.	Jul 2000	A
6094707	Sokolov et al.	Jul 2000	A
6105104	Guttmann et al.	Aug 2000	A
6111717	Cloke et al.	Aug 2000	A
6145052	Howe et al.	Nov 2000	A
6175893	D'Souza et al.	Jan 2001	B1
6178056	Cloke et al.	Jan 2001	B1
6185063	Cameron	Feb 2001	B1
6191909	Cloke et al.	Feb 2001	B1
6195218	Guttmann et al.	Feb 2001	B1
6205494	Williams	Mar 2001	B1
6208477	Cloke et al.	Mar 2001	B1
6223303	Billings et al.	Apr 2001	B1
6230233	Lofgren et al.	May 2001	B1
6246346	Cloke et al.	Jun 2001	B1
6249393	Billings et al.	Jun 2001	B1
6256695	Williams	Jul 2001	B1
6262857	Hull et al.	Jul 2001	B1
6263459	Schibilla	Jul 2001	B1
6272694	Weaver et al.	Aug 2001	B1
6278568	Cloke et al.	Aug 2001	B1
6279089	Schibilla et al.	Aug 2001	B1
6289484	Rothberg et al.	Sep 2001	B1
6292912	Cloke et al.	Sep 2001	B1
6310740	Dunbar et al.	Oct 2001	B1
6317850	Rothberg	Nov 2001	B1
6327106	Rothberg	Dec 2001	B1
6337778	Gagne	Jan 2002	B1
6369969	Christiansen et al.	Apr 2002	B1
6384999	Schibilla	May 2002	B1
6388833	Golowka et al.	May 2002	B1
6405342	Lee	Jun 2002	B1
6408357	Hanmann et al.	Jun 2002	B1
6408406	Parris	Jun 2002	B1
6411452	Cloke	Jun 2002	B1
6411458	Billings et al.	Jun 2002	B1
6412083	Rothberg et al.	Jun 2002	B1
6415349	Hull et al.	Jul 2002	B1
6425128	Krapf et al.	Jul 2002	B1
6441981	Cloke et al.	Aug 2002	B1
6442328	Elliott et al.	Aug 2002	B1
6445524	Nazarian et al.	Sep 2002	B1
6449767	Krapf et al.	Sep 2002	B1
6453115	Boyle	Sep 2002	B1
6470420	Hospodor	Oct 2002	B1
6480020	Jung et al.	Nov 2002	B1
6480349	Kim et al.	Nov 2002	B1
6480932	Vallis et al.	Nov 2002	B1
6483986	Krapf	Nov 2002	B1
6487032	Cloke et al.	Nov 2002	B1
6490635	Holmes	Dec 2002	B1
6493173	Kim et al.	Dec 2002	B1
6499083	Hamlin	Dec 2002	B1
6519104	Cloke et al.	Feb 2003	B1
6525892	Dunbar et al.	Feb 2003	B1
6545830	Briggs et al.	Apr 2003	B1
6546489	Frank, Jr. et al.	Apr 2003	B1
6550021	Dalphy et al.	Apr 2003	B1
6552880	Dunbar et al.	Apr 2003	B1
6553457	Wilkins et al.	Apr 2003	B1
6578106	Price	Jun 2003	B1
6580573	Hull et al.	Jun 2003	B1
6594183	Lofgren et al.	Jul 2003	B1
6600620	Krounbi et al.	Jul 2003	B1
6601137	Castro et al.	Jul 2003	B1
6603622	Christiansen et al.	Aug 2003	B1
6603625	Hospodor et al.	Aug 2003	B1
6604220	Lee	Aug 2003	B1
6606682	Dang et al.	Aug 2003	B1
6606714	Thelin	Aug 2003	B1
6606717	Yu et al.	Aug 2003	B1
6611393	Nguyen et al.	Aug 2003	B1
6615312	Hamlin et al.	Sep 2003	B1
6639748	Christiansen et al.	Oct 2003	B1
6647481	Luu et al.	Nov 2003	B1
6654193	Thelin	Nov 2003	B1
6657810	Kupferman	Dec 2003	B1
6661591	Rothberg	Dec 2003	B1
6665772	Hamlin	Dec 2003	B1
6687073	Kupferman	Feb 2004	B1
6687078	Kim	Feb 2004	B1
6687850	Rothberg	Feb 2004	B1
6690523	Nguyen et al.	Feb 2004	B1
6690882	Hanmann et al.	Feb 2004	B1
6691198	Hamlin	Feb 2004	B1
6691213	Luu et al.	Feb 2004	B1
6691255	Rothberg et al.	Feb 2004	B1
6693760	Krounbi et al.	Feb 2004	B1
6694477	Lee	Feb 2004	B1
6697914	Hospodor et al.	Feb 2004	B1
6704153	Rothberg et al.	Mar 2004	B1
6708251	Boyle et al.	Mar 2004	B1
6710951	Cloke	Mar 2004	B1
6711628	Thelin	Mar 2004	B1
6711635	Wang	Mar 2004	B1
6711660	Milne et al.	Mar 2004	B1
6715044	Lofgren et al.	Mar 2004	B2
6724982	Hamlin	Apr 2004	B1
6725329	Ng et al.	Apr 2004	B1
6735650	Rothberg	May 2004	B1
6735693	Hamlin	May 2004	B1
6744772	Eneboe et al.	Jun 2004	B1
6745283	Dang	Jun 2004	B1
6751402	Elliott et al.	Jun 2004	B1
6757481	Nazarian et al.	Jun 2004	B1
6772281	Hamlin	Aug 2004	B2
6781826	Goldstone et al.	Aug 2004	B1
6782449	Codilian et al.	Aug 2004	B1
6791779	Singh et al.	Sep 2004	B1
6792486	Hanan et al.	Sep 2004	B1
6799274	Hamlin	Sep 2004	B1
6811427	Garrett et al.	Nov 2004	B2
6826003	Subrahmanyam	Nov 2004	B1
6826614	Hanmann et al.	Nov 2004	B1
6832041	Boyle	Dec 2004	B1
6832929	Garrett et al.	Dec 2004	B2
6845405	Thelin	Jan 2005	B1
6845427	Atai-Azimi	Jan 2005	B1
6850443	Lofgren et al.	Feb 2005	B2
6851055	Boyle et al.	Feb 2005	B1
6851063	Boyle et al.	Feb 2005	B1
6853731	Boyle et al.	Feb 2005	B1
6854022	Thelin	Feb 2005	B1
6862660	Wilkins et al.	Mar 2005	B1
6880043	Castro et al.	Apr 2005	B1
6882486	Kupferman	Apr 2005	B1
6884085	Goldstone	Apr 2005	B1
6888831	Hospodor et al.	May 2005	B1
6892217	Hanmann et al.	May 2005	B1
6892249	Codilian et al.	May 2005	B1
6892313	Codilian et al.	May 2005	B1
6895455	Rothberg	May 2005	B1
6895500	Rothberg	May 2005	B1
6898730	Hanan	May 2005	B1
6910099	Wang et al.	Jun 2005	B1
6928470	Hamlin	Aug 2005	B1
6931439	Hanmann et al.	Aug 2005	B1
6934104	Kupferman	Aug 2005	B1
6934713	Schwartz et al.	Aug 2005	B2
6940873	Boyle et al.	Sep 2005	B2
6943978	Lee	Sep 2005	B1
6948165	Luu et al.	Sep 2005	B1
6950267	Liu et al.	Sep 2005	B1
6954733	Ellis et al.	Oct 2005	B1
6961814	Thelin et al.	Nov 2005	B1
6965489	Lee et al.	Nov 2005	B1
6965563	Hospodor et al.	Nov 2005	B1
6965966	Rothberg et al.	Nov 2005	B1
6967799	Lee	Nov 2005	B1
6968422	Codilian et al.	Nov 2005	B1
6968450	Rothberg et al.	Nov 2005	B1
6973495	Milne et al.	Dec 2005	B1
6973570	Hamlin	Dec 2005	B1
6976190	Goldstone	Dec 2005	B1
6983316	Milne et al.	Jan 2006	B1
6986007	Procyk et al.	Jan 2006	B1
6986154	Price et al.	Jan 2006	B1
6995933	Codilian et al.	Feb 2006	B1
6996501	Rothberg	Feb 2006	B1
6996669	Dang et al.	Feb 2006	B1
7002926	Eneboe et al.	Feb 2006	B1
7003674	Hamlin	Feb 2006	B1
7006316	Sargenti, Jr. et al.	Feb 2006	B1
7009820	Hogg	Mar 2006	B1
7023639	Kupferman	Apr 2006	B1
7024491	Hanmann et al.	Apr 2006	B1
7024549	Luu et al.	Apr 2006	B1
7024614	Thelin et al.	Apr 2006	B1
7027716	Boyle et al.	Apr 2006	B1
7028174	Atai-Azimi et al.	Apr 2006	B1
7031902	Catiller	Apr 2006	B1
7046465	Kupferman	May 2006	B1
7046488	Hogg	May 2006	B1
7050252	Vallis	May 2006	B1
7054937	Milne et al.	May 2006	B1
7055000	Severtson	May 2006	B1
7055167	Masters	May 2006	B1
7057836	Kupferman	Jun 2006	B1
7062398	Rothberg	Jun 2006	B1
7075746	Kupferman	Jul 2006	B1
7076604	Thelin	Jul 2006	B1
7082494	Thelin et al.	Jul 2006	B1
7088538	Codilian et al.	Aug 2006	B1
7088545	Singh et al.	Aug 2006	B1
7092186	Hogg	Aug 2006	B1
7095577	Codilian et al.	Aug 2006	B1
7099095	Subrahmanyam et al.	Aug 2006	B1
7106537	Bennett	Sep 2006	B1
7106947	Boyle et al.	Sep 2006	B2
7110202	Vasquez	Sep 2006	B1
7111116	Boyle et al.	Sep 2006	B1
7114029	Thelin	Sep 2006	B1
7120737	Thelin	Oct 2006	B1
7120806	Codilian et al.	Oct 2006	B1
7126776	Warren, Jr. et al.	Oct 2006	B1
7129763	Bennett et al.	Oct 2006	B1
7133600	Boyle	Nov 2006	B1
7136244	Rothberg	Nov 2006	B1
7146094	Boyle	Dec 2006	B1
7149046	Coker et al.	Dec 2006	B1
7150036	Milne et al.	Dec 2006	B1
7155616	Hamlin	Dec 2006	B1
7171108	Masters et al.	Jan 2007	B1
7171110	Wilshire	Jan 2007	B1
7194576	Boyle	Mar 2007	B1
7200698	Rothberg	Apr 2007	B1
7205805	Bennett	Apr 2007	B1
7206497	Boyle et al.	Apr 2007	B1
7215496	Kupferman et al.	May 2007	B1
7215771	Hamlin	May 2007	B1
7237054	Cain et al.	Jun 2007	B1
7240161	Boyle	Jul 2007	B1
7249365	Price et al.	Jul 2007	B1
7263709	Krapf	Aug 2007	B1
7274639	Codilian et al.	Sep 2007	B1
7274659	Hospodor	Sep 2007	B2
7275116	Hanmann et al.	Sep 2007	B1
7280302	Masiewicz	Oct 2007	B1
7292774	Masters et al.	Nov 2007	B1
7292775	Boyle et al.	Nov 2007	B1
7296284	Price et al.	Nov 2007	B1
7302501	Cain et al.	Nov 2007	B1
7302579	Cain et al.	Nov 2007	B1
7318088	Mann	Jan 2008	B1
7319806	Willner et al.	Jan 2008	B1
7325244	Boyle et al.	Jan 2008	B2
7330323	Singh et al.	Feb 2008	B1
7346790	Klein	Mar 2008	B1
7366641	Masiewicz et al.	Apr 2008	B1
7369340	Dang et al.	May 2008	B1
7369343	Yeo et al.	May 2008	B1
7372650	Kupferman	May 2008	B1
7380147	Sun	May 2008	B1
7392340	Dang et al.	Jun 2008	B1
7404013	Masiewicz	Jul 2008	B1
7406545	Rothberg et al.	Jul 2008	B1
7415571	Hanan	Aug 2008	B1
7436610	Thelin	Oct 2008	B1
7437502	Coker	Oct 2008	B1
7440214	Ell et al.	Oct 2008	B1
7443630	Lengsfield, III et al.	Oct 2008	B2
7451344	Rothberg	Nov 2008	B1
7471483	Ferris et al.	Dec 2008	B1
7471486	Coker et al.	Dec 2008	B1
7486060	Bennett	Feb 2009	B1
7486460	Tsuchinaga et al.	Feb 2009	B2
7496493	Stevens	Feb 2009	B1
7518819	Yu et al.	Apr 2009	B1
7526184	Parkinen et al.	Apr 2009	B1
7539924	Vasquez et al.	May 2009	B1
7543117	Hanan	Jun 2009	B1
7551383	Kupferman	Jun 2009	B1
7562282	Rothberg	Jul 2009	B1
7577973	Kapner, III et al.	Aug 2009	B1
7596797	Kapner, III et al.	Sep 2009	B1
7599139	Bombet et al.	Oct 2009	B1
7619841	Kupferman	Nov 2009	B1
7620772	Liikanen et al.	Nov 2009	B1
7647544	Masiewicz	Jan 2010	B1
7649704	Bombet et al.	Jan 2010	B1
7653927	Kapner, III et al.	Jan 2010	B1
7656603	Xing	Feb 2010	B1
7656763	Jin et al.	Feb 2010	B1
7657149	Boyle	Feb 2010	B2
7672072	Boyle et al.	Mar 2010	B1
7673075	Masiewicz	Mar 2010	B1
7688540	Mei et al.	Mar 2010	B1
7724461	McFadyen et al.	May 2010	B1
7725584	Hanmann et al.	May 2010	B1
7730295	Lee	Jun 2010	B1
7760458	Trinh	Jul 2010	B1
7768776	Szeremeta et al.	Aug 2010	B1
7804657	Hogg et al.	Sep 2010	B1
7813954	Price et al.	Oct 2010	B1
7827320	Stevens	Nov 2010	B1
7839588	Dang et al.	Nov 2010	B1
7843660	Yeo	Nov 2010	B1
7852596	Boyle et al.	Dec 2010	B2
7859782	Lee	Dec 2010	B1
7872822	Rothberg	Jan 2011	B1
7898756	Wang	Mar 2011	B1
7898762	Guo et al.	Mar 2011	B1
7900037	Fallone et al.	Mar 2011	B1
7907364	Boyle et al.	Mar 2011	B2
7929234	Boyle et al.	Apr 2011	B1
7933087	Tsai et al.	Apr 2011	B1
7933090	Jung et al.	Apr 2011	B1
7934030	Sargenti, Jr. et al.	Apr 2011	B1
7940491	Szeremeta et al.	May 2011	B2
7944639	Wang	May 2011	B1
7945727	Rothberg et al.	May 2011	B2
7949564	Hughes et al.	May 2011	B1
7974029	Tsai et al.	Jul 2011	B2
7974039	Xu et al.	Jul 2011	B1
7982993	Tsai et al.	Jul 2011	B1
7984200	Bombet et al.	Jul 2011	B1
7990648	Wang	Aug 2011	B1
7992179	Kapner, III et al.	Aug 2011	B1
8004785	Tsai et al.	Aug 2011	B1
8006027	Stevens et al.	Aug 2011	B1
8014094	Jin	Sep 2011	B1
8014977	Masiewicz et al.	Sep 2011	B1
8019914	Vasquez et al.	Sep 2011	B1
8040625	Boyle et al.	Oct 2011	B1
8078943	Lee	Dec 2011	B1
8079045	Krapf et al.	Dec 2011	B2
8082433	Fallone et al.	Dec 2011	B1
8085487	Jung et al.	Dec 2011	B1
8089719	Dakroub	Jan 2012	B1
8090902	Bennett et al.	Jan 2012	B1
8090906	Blaha et al.	Jan 2012	B1
8091112	Elliott et al.	Jan 2012	B1
8094396	Zhang et al.	Jan 2012	B1
8094401	Peng et al.	Jan 2012	B1
8116020	Lee	Feb 2012	B1
8116025	Chan et al.	Feb 2012	B1
8134793	Vasquez et al.	Mar 2012	B1
8134798	Thelin et al.	Mar 2012	B1
8139301	Li et al.	Mar 2012	B1
8139310	Hogg	Mar 2012	B1
8144419	Liu	Mar 2012	B1
8145452	Masiewicz et al.	Mar 2012	B1
8149528	Suratman et al.	Apr 2012	B1
8154812	Boyle et al.	Apr 2012	B1
8159768	Miyamura	Apr 2012	B1
8161328	Wilshire	Apr 2012	B1
8164849	Szeremeta et al.	Apr 2012	B1
8174780	Tsai et al.	May 2012	B1
8190575	Ong et al.	May 2012	B1
8194338	Zhang	Jun 2012	B1
8194340	Boyle et al.	Jun 2012	B1
8194341	Boyle	Jun 2012	B1
8201066	Wang	Jun 2012	B1
8271692	Dinh et al.	Sep 2012	B1
8279550	Hogg	Oct 2012	B1
8281218	Ybarra et al.	Oct 2012	B1
8285923	Stevens	Oct 2012	B2
8289656	Huber	Oct 2012	B1
8305705	Roohr	Nov 2012	B1
8307156	Codilian et al.	Nov 2012	B1
8310775	Boguslawski et al.	Nov 2012	B1
8315006	Chahwan et al.	Nov 2012	B1
8316263	Gough et al.	Nov 2012	B1
8320067	Tsai et al.	Nov 2012	B1
8324974	Bennett	Dec 2012	B1
8332695	Dalphy et al.	Dec 2012	B2
8341337	Ong et al.	Dec 2012	B1
8350628	Bennett	Jan 2013	B1
8356184	Meyer et al.	Jan 2013	B1
8370683	Ryan et al.	Feb 2013	B1
8375225	Ybarra	Feb 2013	B1
8375274	Bonke	Feb 2013	B1
8380922	DeForest et al.	Feb 2013	B1
8390948	Hogg	Mar 2013	B2
8390952	Szeremeta	Mar 2013	B1
8392689	Lott	Mar 2013	B1
8407393	Yolar et al.	Mar 2013	B1
8413010	Vasquez et al.	Apr 2013	B1
8417566	Price et al.	Apr 2013	B2
8421663	Bennett	Apr 2013	B1
8422172	Dakroub et al.	Apr 2013	B1
8427771	Tsai	Apr 2013	B1
8429343	Tsai	Apr 2013	B1
8433937	Wheelock et al.	Apr 2013	B1
8433977	Vasquez et al.	Apr 2013	B1
8458526	Dalphy et al.	Jun 2013	B2
8462466	Huber	Jun 2013	B2
8467151	Huber	Jun 2013	B1
8489841	Strecke et al.	Jul 2013	B1
8493679	Boguslawski et al.	Jul 2013	B1
8498074	Mobley et al.	Jul 2013	B1
8499198	Messenger et al.	Jul 2013	B1
8512049	Huber et al.	Aug 2013	B1
8514506	Li et al.	Aug 2013	B1
8521972	Boyle et al.	Aug 2013	B1
8527544	Colgrove et al.	Sep 2013	B1
8531791	Reid et al.	Sep 2013	B1
8554741	Malina	Oct 2013	B1
8560759	Boyle et al.	Oct 2013	B1
8565053	Chung	Oct 2013	B1
8576511	Coker et al.	Nov 2013	B1
8578100	Huynh et al.	Nov 2013	B1
8578242	Burton et al.	Nov 2013	B1
8589773	Wang et al.	Nov 2013	B1
8593753	Anderson	Nov 2013	B1
8595432	Vinson et al.	Nov 2013	B1
8599510	Fallone	Dec 2013	B1
8601248	Thorsted	Dec 2013	B2
8611032	Champion et al.	Dec 2013	B2
8612650	Carrie et al.	Dec 2013	B1
8612706	Madril et al.	Dec 2013	B1
8612798	Tsai	Dec 2013	B1
8619383	Jung et al.	Dec 2013	B1
8621115	Bombet et al.	Dec 2013	B1
8621133	Boyle	Dec 2013	B1
8626463	Stevens et al.	Jan 2014	B2
8630052	Jung et al.	Jan 2014	B1
8630056	Ong	Jan 2014	B1
8631188	Heath et al.	Jan 2014	B1
8634158	Chahwan et al.	Jan 2014	B1
8635412	Wilshire	Jan 2014	B1
8640007	Schulze	Jan 2014	B1
8654619	Cheng	Feb 2014	B1
8661193	Cobos et al.	Feb 2014	B1
8667248	Neppalli	Mar 2014	B1
8670205	Malina et al.	Mar 2014	B1
8683295	Syu et al.	Mar 2014	B1
8683457	Hughes et al.	Mar 2014	B1
8687306	Coker et al.	Apr 2014	B1
8693133	Lee et al.	Apr 2014	B1
8694841	Chung et al.	Apr 2014	B1
8699159	Malina	Apr 2014	B1
8699171	Boyle	Apr 2014	B1
8699172	Gunderson et al.	Apr 2014	B1
8699175	Olds et al.	Apr 2014	B1
8699185	Teh et al.	Apr 2014	B1
8700850	Lalouette	Apr 2014	B1
8743502	Bonke et al.	Jun 2014	B1
8749910	Dang et al.	Jun 2014	B1
8751699	Tsai et al.	Jun 2014	B1
8755141	Dang	Jun 2014	B1
8755143	Wilson et al.	Jun 2014	B2
8756361	Carlson et al.	Jun 2014	B1
8756382	Carlson et al.	Jun 2014	B1
8769593	Schwartz et al.	Jul 2014	B1
8773802	Anderson et al.	Jul 2014	B1
8780478	Huynh et al.	Jul 2014	B1
8782334	Boyle et al.	Jul 2014	B1
8793532	Tsai et al.	Jul 2014	B1
8797669	Burton	Aug 2014	B1
8799977	Kapner, III et al.	Aug 2014	B1
8819375	Pruett et al.	Aug 2014	B1
8825976	Jones	Sep 2014	B1
8825977	Syu et al.	Sep 2014	B1
20020194210	Subramoney et al.	Dec 2002	A1
20050069298	Kasiraj et al.	Mar 2005	A1
20050071537	New et al.	Mar 2005	A1
20060232874	Tsuchinaga et al.	Oct 2006	A1
20070156794	Kisley	Jul 2007	A1
20070223132	Tsuchinaga	Sep 2007	A1
20090113702	Hogg	May 2009	A1
20090129163	Danilak	May 2009	A1
20100306551	Meyer et al.	Dec 2010	A1
20110225346	Goss et al.	Sep 2011	A1
20110225347	Goss	Sep 2011	A1
20110226729	Hogg	Sep 2011	A1
20110264843	Haines	Oct 2011	A1
20110283049	Kang et al.	Nov 2011	A1
20110302477	Goss	Dec 2011	A1
20120159042	Lott et al.	Jun 2012	A1
20120275050	Wilson et al.	Nov 2012	A1
20120281963	Krapf et al.	Nov 2012	A1
20120324980	Nguyen et al.	Dec 2012	A1
20130067136	Bates	Mar 2013	A1
20130080689	Jo	Mar 2013	A1
20130117501	Yun	May 2013	A1
20130173854	Shim	Jul 2013	A1
20130232308	Yairi	Sep 2013	A1
20130311741	Tene et al.	Nov 2013	A1
20140201424	Chen et al.	Jul 2014	A1
20140281127	Marcu	Sep 2014	A1
20150301932	Oh	Oct 2015	A1

Non-Patent Literature Citations (3)

Entry
International Search Report and Written Opinion dated Apr. 13, 2015 from related PCT Serial No. PCT/US2015/012689, 13 pages.
Ji-Yong Shin, et al., Contention-Oblivious Disk Arrays for Cloud Storage, Cornell University* Microsoft Research* Google, Proceedings of the USENIX conference on File and Storage Technologies (FAST), 2013, pp. 285-297.
Ji-Yong Shin, et al., A Contention-Oblivious Design for Cloud Storage, Cornell University, HotStorage Talk, Jun. 13, 2012, pp. 1-9.

Related Publications (1)

	Number	Date	Country
	20150212938 A1	Jul 2015	US

Provisional Applications (1)

	Number	Date	Country
	61932113	Jan 2014	US

Data storage system garbage collection based on at least one attribute

Information

Patent Number

Date Filed

Date Issued

Inventors

Original Assignees

Examiners

CPC

Field of Search

CPC

International Classifications

Term Extension