The disclosed embodiments relate generally to scheduling operations in non-volatile memory devices (e.g., NAND flash memory devices), and in particular to assigning operations to respective non-volatile memory devices based at least in part on preference values of the non-volatile memory devices for the operations.
In high performance flash memory architectures, performance is maximized by operating system resources as efficiently as possible. Conventional systems attempt to improve efficiency by allowing multiple operations to be worked on independently through the use of queues, pipelines, and parallel operations. However, when resources in such systems are poorly scheduled, these systems may exhibit a “slinky effect,” such that a system bottleneck moves from one resource to another over time. For example, in a NAND system, a NAND bus interface may be a short-term bottleneck when data are transferred to sets of idle NAND dies. Once the dies begin programming, the NAND bus is idle. Therefore, the idle time on NAND and NAND buses due to poor scheduling leads to poor system performance.
Without limiting the scope of the appended claims, after considering this disclosure, and particularly after considering the section entitled “Detailed Description” one will understand how the aspects of various embodiments are implemented and used to manage operations performed within non-volatile storage devices, in order to improve the performance of non-volatile storage devices. In some embodiments, in a storage controller of a storage system, a plurality of memory operations to be performed by a plurality of non-volatile memory devices coupled to the storage controller are identified. The number of memory operations in the plurality of memory operations typically is no greater than the number of non-volatile memory devices in the plurality of non-volatile memory devices, each memory operation is to be performed by a distinct non-volatile memory device, and the memory operations include host writes, garbage collection writes, and garbage collection reads. For each non-volatile memory device, preference values are assigned to each of the memory operations. Each memory operation is then assigned to a distinct non-volatile memory device, using the preference values assigned to each of the memory operations for each non-volatile memory device.
So that the present disclosure can be understood in greater detail, a more particular description may be had by reference to the features of various embodiments, some of which are illustrated in the appended drawings. The appended drawings, however, merely illustrate pertinent features of the present disclosure and are therefore not to be considered limiting, for the description may admit to other effective features.
In accordance with common practice, the various features illustrated in the drawings may not be drawn to scale. Accordingly, the dimensions of the various features may be arbitrarily expanded or reduced for clarity. In addition, some of the drawings may not depict all of the components of a given system, method or device. Finally, like reference numerals may be used to denote like features throughout the specification and figures.
Poorly scheduled resources in a storage system negatively impact the storage system performance. Consequently, what is desired are scheduling mechanisms that efficiently use system resources to perform heterogeneous operations (e.g., host writes, garbage collection writes, and/or garbage collection reads). Efficient use of system resources reduces the “slinky effect,” and thereby reduces system bottlenecks. In embodiments described herein, the storage system maps a memory operation scheduling problem (sometimes called a scheduling task, for scheduling a set of memory operations) is mapped into an optimization problem such as the Stable Marriage Problem or the Assignment Problem and solved accordingly, to improve the system performance.
In the Stable Marriage Problem, first and second groups each have n members, where n is an integer greater than one. Each member of each group ranks every member of the other group in order of preference, from 1 to n. Each member of each group is assigned (e.g., married to) a member of the other group, such that there are no two members of different groups who would rather be assigned (e.g., married) to each other than with their assigned partners. In the context of storage systems, the first group is memory operations and the second group is memory devices (e.g., memory die). The Stable Marriage Problem can be solved, for example, using the Gale-Shapley iterative algorithm.
In the Assignment Problem, a group of agents and a group of tasks each have n members. Any agent may be assigned to perform any task, at a cost that varies as a function of the agent-task assignment. Each task is assigned a single agent and each agent is assigned a single task such that the total cost is minimized. The Assignment Problem can be solved, for example, using the Hungarian Algorithm.
Solution of the Stable Marriage Problem or the Assignment Problem in the context of storage-system scheduling thus allows memory operations to be assigned efficiently to respective memory devices.
(A1) More specifically, some embodiments include a method of managing a storage system. The method is performed in a storage system having a storage controller a non-volatile memory devices, and includes identifying a plurality of memory operations to be performed by a plurality of non-volatile memory devices in the storage system. The number of memory operations in the plurality of memory operations is no greater than the number of non-volatile memory devices in the plurality of non-volatile memory devices; each memory operation is to be performed by a distinct non-volatile memory device; and the memory operations include host writes, garbage collection writes, and garbage collection reads. The method also includes, for each non-volatile memory device, assigning preference values to each of the memory operations. The method further includes assigning each memory operation to a distinct non-volatile memory device, using the preference values assigned to each of the memory operations for each non-volatile memory device.
(A2) In some embodiments of the method of A1, the plurality of non-volatile memory devices includes a plurality of memory dies. Each non-volatile memory device includes a distinct memory die of the plurality of memory dies.
(A3) In some embodiments of the method of A1 or A2, the storage controller manages a plurality of processes. Each memory operation is part of a process. A respective process includes memory operations of a common type selected from the group consisting of host writes, garbage collection writes, and garbage collection reads. The memory operations of a respective process include memory operations directed to respective pages in each of the non-volatile memory devices.
(A4) In some embodiments of the method of A3, the plurality of non-volatile memory devices includes a first memory device and remaining memory devices. The respective pages of the remaining memory devices store data. The respective page of the first memory device stores parity information corresponding to the data stored in the respective pages of the remaining memory devices.
(A5) In some embodiments of the method of A3 or A4, assigning preference values includes, for a respective non-volatile memory device: determining that a first memory operation of the plurality of memory operations is associated with a process for which no more than a specified number of memory operations are incomplete, and in response to the determining, assigning preference values to the memory operations that indicate a preference of the respective non-volatile memory device for the first memory operation over other memory operations of the plurality of memory operations.
(A6) In some embodiments of the method of A5, the determining includes determining that the first memory operation is the only remaining incomplete memory operation for its process.
(A7) In some embodiments of the method of any one of A1 to A6, assigning preference values includes, for a respective non-volatile memory device: determining that all memory operations that the respective non-volatile memory device currently can perform are of a first type selected from the group consisting of host writes, garbage collection writes, and garbage collection reads; and, in response to the determining, assigning preference values to the memory operations that indicate a preference of the respective non-volatile memory device for the first type of memory operation over other types of memory operations.
(A8) In some embodiments of the method of any one of A1 to A7, the storage controller includes a front-end controller and a plurality of back-end controllers coupled to the front-end controller. Each back-end controller is coupled to a respective subset of the plurality of non-volatile memory devices. The front-end controller receives host write requests and schedules garbage collection writes in accordance with the host write requests. The back-end controllers schedule garbage collection reads, wherein each garbage collection read corresponds to a respective garbage collection write. The front-end controller performs the identifying, the assigning of preference values, and the assigning of each memory operation.
(A9) In some embodiments of the method of any one of A1 to A8, the method further includes, for each memory operation of the plurality of memory operations, assigning preference values to each of the non-volatile memory devices. Assigning each memory operation to a distinct non-volatile memory device is performed using both the preference values assigned to each of the non-volatile memory devices for each memory operation and the preference values assigned to each of the memory operations for each non-volatile memory device.
(A10) In some embodiments of the method of A9, the storage controller includes a front-end controller and a plurality of back-end controllers coupled to the front-end controller. Each back-end controller is coupled to a respective subset of the plurality of non-volatile memory devices. Assigning the preference values to each of the non-volatile memory devices includes, for a respective memory operation of the plurality of memory operations, assigning preference values that indicate a preference of the respective memory operation for a first subset of the plurality of non-volatile memory devices over other subsets of the plurality of non-volatile memory devices.
(A11) In some embodiments of the method of any one of A1 to A10, assigning preference values to each of the memory operations includes, for each non-volatile memory device, ranking the memory operations in order of preference. Assigning each memory operation to a distinct non-volatile memory device includes solving the Stable Marriage Problem in accordance with the ranking of the memory operations in order of preference for each non-volatile memory device.
(A12) In some embodiments of the method of A11,the method further includes partitioning the memory operations into a group of host writes, a group of garbage collection writes, and a group of garbage collection reads. Ranking the memory operations in order of preference includes ranking the groups and ordering the memory operations by the ranked groups.
(A13) In some embodiments of the method of A11 or A12, the method further includes, for each memory operation of the plurality of memory operations, ranking the non-volatile memory devices in order of preference. Solving the Stable Marriage Problem is performed in accordance with both the ranking of the non-volatile memory devices in order of preference for each memory operation and the ranking of the memory operations in order of preference for each non-volatile memory device.
(A14) In some embodiments of the method of any one of A1 to A10, assigning preference values to each of the memory operations includes, for each non-volatile memory device, assigning weights to the memory operations. Assigning each memory operation to a distinct non-volatile memory device includes solving the Assignment Problem in accordance with the weights of the memory operations for each non-volatile memory device.
(A15) In some embodiments of the method of A14, the method further includes partitioning the memory operations into a group of host writes, a group of garbage collection writes, and a group of garbage collection reads. Assigning weights to the memory operations includes, for each non-volatile memory device, assigning a first weight to each operation in the group of host writes, assigning a second weight to each operation in the group of garbage collection writes, and assigning a third weight to each operation in the group of garbage collection reads, wherein the first, second, and third weights are distinct.
(A16) In some embodiments of the method of A14 or A15, the method further includes, for each memory operation of the plurality of memory operations, ranking the non-volatile memory devices in order of preference. Solving the Assignment Problem is performed in accordance with both the ranking of the non-volatile memory devices in order of preference for each memory operation and the weights of the memory operations for each non-volatile memory device.
(A17) In some embodiments of the method of any one of A1 to A16, identifying the number of memory operations to be performed by the non-volatile memory devices includes determining a ratio of host writes to garbage collection writes, based on a write amplification of the storage system; selecting host writes and garbage collection writes such that the number of host writes and the number of garbage collection writes satisfy the ratio; and selecting garbage collection reads such that the number of garbage collection reads equals the number of garbage collection writes.
(A18) In another aspect, a storage system includes a plurality of non-volatile memory devices, one or more processors, and memory storing one or more programs configured for execution by the one or more processors. The one or more programs include instructions for performing the method of any one of A1 to A11 described above. Alternatively stated, the one or more programs include instructions that when executed by the one or more processors, cause the storage system to perform the method of any one of A1 to A11 described above.
(A19) In yet another aspect, a non-transitory computer-readable storage medium stores one or more programs configured for execution by one or more processors of a storage system that further includes a plurality of non-volatile memory devices. The one or more programs include instructions for performing the method of any one of A1 to A17 described above.
Numerous details are described herein in order to provide a thorough understanding of the example embodiments illustrated in the accompanying drawings. However, some embodiments may be practiced without many of the specific details, and the scope of the claims is only limited by those features and aspects specifically recited in the claims. Furthermore, well-known methods, components, and circuits have not been described in exhaustive detail so as not to unnecessarily obscure pertinent aspects of the embodiments described herein.
In some embodiments, storage medium 132 includes a plurality of non-volatile memory (NVM) devices 140-1 through 140-n. In some embodiments, storage medium 132 is NAND-type flash memory or NOR-type flash memory (e.g., NVM devices 140-1 through 140-n are NAND-type flash memory or NOR-type flash memory). In some embodiments, storage medium 132 includes one or more three-dimensional (3D) memory devices. Further, in some embodiments, storage controller 124 is a solid-state drive (SSD) controller. However, other types of storage media (e.g., other types of NVM devices) may be included in accordance with aspects of a wide variety of embodiments (e.g., PCRAM, ReRAM, STT-RAM, etc.). In some embodiments, a NVM device includes one or more flash memory dies, one or more flash memory packages, one or more flash memory channels or the like. For example, each NVM device 140 (or a respective NVM device) includes a plurality of memory dies, and each of the NVM devices 140-1 through 140-n is a distinct NVM die and/or distinct NVM package. In some embodiments, data storage system 100 can contain one or more storage devices 120.
Computer system 110 is coupled to storage controller 124 through data connections 101. However, in some embodiments computer system 110 includes storage controller 124, or a portion of storage controller 124, as a component and/or as a subsystem. For example, in some embodiments, some or all of the functionality of storage controller 124 is implemented by software executed on computer system 110. Computer system 110 may be any suitable computer device, such as a computer, a laptop computer, a tablet device, a netbook, an internet kiosk, a personal digital assistant, a mobile phone, a smart phone, a gaming device, a computer server, or any other computing device. Computer system 110 is sometimes called a host, host system, client, or client system. In some embodiments, computer system 110 is a server system, such as a server system in a data center. In some embodiments, computer system 110 includes one or more processors, one or more types of memory, a display and/or other user interface components such as a keyboard, a touch-screen display, a mouse, a track-pad, a digital camera, and/or any number of supplemental I/O devices to add functionality to computer system 110. In some embodiments, computer system 110 does not have a display and other user interface components.
Storage medium 132 is coupled to storage controller 124 through connections 103. Connections 103 are sometimes called data connections, but typically convey commands in addition to data, and optionally convey metadata, error correction information and/or other information in addition to data values to be stored in storage medium 132 and data values read from storage medium 132. In some embodiments, however, storage controller 124 and storage medium 132 are included in the same device (e.g., an integrated device) as components thereof. Furthermore, in some embodiments, storage controller 124 and storage medium 132 are embedded in a host device (e.g., computer system 110), such as a mobile device, tablet, other computer or computer controlled device, and the methods described herein are performed, at least in part, by the embedded storage controller. Storage medium 132 may include any number (e.g., one or more) of memory devices including, without limitation, non-volatile semiconductor memory devices, such as flash memory device(s). For example, flash memory device(s) can be configured for enterprise storage suitable for applications such as cloud computing, for database applications, primary and/or secondary storage, or for caching data stored (or to be stored) in secondary storage, such as hard disk drives. Additionally and/or alternatively, flash memory device(s) can also be configured for relatively smaller-scale applications such as personal flash drives or hard-disk replacements for personal, laptop, and tablet computers.
Storage medium 132 is divided into a number of addressable and individually selectable blocks, such as selectable portion 133. In some embodiments, the individually selectable blocks are the minimum size erasable units in a flash memory device. In other words, each block contains the minimum number of memory cells that can be erased without erasing any other memory cells in the same flash memory device. Typically, when a flash memory block is erased, all memory cells in the block are erased simultaneously. Each block is usually further divided into a plurality of pages and/or word lines, where each page or word line is typically an instance of the smallest individually accessible (readable) portion in a block. In some embodiments (e.g., using some types of flash memory), the smallest individually accessible unit of a data set, however, is a sector, which is a subunit of a page. That is, a block includes a plurality of pages, each page contains a plurality of sectors, and each sector is the minimum unit of data for reading data from the flash memory device. For example, in some implementations, each block includes a number of pages, such as 64 pages, 128 pages, 256 pages or another suitable number of pages. Blocks are typically grouped into a plurality of zones. Each block zone can be independently managed to some extent, which increases the degree of parallelism for parallel operations and simplifies management of storage medium 132.
Additionally, if data is written to a storage medium in pages, but the storage medium is erased in blocks, pages in the storage medium may contain invalid (e.g., stale) data, but those pages cannot be overwritten until the whole block containing those pages is erased. In order to write to the pages with invalid data, the pages (if any) with valid data in that block are read and re-written to a new block and the old block is erased (or put on a queue for erasing). This process is called garbage collection. After garbage collection, the new block contains the pages with valid data and may have free pages that are available for new data to be written, and the old block can be erased so as to be available for new data to be written.
A phenomenon related to garbage collection is write amplification. Write amplification is a phenomenon where the actual amount of physical data written to a storage medium (e.g., NVM devices 140 in storage device 120) is a multiple of the logical amount of data written by a host (e.g., computer system 110, sometimes called a host) to the storage medium. As discussed above, when a block of storage medium must be erased before it can be re-written, the garbage collection process to perform these operations results in re-writing data one or more times. This multiplying effect increases the number of writes required over the life of a storage medium, which shortens the time it can reliably operate. The formula to calculate the write amplification of a storage system is given by equation:
Write operations that are performed in response to commands from the host are referred to as host writes, while write operations performed during garbage collection are referred to as garbage collection writes. For a given write amplification, the ratio of the number of garbage collection writes to the number of host writes equals the write amplification minus 1. Furthermore, a garbage collection read is performed for each garbage collection write: data to be written to a new block during garbage collection must be read from an old block. RAM 150 is used to store data read from NVM devices 140 during garbage collection reads until the data is re-written to the NVM devices 140 during garbage collection writes.
In some implementations, for example some embodiments of data storage system 100 shown in
Host interface 129 provides an interface to computer system 110 through data connections 101. Similarly, storage medium interface 128 provides an interface to storage medium 132 though connections 103. In some embodiments, storage medium interface 128 includes read and write circuitry, including circuitry capable of providing reading signals to storage medium 132 (e.g., reading threshold voltages for NAND-type flash memory, as discussed below). In some embodiments, connections 101 and connections 103 are implemented as a communication media over which commands and data are communicated, using a protocol such as DDR3, SCSI, SATA, SAS, or the like. In some embodiments, storage controller 124 includes one or more processing units (also sometimes called CPUs, processors, microprocessors, or microcontrollers) configured to execute instructions in one or more programs (e.g., in storage controller 124). In some embodiments, the one or more processors are shared by one or more components within, and in some cases, beyond the function of storage controller 124.
In some embodiments, management module 121-1 includes one or more central processing units (CPUs, also sometimes called processors, microprocessors, or microcontrollers) 122-1 configured to execute instructions in one or more programs (e.g., in management module 121-1). In some embodiments, the one or more CPUs 122-1 are shared by one or more components within, and in some cases, beyond the function of storage controller 124. Management module 121-1 is coupled to host interface 129, additional module(s) 125 and storage medium interface 128 in order to coordinate the operation of these components. In some embodiments, one or more modules of management module 121-1 are implemented in management module 121-2 of computer system 110. In some embodiments, one or more processors of computer system 110 (not shown) are configured to execute instructions in one or more programs (e.g., in management module 121-2). Management module 121-2 is coupled to storage device 120 in order to manage the operation of storage device 120.
Additional module(s) 125 are coupled to storage medium interface 128, host interface 129, and management module 121-1. As an example, additional module(s) 125 may include an error control module to limit the number of uncorrectable errors inadvertently introduced into data during writes to memory and/or reads from memory. In some embodiments, additional module(s) 125 are executed in software by the one or more CPUs 122-1 of management module 121-1, and, in other embodiments, additional module(s) 125 are implemented in whole or in part using special purpose circuitry (e.g., to perform encoding and decoding functions). In some embodiments, additional module(s) 125 are implemented in whole or in part by software executed on computer system 110.
As data storage densities of non-volatile semiconductor memory devices continue to increase, stored data is more prone to being stored and/or read erroneously. In some embodiments, error control coding can be utilized to limit the number of uncorrectable errors that are introduced by electrical fluctuations, defects in the storage medium, operating conditions, device history, write-read circuitry, etc., or a combination of these and various other factors.
In some embodiments, an error control module, included in additional module(s) 125, includes an encoder and a decoder. In some embodiments, the encoder encodes data by applying an error control code (ECC) to produce a codeword, which is subsequently stored in storage medium 132. When encoded data (e.g., one or more codewords) is read from storage medium 132, the decoder applies a decoding process to the encoded data to recover the data, and to correct errors in the recovered data within the error correcting capability of the error control code. Those skilled in the art will appreciate that various error control codes have different error detection and correction capacities, and that particular codes are selected for various applications for reasons beyond the scope of this disclosure. As such, an exhaustive review of the various types of error control codes is not provided herein. Moreover, those skilled in the art will appreciate that each type or family of error control codes may have encoding and decoding algorithms that are particular to the type or family of error control codes. On the other hand, some algorithms may be utilized at least to some extent in the decoding of a number of different types or families of error control codes. As such, for the sake of brevity, an exhaustive description of the various types of encoding and decoding algorithms generally available and known to those skilled in the art is not provided herein.
In some embodiments, during a host write operation, host interface 129 receives data to be stored in storage medium 132 from computer system 110. The data received by host interface 129 is made available to an encoder (e.g., in additional module(s) 125), which encodes the data to produce one or more codewords. The one or more codewords are made available to storage medium interface 128, which transfers the one or more codewords to storage medium 132 in a manner dependent on the type of storage medium being utilized.
In some embodiments, a host read operation is initiated when computer system (host) 110 sends one or more host read commands (e.g., via data connections 101, or alternatively a separate control line or bus) to storage controller 124 requesting data from storage medium 132. Storage controller 124 sends one or more read access commands to storage medium 132, via storage medium interface 128, to obtain raw read data in accordance with memory locations (or logical addresses, object identifiers or the like) specified by the one or more host read commands. Storage medium interface 128 provides the raw read data (e.g., comprising one or more codewords) to a decoder (e.g., in additional module(s) 125). If the decoding is successful, the decoded data is provided to host interface 129, where the decoded data is made available to computer system 110. In some embodiments, if the decoding is not successful, storage controller 124 may resort to a number of remedial actions or provide an indication of an irresolvable error condition. In some embodiments, host read operations are performed as corresponding commands are received from the host, and thus are not included in the scheduling techniques (e.g., based on the Stable Marriage problem or Assignment Problem) disclosed herein.
As explained above, a storage medium (e.g., NVM devices 140) is divided into a number of addressable and individually selectable blocks and each block is optionally (but typically) further divided into a plurality of pages and/or word lines and/or sectors.
While erasure of a storage medium is performed on a block basis, in many embodiments, reading and programming of the storage medium is performed on a smaller subunit of a block (e.g., on a page basis, word line basis, or sector basis). In some embodiments, the smaller subunit of a block consists of multiple memory cells (e.g., single-level cells or multi-level cells). In some embodiments, programming is performed on an entire page. In some embodiments, a multi-level cell (MLC) NAND flash typically has four possible states per cell, yielding two bits of information per cell. Further, in some embodiments, a MLC NAND has two page types: (1) a lower page (sometimes called fast page), and (2) an upper page (sometimes called slow page). In some embodiments, a triple-level cell (TLC) NAND flash has eight possible states per cell, yielding three bits of information per cell. Although the description herein uses TLC, MLC, and SLC as examples, those skilled in the art will appreciate that the embodiments described herein may be extended to memory cells that have more than eight possible states per cell, yielding more than three bits of information per cell. In some embodiments, the encoding format of the storage media (e.g., TLC, MLC, or SLC and/or a chosen data redundancy mechanism) is a choice made (or implemented) when data is actually written to the storage media.
Flash memory devices (in some embodiments, storage medium 132) utilize memory cells (e.g., SLC, MLC, and/or TLC) to store data as electrical values, such as electrical charges or voltages. Each flash memory cell typically includes a single transistor with a floating gate that is used to store a charge, which modifies the threshold voltage of the transistor (e.g., the voltage needed to turn the transistor on). The magnitude of the charge, and the corresponding threshold voltage the charge creates, is used to represent one or more data values. In some embodiments, during a read operation, a reading threshold voltage is applied to the control gate of the transistor and the resulting sensed current or voltage is mapped to a data value.
In some embodiments, RAM 150 is a double-data-rate (DDR) RAM. In some embodiments, RAM 150 is coupled to storage controller 124 (e.g., through a bus 105 that connects to management module 121-1 and/or storage medium interface 128). Alternatively, RAM 150 is part of storage controller 124, or is part of storage medium 132.
Attention is now directed to
As a non-limiting example, data storage system 100 includes storage device 120, which includes a FE controller 130 and a plurality of BE controllers 134-1 through 134-m. Connections 133 couple FE controller 130 with the BE controllers 134-1 through 134-m, each of which is coupled with and controls a respective plurality of NVM devices, e.g., NVM devices 140-1 through 140-k. The FE controller 130 includes a management module 121-1. Similarly, each of the BE controllers 134-1 through 134-m includes a management module 121-3. Together, management modules 121-1 and 121-3 mange the operation of storage device 120.
In this non-limiting example, data storage system 100 is used in conjunction with computer system 110. In some implementations, storage device 120 includes a plurality of NVM devices. In some implementations, NVM devices 140 include NAND-type flash memory or NOR-type flash memory. Further, in some implementations, each FE controller 130 and BE controller 134 is or includes a solid-state drive (SSD) controller. However, one or more other types of storage media may be included in accordance with aspects of a wide variety of implementations.
In some embodiments, the plurality of BE controllers 134 are coupled with FE controller 130 through connections 133. Connections 133 are sometimes called data connections, but typically convey commands in addition to data, and optionally convey metadata, error correction information, and/or other information in addition to data values to be stored in NVM devices 140 and data values read from NVM devices 140. In some embodiments, each BE controller 130 is coupled to a respective subset of the plurality of non-volatile memory devices. For example, as shown in
In some embodiments, FE controller 130, the plurality of BE controllers 134, and NVM devices 140 are included in the same device (e.g., an integrated device such as storage medium 132 of
In some embodiments, storage device 120 includes m memory channels, each of which has a BE controller 134 and a set of NVM devices 140 coupled to the BE controller 134, where m is an integer greater than one. In some embodiments, two or more memory channels share a BE controller 134. In either example, each memory channel has its own distinct set of NVM devices 140. In a non-limiting example, m typically is 8, 16 or 32. In another non-limiting example, the number of NVM devices 140 per memory channel is typically 8, 16, 32 or 64. Furthermore, in some embodiments, the number of NVM devices 140 is different in different memory channels.
In some embodiments, FE controller 130 and each BE controller 134 include a portion of RAM 150, for example implementing a write cache, while in other embodiments only FE controller 130 implements a write cache in RAM 150. In some embodiments, each BE controller 134 optionally includes a management module 121 (e.g., management modules 121-3 of BE controllers 134). The management modules 121-3 of BE controllers 164 also, in some embodiments, include one or more CPUs (not shown in
In some embodiments, management module 121-3 of BE controller 134-1 performs or shares some of the tasks typically performed by management module 121-1 of FE controller 130. For example, in some embodiments, management module 121-3 of BE controller 134-1 monitors the status of executing commands at the NVM devices coupled to that management module, instead of management module 121-1 performing this function (as discussed in more detail below). In some embodiments, management module 121-3 of BE controller 134-1 monitors a portion of NVM devices 140, while management module 121-1 of storage controller 124 monitors the remainder of NVM devices 140. In some embodiments, management module 121-3 of BE controller 134-1 monitors a portion of NVM devices 140 (e.g., all NVM devices associated with the BE controller of which management module 121-3 is a component), and other management modules 121 associated with other BE controllers 134 monitor the remaining NVM devices 140.
As mentioned above, front-end controller 130 receives host write commands from the computer system 110 and determines the number of garbage collection writes to be performed. In some embodiments, the front-end controller 130 performs high-level scheduling, in which host writes and garbage collection writes are scheduled, while the back-end controllers 134-1 through 134-m perform low-level scheduling, in which garbage collection reads are scheduled based on the number of garbage collection writes. In some embodiments, the front-end controller 130 assigns respective operations (e.g., host writes, garbage collection writes, and/or garbage collection reads) to respective NVM devices 140 (e.g., during a specified time period).
Memory 206 includes high-speed random access memory, such as DRAM, SRAM, DDR RAM or other random access solid state memory devices, and may include non-volatile memory, such as one or more magnetic disk storage devices, optical disk storage devices, flash memory devices, or other non-volatile solid state storage devices. Memory 206 optionally includes one or more storage devices remotely located from the CPU(s) 122-1. Memory 206, or alternatively the non-volatile memory device(s) within memory 206, comprises a non-transitory computer readable storage medium.
In some embodiments, memory 206, or the non-transitory computer-readable storage medium of memory 206 stores the following programs, modules, and data structures, or a subset or superset thereof:
Each of the above identified elements may be stored in one or more of the previously mentioned memory devices, and corresponds to a set of instructions for performing a function described above. The above identified modules or programs (e.g., sets of instructions) need not be implemented as separate software programs, procedures or modules, and thus various subsets of these modules may be combined or otherwise re-arranged in various embodiments. In some embodiments, memory 206 may store a subset of the modules and data structures identified above. Furthermore, memory 206 may store additional modules and data structures not described above. In some embodiments, the programs, modules, and data structures stored in memory 206, or the non-transitory computer readable storage medium of memory 206, provide instructions for implementing some of the methods described below. In some embodiments, some or all of these modules may be implemented with specialized hardware circuits that subsume part or all of the module functionality.
Although
In some embodiments, respective memory operations are performed as part of a memory process, which is managed by a storage controller (e.g., FE controller 130,
The operations for a respective process are not necessarily performed simultaneously. For example, for host write process 302-1, the operations directed to NVM devices 140-1 through 140-3 and to 140-r have already been performed, as indicated by the check marks, while the operation directed to NVM device 140-4 has not yet been performed, as indicated by the absence of a check mark. A process is complete when all of its operations have been performed.
In some embodiments, the status of processes is used to determine (e.g., for assignment 620,
In some embodiments, a NVM device 140 has a preference for performing a particular operation and/or a particular type of operation (e.g., host write, garbage collection write, or garbage collection read) if there are no other types of memory operations to be performed. For example, NVM device 140-1 has a preference for performing a host write for host write process 302-3, because no garbage collection writes or reads are available for it to perform.
In some embodiments, a NVM device 140 has a preference for performing a garbage collection read if there are no host writes and garbage collection writes for it to perform. The data accessed in the garbage collection read is buffered (e.g., in the RAM 150) for subsequent write back to the storage medium 132.
In some embodiments, the plurality of NVM devices 140 (e.g., the set of NVM devices in a RAID stripe) include a first memory device and remaining memory devices. For each process, a page in the first memory NVM device 140 stores parity information, while pages in the remaining NVM devices 140 of the plurality of NVM devices store data corresponding to the parity information. For example, in some embodiments, storage device 120 uses deterministic parity: a single NVM device (e.g., NVM device 140-k) always stores parity information, while the other NVM devices store data. In some other embodiments, the storage device 120 uses last-man-out parity: the parity information is written to the last NVM device to be written in a RAID stripe, such that the NVM device storing parity information varies from process to process.
In some embodiments, memory operations 400 include host writes, garbage collection writes, and garbage collection reads. It is noted that in these embodiments, memory operations 400 do not include host reads, since host reads are directed to specific addresses, which correspond to specific die, and therefore host reads cannot be reassigned to die other than the specific die(s) that contain the data requested by the host reads. Operations 400 are partitioned into distinct groups, and each group is of a type of operation. For example,
After identifying the memory operations 400 and the non-volatile devices (e.g., the dies 402 in a RAID stripe), for each non-volatile memory device (e.g., die 402), the storage system assigns preference values to each of the memory operations. In some embodiments, ranking of the memory operations 400 in order of preference is performed for each non-volatile memory device. For example, in
Having assigned the group preferences for each die, the storage system ranks, for each die, the operations within each group in the order of preference, as shown in
Using the preference values assigned to each of the memory operations for each non-volatile memory device, the storage system then assigns each memory operation to a distinct non-volatile memory device, as shown in
Although
As a non-limiting example, after identifying a plurality of memory operations 500 to be performed by a plurality of non-volatile memory devices (e.g., the dies 502 in a RAID stripe), and assigning preference values for each die, the in a storage system assigns weight to each operation, as shown in
In accordance with the weights assigned to memory operation for each non-volatile memory device, the in a storage system assigns each memory operation to a distinct non-volatile memory device, as shown in
Although
Additional details concerning each of the processing steps discussed above for scheduling memory operations, including mapping into the Stable Marriage Problem and the Assignment Problem and solving accordingly, are presented below with reference to
In some embodiments, some of the operations of method 600 are performed at a host (e.g., computer system 110) and other operations of method 600 are performed at a storage device (e.g., storage device 120). In some embodiments, method 600 is governed, at least in part, by instructions that are stored in a non-transitory computer-readable storage medium and that are executed by one or more processors of a host (not shown in
For ease of explanation, the following describes the method 600 as performed by a storage device (e.g., by storage controller 124 of storage device 120,
In (602) a storage system having a storage controller (e.g., storage controller 124,
As described above with respect to
Having identified the memory operations and the non-volatile devices, for each non-volatile memory device, the storage system assigns (620) preference values to each of the memory operations. In some embodiments, for a respective non-volatile memory device, the storage system first determines (622) that all memory operations that the respective non-volatile memory device currently can perform are of a first type selected from the group consisting of host writes, garbage collection writes, and garbage collection reads. Based on the determination, the storage device assigns preference values to the memory operations that indicate a preference of the respective non-volatile memory device for the first type of memory operation over other types of memory operations. For example, in
In some embodiments, the preference values are assigned to not only the memory operations, but also the non-volatile memory devices. In some embodiments, for each memory operation of the plurality of memory operations, the storage system assigns (626) preference values to each of the non-volatile memory devices so that assigning (628) each memory operation to a distinct non-volatile memory device is performed using both the preference values assigned to each of the non-volatile memory devices for each memory operation and the preference values assigned to each of the memory operations for each non-volatile memory device. Furthermore, as described below (e.g., see description of operations 648 and 658), in some embodiments, preference values are assigned to non-volatile memory devices, but not to memory operations.
In some embodiments, the storage controller (e.g., storage controller 124,
In some embodiments, the storage system (e.g., storage controller 124,
In some embodiments, the memory operations of a respective process comprise memory operations directed to respective pages in each of the non-volatile memory devices. The plurality of non-volatile memory devices comprises (634) a first memory device and remaining memory devices, such that the respective pages of the remaining memory devices store data and the respective page of the first memory device stores parity information corresponding to the data stored in the respective pages of the remaining memory devices. In some embodiments, assigning preference values comprises (636), for a respective non-volatile memory device, first determining that a first memory operation of the plurality of memory operations is associated with a process for which no more than a specified number of memory operations are incomplete, and in response to the determining, assigning preference values to the memory operations that indicate a preference of the respective non-volatile memory device for the first memory operation over other memory operations of the plurality of memory operations. For example, in
In some embodiment, the determining in operation 636 comprises (638) determining that the first memory operation is the only remaining incomplete memory operation for its process. For example, in
In some embodiments, the storage controller (e.g., storage controller 124,
In some embodiments, the storage system assigns operations to non-volatile memory device through solution of the Stable Marriage Problem, as shown in
In some embodiments, as shown in
In some embodiments, preference values are assigned to each non-volatile device as shown in
In some embodiments, the storage system assigns operations to non-volatile memory devices through solution of the Assignment Problem, as shown in
In some embodiments, as shown in
As explained above, in some embodiments, preference values are assigned to each non-volatile device as shown in
It will be understood that, although the terms “first,” “second,” etc. may be used herein to describe various elements, these elements should not be limited by these terms. These terms are only used to distinguish one element from another. For example, a first transistor could be termed a second transistor, and, similarly, a second transistor could be termed a first transistor, without changing the meaning of the description, so long as all occurrences of the “first transistor” are renamed consistently and all occurrences of the “second transistor” are renamed consistently. The first transistor and the second transistor are both transistors, but they are not the same transistor.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the claims. As used in the description of the embodiments and the appended claims, the singular forms “a,” “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will also be understood that the term “and/or” as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items. It will be further understood that the terms “comprises” and/or “comprising,” when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
As used herein, the term “if” may be construed to mean “when” or “upon” or “in response to determining” or “in accordance with a determination” or “in response to detecting,” that a stated condition precedent is true, depending on the context. Similarly, the phrase “if it is determined [that a stated condition precedent is true]” or “if [a stated condition precedent is true]” or “when [a stated condition precedent is true]” may be construed to mean “upon determining” or “in response to determining” or “in accordance with a determination” or “upon detecting” or “in response to detecting” that the stated condition precedent is true, depending on the context.
The foregoing description, for purpose of explanation, has been described with reference to specific embodiments. However, the illustrative discussions above are not intended to be exhaustive or to limit the claims to the precise forms disclosed. Many modifications and variations are possible in view of the above teachings. The embodiments were chosen and described in order to best explain principles of operation and practical applications, to thereby enable others skilled in the art.
This application claims priority to U.S. Provisional Patent Application No. 62/190,183, filed Jul. 8, 2015, which is incorporated by reference in its entirety.
Number | Name | Date | Kind |
---|---|---|---|
4586167 | Fujishima et al. | Apr 1986 | A |
5559988 | Durante et al. | Sep 1996 | A |
5909559 | So | Jun 1999 | A |
6247136 | MacWilliams et al. | Jun 2001 | B1 |
6292410 | Yi et al. | Sep 2001 | B1 |
6401213 | Jeddeloh | Jun 2002 | B1 |
6449709 | Gates | Sep 2002 | B1 |
6952682 | Wellman | Oct 2005 | B1 |
7969809 | Ben-Rubi | Jun 2011 | B2 |
8010738 | Chilton et al. | Aug 2011 | B1 |
8122202 | Gillingham | Feb 2012 | B2 |
8213255 | Hemink et al. | Jul 2012 | B2 |
8255618 | Borchers | Aug 2012 | B1 |
8321627 | Norrie | Nov 2012 | B1 |
8429498 | Anholt et al. | Apr 2013 | B1 |
8479080 | Shalvi et al. | Jul 2013 | B1 |
8539139 | Morris | Sep 2013 | B1 |
8595590 | Vojcic et al. | Nov 2013 | B1 |
8775720 | Meyer | Jul 2014 | B1 |
8825967 | Hong Beom | Sep 2014 | B2 |
8874836 | Hayes | Oct 2014 | B1 |
8886872 | Norrie | Nov 2014 | B1 |
8924661 | Shachar | Dec 2014 | B1 |
8984376 | Norrie | Mar 2015 | B1 |
9128825 | Albrecht et al. | Sep 2015 | B1 |
9170876 | Bates et al. | Oct 2015 | B1 |
9176971 | Shapiro | Nov 2015 | B2 |
9214965 | Fitzpatrick et al. | Dec 2015 | B2 |
20030115403 | Bouchard et al. | Jun 2003 | A1 |
20030122834 | Mastronarde et al. | Jul 2003 | A1 |
20040117441 | Liu et al. | Jun 2004 | A1 |
20050144361 | Gonzalez et al. | Jun 2005 | A1 |
20050248992 | Hwang et al. | Nov 2005 | A1 |
20070002629 | Lee et al. | Jan 2007 | A1 |
20070156998 | Gorobets | Jul 2007 | A1 |
20070233937 | Coulson et al. | Oct 2007 | A1 |
20080140914 | Jeon | Jun 2008 | A1 |
20080147994 | Jeong | Jun 2008 | A1 |
20080235466 | Traister | Sep 2008 | A1 |
20080235480 | Traister | Sep 2008 | A1 |
20080291204 | Korupolu | Nov 2008 | A1 |
20080295094 | Korupolu | Nov 2008 | A1 |
20090168525 | Olbrich et al. | Jul 2009 | A1 |
20090177943 | Silvus et al. | Jul 2009 | A1 |
20090222627 | Reid | Sep 2009 | A1 |
20090282191 | Depta | Nov 2009 | A1 |
20100005217 | Jeddeloh | Jan 2010 | A1 |
20100014364 | Laberge et al. | Jan 2010 | A1 |
20100082879 | McKean et al. | Apr 2010 | A1 |
20100165730 | Sommer et al. | Jul 2010 | A1 |
20100174845 | Gorobets et al. | Jul 2010 | A1 |
20100174853 | Lee et al. | Jul 2010 | A1 |
20100220509 | Solokov et al. | Sep 2010 | A1 |
20100250874 | Farrell et al. | Sep 2010 | A1 |
20110113204 | Henriksson et al. | May 2011 | A1 |
20110138100 | Sinclair | Jun 2011 | A1 |
20110235434 | Byom et al. | Sep 2011 | A1 |
20110252215 | Franceschini et al. | Oct 2011 | A1 |
20110264851 | Jeon et al. | Oct 2011 | A1 |
20110302474 | Goss et al. | Dec 2011 | A1 |
20120030408 | Flynn et al. | Feb 2012 | A1 |
20120047317 | Yoon et al. | Feb 2012 | A1 |
20120159070 | Baderdinni et al. | Jun 2012 | A1 |
20120198129 | Van Aken et al. | Aug 2012 | A1 |
20120224425 | Fai et al. | Sep 2012 | A1 |
20120278530 | Ebsen | Nov 2012 | A1 |
20120324180 | Asnaashari et al. | Dec 2012 | A1 |
20130007380 | Seekins et al. | Jan 2013 | A1 |
20130070507 | Yoon | Mar 2013 | A1 |
20130111112 | Jeong et al. | May 2013 | A1 |
20130111289 | Zhang et al. | May 2013 | A1 |
20130111290 | Zhang et al. | May 2013 | A1 |
20130132650 | Choi et al. | May 2013 | A1 |
20130182506 | Melik-Martirosian | Jul 2013 | A1 |
20130219106 | Vogan et al. | Aug 2013 | A1 |
20130232290 | Ish | Sep 2013 | A1 |
20130254498 | Adachi et al. | Sep 2013 | A1 |
20130262745 | Lin et al. | Oct 2013 | A1 |
20130297894 | Cohen et al. | Nov 2013 | A1 |
20130346805 | Sprouse et al. | Dec 2013 | A1 |
20140006688 | Yu et al. | Jan 2014 | A1 |
20140013026 | Venkata et al. | Jan 2014 | A1 |
20140047170 | Cohen et al. | Feb 2014 | A1 |
20140075100 | Kaneko | Mar 2014 | A1 |
20140143637 | Cohen et al. | May 2014 | A1 |
20140148175 | Luo | May 2014 | A1 |
20140173239 | Schushan | Jun 2014 | A1 |
20140229655 | Goss et al. | Aug 2014 | A1 |
20140229656 | Goss et al. | Aug 2014 | A1 |
20140241071 | Goss et al. | Aug 2014 | A1 |
20140244897 | Goss et al. | Aug 2014 | A1 |
20140244899 | Schmier et al. | Aug 2014 | A1 |
20140258598 | Canepa et al. | Sep 2014 | A1 |
20140281833 | Kroeger et al. | Sep 2014 | A1 |
20140310241 | Goyen | Oct 2014 | A1 |
20140379988 | Lyakhovitskiy et al. | Dec 2014 | A1 |
20150067172 | Ashokan | Mar 2015 | A1 |
20150074487 | Patapoutian et al. | Mar 2015 | A1 |
20150095558 | Kim et al. | Apr 2015 | A1 |
20150113206 | Fitzpatrick | Apr 2015 | A1 |
20150186278 | Jayakumar et al. | Jul 2015 | A1 |
20150234612 | Himelstein | Aug 2015 | A1 |
20150261473 | Matsuyama et al. | Sep 2015 | A1 |
20150262632 | Shelton et al. | Sep 2015 | A1 |
20150301749 | Seo et al. | Oct 2015 | A1 |
20150331627 | Kwak | Nov 2015 | A1 |
20160026386 | Ellis et al. | Jan 2016 | A1 |
20160034194 | Brokhman et al. | Feb 2016 | A1 |
20160062699 | Samuels et al. | Mar 2016 | A1 |
20160070493 | Oh et al. | Mar 2016 | A1 |
20160071612 | Takizawa et al. | Mar 2016 | A1 |
20160117099 | Prins et al. | Apr 2016 | A1 |
20160117102 | Hong | Apr 2016 | A1 |
20160117105 | Thangaraj et al. | Apr 2016 | A1 |
20160117252 | Thangaraj et al. | Apr 2016 | A1 |
20160170671 | Huang | Jun 2016 | A1 |
20160170831 | Lesatre et al. | Jun 2016 | A1 |
20160179403 | Kurotsuchi et al. | Jun 2016 | A1 |
20160210060 | Dreyer | Jul 2016 | A1 |
20160299689 | Kim | Oct 2016 | A1 |
20160299699 | Vanaraj et al. | Oct 2016 | A1 |
20160299704 | Vanaraj et al. | Oct 2016 | A1 |
20160299724 | Vanaraj et al. | Oct 2016 | A1 |
20160342344 | Kankani et al. | Nov 2016 | A1 |
20160342345 | Kankani et al. | Nov 2016 | A1 |
20160371394 | Shahidi | Dec 2016 | A1 |
Number | Date | Country |
---|---|---|
0 376 285 | Jul 1990 | EP |
WO 2012083308 | Jun 2012 | WO |
Entry |
---|
Seagate Technology, “SCSI Commands Reference Manual, Rev. C”, Product Manual dated Apr. 2010, pp. 211-214. |
Tanenbaum, “Structured Computer Organization”, 3rd edition 1990, section 1.4, p. 11, 3 pages. |
International Search Report and Written Opinion dated Nov. 18, 2015, received in International Patent Application No. PCT/US2015/039552 which corresponds to U.S. Appl. No. 14/559,183, 11 pages. (Ellis). |
International Search Report and Written Opinion dated Jul. 4, 2016, received in International Patent Application No. PCT/US2016/028477, which corresponds to U.S. Appl. No. 14/883,540, 11 pages (Hodgdon). |
International Search Report and Written Opinion dated Nov. 9, 2015, received in International Patent Application No. PCT/US2015/053551, which corresponds to U.S. Appl. No. 14/668,690, 12 pages (Thangaraj). |
International Search Report and Written Opinion dated Nov. 11, 2015, received in International Patent Application No. PCT/US2015/053582, which corresponds to U.S. Appl. No. 14/659,493, 12 pages (Prins). |
International Search Report and Written Opinion dated Sep. 8, 2016, received in International Patent Application No. PCT/US2016/036716, which corresponds to U.S. Appl. No. 14/925,945, 13 pages. (Ellis). |
Atmel Data-sheet, “9-to-bit Selectable, ±0.5° C. Accurate Digital Temperature Sensor with Nonvolatile Registers and Serial EEPROM” www.atmel.com/images/Atmel-8854-DTS-AT30TSE752A-754A-758A-Datasheet.pdf, Atmel Data-sheet, Mar. 1, 2011,—Atmel-8854-DTS-AT30TSE752A-754A-758A-Datasheet—102014, 57 pages. |
Number | Date | Country | |
---|---|---|---|
20170010815 A1 | Jan 2017 | US |
Number | Date | Country | |
---|---|---|---|
62190183 | Jul 2015 | US |