The present disclosure relates to systems and methods for suppressing the latency in a non-volatile solid state device (“SSD”) and specifically for suppressing the worst input-output (IO) latency of NAND flash-based SSDs.
During operation of a non-volatile solid state device (“SSD”), garbage collection (“GC”) can be performed to generate and maintain free memory blocks in the SSD. The free blocks contain free pages that are available for writing new data. The free blocks can be reclaimed from memory blocks that can contain both valid and invalid data. During the garbage collection operation, a block is first identified for reclaiming, e.g., a “victim” block. Any valid pages residing in the victim block are copied to another memory block, and the entire victim block is erased. The garbage collection operation uses read and write operations, in addition to the erase operation. These operations can compete with host user read and write operations for access to the same memory blocks.
In a NAND flash-based SSD, a read/write operation occurs on a particular page of a block, while an erase operation occurs on the entire block. Pages of a particular block can be written at any time if they are free, e.g., had been previously erased. For example, new pages can be written to a block, if the block has currently free pages. Data in some pages of the block can become invalid, e.g., stale, and can be replaced with valid data. Replacement pages with valid data cannot override pages with invalid data; rather they can be written to the block in other free pages. As discussed above, during garbage collection, pages of a first block with valid data are read from the block (victim block) and are written to a new block. Then the entire victim block can be erased. Garbage collection can result in inconsistent performance of a NAND flash-based SSD, e.g., the worst IO latency happens when garbage collection occurs in an SSD, and the host needs to wait for the garbage collection to complete before it can read or write data to the SSD.
Accordingly, efficient ways to perform garbage collection are desirable that can suppress the worst IO latency in a NAND flash-based SSD.
The present disclosure relates to methods and systems for suppressing the worst input-output (IO) latency in NAND flash-based SSDs.
One embodiment can include a method for reducing latency in a non-volatile memory that includes a flash memory and a storage class memory. The method can include the steps of providing a first region and a second region in the storage class memory, receiving by the non-volatile memory a first command from a host in communication with the non-volatile memory, and determining whether the first command requires access to a block in the flash memory. When the first command does not require access to the block in the flash memory, the method can further determine whether the first command requires access to the second region in the storage class memory. When the first command requires access to the second region in the storage class memory, the method can move data accessed by the first command from the second region in the storage class memory to the first region in the storage class memory.
According to embodiments of the present invention, when the first command does not require access to the second region in the storage class memory, a method for reducing latency in a non-volatile memory can further access the first region in the storage class memory to execute the first command and evict data from the second region in the storage class memory to the flash memory.
According to embodiments of the present invention, when the first command requires access to the block in the flash memory, a method for reducing latency in a non-volatile memory can determine whether a garbage collection operation is executed in the flash memory and, when the garbage collection operation is not executed, determine whether the flash memory is in an idle state and evict data from the second region in the storage class memory to the flash memory, when the flash memory is in the idle state.
According to embodiments of the present invention, when the flash memory is not in the idle state, a method for reducing latency in a non-volatile memory can further determine whether the first command is at least one of a read command and a write command, and, when the first command is a write command, the method can evict a data page from the second region in the storage class memory to the flash memory, and execute the write command. When the first command is a read command, the method can execute the read command and evict two data pages from the second region in the storage class memory to the flash memory.
According to embodiments of the present invention, when the garbage collection operation is executed, a method for reducing latency in a non-volatile memory can copy valid pages from a first flash memory chip to the second region of the storage class memory and erase a memory block in the first flash memory chip. The method can further determine that at least one of the garbage collection operation is not executed on the first flash memory chip and the second region of the storage class memory has data from the first flash memory chip and evict data from the first flash memory chip to the second region in the storage class memory.
According to embodiments of the present invention, a method for reducing latency in a non-volatile memory can evict data from the second region of the storage class memory to the flash memory based on an eviction policy based on at least one of the number of pages in a block in the second region and the age of the pages in the block in the second region.
Another embodiment can include a system for reducing latency in a non-volatile memory that comprises a flash memory, a storage class memory, comprising a first region and a second region, and a memory controller in communication with a host, the flash memory, and the storage class memory. The memory controller can be configured to receive a first command from the host and determine whether the first command requires access to a block in the flash memory. When the first command does not require access to the block in the flash memory, the memory controller can further determine whether the first command requires access to the second region in the storage class memory. When the first command requires access to the second region in the storage class memory, the memory controller can further move data accessed by the first command from the second region in the storage class memory to the first region in the storage class memory.
According to embodiments of the present invention, when the first command does not require access to the second region in the storage class memory, the memory controller can further access the first region in the storage class memory to execute the first command and evict data from the second region in the storage class memory to the flash memory.
According to embodiments of the present invention, when the first command requires access to the block in the flash memory, the memory controller can determine whether a garbage collection operation is executed in the flash memory and, when the garbage collection operation is not executed, determine whether the flash memory is in an idle state and evict data from the second region in the storage class memory to the flash memory, when the flash memory is in the idle state.
According to embodiments of the present invention, the memory controller can further determine whether the first command is at least one of a read command and a write command, when the flash memory is not in the idle state. When the first command is a write command, the memory controller can evict a data page from the second region in the storage class memory to the flash memory and execute the write command.
According to embodiments of the present invention, when the first command is a read command, the memory controller can execute the read command and evict two data pages from the second region in the storage class memory to the flash memory.
According to embodiments of the present invention, when the garbage collection operation is executed, the memory controller can copy valid pages from a first flash memory chip to the second region of the storage class memory and erase a memory block in the first flash memory chip.
According to embodiments of the present invention, the memory controller can determine that at least one of the garbage collection operation is not executed on the first flash memory chip and the second region of the storage class memory has data from the first flash memory chip and evict data from the first flash memory chip to the second region in the storage class memory.
According to embodiments of the present invention, the memory controller can evict data from the second region of the storage class memory to the flash memory based on an eviction policy based on at least one of the number of pages in a block in the second region and the age of the pages in the block in the second region.
Various objects, features, and advantages of the present disclosure can be more fully appreciated with reference to the following detailed description when considered in connection with the following drawings, in which like reference numerals identify like elements. The following drawings are for the purpose of illustration only and are not intended to be limiting of the invention, the scope of which is set forth in the claims that follow.
Systems and methods for suppressing the worst input-output (IO) latency of NAND flash-based SSDs are provided. The worst IO latency in a NAND flash-based SSD occurs when the host controller has issued a command, e.g., a read or a write command, for a particular page in a block that undergoes garbage collection. When this happens, the host controller waits for the garbage collection operation to complete, before the command can operate on the particular page.
T
CMD
_
WAIT
=T
GC
≅T
E+γ(TRD+TWR),
where TE is the block erase latency, TRD is the page read latency, TWR is the page write latency, and “γ” is the number of valid pages in the block that need to be read from the block and written to another block before the block is erased. Typically in NAND flash memories, a write operation is longer than a read operation. Assuming that the NAND flash has a read latency TRD of 50 μs per a 16 KB page size, a write latency TWR of 1 ms per a 16 KB page size, an erase latency of 3 ms per block, and a valid page ratio is 10%, e.g., γ=25 pages in a block of 256 pages, the latency to reclaim a block is TCMD_WAIT=3 ms+25*(50 μs+1 ms)=29.25 ms. The garbage collection can be an atomic operation, where all the valid pages from the victim block are migrated to another block, since the objective of copying the pages is to reclaim the block by erasing it. The garbage collection operation can also be pre-emptive in case of a read operation. In this case, the control algorithm can be more complicated to address different scenarios. For example, if one of the pages in the reclaiming block is read-intensive followed by a write request, then the write request can wait for all the read requests to complete.
Therefore, the latency during garbage collection increases until the garbage collection operation completes. The latency and the performance of the SSD are related. Accordingly, during garbage collection, the performance of the SSD is also degraded. This is illustrated in
Different prior art approaches attempt to reduce the garbage collection overhead. Some attempt to reduce the garbage collection overhead by increasing the over-provisioning area. However, these prior art attempts increase the NAND flash cost, without completely eliminating the worst case latency. Other prior art approaches use Dynamic Read Access Memory (“DRAM”) or Storage Class Memory (“SCM”) cache buffers to absorb hot, e.g., frequently accessed, writes or classify data activity, e.g., classify data as hot or cold, and store data of similar activity in the same block. These approaches can decrease the garbage collection frequency, however, they cannot eliminate the worst garbage collection latency, which is especially true in case of reclaiming blocks with many cold data. Other prior art approaches write small data to fragmented NAND flash pages by scrambling the logical block addresses (LBA). These approaches can reduce the garbage collection latency, however, they need to maintain a very large table for logical block addressing, which makes them impractical for commercial products.
The disclosed systems and methods suppress the worst case garbage collection latency, which can result in a consistent performance of an SSD. A solid-state storage device uses integrated circuit assemblies as memory to store data persistently. For example, an SSD can include one or more integrated circuits of one type of memory, for example, NAND-flash memory, or can include more than one types of non-volatile memory. According to aspects of the disclosure, the proposed systems and methods use a hybrid NAND flash memory with Storage Class Memory. Storage is often thought of as a mechanical hard disk drive (“HDD”) that offers near limitless capacity, when compared to DRAM. It is also persistent, which means that information is not lost if the server crashes or loses power. The problem with hard drives is that in many cases they are unable to provide information to the application quickly enough.
Storage Class Memory (“SCM”), such as Magnetoresistive Random-Access Memory (“MRAM”), Phase-Change Memory (“PCM”), Resistive random-access memory (“ReRAM”), and a battery-backed DRAM, is a class of storage/memory devices that can provide an intermediate step between high-performance DRAM and cost-effective HDDs. SCMs have much larger endurance than NAND flash memories. SCM is byte-addressable, in comparison to NAND flash, which operates in page unit (4 KB-16 KB). SCMs can blur the distinction between memory devices, which can be fast, expensive, and volatile, and storage devices, which can be slow, cheap, and non-volatile, and combine the benefits of both to be low-cost, fast, and non-volatile. SCMs can provide read performance similar to DRAM and write performance that is significantly faster than HDD technology.
The disclosed hybrid NAND flash memory/SCM SSD combines the benefit of an in-place update non-volatile memory as well as the benefit of a block-erasable non-volatile memory. An exemplary architecture 300 of a hybrid NAND flash memory/SCM SSD in communication with a host is illustrated in
According to aspects of the disclosure, the SCM chips 306 can be divided into two regions. This is illustrated in
According to aspects of the disclosure,
According to aspects of the disclosure, the latency reduction during garbage collection based on the proposed systems and methods are illustrated in
If the destination block is located on a different NAND chip, then the second page can be read before the first page is written on the destination block. This is illustrated in
According to aspects of the disclosure, during garbage collection, the valid pages from the victim block in NAND flash 308 can be copied to the second region R2 (404) of SCM 306, as illustrated in
T
CMD
_
WAIT
=T
GC
≅T
E+γ(TRD),
where TE is the block erase latency, TRD is the page read latency, and γ is the number of valid pages in the block that is being erased. Assuming, that the NAND flash has read latency TRD of 50 μs per a 16 KB page size, erase latency of 3 ms per block, and a valid page ratio of 10%, e.g., γ=25 pages in a block of 256 pages, the latency to reclaim a block according to the proposed system is TCMD_WAIT=3 ms+25*50 μs=4.25 ms, significantly reduced from the latency value of 29.25 ms of a conventional NAND flash memory.
As discussed above, because the SCM is non-volatile, there is no requirement to flush the garbage collection valid data from the second region R2 (404) of the SCM to the NAND flash chip 308 immediately. The data that will eventually be evicted can be determined based on an eviction policy.
If the host command requires access to a NAND flash chip, then the method can check whether a garbage collection operation is occurring at the NAND flash chip 914. If there is no garbage collection operation, the method can check whether there is a NAND flash chip in an idle status 916. If there is an idle NAND flash chip, then pages from the second region R2 can be evicted to the NAND flash chip 912. If there is no idle NAND flash chip, then the method can check whether the received host command is a read command or a write command 918. If the host command is a write command, then one page from the second region R2 can be evicted to the NAND flash chips 920 and the write command can complete 922. If the host command is a read command, then the read command can complete 924 and two pages from the second region R2 can be evicted to the NAND flash chips 926.
If there is a garbage collection operation when the host operation requires access to the NAND flash, the method can check whether a particular NAND flash chip, e.g., chip i, undergoes the garbage collection operation and whether the second region R2 (404) has any data from the particular NAND flash chip, e.g., chip i, 928. If chip i undergoes a garbage operation and the second region R2 has no data from chip i, then all valid pages from the victim block can be copied from chip i to the second region R2930. In the other case, the method can evict all remaining data of chip i in the second region R2932, before all valid pages from the victim block can be copied from chip i to the second region R2930. Check 928 is a condition that can be used to set the deadline of evicting the garbage collection data in SCM R2 region (404). The garbage collected data in the second region R2 (404) is the static data that can be eventually written to the NAND flash memory. The second region R2 (404) can provide a temporary space for completing quickly the garbage collection process. If condition 928 is true, this can mean that chip i can still have data pages in second region R2 when chip i triggers the garbage collection again. In this case, the SSD controller can evict the data pages in the second region R2 from chip i because in the worst case, the second region R2 region can overflow if no deadline condition is not set. Finally, the method can erase the victim block 934.
Those of skill in the art would appreciate that the various illustrations in the specification and drawings described herein can be implemented as electronic hardware, computer software, or combinations of both. To illustrate this interchangeability of hardware and software, various illustrative blocks, modules, elements, components, methods, and algorithms have been described above generally in terms of their functionality. Whether such functionality is implemented as hardware, software, or a combination depends upon the particular application and design constraints imposed on the overall system. Skilled artisans can implement the described functionality in varying ways for each particular application. Various components and blocks can be arranged differently (for example, arranged in a different order, or partitioned in a different way) all without departing from the scope of the subject technology.
Furthermore, an implementation of the communication protocol can be realized in a centralized fashion in one computer system, or in a distributed fashion where different elements are spread across several interconnected computer systems. Any kind of computer system, or other apparatus adapted for carrying out the methods described herein, is suited to perform the functions described herein.
A typical combination of hardware and software could be a general purpose computer system with a computer program that, when being loaded and executed, controls the computer system such that it carries out the methods described herein. The methods for the communications protocol can also be embedded in a computer program product, which comprises all the features enabling the implementation of the methods described herein, and which, when loaded in a computer system is able to carry out these methods.
Computer program or application in the present context means any expression, in any language, code or notation, of a set of instructions intended to cause a system having an information processing capability to perform a particular function either directly or after either or both of the following a) conversion to another language, code or notation; b) reproduction in a different material form. Significantly, this communications protocol can be embodied in other specific forms without departing from the spirit or essential attributes thereof, and accordingly, reference should be had to the following claims, rather than to the foregoing specification, as indicating the scope of the invention.
The communications protocol has been described in detail with specific reference to these illustrated embodiments. It will be apparent, however, that various modifications and changes can be made within the spirit and scope of the disclosure as described in the foregoing specification, and such modifications and changes are to be considered equivalents and part of this disclosure.