The present invention relates to the field of solid-state data storage devices, and particularly to reducing the read tail latency of solid state data storage devices.
Solid-state data storage devices, which use non-volatile NAND flash memory technology, are being pervasively deployed in various computing and storage systems. In addition to single or multiple NAND flash memory chips, each solid state data storage device must contain a controller that manages all the NAND flash memory chips. NAND flash memory cells are organized in an array→block→page hierarchy, where one NAND flash memory array is partitioned into a large number (e.g., thousands) of blocks, and each block contains a certain number (e.g., 256) of pages. The size of each flash memory physical page typically ranges from 8 kb to 32 kB, and the size of each flash memory block is typically tens of MBs. Data are programmed and fetched in the unit of page. However, flash memory cells must be erased before being re-programmed, and the erase operation is carried out in the unit of block (i.e., all the pages within the same block must be erased at the same time).
Compared with hard disk drives (HDDs), flash-based solid state storage devices can achieve significantly higher average I/O throughput and lower average I/O access latency. In addition to average I/O throughput and latency, many applications (e.g., databases) have stringent requirements on read tail latency (e.g., 99th percentile read latency). Nevertheless, solid-state storage devices could be subject to long read tail latency, which can be explained as follows. The read latency of NAND flash memory is typically tens of microseconds (e.g., 30˜50 μs), while the write and erase latency of NAND flash memory is typically few milliseconds (e.g., 2 ms). When one flash memory chip or die carries out page write or block erase operations, it cannot serve any read operations. As a result, write/erase operations could block subsequent read requests from being served for a long time, leading to long read tail latency.
Accordingly, embodiments of the present disclosure are directed to systems and methods for reducing the read tail latency of solid state data storage devices.
A first aspect provides a storage device, comprising: a set of flash memory chips; and a controller that schedules request from a host using a set of request queues, wherein the controller includes a queue manager that: reorders high priority read requests over low priority write requests in each request queue; suspends low priority write requests to process high priority read requests; and limits a number of low priority write requests allowed in each request queue to a threshold value smaller than a size of each request queue.
A second aspect provides a storage infrastructure, comprising: a host; a set of flash memory chips; and a controller that schedules request from the host using a set of request queues, wherein the controller includes a queue manager that: reorders high priority read requests over low priority write requests in each request queue; suspends low priority write requests to process high priority read requests; and limits a number of low priority write requests allowed in each request queue to a threshold value smaller than a size of each request queue.
A third aspect provides a method for scheduling flash memory requests on a controller, comprising: receiving requests from a host; loading the requests into a set of request queues; reordering high priority read requests over low priority write requests in each request queue; suspending low priority write requests to process high priority read requests; and limiting a number of low priority write requests allowed in each request queue to a threshold value smaller than a size of each request queue.
The numerous advantages of the present invention may be better understood by those skilled in the art by reference to the accompanying figures in which:
Reference will now be made in detail to the presently preferred embodiments of the invention, examples of which are illustrated in the accompanying drawings.
Although effective, the above two design strategies may not be always adequate, especially in the presence of stringent read tail latency constraints. When using either the request re-ordering or low priority write operation suspension, the number of pending low priority write requests within the request queue will gradually increase as the system keeps postponing/suspending low priority write operations in favor of serving high-priority read requests. Once a request queue 22 is filled with low priority write requests (and low-priority read requests if any), the request queue 22 cannot accept any new requests (including high-priority read requests), until at least one pending request within the queue has been successfully processed. This will block high-priority read requests, contributing to read tail latency.
As shown in
When implementing this approach, an important issue is how to quantitatively determine the threshold value of lw. To address this issue, three illustrative options are described, including using a fixed limiter 26 or a dynamic limiter 28 (
It is understood that other approaches for dynamically or statically calculating a threshold value lw may be used within the scope of this invention. It is also understood that the controller 10 may be implemented in any manner, e.g., as an integrated circuit board or a controller card that includes a processing core, I/O, processing logic and/or a software program. Aspects may be implemented in hardware or software, or a combination thereof. For example, aspects of the processing logic may be implemented using field programmable gate arrays (FPGAs), ASIC devices, or other hardware-oriented system.
Aspects may be implemented with a computer program product stored on a computer readable storage medium. The computer readable storage medium can be a tangible device that can retain and store instructions for use by an instruction execution device. The computer readable storage medium may be, for example, but is not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing. A non-exhaustive list of more specific examples of the computer readable storage medium includes the following: a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), a static random access memory (SRAM), a portable compact disc read-only memory (CD-ROM), a digital versatile disk (DVD), a memory stick, etc. A computer readable storage medium, as used herein, is not to be construed as being transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission media (e.g., light pulses passing through a fiber-optic cable), or electrical signals transmitted through a wire.
Computer readable program instructions for carrying out operations of the present invention may be assembler instructions, instruction-set-architecture (ISA) instructions, machine instructions, machine dependent instructions, microcode, firmware instructions, state-setting data, or either source code or object code written in any combination of one or more programming languages, including an object oriented programming language such as Java, Python, Smalltalk, C++ or the like, and conventional procedural programming languages, such as the “C” programming language or similar programming languages. The computer readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider). In some embodiments, electronic circuitry including, for example, programmable logic circuitry, field-programmable gate arrays (FPGA), or programmable logic arrays (PLA) may execute the computer readable program instructions by utilizing state information of the computer readable program instructions to personalize the electronic circuitry, in order to perform aspects of the present invention.
Aspects of the present invention are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by hardware and/or computer readable program instructions.
These computer readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer readable program instructions may also be stored in a computer readable storage medium that can direct a computer, a programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer readable storage medium having instructions stored therein comprises an article of manufacture including instructions which implement aspects of the function/act specified in the flowchart and/or block diagram block or blocks.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts or carry out combinations of special purpose hardware and computer instructions.
The foregoing description of various aspects of the invention has been presented for purposes of illustration and description. It is not intended to be exhaustive or to limit the invention to the precise form disclosed, and obviously, many modifications and variations are possible. Such modifications and variations that may be apparent to an individual in the art are included within the scope of the invention as defined by the accompanying claims.
Number | Date | Country | |
---|---|---|---|
62545941 | Aug 2017 | US |