One important performance metric for a storage system is the latency related to retrieving data stored in the storage system. The performance of the storage system improves with a decrease in the read latency. The read latency for a storage system may be decreased if the storage system is able to reliably retrieve error-free data from the storage medium. When error-free data is not retrieved, the storage system may perform additional actions in order to remove the errors from the retrieved data. For example, the storage system may use error correction mechanisms such as error correcting codes (ECC) and/or RAID to remove errors from the retrieved data or otherwise generate error-free data. The use of error correcting mechanisms results in an increase in read latency, which is accompanied with a corresponding decrease in performance.
In general, in one aspect, the invention relates to A method for reading data from persistent storage, the method comprising receiving a client read request for data from a client, wherein the client read request comprises a logical address, determining a physical address corresponding to the logical address, wherein the physical address comprises a page number for a physical page in the persistent storage, determining, using one selected from a group consisting of the physical address and the logical address, a retention time for the data, determining a program/erase (P/E) cycle value associated with the physical page, obtaining at least one read threshold value using the P/E cycle value, the retention time, the page number, issuing a control module read request comprising the at least one read threshold value to a storage module, wherein the storage module comprises the physical page, and obtaining the data from the physical page using the at least one read threshold value.
In general, in one aspect, the invention relates to a system, comprising a storage module comprising a storage module controller and persistent storage, and a control module operatively connected to the storage module and a client, wherein the control module: receives a client read request for data from a client, wherein the client read request comprises a logical address, determines a physical address corresponding to the logical address, wherein the physical address comprising a page number for a physical page in the persistent storage, determines, using one selected from a group consisting of the physical address and the logical address, a retention time for the data stored on the physical page, determines a program/erase (P/E) cycle value associated with the physical page, obtains at least one read threshold value using the P/E cycle value, the retention time, and the page number; and issues a control module read request comprising the at least one read threshold value to the storage module, wherein the storage module comprises the physical page, wherein the storage module: receives the control module read request; and obtains the data from the physical page using the at least one read threshold value in the control module read request.
In general, in one aspect, the invention relates to a non-transitory computer readable medium comprising computer readable program code, which when executed by a computer processor enables the computer processor to: receive a client read request for data from a client, wherein the client read request comprises a logical address, determine a physical address corresponding to the logical address, wherein the physical address comprises a page number for a physical page in a persistent storage, determine, using one selected from a group consisting of the physical address and the logical address, a retention time for the data, determine a program/erase (P/E) cycle value associated with the physical page, obtain at least one read threshold value using the P/E cycle value, the retention time, and the page number, issue a control module read request comprising the at least one read threshold value to a storage module, wherein the storage module comprises the physical page, and obtain the data from the physical page using the at least one read threshold value.
Other aspects of the invention will be apparent from the following description and the appended claims.
Specific embodiments of the invention will now be described in detail with reference to the accompanying figures. In the following detailed description of embodiments of the invention, numerous specific details are set forth in order to provide a more thorough understanding of the invention. However, it will be apparent to one of ordinary skill in the art that the invention may be practiced without these specific details. In other instances, well-known features have not been described in detail to avoid unnecessarily complicating the description
In the following description of
In general, embodiments of the invention relate to increasing the utilization of solid-state storage by dynamically modifying read threshold values over the lifetime of the flash storage. More specifically, embodiments of the invention relates to using P/E cycle values, retention times, and page numbers in order to determine the appropriate read threshold value(s) to use when reading data that has been previously stored in the solid-state storage. The ability to dynamically change the read threshold values on a per read request basis allows for more error-free data to be retrieved from the solid-state storage. When error-free data is retrieved from the solid-state storage, there is no need to implement error correction mechanisms. As a result, the performance of the system increases.
The following description describes one or more systems and methods for implementing one or more embodiments of the invention.
In one embodiment of the invention, clients (100A, 100M) correspond to any physical system that includes functionality to issue a read request to the storage appliance (102) and/or issue a write request to the storage appliance (102). Though not shown in
In one embodiment of the invention, the client (100A-100M) is configured to execute an operating system (OS) that includes a file system. The file system provides a mechanism for the storage and retrieval of files from the storage appliance (102). More specifically, the file system includes functionality to perform the necessary actions to issue read requests and write requests to the storage appliance. The file system also provides programming interfaces to enable the creation and deletion of files, reading and writing of files, performing seeks within a file, creating and deleting directories, managing directory contents, etc. In addition, the file system also provides management interfaces to create and delete file systems. In one embodiment of the invention, to access a file, the operating system (via the file system) typically provides file manipulation interfaces to open, close, read, and write the data within each file and/or to manipulate the corresponding metadata.
Continuing with the discussion of
In one embodiment of the invention, the storage appliance (102) is a system that includes volatile and persistent storage and is configured to service read requests and/or write requests from one or more clients (100A, 100M). Various embodiments of the storage appliance (102) are described below in
Referring to
Referring to
Those skilled in the art will appreciate that while
Continuing with the discussion of
Continuing with
In one embodiment of the invention, the processor (208) is configured to create and update an in-memory data structure (not shown), where the in-memory data structure is stored in the memory (210). In one embodiment of the invention, the in-memory data structure includes information described in
In one embodiment of the invention, the processor is configured to offload various types of processing to the FPGA (212). In one embodiment of the invention, the FPGA (212) includes functionality to calculate checksums for data that is being written to the storage module(s) and/or data that is being read from the storage module(s). Further, the FPGA (212) may include functionality to calculate P and/or Q parity information for purposes of storing data in the storage module(s) using a RAID scheme (e.g., RAID 2-RAID 6) and/or functionality to perform various calculations necessary to recover corrupted data stored using a RAID scheme (e.g., RAID 2-RAID 6). In one embodiment of the invention, the storage module group (202) includes one or more storage modules (214A, 214N) each configured to store data. One embodiment of a storage module is described below in
In one embodiment of the invention, the storage module controller (300) is configured to receive requests to read from and/or write data to one or more control modules. Further, the storage module controller (300) is configured to service the read and write requests using the memory (not shown) and/or the solid-state memory modules (304A, 304N).
In one embodiment of the invention, the memory (not shown) corresponds to any volatile memory including, but not limited to, Dynamic Random-Access Memory (DRAM), Synchronous DRAM, SDR SDRAM, and DDR SDRAM.
In one embodiment of the invention, the solid-state memory modules correspond to any data storage device that uses solid-state memory to store persistent data. In one embodiment of the invention, solid-state memory may include, but is not limited to, NAND Flash memory and NOR Flash memory. Further, the NAND Flash memory and the NOR flash memory may include single-level cells (SLCs), multi-level cell (MLCs), or triple-level cells (TLCs). Those skilled in the art will appreciate that embodiments of the invention are not limited to storage class memory.
The memory includes a mapping of logical addresses (400) to physical addresses (402). In one embodiment of the invention, the logical address (400) is an address at which the data appears to reside from the perspective of the client (e.g., 100A, 100M in
In one embodiment of the invention, the logical address is (or includes) a hash value generated by applying a hash function (e.g., SHA-1, MD-5, etc.) to an n-tuple, where the n-tuple is <object ID, offset ID>. In one embodiment of the invention, the object ID defines a file and the offset ID defines a location relative to the starting address of the file. In another embodiment of the invention, the n-tuple is <object ID, offset ID, birth time>, where the birth time corresponds to the time when the file (identified using the object ID) was created. Alternatively, the logical address may include a logical object ID and a logical byte address, or a logical object ID and a logical address offset. In another embodiment of the invention, the logical address includes an object ID and an offset ID. Those skilled in the art will appreciate that multiple logical addresses may be mapped to a single physical address and that the logical address content and/or format is not limited to the above embodiments.
In one embodiment of the invention, the physical address (402) corresponds to a physical location in a solid-state memory module (304A, 304N) in
In one embodiment of the invention, each physical address (402) is associated with a program/erase (P/E) cycle value (404). The P/E cycle value may represent: (i) the number of P/E cycles that have been performed on the physical location defined by the physical address or (ii) a P/E cycle range (e.g., 5,000-9,999 P/E cycles), where the number of P/E cycles that have been performed on the physical location defined by the physical address is within the P/E cycle range. In one embodiment of the invention, a P/E cycle is the writing of data to one or more pages in an erase block (i.e., the smallest addressable unit for erase operations, typically, a set of multiple pages) and the erasure of that block, in either order.
The P/E cycle values may be stored on a per page basis, a per block basis, on a per set of blocks basis, and/or at any other level of granularity. The control module includes functionality to update, as appropriate, the P/E cycle values (402) when data is written to (and/or erased from) the solid-state storage modules.
In one embodiment of the invention, all data (i.e., data that the file system on the client has requested be written to solid-state storage modules) (406) is associated with a birth time (408). The birth time (408) may correspond to: (i) the time the data is written to a physical location in a solid-state storage module (as a result of client write request, as a result of a garbage collection operation initiated by the control module, etc.); (ii) the time that the client issued a write request to write the data to a solid-state storage module; or (iii) a unitless value (e.g., a sequence number) that corresponds to the write events in (i) or (ii).
In one embodiment of the invention, the in-memory data structure includes a mapping of <retention time, page number, P/E cycle value> to one or more read threshold value (412). The aforementioned mapping may further include any other system parameter(s) (i.e., one or more parameters in addition to retention time, page number, P/E cycle value) that affects the read threshold (e.g., temperature, workload, etc.). In one embodiment of the invention, the retention time corresponds to the time that has elapsed between the writing of the data to a physical location in a solid-state storage module and the time that the data is being read from the same physical location in the solid-state storage module. The retention time may be expressed in units of time or may be expressed as a unitless value (e.g., when the birth time is expressed as a unitless value). In one embodiment of the invention, the P/E cycle value in <retention time, page number, P/E cycle value> may be expressed as a P/E cycle or a P/E cycle range.
In one embodiment of the invention, read threshold value(s) (412) correspond to voltages or a shift value, where the shift value corresponds to a voltage shift of a default read threshold value. Each of read threshold values may be expressed as a voltage or as a unitless number that corresponds to a voltage.
In one embodiment of the invention, the default read threshold value is specified by the manufacturer of the solid-state memory modules. Further, the granularity of the shift values may be specified by the a shift value, where the shift value corresponds to a voltage shift of a corresponding default read threshold value.
In one embodiment of the invention, the read threshold values (including the default read threshold values) correspond to voltage values that are used to read data stored in solid-state storage modules. More specifically, in one embodiment of the invention, the logical value (e.g., 1 or 0 for memory cells that are SLCs or 00, 10, 11, 01 for memory cells that are MLCs) is determined by comparing the voltage in the memory cell to one or more read threshold values. The logical value stored in the memory cell may then be ascertained based the results of the comparison. For example, if a given voltage (V) is above a B threshold and below a C threshold, then the logical value stored in the memory cell is 00 (see e.g.,
In one embodiment of the invention, the read threshold value(s) (412) are ascertained by conducting experiments to determine how the read threshold values should be modified when at least one of the following variables is modified: retention time, P/E cycle value, and page number. The read threshold value(s) (412) is optimized in order to be able to successfully read data from a solid-state memory module. Specifically, for each combination of <retention time, P/E cycle value, page number> an optimal read threshold value is determined. The optimal read threshold value for a given <retention time, P/E cycle value, page number> is the read threshold value that results in the lowest bit error rate (BER) in data retrieved from a solid-state memory module for a given retention time of the data, P/E cycle value of the physical location on which the data is stored, and the page number of the page on which the data is stored in the solid-state memory module.
By modifying the read threshold value(s) based upon retention time, P/E cycle value, and page number, the storage appliance takes into account the various variables that may alter the voltage stored in a given memory cell at a given retention time, P/E cycle value, and page number. Said another way, when the logical value “01” is to be stored in a memory cell, the storage module controller stores a sufficient number of electrons in the memory cell in order to have a voltage that corresponds to “01”. Over time, the voltage stored in the memory cell varies based upon the retention time, P/E cycle value, and page number. By understanding how the voltage varies over time based on the above variables, an appropriate read threshold value may be used when reading the logical value from the memory cell in order to retrieve “01”.
For example, a first read threshold value(s) may be used to successfully read data when the retention time is 4 months, the P/E cycle value is 30,000, and the page number is 3, while a second read threshold value(s) may be used to successfully read data when the retention time is 5 months, the P/E cycle value is 30,000, and the page number is 3.
If the default read threshold value is used (instead of a non-default read threshold value), then there is a higher likelihood that an incorrect logical value (e.g., “11” instead of “01”) may be obtained from reading the memory cell. This, in turn, results in the need for ECC or other error correction mechanisms such as RAID reconstruction (i.e., correction of errors within retrieved data using one or more parity values) in order to correct the error in the retrieve data and ultimately provide error-free data to the requesting client. The use of error correction mechanisms increases the time required to service a client read request and consequently decreases the performance of the storage appliance.
In one embodiment of the invention, a read threshold value(s) may be provided for each <retention time, P/E cycle value, and page number> combination. The specific read threshold value(s) for a given <retention time, P/E cycle value, and page number> may correspond to the default read threshold value(s) or a non-default read threshold value(s) (i.e., a read threshold value other than the default read threshold value(s)).
In another embodiment of the invention, memory (210 in
Turning to the flowcharts, while the various Steps in the flowchart are presented and described sequentially, one of ordinary skill will appreciate that some or all of the Steps may be executed in different orders, may be combined or omitted, and some or all of the steps may be executed in parallel.
In Step 500, a client read request is received by the control module from a client, where the client read request includes a logical address.
In Step 502, a physical address (which includes the page number) is determined from the logical address. As discussed above, the memory in the control module includes a mapping of logical addresses to physical addresses (see discussion of
In Step 504, the retention time (t) is determined for the data stored at the physical address. The retention time may be determined using the birth time of the data (see
In Step 506, the P/E cycle value for the physical address is determined. The P/E cycle value may be determined by performing a look-up in an in-memory data structure (located in the memory of the control module) using the physical address as the key. The result of Step 506 may be the actual P/E cycle value associated with the physical address (e.g., the P/E cycle value associated with the block in which the physical location corresponding to the physical address is located) or may be a P/E cycle value range (e.g., 5,000-9,999 P/E cycles), where the actual P/E cycle value associated with the physical address is within the P/E cycle value range.
In Step 508, zero or more read threshold values are obtained from an in-memory data structure (see
In one embodiment of the invention, the determination of whether to use a non-default read threshold value may be based on the P/E cycle value (determined in Step 506) or the retention time (determined in Step 504). For example, when the P/E cycle value is below a threshold P/E cycle value, the default read threshold value(s) is used and, as such, Step 508 is not performed. Additionally or alternatively, when the retention time is below a threshold retention time, the default read threshold value(s) is used and, as such, Step 508 is not performed. When the P/E cycle value (determined in Step 506) is above the threshold P/E cycle value and/or the retention time (determined in Step 504) is above the threshold retention time then the look-up described in Step 508 is performed.
Continuing with the discussion in
In one embodiment of the invention, if there are multiple read threshold values associated with a given read request (e.g., see
In Step 520, the control module read request is received from the control module. In Step 522, a read command is generated by the storage controller module based on the one or more read threshold value(s) and the physical address in the control module read request. In one embodiment of the invention any given read command generated in Step 522 may specify one or more read threshold values. If the control module does not include any read threshold values then the default read threshold values are used to generate the read command. If the control module read request includes read threshold values that are in the form of shift values (described above), then generating the read command may include obtaining the default read threshold values and modifying one or more read threshold values using the shift value(s). The read command may be in any format that is supported by the solid-state memory modules.
In Step 524, the read command is issued to the solid-state memory module. In Step 526, data is received, by the storage module controller, in response to the read command. In Step 528, the retrieved data is provided to the control module. The control module subsequently provides the data to the client. In one embodiment of the invention, the storage module controller may include functionality to directly transfer the retrieved data to the client without requiring the data to be temporarily stored in the memory on the control module.
Turning to
In this example assume that the solid-state memory module (620, 622) includes MLCs and that the aforementioned look-up returns read threshold values in the form of shift values for threshold B and threshold C (see
The storage module (614) subsequently receives and services the controller read request (612). More specifically, the storage module controller (612) generates and issues a read command (618) to the solid-state memory module that includes the physical location corresponding to the physical address. In this example, the read command is generated using the default read threshold A value, a non-default read threshold B value, and/or a non-default threshold C value. The non-default threshold B value is determined using the default threshold B value and the shift value for threshold B. Further, the non-default threshold C value is determined using the default threshold C value and the shift value for threshold C.
The storage module controller subsequently receives the data from the solid-state memory module and then provides the data (in a response (624)) to the client (600). The data may be directly copied from a memory (not shown) in the storage module to a client memory (not shown).
One or more embodiments of the invention may be implemented using instructions executed by one or more processors in the storage appliance. Further, such instructions may correspond to computer readable instructions that are stored on one or more non-transitory computer readable mediums.
While the invention has been described with respect to a limited number of embodiments, those skilled in the art, having benefit of this disclosure, will appreciate that other embodiments can be devised which do not depart from the scope of the invention as disclosed herein. Accordingly, the scope of the invention should be limited only by the attached claims.