This disclosure relates to non-volatile data storage and retrieval within semiconductor memory.
Nonvolatile memory is a type of digital memory that can retain data in the absence of power. Generally speaking, nonvolatile memory is relatively cheap, but it is also relatively slow when compared to other types of memory, such as random access memory (“RAM”). Given this disadvantage in performance, the memory industry continually strives to improve performance characteristics of nonvolatile memory, so as to enable its use as a cheaper replacement for RAM.
Many storage technologies use techniques that store some type of redundancy information that can be used to recover data in the event of an error. Generally speaking, these techniques, referred to as “erasure coding” or “redundancy” storage, process some type of underlying data to generate one or more pieces of error correction (“EC”) information which is stored in addition to the underlying data; if some amount of the underlying data is lost, for example, because data becomes corrupted or because a storage unit becomes unavailable, the underlying data can still be quickly recovered. There are several different types of redundancy schemes including schemes where EC information is stored together with the data (i.e., in the same physical memory structure), and these schemes typically permit recovery of up to a certain number of bit errors. There also exist schemes which store redundancy information across multiple storage units, for example, to permit recovery of contents of one of the multiple storage units should any one of the multiple storage units become unavailable. “Redundant Array of Independent Disks” or simply “RAID” is sometimes conventionally used to refer to these latter techniques, and this term is used herein to refer to any storage technology where error recovery information and/or redundant data is stored across two or more distinct logical or physical storage resources, i.e., including but not limited to disk-based schemes, such that data can still be recovered notwithstanding loss or unavailability of one or more of these storage resources.
It is desirable to extend redundancy techniques (including RAID techniques) so that they can be used with non-volatile memory, i.e., such that these techniques can also be used with multiple storage resources each based in non-volatile memory. One relatively cheap and commonly available class of non-volatile memory utilizes asymmetric programming and erasing of storage units; flash memory and magnetic shingled drives are nonlimiting examples of these asymmetric non-volatile memory (“ANVM”) technologies. With these types of memory technologies, a minimum size unit of data that must be written (or “programmed”) at once and a minimum size unit of data that must be erased at once are different; for example with flash memory specifically, data is typically programmed in units of pages, depending on specific technology, and erased in units of “blocks” or “erase units” (EUs). A typical page size is about 16 kilobytes (KB), while a typical EU size might be hundreds to thousands of pages. A problem that can arise with these types of storage technologies is that entire structural units of memory, such as individual EUs, dies or other structures, can wear out over time, with the result that data and/or associated EC information can be lost. While techniques exist as mentioned for providing redundancy across structures, the asymmetric programming and erasing requirement can further complicate erasure coding schemes. Citing some specific problems that can arise with redundancy schemes for ANVM, if EC information is stored with data (e.g., in the same EU) and the EU becomes defective (as is statistically one of the more common forms of ANVM error), it may not be possible to recover data; because of the requirement for address translation, caused by the asymmetric programing/erasing framework, challenges exist in protecting data across multiple structures or drives (e.g., using RAID-style redundancy), as the need for address translation makes linking data and associated EC information difficult and substantially increases host traffic, and as substantial spare capacity must typically be devoted to provide effective error recovery capability.
Techniques are needed for improving the performance of systems rooted in nonvolatile memory, including memory characterized by one or more of the characteristics referenced above, particularly to facilitate application of redundancy techniques to ANVM. Further, techniques are needed for improving the ability of a host to gain insight into the memory configuration, to effectively permit the host to efficiently schedule host requests, e.g., such that the host can better manage communication bandwidth across the various storage resources, and so that the host can effectively provide for greater memory system customization. The present invention addresses these needs and provides further, related advantages.
The subject matter defined by the enumerated claims may be better understood by referring to the following detailed description, which should be read in conjunction with the accompanying drawings. This description of one or more particular embodiments, set out below to enable one to build and use various implementations of the technology set forth by the claims, is not intended to limit the enumerated claims, but to exemplify their application to certain methods and devices. The description set forth below exemplifies techniques that can be practiced in one embodiment by a host, in another embodiment by a memory controller (e.g., within a single drive or across multiple drives), in another embodiment by a flash memory device (e.g., die or integrated circuit), or a drive (or storage aggregate having one or more drives) and in yet another embodiment by a host or memory controller cooperating with one or more other circuits. This disclosure also provides improved designs for a memory controller, host, memory devices, a memory system, a subsystem (such as a drive, e.g., a solid state drive or “SSD”), and associated circuitry, firmware, software and/or other processing logic. The disclosed techniques can also be implemented as instructions for fabricating an integrated circuit (e.g., as a circuit design file or as a field programmable gate array or “FPGA” configuration). While the specific examples are presented, particularly in the context of flash memory, the principles described herein may also be applied to other methods, devices and systems as well.
This disclosure provides techniques for facilitating and/or providing additional redundancy for data stored in asymmetric non-volatile memory (“ANVM”) structures. In one embodiment, the ANVM is flash memory such as implemented by one or more chips of NAND flash memory, formed as part of a solid state drive (“SSD,” or simply “drive”) having a dedicated memory controller; the memory controller can be an integrated circuit primarily concerned with controlling the operation of one or more chips of NAND flash memory (i.e., a “memory controller integrated circuit” for NAND flash memory).
One embodiment provides a memory controller for flash memory and/or a storage drive based on such a memory controller. The memory controller supports storing redundancy across drives (e.g., in memory not managed by that memory controller) by computing error correction (EC) information for data that it writes, and it transmits that EC information off-drive, e.g., in one embodiment to the host, and in a different embodiment to a memory controller for another drive. This embodiment supports many different error correction schemes. Another embodiment provides a memory system with multiple flash memory drives, each having its own dedicated memory controller (i.e., each having one or more memory controller ICs). A memory controller performing a write of data passes information on to the next drive; this information can either include the data which is being written, or it can include EC information based on the data which is being written. The next memory controller (for the next drive) then performs a linear error correction combination to generate new error correction data; for example, the next memory controller (for the next drive) can exclusive-OR (“XOR”) the information received from the prior drive with data at a matched storage location, to compute parity, which it then passes on to the next drive. Such an embodiment provides a scheme were EC data is generated across multiple drives, in a manner not requiring direct involvement of the host for computation of EC information between drives (i.e., which is performed in this embodiment on a peer-to-peer basis). Note that a “write” of data, as used in this context, can include a move of data (e.g., without limitation, a “garbage collection” operation in which a memory controller for flash memory moves or copies data from a first location (for example, a mostly-released EU) to a new physical destination within the same drive, e.g., in an effort to free up the mostly-released EU so that it can be erased and prepared to receive new writes.
The embodiments introduced above facilitate a myriad of different error correction/redundancy schemes which are all contemplated by this disclosure as novel techniques. For example, one contemplated implementation uses two different redundancy schemes. Within each drive, a first redundancy scheme is used which provides for redundancy across multiple logical or physical structures within the drive; such a scheme can optionally benefit from teachings from the incorporated-by-reference documents which use ‘virtual address’ and ‘address space layout’ techniques to provide for cross-structure redundancy and zones or virtual devices (as described below), to provide for a wide variety of customization. A second error correction scheme is then applied on a cross-drive basis, e.g., where EC information for each of multiple drives is combined and used to produce redundancy for EC data (referred to herein as “superparity,” i.e., as a term denoting compression of, or combination of multiple pieces of error information for respective data sets, irrespective of whether the EC information in fact involves ‘parity’); such a scheme provides for space-efficient error protection against multiple failures in any given drive. Recollecting that with ANVM (and flash memory in particular), wear-based failures tend to be on a block-by-block basis, the first error correction scheme can be applied to recover a given number of failures (e.g., single block or segment only in some embodiments) within a given drive, without requiring extra-drive redundancy, while in the unlikely event that a greater number of blocks simultaneously fail within the given drive (e.g., two blocks or more in some embodiments), the second error correction scheme can be utilized to rebuild second error correction data for the drive having the errors (i.e., by retrieving the stored “superparity” and reverse-computing the EC information for the specific drive having the failures from stored or recomputed EC information from the other drives). It should be observed that this scheme utilizes cross-drive redundancy with extremely efficient storage of recovery data; such a scheme will be explained with particular reference to
Note again that many error schemes are contemplated by this disclosure, both inside and outside a given drive, and are not limited to the use of two or more different error recovery systems in combination, as was just introduced. Each of intra-drive error correction schemes and inter-drive error correction schemes, and combinations of the two, are contemplated by this disclosure; embodiments are expressly contemplated where these various techniques are practiced just in memory, just in a memory controller, just in software, just in a drive, just in a host, just in other architectural elements, in multiple dries or other elements, or in any permutation or combination of these things.
Prior to proceeding to a further detailed description regarding various embodiments, it would be helpful to introduce certain additional terms that will be used in this disclosure.
Specifically contemplated implementations can feature instructions stored on non-transitory machine-readable media. Such instructional logic can be written or designed in a manner that has certain structure (architectural features) such that, when the instructions are ultimately executed, they cause the one or more general purpose machines (e.g., a processor, computer or other machine) to behave as a special purpose machine, having structure that necessarily performs described tasks on input operands in dependence on the instructions to take specific actions or otherwise produce specific outputs. “Non-transitory” machine-readable or processor-accessible “media” or “storage” as used herein means any tangible (i.e., physical) storage medium, irrespective of how data on that medium is embodied, including without limitation, random access memory, hard disk memory, EEPROM, flash, storage cards, optical memory, a disk-based memory (e.g., a hard drive, DVD or CD), server storage, volatile memory and/or other tangible mechanisms where instructions may subsequently be retrieved and used to control a machine. The media or storage can be in standalone form (e.g., a program disk or solid state device) or embodied as part of a larger mechanism, for example, a laptop computer, portable device, server, network, printer, memory drive or unit or other set of one or more devices. The instructions can be implemented in different formats, for example, as metadata that when called is effective to invoke a certain action, as Java code or scripting, as code written in a specific programming language (e.g., as C++ code), as a processor-specific instruction set, or in some other form or language; the instructions can also be executed by a single, common processor or by different (remote or collocated) processors or processor cores, depending on embodiment. Throughout this disclosure, various processes will be described, any of which can generally be implemented as instructions stored on non-transitory machine-readable media. Depending on product design, such products can be fabricated to be in saleable form, or as a preparatory step that precedes other processing or finishing steps (i.e., that will ultimately create finished products for sale, distribution, exportation or importation). Also depending on implementation, the instructions can be executed by a single computer or device and, in other cases, can be stored and/or executed on a distributed basis, e.g., using one or more servers, web clients, or application-specific devices. Each function mentioned in reference to the various FIGS. herein can be implemented as part of a combined program or as a standalone module, either stored together on a single media expression (e.g., single floppy disk) or on multiple, separate storage devices. Throughout this disclosure, various processes will be described, any of which can generally be implemented as instructional logic (e.g., as instructions stored on non-transitory machine-readable media), as hardware logic, or as a combination of these things, depending on embodiment or specific design. “Module” as used herein refers to a structure dedicated to a specific function; for example, a “first module” to perform a first specific function and a “second module” to perform a second specific function, when used in the context of instructions (e.g., computer code), refers to mutually-exclusive code sets. When used in the context of mechanical or electromechanical structures (e.g., an “encryption module,” the term “module” refers to a dedicated set of components which might include hardware and/or software). In all cases, the term “module” is used to refer to a specific structure for performing a function or operation that would be understood by one of ordinary skill in the art to which the subject matter pertains as a conventional structure used in the specific art (e.g., a software module or hardware module), and not as a generic placeholder or “means” for “any structure whatsoever” (e.g., “a team of oxen”) for performing a recited function (e.g., “encryption of a digital input”). “Erasure coding” as used herein refers to any process where redundancy information is generated and/or stored, such that underlying information used to generate that redundancy information can be recovered if a memory device or unit of memory is off-line or otherwise inaccessible. “Host-supplied” address means an address supplied by the host (but not necessarily assigned by the host), for example, provided in connection with a host-issue request; for example, this term encompasses an address provided with a read request which seeks specific data (e.g., the address may have been originally assigned by the memory controller, depending on embodiment). “EC information” as used herein refers to information relating to error correction values or the location or existence of error correction values; it for example can encompass single or multi-bit parity values or other forms of error correction codes or data and/or it can encompass an identifier, link or address that points to one or more of these values; as non-limiting examples, it can refer to a result of an XOR operation performed on two pieces of multibit data, or a notification that EC values have been stored in a location (i.e., such that the host or a memory controller for a different drive can then request or access those values). “Superparity” and/or “master error correction information” as used herein refer to redundancy information premised on a combination of multiple sets of EC values, regardless of whether those values are in fact parity values (e.g., the term refers to a master data representing compressed or combined redundancy protection for any type of EC information). “RAID” as used herein refers to a redundancy scheme that is tolerant to one or more independent devices or storage structures being offline or otherwise inaccessible by permitting recovery from other independent devices or storage structures that are not offline or inaccessible, for example, encompassing a situation where M page- or other-sized sets or groups of data are received or stored and where L storage locations (or segments, e.g., respective virtual or physical structural elements) are used to store that data (i.e., where L−M≥1) to permit recovery of each of the original M pieces of data notwithstanding that at least one (and potentially 2, 3 or more) of the L respective structural elements is offline; for example, the term RAID as used below is not limited to a disk-based scheme, and can encompass RAID4, RAID6 and other redundancy schemes. “Redundancy information” as use herein refers to information transmitted by a memory controller to a host, another drive, or another destination outside of a drive in which the memory controller resides for purposes of providing redundancy; for example, in one embodiment, this term can encompass mirroring newly-written data (or data to be written, copied or moved) to a different drive, which then calculates parity based on two pieces of data—in another embodiment, this term can encompass EC information based on newly-written data (or data to be written, copied or moved), e.g., which is then transmitted to the host, a different drive, or another off-drive destination.
Having thus introduced some illustrative capabilities provided by various techniques describe herein, this description will now proceed to discuss a number of illustrative embodiments in additional detail.
In one embodiment, the memory controller for flash memory receives as inputs one or more of (a) data or EC information from another drive, per function block 113, and/or (b) per numeral 115, new write data from a host, and/or a host-originated request that will result in a write. Also, the memory controller for flash memory can transmit computed EC information to a host (i.e., as represented by numeral 116), or to another drive (i.e., as optionally represented by numeral 117), optionally with an index that links/identifies matched, underlying data, storage locations and/or EC values. Revisiting the discussion earlier, to the effect that in one embodiment, multiple drives can communicate via peer-to-peer communications, numerals 113 and 117 together denote that a given drive can receive data (or an EC information input) from a first adjacent drive (i.e., as implied by box 113), it can operate on that received data (for example, by optionally computing EC information based on both the received data/EC information input and other data/EC information) to generate new EC information, and it can pass that newly generated EC information to a second adjacent drive, via conceptual path 117. Note that this daisy chaining between drives is optional for many embodiments.
In this regard, one embodiment provides for cross-drive error protection. The memory controller for a given drive, such as SSD controller 139 for drive 141, writes data into the ANVM that the particular memory controller controls. As referenced earlier, the given drive then transmits something to either the host or the other drives to provide for redundancy. In one contemplated implementation, as referenced earlier, the SSD controller 139 transmits EC information to the host 133, with the host electively either storing that EC information, transmitting it to another drive, or processing it (e.g., to derive superparity). In another contemplated implementation, the SSD controller 139 for drive 141 uses a peer-to-peer channel 145 (or system bus 143) to send a copy of its write data (or the computed EC information) to another memory controller for another SSD. That second memory controller then uses this received information to compute further EC information, for example, by linearly combining it with similar EC or data information for a matched memory location within that drive; in a relatively simple embodiment, the drive simply XORs the received information with the data found at the matched memory location, and then forwards the resulting EC value(s) on to the next drive, which performs a similar operation. Exemplifying this operation, in one implementation, each of the depicted drives XORs like-order data written to matched memory locations (e.g., the third drive XORs its data with the result of an XOR between the first two drives, and so on), with the final drive then generating a parity result that enables recovery of any given data value (from one of the drives) should one of the matched locations experience error. The final drive can either store the resultant parity (e.g., the final drive might be a fifth drive which simply stores redundancy for four other drives) or forward it to the host for storage elsewhere. The host can also perform this same type of EC combination, i.e., simply by receiving EC information returns from each drive and matching EC information together for corresponding locations. To reverse engineer a given set of data from a single drive, the host in one embodiment simply reads the data from a set of matched storage locations (save the one that experiences error), it recovers the corresponding EC information, and it XORs all of these pieces of data together to exactly recover the corrupted data. Not all embodiments discussed herein use this relatively simple, optional error correction scheme, and linear combination functions other than based on XOR operations can be can also or instead be used in some embodiments.
In identifying matched memory locations, and generating EC information which spans drives in the manner just exemplified, a number of configuration options exist. In a simple case, where the EC information is the result of a linear combination of sets of respective data and/or associated EC information, the memory controller and/or host can be configured to identify a virtual, logical, or physical address offset or other index for each drive in order to identify matched underlying data and/or EC information. For example, where EC information is calculated on the basis of physical address, each memory controller (or the host) would simply obtain data/EC information for a given address offset (e.g., channel1, die3, plane 1, EU24, page 7, etc.) and would combine this data/EC information with similar data/EC information from one or more other drives having exactly the same address offsets (e.g., channel1, die3, plane 1, EU24, page 7, etc.). For logical or virtual addressing, or other forms of index, the same type of operation is performed, but on the basis of this addressing (note that in such an embodiment, when a given memory controller reports back EC information, it also advantageously reports back the corresponding logical or virtual address or other index for the associated underlying data, i.e., which permits the host for example, to identify matched logical and/or virtual storage locations for the EC information it receives). As these statements imply, the memory controller for a given drive can optionally apply address translation (e.g., a flash translation layer or FTL) at a level below the indexing scheme. For example, as described by the incorporated by reference documents, in a virtual scheme, the memory controller can transparently remap a virtual structure corresponding to a portion of a host-supplied address to one or more dedicated physical structure (e.g., a virtual “block” can be mapped to one or more physical blocks in a manner transparent to the host, e.g., where each of the one or more physical blocks are associated with only one virtual block).
As will be discussed further below, address space layout (ASL) and virtual block device (VBD) techniques from the incorporated by reference documents can also be used to define zones or VBDs for one or more of the depicted drives. For example, a first zone in one drive can be configured to have no redundancy, while a second zone in that drive can be configured to use internally-stored EC codes only, and a third zone in that drive can be configured to use multiple redundancy schemes, including one rooted in first EC information stored on the drive and one rooted in second EC information stored off the drive; a second drive can be configured differently, for example, having only one zone matched for cross-drive redundancy purposes to the third zone in the first drive (e.g., with cross-drive error protection being applied by the first/second drives to this respective third/first zones only). The memory controllers 139 depicted in
The host depicted in
In some implementations, the client(s) can issue packets including file-based access protocols such as the Common Internet File System (CIFS) protocol or Network File System (NFS) protocol over TCP/IP when accessing information in the form of files. In other implementations, the client(s) can issue packets including block-based access protocols such as the Small Computer Systems Interface (SCSI) protocol encapsulated over TCP (ISCSI) and SCSI encapsulated over FC (FCP) when accessing information in the form of LUNs or blocks. Also in some implementations, the storage adapter includes input/output (IO) interface circuitry that couples to SSDs (e.g., 141) over an IO interconnect arrangement, such as a conventional high-performance Fibre Channel serial link topology, NVMe over fabric, Ethernet, Infiniband, TCP/IP, or indeed, using any desired connection protocol.
The depicted storage server manages flash memory using a log-structured copy-on-write file system, optionally in a manner that obviates need for an extensive FTL layer on the side of the SSD and that better distributes wear, as described in commonly owned U.S. Pat. No. 9,400,749 (incorporated by reference). This is to say, in one contemplated embodiment, each flash-based drive such as drive 141 advantageously has a memory controller that implements operations where either the host or a controller specific to the given drive performs maintenance, and/or the memory controller sends requests to perform maintenance to the host, which then explicitly commands requested maintenance in the given drive request on a fractional basis in a manner where request issuance is scheduled at a time of the host's choosing, e.g., so as to not collide with queued/upcoming data accesses and/or other operations planned for the given drive. The memory controller IC for each drive also advantageously computes EC information as referenced elsewhere herein, optionally enabling the storage server/host to provide for cross-drive redundancy. Depending on system architecture, host software manages interaction with each such memory controller. This architecture provides for host-memory-controller cooperation in managing NAND flash memory-based storage devices in direct-attached and/or network-attached storage environments. For example, each flash-based drive has a respective controller that also serves to provide information to the host regarding each subdivision of the memory managed by that controller. The storage server (in this case, the host) can manage wear distribution across multiple drives to help lessen wear to any one area of memory; for example, in the context of the wear-aware writes, the storage server can collect wear metrics for all flash memory managed as well as for other types of nonvolatile memory, if present, and can redistribute cold data using explicit requests to each of the drives that call for a given drive to transparently move data at a specified address (i.e., without sending that data to the host) or to transfer that data between drives (e.g., optionally using peer-to-peer communication). The storage server can combine bitmaps from multiple drives and can allocate if desired new writes to a single drive only if needed to better distribute wear or conversely, provide for bins (i.e., distinct SSDs) with differently managed wear parameters as different endurance groups. Advantageously, the host can also electively direct writes of data based on policy, for example, in a manner that avoids fragmentation or that groups certain types of data together based on read-write characteristics. In one embodiment, each SSD memory controller 139 is rooted in a configurable architecture that depends on host capabilities, policies, or other considerations. For example, in such an architecture, if the host processor does not support host-assignment of addresses for new writes, the host configures each SSD memory controller to perform this function and to report back assigned logical, virtual (ideal) or physical addresses as described earlier.
In the case of the example presented in
In a first example, the “second” error correction scheme is different than the first error correction scheme (used intra-drive) and permits recovery if a second block of data is lost. For example, using a convention that will be repeated below, in a manner akin to RAID6, the first error correction scheme generates EC information “P” (which is stored on-drive in this example) while the second error correction scheme generates EC information “Q” (which is stored off-drive in this example). The memory system thus has L segments of information which are used to recover M segments of the underlying data stripe, where L−M=2 in this example; thus, in this example, because interrelated error correction algorithms are used (e.g., dependent on each other), loss of any two blocks or segments still permits complete recovery of underlying data. Note that schemes where L−M is greater than 2 are also contemplated by this disclosure. Generally speaking, because random failures in practice tend to occur at different times, for many memory implementations, a scheme where L−M=2 is more than adequate in terms of providing on-system failure recovery.
In a second example, the “second” error correction scheme is used to protect the first error correction information (i.e., the EC information generated using the “first” error correction algorithm). For example, if EC information obtained on the basis of a “data stripe” is written on a round-robin basis into memory 207, and a structure is lost which contains both data and EC information, the second error correction scheme may be used to recover the lost EC information (and thus, the lost data as well). For example, given the assumption of a round-robin placement of the first EC information, it is statistically unlikely that a random-structural failure will impact both data and its associated “on-drive” EC information. For such an embodiment, the EC information (as contrasted with the underlying data) is combined across-drives (i.e., by the host, or by another drive or drives) using the scheme described earlier, i.e., by linearly combining a “stripe” of EC information across drives; for example, returning briefly to the example presented by
It is once again noted that these various error correction schemes are offered as examples only, and that many underlying error protection schemes and combinations of such schemes can be employed by those having ordinary skill in the art. As will be appreciated by those having ordinary skill in the art, by storing EC information for ANVM off-drive, in the manner described, one may provide for enhanced security and reliability for ANVM. Furthermore, such techniques can be facilitated and/or enhanced using one or more of (a) zone/VBD specific customization, (b) indexing of EC information using virtual, logical and/or physical addressing, (c) virtualized addressing techniques, and (d) a cooperative and/or expositive memory controller, each as described herein and in the incorporated by referenced documents.
The memory controller, optionally a memory controller integrated circuit (IC) to control flash memory, automatically computes EC Information in connection with the data being moved or written per numeral 307, as was referenced earlier. Note that in some detailed embodiments, the memory controller is programmable, e.g., it has a mode register, and EC computation and/or associated options (e.g., EC information type or format) is specified by a writable value, or can be turned on/off according to programmed register value, optionally for an entire drive or on the basis of specific zones or VBDs. Per numeral 313, in the depicted embodiment, the memory controller at some point optionally alerts the host; such an alert, per numeral 315, either unsolicitedly conveys the EC information or a derivative directly to the host, or it conveys some type of information which enables the host to later read the EC information at a time of the host's choosing or to trigger a peer-to-peer transfer operation for EC values. For example, the alert can convey an address (either corresponding to the memory controller's internal RAM or in ANVM) which the host can then use to directly read the EC values; in some embodiments, as noted elsewhere herein below, the host can also direct the memory controller to directly transmit the EC values to another location, for example, directly to another storage drive, by an explicit command which specifies both source EC values to be transmitted as well as a destination address (e.g., a specific drive, or a specific logical, virtual or physical address in a specific drive).
Numerals 309-314 indicate several further optional operations/features. Per numeral 309, in one embodiment, EC information as it is computed by the memory controller can be stored in memory accessible to the memory controller (that is within the drive, such as in RAM internal to the memory controller IC, or in NVM/ANVM within the drive); in one contemplated embodiment, this EC information is stored in the ANVM memory managed by the memory controller just as if it were data written by the host (or alternatively, in a protected area reserved for EC information), with the host optionally reading that information using a standard read request. Per numeral 310, in one embodiment, the memory controller builds running EC information, for example, conducting incremental operations that linearly modify existing EC information as the memory controller fills a structural unit (virtual or physical); as noted earlier, in one embodiment, simple parity is used as a linear EC process, e.g., the bit values of data at multiple storage locations are simply XORed together or the bit values of data are XORed with existing EC information. Still further, per numeral 311, it is also possible for the memory controller to defer computation of EC information to a later time, e.g., performing the calculation in a manner covering multiple writes, such as only once a zone is filled, or responsive to an express host request to perform this computation. As indicated by numerals 312 and 313, the EC information can be sent to the host (or another drive) intermittently (e.g., at each write, at each time when a block is filled, each time a zone is filled, after a data burst, or at some other milestone), and the memory controller can either directly transmit the EC values to the host or provide the host with some type of message that permits the host to read or otherwise direct the transfer of that information at a time of the host's choosing (e.g., in a manner interleaved by the host amidst upcoming read and write requests, so as to not collide with other queued host requests that are to be issued to the same drive or a different drive). Per numeral 314, in contemplated versions of this embodiment, the ANVM can be flash memory, such that the memory controller is a memory controller tasked with controlling flash memory, and the EC Information can take the form of parity information; optionally also, the memory controller can be in the form of an integrated circuit (“IC”) (as alluded to earlier).
The memory controller then performs the operation in question. As indicated by numeral 317, in one embodiment, a move or copy operation can be effectuated as a delegated move/copy operation, i.e., responsive to an express host request, data is retrieved by the memory controller from a first location (e.g., a first physical location) and is rewritten to a second location (e.g., a second physical location) without sending the data to the host. This differs from most conventional host-involved operations where a host typically reads data via a first request (e.g., via a read request to a first memory controller for a first drive) and then writes data via an independent, second write request issued to a memory controller (e.g., for a second drive). Note that there are several prominent examples of move/copy operations, including without limitation the following: (1) a garbage collection operation, that is, where a memory controller wishes to reset (erase) an erase unit of ANVM so that that erase unit can again receive programming, and where, to preserve data, data which is still valid (i.e., still in active use in servicing reads) must be moved out of that erase unit to a new physical location; (2) a cold data relocation operation, that is, where some erase units are experiencing disproportionate wear, because other erase units have data that is never or rarely modified (i.e., these latter erase units have tended to not be garbage collected because a large percentage of their pages remain in active use)—by detecting cold data and forcing a move of that data, the memory controller can ensure more even wear for the memory it manages and can avoid concentration wear leveling in a subset of memory space; (3) an operation where data is modified and/or partly overwritten (e.g., a memory controller is instructed to overwrite a logical address, or performs some type of atomic operation on existing data or performs an operation where only a small portion of data is being replaced), and where the memory controller must then find a new physical location); and (4) an operation where for a given structure (e.g., die, zone, VBD, etc.), the memory controller is swapping in a reserved structure to take the place for actively used space; other examples are also possible. As discussed in the incorporated by reference documents and elsewhere herein, it is expressly contemplated that these activities can be fractionally managed by the host, depending on embodiment, with the host initiating address-delimited (or otherwise specified) operations, page-by-page, or block-by-block, by express request; such an operation permits the host to break up maintenance and/or schedule bits of maintenance at a time of the host's choosing, thereby avoiding conventional issues where a host seeking to read or write data finds a memory controller “offline” (i.e., conducting drive-wide maintenance manifested as an extended sequence of garbage collection operations). While these operations in many embodiments are effectuated responsive to a single, explicit host request, in other embodiments, these operations can be performed sua sponte and transparently by the memory controller (e.g., the memory controller acts transparently to the host, for example, to detect and relocate cold data). In other embodiments, per numeral 319, a second request is received from the host (e.g., the memory controller performs some type of processing pursuant to a first request, stores a result in internal RAM, and is triggered by a second host request to write or commit the processing result to memory, to send information to the host or to send information to another drive).
In one embodiment, the host uses received EC information sets for multiple drives to calculate master error recovery information (i.e., “superparity” as defined earlier). For example, as indicated by numeral 327 in
The memory controller tracks subdivision-specific metadata using internal storage 411. In one embodiment, this storage can be volatile memory such as synchronous random access memory (SRAM) or other internal RAM; in another embodiment, this storage can be non-volatile memory, for example an internal flash array. This same RAM can be used for the memory controllers various data processing operations including temporary storage of either write or move data or EC information codes/parity information. As denoted by reference numeral 413, the internal storage retains information for each subdivision of the memory governed by the memory controller, in this case, for one or more logical, virtual or physical subdivisions of the memory 407. In embodiments where the memory 407 is a NAND flash memory, the storage retains information for each VBD, channel, die, EU, and physical page of the flash memory (e.g., VBDs 1-i, channels 1-j per VBD, dies 1-k per channel, EUs 1-m per die, and pages 1-n per EU, as variously indicated by reference numerals 412-416) for all flash memory managed by that memory controller; these numbers i,j,k,m,n do not have to be homogeneous throughout the flash memory, e.g., one VBD can span 4 channels while another can span 1, and similarly, the number of dies per channel, EUs per die and pages per EU can vary throughout the memory. For example, depending on manufacturer and design, there can be 128-256 pages per EU, with each EU corresponding to a substrate well, and each page corresponding to an independently controlled wordline for memory cells tied to that substrate well. The data tracked for each subdivision can be of two types, as denoted by numerals 417a and 417b, including information regarding the physical state of the (virtual or physical) hierarchical elements of the memory, for example, wear of a memory location, whether constituent elements are erased (rendered writeable) or have been released, the need for various types of maintenance (417a) and information regarding the data stored in that location, for example, logical address, back references, data age, hot/cold status, EC information/parity (417b); when data is moved between locations, the data metadata (417b) associated with the data can optionally be moved to or otherwise associated with a new table location to correspond to the new destination, whereas the state metadata (417a) describing state of the memory location itself stays associated with the old memory location, as it describes state of that particular location. Note that in other embodiments, the various metadata (417a/417b) does not move, but is rather stored in two tables with the logical to physical (L2P) mapping of one of those tables being changed. The memory controller also has logic 418 that performs various functions, e.g., it is operable to send to a host either some or all of the “raw” information retained in the storage 411, or derived or processed information based on that storage 411. This logic for example can include circuitry within the memory controller adapted to respond to host requests seeking specific metadata; alternatively, this logic can also include circuitry that applies pertinent filters or comparisons and that notifies the host when a tracked metric meets an assigned threshold. This information or an alert representing a particular condition can be transmitted to the host via the at least one first interface 409a/b, via a dedicated connection or via a backplane connection. As will be shown below, the logic 418 can also comprise circuitry and/or instructional logic to assist with offloaded functions from the host, and perform various other functions cooperatively, as will be discussed below. The logic 418 also performs functions such as address translation (e.g., at any one or more structural tiers within the memory) and write and read (i.e., data access) control and/or address assignment and various maintenance operations, and it sends commands to memory via a second interface 419 in order to accomplish these various functions.
Several configurations are also represented by the embodiment of
It was noted earlier that in some embodiments, rather than immediately sending redundancy information to the host (or other memory), the memory controller can hold the calculated redundancy information for future processing, and/or can calculate the redundancy information on a delayed basis. There are various circumstances under which this can be quite useful, e.g., for some EC information schemes, delayed processing and/or transmission architecture permits the memory controller to save space, command overhead and/or processing cycles by accumulating redundancy information for many writes prior to sending the information to the host or another drive. In addition, in systems in which there are many zones, VBDs, drives or available alternative locations in which to store data, this permits the host to flexibly decide at a later time where redundancy information should be placed; as a non-limiting example, if there are alternate destination choices, one of which has an upcoming write or read operation scheduled by the host, the host can select a different one of the alternate destination choices, so that both operations can be performed concurrently (i.e., without waiting for verified, time-consuming programming of data in a given array, as is conventional for flash memory). To perform this destination selection, as but one example, the host can issue a second request to the memory controller with an identifier for the EC Information operand (e.g., if the memory controller supports multiple such operands in flight) and a destination address to which the memory controller is to directly transmit the EC information in question. This can also be done in the reverse, e.g., in another embodiment, the memory controller transmits the EC values (e.g., via the command links seen in
For the embodiment of
To provide another example of limited address translation, where structures can be remapped at one or more specific tiers or levels of a flash memory hierarchy, the memory controller can be configured to identify a block error and to transparently remap the subject data over to reserved memory space. Because such reassignment might affect only a very small portion of data written to memory, the memory controller can advantageously keep track of this reassignment using either the pertinent FTL table or metadata 463, depending on embodiment. For example, future reads specifying the remapped erase unit, page or other structure are intercepted by the memory controller logic using locally-stored metadata 463 and redirected to the proper physical location in memory for defective blocks. This scheme presents yet another option that permits a memory controller to be freed from having to implement extensive search trees to find physical locations based on supplied logical addresses, i.e., the memory controller need only maintain a per-die FTL (or FTL for some other structural level in the memory's structural hierarchy) and track defective memory reassignments, which ultimately become stale as the memory controller progresses through erase operations, garbage collection and updates of data (the latter being directly written to new pages or erase units). Note that such address translation can be kept very simple if the memory controller simply allocates remapped space to a reserved erase unit or other structure using a like assignment, or conversely, remaps a first physical page in a given erase unit to another physical page in the same erase unit.
The command processing logic 459 receives commands from the host directed to general configuration of memory operations and for queries. Generally speaking, this logic manages and updates the metadata 454 and runs queries on the metadata, returning information to the host as appropriate via the host interface logic 455. The returns to the host can be immediate returns to synchronous commands and immediate or later responses (or alerts) to asynchronous commands. Exemplifying some command options, the command logic can (a) serve information up to the host drawn from metadata 263 for use in wear aware writes, (b) assist with wear leveling (WL), garbage collection (GC), data management (DM), hot/cold management, (c) service queries for EC information and/or EC information computation or transfer, or (d) assist with servicing of other functions in memory.
An exemplary memory controller can assume varying levels of host support in a manner that can be customized to any specific memory system design. That is, memory controller 441 possesses dedicated logic infrastructure to perform WL, GC, DM and hold/cold specific functions (465, 466, 467 and 469, respectively), each of which can be tailored to a specific level of interaction with the host pertinent to the specific implementation. Similarly, the memory controller 451 also possesses logic to perform erasure coding as part of logic 459 (e.g., as part of data management logic 467). Depending on the desired level of interaction, the memory controller 451 helps avoid the need for remote storage and retrieval of large address translation tables and the use of complex search trees, e.g., address translation can be performed using a greatly simplified address translation table or omitted in the memory controller entirely. In addition, the configured level of cooperation can advantageously permit a host to directly assume scheduling of many flash management functions that might interfere with (i.e., compete with) host-directed writes, such as garbage collection, data relocation, wear leveling and so forth. That is to say, an architecture will be described below that permits a memory controller to serve sophisticated information to the host to assist with this scheduling. This, combined with less FTL overhead, provides for faster, more consistent flash response, and facilitates multiple drive storage aggregates based on solid state (flash) drives (SSDs) as well as mixed or heterogeneous systems that combine SSDs with other memory types.
Note that this is an example only, e.g., the architecture described herein can optionally also support a traditional FTL design, or memory controller management of complex functions.
To assist with host scheduling of flash management tasks, the memory controller can have firmware or hardware logic (or both) dedicated to specific types of host commands and host queries. In the embodiment of
For both embodiments that use wear-aware writes as well as those that do not, the memory controller can include wear leveling logic 465. That is, to account for a limited number of flash memory P/E cycles (typically on the order of tens to hundreds of thousands of cycles for NAND flash), the logic on board the memory controller can be designed to track wear as part of metadata 454 and to provide this information to the host. If over time, certain units of memory are determined from erase counts to represent disproportionately high or low wear relative to overall memory, wear leveling can then be performed. Note that for embodiments where wear-aware writes are used, wear leveling can be highly localized, i.e., performed as a data relocation option simply to redistribute cold data. The memory controller 451 can generate alerts when predetermined wear thresholds are reached, and can otherwise perform low level queries relating to wear leveling. In support of the techniques presented by this disclosure, the wear accounting logic 469 can keep a changing-list of erase units, ranked in order of coldest data, least wear, greatest wear or in another manner. In one embodiment, this logic can be prompted via an explicit host command to synchronously compile such a list or to asynchronously notify the host of erase unit identity any time a wear metric (e.g., an erase count) exceeds a programmably-defined value. Then, when and as wear leveling is scheduled by the host, the host issues a command to the memory controller to relocate cold data and erase the old space (e.g., using relocation logic 470), thereby redistributing that space into a pool of available space used for active writes (and potentially more frequently-cycled data). Note that in an embodiment where the host directly addresses physical or idealized space and performs wear-aware address assignment, distribution of wear can be inherently minimized as part of the write process. However, disproportionate wear can still occur for data that is held for a long time and which is therefore deemed “cold;” that is, cold data can keep erase units out of circulation while other erase units are more frequently recycled. The memory controller architecture presented in this embodiment supports memory controller cooperation with wear management through the use of “limited” data relocation and wear leveling processes (e.g., “fractional operations” directed only to specific address ranges within flash) as well as (if pertinent to the implementation), the scheduling and management of more extensive wear leveling, e.g., for entire flash devices or across multiple flash devices or drives.
A copy-on-write process can result in retaining old pages in flash memory that are stale. This is because, in the case of NAND flash memory, a given erase unit can have other pages that are still in use, and the old page location typically cannot be reused until the entire associated erase unit is recycled. Over time, substantial portions of flash memory can be locked up simply because a small fraction of space in many respective erase units is still in use. This situation can occur whether the host or the memory controller performs address translation. To address this, the memory controller of
Once again, providing this structure in a memory controller is optional. In other embodiments, the host can command data relocation, wear leveling and so forth and can perform accounting necessary to such purposes. In other embodiments, as just referenced, this is done by the memory controller or on a joint basis by the memory controller for a given drive and the host, working together.
In an embodiment where the host cooperates with the garbage collection task, the host can query the memory controller using a command, with processing of the command performed in cooperation with the release accounting logic 471. In more detailed embodiments, the release accounting logic can be designed to perform low level inquiries, for example, to return a list of EUs where page utilization falls below a specific threshold (e.g., 50%). This type of function can also be managed as an asynchronous task, e.g., the host can request that the memory controller alert the host if at any time an erase unit that has been written-to (or that has just had a page released) experiences less than a threshold level of page utilization; in this regard, the release accounting logic 471 tracks explicit page release with each command information update, and can perform any processing necessary to alert the host in response to any asynchronous queries. The release accounting logic 471 also has circuitry and/or firmware that performs other forms of processing, for example, optionally providing a list of “the 10 best” candidates for garbage collection in order of page (under) utilization. In another embodiment, some or all of the data relocation functions can be managed by the memory controller, for example, with relocation logic 472 being delegated specific tasks by the host (such as the identification of erase units to the host for relocation of data, or relocation of data in response to a host-specified target memory address). Once relocation has been performed, with respective metadata updated and associated physical pages are released, the full erase unit is reclaimable. In one embodiment, this is performed by the host, which issues an explicit EraseBlock request for an address-specified erase unit-logic 459 processes this command and, once the command is completed, returns the freed erase unit to a pool of available erase units for future data allocation.
Data management logic 467 can support optional functions such as idealized addressing, address assignment, zone and/or VBD definition and management, and so forth. Module 473 performs functions associated with data integrity and managing structures (virtual, logical or physical) in the memory. Module 474 in this embodiment handles EC information computation, retention and rebuild/recovery of EC information or underlying data, and/or notification or relay of EC information, as appropriate.
Hot/cold data management can be performed by tracking data-specific metadata for the various structural elements of memory, ostensibly to provide a form of wear leveling. In one embodiment, the memory controller optionally uses processes that periodically look for “old” data (that is, logical addresses that have not been written to for a long time, or that alternatively have a low read frequency), and remaps the stored data to a different memory location to effectively help distribute lifetime degrading memory rewrites throughout memory managed by the memory controller (or throughout a particular VBD), using processes similar to those described earlier; for example, the memory controller sends a notification to the host, which the host then uses to schedule a data move request, and then—at the scheduled time—issues that request to the memory controller to move the data from LBA x to LBA y, optionally all without the data ever having been sent to the host. Consolidation logic 475 can identify hot data, cold data, or other information (e.g., EUs having disparate hot/cold data characteristics), with relocation logic 476 being used to respond to host commands and to update stored metadata (in the manner previously described).
As operations are performed in memory, whether as part of a management operation (such as data relocation) or in servicing write or read requests, relayed as one or more associated commands from the memory controller to a selected memory device, IO scheduling logic 461 detects completion of the command. Pertinent information is added by metadata management logic 457 to the stored metadata 454, and the host is then signaled with any completion codes and/or information returns as necessary. For example, if a data relocation operation has been performed, the metadata 454 can be updated with new information for both source and target blocks of memory, and new EC information can be generated, stored and/or sent to the host or another drive, as pertinent to the embodiment.
Reflecting on the structures shown in
More specifically,
For purposes of
In one example, data that is to be written into location 512 might be new data supplied by the host as a replacement for data at the host-supplied (e.g., logical) address; in this case, a number of options exist, depending on implementation. In one embodiment, discussed further below, the memory controller performs an XOR operation between new and old data, and it keeps old data in location 511a and writes the new data to location 512. For example, the memory controller can prepare a new physical space corresponding to zone transparently to a host, with the host seeing a seemingly unchanging frontier. When and as the new physical space is filled, the memory controller simply swaps the new space in for the old, and erases the old physical space, permitting recycling of that space in a manner that is transparent from the host's perspective. Such an embodiment is described below in connection with
Regardless of specific implementation, some data is to be placed by the memory controller into new physical location 512 and the memory controller in this embodiment should be understood to compute new EC information such as parity 515a; such an operation depending on implementation might require a memory controller to read location 511a to obtain old data and to retrieve old parity information 513a, as just exemplified, but such is not required for all embodiments or all types of operations. New parity data 515a can be sent to the host or another drive (e.g., either immediately or on a deferred basis as previously discussed). As a further alternative, the memory controller can be configured to compute the new parity 515a in a manner that combines data from new neighboring storage locations, such as those seen as cross-hatched at the right side of drive 505a in
The embodiment of
Also, note that in some embodiments, drive 505n+1 can be used to store host-provided write data in addition to EC Information/parity information. For example, to cycle which of drives/SSDs 505a . . . 505n+1 is used to store redundancy data on a round robin basis, such that for one “data stripe” or set of associated storage locations, drive 505n+1 is used to store parity/superparity, while for another data stripe or set of associated storage locations write data is stored in drives 505n-505n+1 and EC information/parity information is stored in drive 505a (e.g., in this case, the operation is as described above except that the various drives are also exchanged or rotated on a round-robin basis in terms of their roles from a system perspective).
The EC Information/parity operations in the case of
Reflecting on some of the principles/benefits featured by the embodiment of
It has been mentioned for some embodiments discussed above that a memory controller for a given drive or SSD can return to the host or another drive or destination different types of information, including without limitation, host query and information returns extensively discussed in U.S. Pat. No. 9,400,749.
Having thus introduced these techniques, this disclosure will now proceed to discuss design associated with a pseudo-expositive memory controller, used in connection with some of the embodiments herein, with reference to
A memory controller in one embodiment can subdivide an incoming memory address into multiple discrete address fields corresponding to respective hierarchical groups of structural elements within a target nonvolatile semiconductor memory system, in which at least one of the discrete address fields constitutes a virtual address for the corresponding physical element within the structural hierarchy. Through hierarchical subdivision, a virtual address portion of an incoming memory address is ensured to resolve to an element within the physical bounds of a larger (hierarchically-superior) structure, but may be freely mapped to any of the constituent physical elements of that larger structure. Accordingly, a host requestor may issue logical memory addresses with address fields purposefully specified to direct read, write and maintenance operations to physically distinct structures within the memory system in a manner that limits performance-degrading conflicts while the memory controller remains free, by virtue of one or more virtualized address fields within the incoming logical addresses, to virtualize localized groups of physical structures and thus mask defective structural elements and swap operational structural elements into and out of service, for example, as they wear or otherwise require maintenance.
The net storage volume of a given drive can be subdivided into discrete performance-isolated storage regions based on specified system requirements and underlying memory system geometry and performance characteristics, e.g., VBDs, with each such storage region being mapped by an independent linear range of logical addresses. Accordingly, each performance-isolated storage region or “VBD” may be presented to one or more host access requestors as an independent block device (i.e., mass storage unit having mapped logical address space) so that the nonvolatile memory system may be perceived by that host as being constituted by multiple discrete block devices or VBDs, each having its own performance characteristics and address space. In one embodiment, this address space is continuous, with addressing effectuated using active and reserved structures and modulo and/or remainder-based address resolution as more fully described in U.S. Pat. No. 9,542,118; in another embodiment, structures are only applied on a power-of-two basis, such that bit fields of the host-supplied address map cleanly to structural tiers, and in another embodiment, discontinuous addressing is used, such that the host-visible address space presents gaps that are not in fact mapped to ANVM. The mapping of the logical address space within a given VBD, referred to herein as an address space layout, or “ASL,” may vary from one VBD to another (e.g., sequential addresses within logical address ranges of respective VBDs may be distributed within the structural hierarchy of the memory system in different orders) to yield configurable and varied VBD characteristics in terms of erasable segment size, endurance and I/O bandwidth, all via ASL table configuration as introduced above and discussed below. Further, multiple different address space layouts may be applied within different “subspaces” of a given VBD (i.e., discrete portions of the VBD's address range) with, for example, addresses in one subspace being sequentially applied to structural elements at different hierarchical levels of the memory system in a different order than in another subspace; for example, most significant bits of a host-supplied address can be applied for one VBD to resolve an addressed channel while the very same most significant bits, or a subset or superset of such bits, can be applied to dies selection for another VBD. Also, in a number of embodiments, system requirements specified (e.g., by a user/system designer) in terms of VBD capacity and performance metrics including, without limitation, read and write bandwidth requirements and minimum data transfer size required by the VBD, can be translated into corresponding configuration and allocation of structural elements as necessary to meet the high-level requirements, with such configuration and allocation optionally being programmed directly into the nonvolatile memory subsystem and/or corresponding VBD definition reported to a host access requestor. By this approach, a system designer can configure and allocate VBDs according to performance requirements of the application at hand without having to resort to the complex and error-prone task of allocating and configuring numerous physical resources within the nonvolatile memory system individually. As noted previously, in one embodiment, the host can dynamically readjust its subdivisions so as to periodically or on demand further optimize memory. For example, a host has the ability not only to select a write location for specific data and/or EC information, but can change its address space configuration as a general matter (i.e., as a result of machine learning and consequent storage system re-optimization).
The nonvolatile memory subsystem is presented as a flash memory device having at least one SSD and in many embodiments, multiple SSDs, each with their own respective memory controller, as indicated earlier; Each SSD can be hierarchically arranged in multiple wired signaling channels each coupled to multiple flash memory dies, with each die including numerous individually erasable blocks (EUs) distributed in one or more access planes, and with each EU including numerous pages constituted by a predetermined number of single-bit or multi-bit nonvolatile storage cells (i.e., channels, dies, erase units and pages constitute, for example and without limitation, respective hierarchical physical elements within the flash memory device). For example, in one embodiment, a memory controller within the flash memory system (e.g., within the drive or SSD) subdivides each incoming host-supplied address into respective channel, die, erase unit and page address fields, any or all of which may be virtual addresses, and resolves a commanded read or write access to a specific channel indicated by the channel address field, a specific die indicated by the die address field, a specific EU unit indicated by the EU field (including possible resolution to two or more virtual and/or physical EUs in the case of multi-plane command sequences) and a specific page indicated by the page address field (including possible resolution to two or more pages in the case of a multi-page operation). Where the memory controller maintains reserve units, for example mapping host-visible address space to reserved structural elements at a given level of the hierarchy, some limited address translation can be performed at the memory controller, e.g., by translating the address of one block in the hierarchy (e.g., EU) while preserving logical location level at other address levels (e.g., preserving page ordering within a remapped EU). Among other advantages, this architecture provides for greatly simplified address translation (e.g., which can optionally be implemented entirely in hardware or mostly in hardware with various alternative, small address translation tables being used for select structural elements in the hierarchy, e.g., for each die, facilitating configurable and predictable I/O latency, by shortening address translation time and eliminating associated complexity. As noted previously, in other embodiments, such reserved structural element units can also/instead be used to supply storage units for redundancy information, such as EC Information, as specified by the ASL information.
Continuing with
As discussed above, the ASL parameters define the manner in which sequential host-addresses are assigned and/or distributed within the structural hierarchy of an associated VBD and thus indicate the number of pages within the same EU (i.e., “seqPg”) to which sequential LBAs apply before progressing to page(s) in the next EU, and then the number of EUs to be sequentially accessed within a given die (“seqEU”) before progressing to the next die, and then the number of dies to be accessed on a given channel (“seqDie”) before progressing to the next channel, and so forth. The feature control parameters include, for example and without limitation, whether read caching and write caching are to be enabled (independently settable via the rdC and wrC fields of the ASL lookup table entry) for the VBD or subspace thereof, the number of pages that may be simultaneously or concurrently written to or read from within the same erase unit (nPa), and the number of erase-unit planes to be concurrently accessed in a given write or read command sequence (nPI). In general, read caching is a double-buffering construct that enables data retrieved from an address-selected storage page and stored within the flash die's page register (i.e., a buffer element that temporarily holds outbound page-read data and inbound page-write data) to be output from the flash die concurrently with transfer of subsequently selected storage-page data to the page register, and write caching is a similar double-buffering arrangement that enables concurrency during page-write operations. Thus, the read and write page caching features, when enabled, reduce net latency of a sequence of read or write operations, respectively. In general, page caching scales (e.g., multiples according to cache depth) the effective size of the page register and thus correspondingly raises the minimum data transfer size imposed on the host in a given page read or write operation. For simplicity of understanding, page caching in both the read and write directions is disabled (i.e., “off”) within the exemplary ASL lookup table entries shown. Multi-page operation (i.e., nPA set to a value greater than one) and multi-plane operation (nPI set to a value greater than 1) likewise raise the minimum data transfer size between the host and memory controller in order to perform (i.e., proceed with) programming. In the specific examples shown in the ASL lookup table of
Still referring to
Concluding with
Note that the definition of VBD is advantageously used in some embodiments in a manner tied to the way in which EC Information is managed across drives. For example, VBDs in other drives matching the 2×2 channel/die configuring of BD0 and/or BD1 from
As noted earlier, for many ANVM designs, structural failures (including wear-based failures) tend to occur as independent random events, and occur on a basis of an isolated physical structure, such as a physical block (physical EU).
More particularly,
However, as seen at the right side of
Once again, many variations, combinations and permutations of the principles discussed above will be readily apparent to those having experience in erasure coding, and are contemplated by this disclosure even if not individually detailed above.
It is further assumed for purposes of this FIG., that there is a multi-segment error for one of the drives, as indicated by numeral 1155, i.e., M (i.e., two in this example) structures experience error for exactly one SSD, such that it is necessary in this embodiment to use externally-stored EC information to rebuild the unavailable data for that SSD. For purposes of the EC algorithms discussed above in connection with
The embodiment described in reference to
Reflecting on the foregoing, what has been described is a memory controller architecture that can facilitate efficient redundancy operations and delegated move/copy operations in a manner that does not create excessive bandwidth competition (e.g., by reducing the number of data transfers between the host and the drives, and having the drives themselves compute cross-drive parity) for host functions and that facilitates true redundancy across drives.
Many options and variations to the above are expressly contemplated by this disclosure. For example, while flash memory is used in a number of examples herein, the techniques described herein may readily be applied to other forms of ANVM. Also, while parity information is used as one example of EC Information, and an XOR operation identified as one “linear” error correction technique, the techniques described herein are applicable to any other types of EC Information as well, including BCH codes and other EC schemes. The disclosed techniques may also be applied to EC schemes based on non-linear combination/processing. Many types of EC algorithms may be used. Many analogous techniques and applications will occur to those having ordinary skill in the art. Also, while many embodiments introduced above call for error information to be provided to a host, in some contemplated embodiments, the memory controller instead directly transfers/forwards the error information to another drive (e.g., without sending the EC information to the host). This is to say, as noted earlier, some embodiments support transfer of error information (e.g., parity values or results of parity processing) directly via a peer-to-peer bus from a memory controller for one ANVM SSD to a memory controller on another drive; in some embodiments, the recipient drive has a memory controller that computes and stores master error information (e.g., superparity), permitting recovery of EC information (by a given drive or the host) should any drive become unavailable or corrupted. Also, while techniques for generating and matching EC data have been presented, all of the techniques described herein may be applied to recover EC information and/or lost data, i.e., via similar EC generation techniques and processing suitable for extracting lost information based on data and stored and/or generated EC information. While many examples focus on error protection robust to one or two storage resources becoming unavailable, there are error correction schemes robust to three or more resources becoming corrupted or unavailable (e.g., using the disclosed, for example, with extension to three or more EC algorithms); application of the techniques described herein is contemplated for all such schemes. In fact, the described techniques can be applied not only to provide for compression of “Q” information, as discussed above, but using lossless compression techniques to provide insurance against third, fourth, fifth, or additional independent random failures which might affect a given drive.
This disclosure also provides techniques for exposing memory geometry to a host, to permit improvement of host processes in interacting with that memory. In one example, a host and/or a user and/or a memory controller can provide for definition of zones or VDBs in a given SSD. A “zone” as used herein is a subset of ANVM in a single SSD, such that two or more logical and/or physical structures can be defined in a single SSD. A “VBD” or virtual block device as used herein corresponds to a block device described in commonly-owned U.S. Pat. No. 9,542,118, e.g., a region having boundaries which align to one or more integral arrays (or dies), providing for performance isolation, such that for example a memory controller managing such a structure can issue parallel requests to one VBD without regard to status of memory controller commands which are outstanding to a different VBD within the same drive or SSD. A “zone” as used herein does not necessarily have to have boundaries defined in this manner. In some embodiments discussed herein, the host participates with a memory controller for a given drive in defining zones and/or VBDs, providing for tailoring of storage needs to individual clients and/or applications, with the memory controller being programmable so as to manage and/or perform addressing in each zone on a different basis. In some embodiments, including those embodiments referenced earlier, a host can request a write of data to a zone or VBD, or it can specify a destination for an operation that moves data (e.g., garbage collection, or hot/cold data relocation) simply by identifying a zone or VBD of a given drive (e.g., “zone32”), with the memory controller for the destination resource then assigning an address (i.e., with the memory-controller-assigned address being used by the host in future read requests to identify the data in question); the memory-controller-assigned address can be, but does not have to be, a virtual address or a physical address. For example, as described in incorporated-by-reference U.S. Pat. No. 9,542,118, a level of abstraction can be used such that the host uses an “idealized address” (e.g., a specific type of virtual address) with the memory controller then performing some type of address translation or resolution to identify a specific physical location where requested data is stored. In other examples, instead of using zones per se, the host directly specifies an address (which might have been previously assigned by a memory controller and reported to the host, depending on embodiment), and the memory controller then directly applies and/or translates only a limited field or fields of the host-supplied address (e.g., it modifies/remaps an address subfield dedicated to a structural level such as channel, die, array, erase unit, plane, page/wordline, LUN, and so forth), to resolve a logical or physical structure in ANVM where data can be found. This operation and associated circuitry is also described in the incorporated-by-reference document U.S. Pat. Nos. 9,400,749 and/or 9,542,118. A design which provides for exposure of ANVM geometry to a host and/or provides for the use of zones and/or VBDs permits some embodiments to employ a number of different options introduced above. For example, a memory controller can compute EC information as it fills a zone (e.g., using a linear combination scheme), and can provide either intermittent or completed-zone-wide EC information to a host. As this statement implies, in various embodiments, EC information can be computed on the basis of any desired logical and/or physical structure, for example, on the basis of virtual storage units (e.g., a “virtual” erase unit or die, as described in the incorporated-by-reference documents), on the basis of an entire die, or on some other basis.
It should be noted that, as a step in their fabrication or other reification, the various circuits disclosed herein may be described using computer aided design tools and expressed (or represented) as data and/or instructions embodied in various computer-readable media, in terms of their behavioral, register transfer, logic component, transistor, layout geometries, and/or other characteristics. Formats of files and other objects in which such circuit expressions may be implemented include, but are not limited to, formats supporting behavioral languages such as C, Verilog, and VHDL, formats supporting register level description languages like RTL, and formats supporting geometry description languages such as GDSII, GDSIII, GDSIV, CIF, MEBES and any other suitable formats and languages. Computer-readable media in which such formatted data and/or instructions may be embodied include, but are not limited to, computer storage media in various forms (e.g., optical, magnetic or semiconductor storage media, whether independently distributed in that manner, or stored “in situ” in an operating system).
When received within a computer system via one or more computer-readable media, such data and/or instruction-based expressions of the above described circuits may be processed by a processing entity (e.g., one or more processors) within the computer system in conjunction with execution of one or more other computer programs including, without limitation, net-list generation programs, place and route programs and the like, to generate a representation or image of a physical manifestation of such circuits. Such representation or image may thereafter be used in device fabrication, for example, by enabling generation of one or more masks that are used to form various components of the circuits in a device fabrication process. Any of the various methods and operational sequences herein may likewise be recorded as one or more sequences of instructions on a computer-readable medium and may be executed on a computing device to effectuate the disclosed method and/or operational sequence.
In the foregoing description and in the accompanying drawings, specific terminology and drawing symbols have been set forth to provide a thorough understanding of the present invention. In some instances, the terminology and symbols may imply specific details that are not required to practice the invention. For example, any of the specific numbers of bits, signal path widths, signaling or operating frequencies, device geometries and numbers of hierarchical structural elements (e.g., channels, dies, planes, erase units, pages, etc.), component circuits or devices and the like may be different from those described above in alternative embodiments. Additionally, links or other interconnections between integrated circuit devices or internal circuit elements or blocks may be shown as buses or as single signal lines. Each of the buses may alternatively be a single signal line, and each of the single signal lines may alternatively be buses. Signals and signaling links, however shown or described, may be single-ended or differential. A signal driving circuit is said to “output” a signal to a signal receiving circuit when the signal driving circuit asserts (or deasserts, if explicitly stated or indicated by context) the signal on a signal line coupled between the signal driving and signal receiving circuits. The term “coupled” is used herein to express a direct connection as well as a connection through one or more intervening circuits or structures. Device “programming” may include, for example and without limitation, loading a control value into a register or other storage circuit within an integrated circuit device in response to a host instruction (and thus controlling an operational aspect of the device and/or establishing a device configuration) or through a one-time programming operation (e.g., blowing fuses within a configuration circuit during device production), and/or connecting one or more selected pins or other contact structures of the device to reference voltage lines (also referred to as strapping) to establish a particular device configuration or operation aspect of the device. The terms “exemplary” and “embodiment” are used to express an example, not a preference or requirement.
Any incorporation by reference of documents above is limited such that no subject matter is incorporated that is contrary to the explicit disclosure herein, and no definition from an incorporated-by-reference document modifies, supplants or appends to any definition set forth herein, i.e., definitions set forth in this document control. Any incorporation by reference of documents above is further limited such that no claims included in the documents are incorporated by reference herein.
While the invention has been described with reference to specific embodiments thereof, it will be evident that various modifications and changes may be made thereto without departing from the broader spirit and scope. For example, features or aspects of any of the embodiments may be applied in combination with any other of the embodiments disclosed herein and/or in materials incorporated by reference or in place of counterpart features or aspects thereof. Accordingly, the specification and drawings are to be regarded in an illustrative rather than a restrictive sense.
This application is a continuation of U.S. patent application Ser. No. 17/498,629, filed on Oct. 11, 2021, on behalf of first-named inventor Robert Lercari for “Erasure Coding Techniques For Flash Memory,” which in turn is a continuation of U.S. patent application Ser. No. 16/707,934, filed on Dec. 9, 2019, on behalf of first-named inventor Robert Lercari for “Erasure Coding Techniques For Flash Memory” (now issued as U.S. Pat. No. 11,175,984). The aforementioned patent applications are hereby incorporated by reference. U.S. Pat. Nos. 9,400,749 and 9,542,118 are also hereby incorporated by reference.
Number | Name | Date | Kind |
---|---|---|---|
4532590 | Wallach | Jul 1985 | A |
4813002 | Joyce | Mar 1989 | A |
5404485 | Ban | Apr 1995 | A |
5568423 | Jou et al. | Oct 1996 | A |
5652857 | Shimoi et al. | Jul 1997 | A |
5668976 | Zook | Sep 1997 | A |
5860082 | Smith et al. | Jan 1999 | A |
5963977 | Gold | Oct 1999 | A |
6118724 | Higgenbottom | Sep 2000 | A |
6134631 | Jennings, III | Oct 2000 | A |
6145069 | Dye | Nov 2000 | A |
6148354 | Ban | Nov 2000 | A |
6381668 | Lunteren | Apr 2002 | B1 |
6430650 | Miyauchi | Aug 2002 | B1 |
6571312 | Sugai | May 2003 | B1 |
6892287 | Millard | May 2005 | B1 |
7096378 | Stence et al. | Aug 2006 | B2 |
7120729 | Gonzalez | Oct 2006 | B2 |
7339823 | Nakayama et al. | Mar 2008 | B2 |
7383375 | Sinclair | Jun 2008 | B2 |
7404031 | Oshima | Jul 2008 | B2 |
7406563 | Nagshain | Jul 2008 | B1 |
7519869 | Mizuno | Apr 2009 | B2 |
7552272 | Gonzalez | Jun 2009 | B2 |
7555628 | Bennett | Jun 2009 | B2 |
7581078 | Ware | Aug 2009 | B2 |
7702846 | Nakanishi et al. | Apr 2010 | B2 |
7702948 | Kalman | Apr 2010 | B1 |
7710777 | Mintierth | May 2010 | B1 |
7747813 | Danilak | Jun 2010 | B2 |
7752381 | Wong | Jul 2010 | B2 |
7801561 | Parikh et al. | Sep 2010 | B2 |
7809900 | Danilak | Oct 2010 | B2 |
7814262 | Sinclair | Oct 2010 | B2 |
7818489 | Karamcheti et al. | Oct 2010 | B2 |
7836244 | Kim et al. | Nov 2010 | B2 |
7861122 | Cornwell et al. | Dec 2010 | B2 |
7877540 | Sinclair | Jan 2011 | B2 |
7904619 | Danilak | Mar 2011 | B2 |
7934074 | Lee | Apr 2011 | B2 |
7941692 | Royer et al. | May 2011 | B2 |
7970983 | Nochimowski | Jun 2011 | B2 |
7991944 | Lee et al. | Aug 2011 | B2 |
8001318 | Iyer | Aug 2011 | B1 |
8024545 | Kim | Sep 2011 | B2 |
8046524 | Tringali | Oct 2011 | B2 |
8055833 | Danilak et al. | Nov 2011 | B2 |
8065471 | Yano et al. | Nov 2011 | B2 |
8065473 | Ito et al. | Nov 2011 | B2 |
8068365 | Kim | Nov 2011 | B2 |
8069284 | Oh | Nov 2011 | B2 |
8072463 | Van Dyke | Dec 2011 | B1 |
8074022 | Okin et al. | Dec 2011 | B2 |
8082389 | Fujibayashi | Dec 2011 | B2 |
8086790 | Roohparvar | Dec 2011 | B2 |
8099581 | Bennett | Jan 2012 | B2 |
8099632 | Tringali | Jan 2012 | B2 |
8195912 | Flynn | Jun 2012 | B2 |
8219776 | Forhan | Jul 2012 | B2 |
8285918 | Maheshwari | Oct 2012 | B2 |
8291151 | Sinclair | Oct 2012 | B2 |
8291295 | Harari | Oct 2012 | B2 |
8341339 | Boyle | Dec 2012 | B1 |
8347042 | You | Jan 2013 | B2 |
8402249 | Zhu | Mar 2013 | B1 |
8423710 | Gole | Apr 2013 | B1 |
8495280 | Kang | Jul 2013 | B2 |
8539197 | Marshall | Sep 2013 | B1 |
8572331 | Shalvi | Oct 2013 | B2 |
8601202 | Melcher | Dec 2013 | B1 |
8645634 | Cox et al. | Feb 2014 | B1 |
8668894 | Kuehne | Apr 2014 | B2 |
8700961 | Lassa | Apr 2014 | B2 |
8954708 | Kim | Feb 2015 | B2 |
8959307 | Bruce | Feb 2015 | B1 |
8996796 | Karamcheti | Mar 2015 | B1 |
9063844 | Higgins | Jun 2015 | B2 |
9123443 | Chung | Sep 2015 | B2 |
9176864 | Gorobets | Nov 2015 | B2 |
9229854 | Kuzmin et al. | Jan 2016 | B1 |
9286198 | Bennett | Mar 2016 | B2 |
9329986 | Li | May 2016 | B2 |
9335939 | Bennett et al. | May 2016 | B2 |
9348749 | Choi | May 2016 | B2 |
9378149 | Bonwick | Jun 2016 | B1 |
9400749 | Kuzmin et al. | Jul 2016 | B1 |
9405621 | Yu | Aug 2016 | B2 |
9519578 | Kuzmin et al. | Dec 2016 | B1 |
9542118 | Lercari et al. | Jan 2017 | B1 |
9565269 | Malwankar | Feb 2017 | B2 |
9575672 | Yamamoto | Feb 2017 | B2 |
9588904 | Ercari et al. | Mar 2017 | B1 |
9652376 | Kuzmin et al. | May 2017 | B2 |
9696917 | Sareena et al. | Jul 2017 | B1 |
9710377 | Kuzmin et al. | Jul 2017 | B1 |
9727454 | Kuzmin et al. | Aug 2017 | B2 |
9734086 | Flynn | Aug 2017 | B2 |
9785572 | Lercari et al. | Oct 2017 | B1 |
9846541 | Miyamoto et al. | Dec 2017 | B2 |
9858008 | Liu | Jan 2018 | B2 |
10067866 | Sutardja | Sep 2018 | B2 |
10445229 | Kuzmin et al. | Oct 2019 | B1 |
10552058 | Jadon et al. | Feb 2020 | B1 |
10552085 | Chen et al. | Feb 2020 | B1 |
10642505 | Kuzmin | May 2020 | B1 |
10642748 | Lercari | May 2020 | B1 |
10838853 | Kuzmin | Nov 2020 | B1 |
10884915 | Kuzmin | Jan 2021 | B1 |
10915458 | Lercari | Feb 2021 | B1 |
10956082 | Kuzmin | Mar 2021 | B1 |
10977188 | Lercari | Apr 2021 | B1 |
10983907 | Kuzmin | Apr 2021 | B1 |
10996863 | Kuzmin | May 2021 | B1 |
11003586 | Lercari | May 2021 | B1 |
11023315 | Jadon | Jun 2021 | B1 |
11023386 | Lercari | Jun 2021 | B1 |
11023387 | Lercari | Jun 2021 | B1 |
11048643 | Lercari | Jun 2021 | B1 |
11068408 | Kim | Jul 2021 | B2 |
11074175 | Kuzmin | Jul 2021 | B1 |
11075984 | Mercier | Jul 2021 | B1 |
11080181 | Kuzmin | Aug 2021 | B1 |
11086789 | Lercari | Aug 2021 | B1 |
11100006 | Lercari | Aug 2021 | B1 |
11175984 | Ercari | Nov 2021 | B1 |
11188457 | Kuzmin | Nov 2021 | B1 |
11216365 | Kuzmin | Jan 2022 | B1 |
11221959 | Lercari | Jan 2022 | B1 |
11221960 | Lercari | Jan 2022 | B1 |
11221961 | Lercari | Jan 2022 | B1 |
11487678 | Park | Nov 2022 | B2 |
11500766 | Bueb | Nov 2022 | B2 |
11580030 | Das | Feb 2023 | B2 |
20030028733 | Tsunoda et al. | Feb 2003 | A1 |
20030037071 | Harris | Feb 2003 | A1 |
20030065866 | Spencer | Apr 2003 | A1 |
20030188032 | Solomon | Oct 2003 | A1 |
20040083335 | Gonzalez | Apr 2004 | A1 |
20050073844 | Gonzalez | Apr 2005 | A1 |
20050073884 | Gonzalez | Apr 2005 | A1 |
20050144413 | Kuo et al. | Jun 2005 | A1 |
20060004957 | Hand, III | Jan 2006 | A1 |
20060022171 | Maeda et al. | Oct 2006 | A1 |
20070019481 | Park | Jan 2007 | A1 |
20070046681 | Nagashima | Mar 2007 | A1 |
20070058431 | Chung et al. | Mar 2007 | A1 |
20070091497 | Mizuno | Apr 2007 | A1 |
20070143569 | Sanders | Jun 2007 | A1 |
20070168321 | Saito | Jul 2007 | A1 |
20070233939 | Kim | Oct 2007 | A1 |
20070260811 | Merry, Jr. et al. | Nov 2007 | A1 |
20070260841 | Hampel | Nov 2007 | A1 |
20070283428 | Ma et al. | Dec 2007 | A1 |
20080005502 | Kanai | Jan 2008 | A1 |
20080034153 | Lee | Feb 2008 | A1 |
20080082596 | Gorobets | Apr 2008 | A1 |
20080126720 | Danilak | May 2008 | A1 |
20080126724 | Danilak | May 2008 | A1 |
20080147964 | Chow et al. | Jun 2008 | A1 |
20080155204 | Qawami et al. | Jun 2008 | A1 |
20080189485 | Jung et al. | Aug 2008 | A1 |
20080195833 | Park | Aug 2008 | A1 |
20080201517 | Kuhne | Aug 2008 | A1 |
20080209114 | Chow | Aug 2008 | A1 |
20080307192 | Sinclair | Dec 2008 | A1 |
20080320476 | Wingard | Dec 2008 | A1 |
20090036163 | Kimbrell | Feb 2009 | A1 |
20090044190 | Tringali | Feb 2009 | A1 |
20090046533 | Jo | Feb 2009 | A1 |
20090083478 | Kunimatsu | Mar 2009 | A1 |
20090089482 | Traister | Apr 2009 | A1 |
20090089490 | Ozawa et al. | Apr 2009 | A1 |
20090119529 | Kono | May 2009 | A1 |
20090138671 | Danilak | May 2009 | A1 |
20090172219 | Mardiks | Jul 2009 | A1 |
20090172246 | Afriat | Jul 2009 | A1 |
20090172250 | Allen et al. | Jul 2009 | A1 |
20090172257 | Prins et al. | Jul 2009 | A1 |
20090172499 | Olbrich | Jul 2009 | A1 |
20090182964 | Greiner | Jul 2009 | A1 |
20090198946 | Ebata | Aug 2009 | A1 |
20090210616 | Karamcheti | Aug 2009 | A1 |
20090210636 | Karamcheti | Aug 2009 | A1 |
20090240903 | Sauber | Sep 2009 | A1 |
20090254689 | Karamcheti | Oct 2009 | A1 |
20090254705 | Abali et al. | Oct 2009 | A1 |
20090271562 | Sinclair | Oct 2009 | A1 |
20090292839 | Oh | Nov 2009 | A1 |
20090300015 | Kazan et al. | Dec 2009 | A1 |
20090327602 | Moore et al. | Dec 2009 | A1 |
20100011085 | Taguchi | Jan 2010 | A1 |
20100011186 | Bennett | Jan 2010 | A1 |
20100191779 | Hinrichs | Jan 2010 | A1 |
20100030946 | Kano | Feb 2010 | A1 |
20100042655 | Tse et al. | Feb 2010 | A1 |
20100077136 | Ware | Mar 2010 | A1 |
20100083050 | Ohyama | Apr 2010 | A1 |
20100106734 | Calder | Apr 2010 | A1 |
20100115172 | Gillingham et al. | May 2010 | A1 |
20100125702 | Lee | May 2010 | A1 |
20100161882 | Stern et al. | Jun 2010 | A1 |
20100162012 | Cornwell et al. | Jun 2010 | A1 |
20100182838 | Kim et al. | Jul 2010 | A1 |
20100199065 | Kaneda | Aug 2010 | A1 |
20100211737 | Flynn | Aug 2010 | A1 |
20100241866 | Rodorff | Sep 2010 | A1 |
20100262761 | Borchers et al. | Oct 2010 | A1 |
20100262762 | Borchers | Oct 2010 | A1 |
20100262765 | Cheon | Oct 2010 | A1 |
20100262773 | Borchers | Oct 2010 | A1 |
20100281230 | Rabii et al. | Nov 2010 | A1 |
20100287217 | Borchers et al. | Nov 2010 | A1 |
20100287327 | Li | Nov 2010 | A1 |
20100287332 | Koshiyama | Nov 2010 | A1 |
20100299494 | Van Acht | Nov 2010 | A1 |
20100329011 | Lee et al. | Dec 2010 | A1 |
20110033548 | Kimmel et al. | Feb 2011 | A1 |
20110041039 | Harari | Feb 2011 | A1 |
20110055445 | Gee et al. | Mar 2011 | A1 |
20110066792 | Shaeffer | Mar 2011 | A1 |
20110125956 | Danilak | May 2011 | A1 |
20110138114 | Yen | Jun 2011 | A1 |
20110153911 | Sprouse | Jun 2011 | A1 |
20110153917 | Maita | Jun 2011 | A1 |
20110161784 | Sellinger et al. | Jun 2011 | A1 |
20110167199 | Danilak | Jul 2011 | A1 |
20110197014 | Yeh | Aug 2011 | A1 |
20110197023 | Iwamitsu et al. | Aug 2011 | A1 |
20110231623 | Goss | Sep 2011 | A1 |
20110238890 | Sukegawa | Sep 2011 | A1 |
20110238892 | Tsai | Sep 2011 | A1 |
20110238943 | Devendran et al. | Sep 2011 | A1 |
20110264843 | Haines | Oct 2011 | A1 |
20110276756 | Bish et al. | Nov 2011 | A1 |
20110283043 | Schuette | Nov 2011 | A1 |
20110296089 | Seol | Dec 2011 | A1 |
20110296133 | Flynn et al. | Dec 2011 | A1 |
20110314209 | Eckstein | Dec 2011 | A1 |
20120033519 | Confalonieri et al. | Feb 2012 | A1 |
20120054419 | Chen | Mar 2012 | A1 |
20120059972 | Chen | Mar 2012 | A1 |
20120066441 | Weingarten | Mar 2012 | A1 |
20120079174 | Nellans | Mar 2012 | A1 |
20120131270 | Hemmi | May 2012 | A1 |
20120131381 | Eleftheriou | May 2012 | A1 |
20120155492 | Abel | Jun 2012 | A1 |
20120159037 | Kwon | Jun 2012 | A1 |
20120159039 | Kegel et al. | Jun 2012 | A1 |
20120166911 | Takahashi | Jun 2012 | A1 |
20120191921 | Shaeffer | Jul 2012 | A1 |
20120198128 | Van Aken | Aug 2012 | A1 |
20120198129 | Van Aken | Aug 2012 | A1 |
20120204079 | Takefman et al. | Aug 2012 | A1 |
20120221776 | Yoshihashi | Aug 2012 | A1 |
20120246394 | Ou | Sep 2012 | A1 |
20120260150 | Cideciyan | Oct 2012 | A1 |
20120303875 | Benhase | Nov 2012 | A1 |
20130007343 | Rub | Jan 2013 | A1 |
20130013852 | Hou et al. | Jan 2013 | A1 |
20130019057 | Stephens | Jan 2013 | A1 |
20130019062 | Bennett et al. | Jan 2013 | A1 |
20130024460 | Peterson | Jan 2013 | A1 |
20130073793 | Yamagishi | Mar 2013 | A1 |
20130073816 | Seo et al. | Mar 2013 | A1 |
20130097236 | Khorashadi | Apr 2013 | A1 |
20130111295 | Li et al. | May 2013 | A1 |
20130111298 | Seroff | May 2013 | A1 |
20130124793 | Gyl et al. | May 2013 | A1 |
20130138868 | Seroff | May 2013 | A1 |
20130145111 | Murukani | Jun 2013 | A1 |
20130166824 | Shim | Jun 2013 | A1 |
20130166825 | Kim et al. | Jun 2013 | A1 |
20130227201 | Talagala | Aug 2013 | A1 |
20130232297 | Tanaka | Sep 2013 | A1 |
20130242425 | Zayas et al. | Sep 2013 | A1 |
20130282055 | Parker | Oct 2013 | A1 |
20130290619 | Knight | Oct 2013 | A1 |
20130297852 | Fai et al. | Nov 2013 | A1 |
20130326117 | Aune | Dec 2013 | A1 |
20130332656 | Kandiraju | Dec 2013 | A1 |
20130339580 | Brandt | Dec 2013 | A1 |
20140040550 | Nale | Feb 2014 | A1 |
20140047210 | Cohen | Feb 2014 | A1 |
20140047300 | Liang | Feb 2014 | A1 |
20140101371 | Nguyen et al. | Apr 2014 | A1 |
20140122781 | Smith et al. | May 2014 | A1 |
20140189207 | Sinclair | Jul 2014 | A1 |
20140189209 | Sinclair et al. | Jul 2014 | A1 |
20140195725 | Bennett | Jul 2014 | A1 |
20140208004 | Cohen | Jul 2014 | A1 |
20140208062 | Cohen | Jul 2014 | A1 |
20140215129 | Kuzmin et al. | Jul 2014 | A1 |
20140237168 | Prins | Aug 2014 | A1 |
20140297949 | Nagawaka | Oct 2014 | A1 |
20140317346 | Moon | Oct 2014 | A1 |
20150046670 | Kim | Feb 2015 | A1 |
20150067297 | Arroyo et al. | Mar 2015 | A1 |
20150113203 | Dancho et al. | Apr 2015 | A1 |
20150134930 | Huang et al. | May 2015 | A1 |
20150149789 | Seo et al. | May 2015 | A1 |
20150212938 | Chen et al. | Jun 2015 | A1 |
20150193148 | Miwa et al. | Jul 2015 | A1 |
20150261456 | Alcantara et al. | Sep 2015 | A1 |
20150263978 | Olson | Sep 2015 | A1 |
20150309734 | Brondjik | Oct 2015 | A1 |
20150324264 | Vidypoornachy et al. | Nov 2015 | A1 |
20150347041 | Kotte et al. | Dec 2015 | A1 |
20150347291 | Choi | Dec 2015 | A1 |
20150347296 | Kotte et al. | Dec 2015 | A1 |
20150355965 | Peddle | Dec 2015 | A1 |
20150363120 | Chen | Dec 2015 | A1 |
20150378613 | Koseki | Dec 2015 | A1 |
20150378886 | Nemazie | Dec 2015 | A1 |
20160011818 | Hashimoto | Jan 2016 | A1 |
20160018998 | Mohan et al. | Jan 2016 | A1 |
20160019148 | Vekiarides | Jan 2016 | A1 |
20160019159 | Ueda | Jan 2016 | A1 |
20160026564 | Manning | Jan 2016 | A1 |
20160034221 | Zettsu | Feb 2016 | A1 |
20160070496 | Cohen | Mar 2016 | A1 |
20160092116 | Liu | Mar 2016 | A1 |
20160147669 | Huang | May 2016 | A1 |
20160179664 | Camp | Jun 2016 | A1 |
20160202910 | Ravimohan | Jul 2016 | A1 |
20160253091 | Ayyavu | Sep 2016 | A1 |
20160342509 | Kotte | Nov 2016 | A1 |
20160357462 | Nam et al. | Dec 2016 | A1 |
20160364179 | Tsai et al. | Dec 2016 | A1 |
20170031699 | Banerjee et al. | Feb 2017 | A1 |
20170060680 | Halbert | Mar 2017 | A1 |
20170075620 | Yamamoto | Mar 2017 | A1 |
20170139838 | Tomlin | May 2017 | A1 |
20170220287 | Wei | Aug 2017 | A1 |
20170364445 | Allison | Dec 2017 | A1 |
20180046480 | Dong | Feb 2018 | A1 |
20180203761 | Halbert | Jul 2018 | A1 |
20180210787 | Bains | Jul 2018 | A1 |
20200117539 | Sun | Apr 2020 | A1 |
20200151055 | Eom | May 2020 | A1 |
20200363996 | Kanno | Nov 2020 | A1 |
20200371871 | Li | Nov 2020 | A1 |
20210064293 | Cho | Mar 2021 | A1 |
20210223994 | Kanno | Jul 2021 | A1 |
Number | Date | Country |
---|---|---|
2001051889 | Feb 2001 | JP |
2011123863 | Jun 2011 | JP |
2011203916 | Oct 2011 | JP |
2009084724 | Jul 2009 | WO |
2009100149 | Aug 2009 | WO |
WO2010027983 | Mar 2010 | WO |
Entry |
---|
Bjorling, “Light NVM, the Linus Open-Channel SSD Subsystem,” 15th USENIX Conference on fiel and storage technologies, Feb. 27, 2017, 17 pages. |
Kang et al., “A Superblock-based Flash Translation Layer for NAND Flash Memory,” EMSOFT'06, Seoul, Korea, Oct. 22, 2006, ACM 1-59593-542-8/06/0010, pp. 161-170. |
TN-29-28: Memory Management in NAND Flash Arrays, Micron publication, 2005, 10 pages, available from https://www.micron.com/-/media/client/global/documents/products/technical-note/nand-flash/tn2928.pdf. |
Park et al., “A Reconfigurable FTL (Flash Translation Layer) Architecture for NAND Flash-Based Applications,” 23 pages, ACM Transactions on Embedded Computing Systems, vol. 7, No. 4, Article 38, Publication date: Jul. 2008. |
Gupta et al., “DFTL: A Flash Translation Layer Employing Demand-based Selective Caching of Page-level Address Mappings,” ASPLOS'09, Mar. 7-11, 2009, Washington, DC, USA, 12 pages. |
Ruia, Virtualization of Non-Volatile Ram, Texas A&M Masters Thesis, 77 pages, May 2015, available from: https://oaktrust.library.tamu.edu/bitstream/handle/1969.1/154981/RUIA-THESIS-2015.pdf?sequence=1&isAllowed=y. |
Hsieh et al., “Efficient Identification of Hot Data for Flash Memory Storage Systems,” ACM Transactions on Storage, vol. 2, No. 1, Feb. 2006, 19 Pages (pp. 22-40). |
Wu et al., “An Adaptive Two-Level Management for the Flash Translation Layer in Embedded Systems ,” https://dl.acm.org/doi/pdf/10.1145/1233501.1233624, Date of Publication: Nov. 9, 2006. |
Grupp et al., “Characterizing Flash Memory: Anomalies, Observations, and Applications,” https://dl.acm.org/doi/pdf/10.1145/1669112.1669118, Dec. 16, 2006. |
Huang et al., “Unified Address Translation for Memory-Mapped SSDs with FlashMap,” https://dl.acm.org/doi/10.1145/2749469.2750420, Jun. 15, 2015. |
Bang et al., “A Memory Hierarchy-Aware Metadata Management Technique for Solid State Disks,” https://ieeexplore.ieee.org/document/6026485, Aug. 10, 2011. |
Chang et al., “An efficient FTL design for multi-chipped solid-state drives,” 2010 IEEE 16th Int'l. Conference on Embedded and Real-Time Computing Systems and Applications, 2010, pp. 237-246. |
NVM Express, Version 1.0b, Jul. 12, 2011, pp. 1-126, published at http://www.nvmexpress.org/resources/ by the NVM Express Work Group. |
John D. Strunk, “Hybrid Aggregates: Combining SSDs and HDDs in a single storage pool,” Dec. 15, 2012, ACM SIGOPS Operating Systems Review archive, vol. 46 Issue 3, Dec. 2012, pp. 50-56. |
Yiying Zhang, Leo Prasath Arulraj, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau, Computer Sciences Department, University of Wisconsin-Madison, “De-indirection for Flash-based SSDs with NamelessWrites,” published at https://www.usenix.org/conference/fast12/de-indirection-flash-based-ssds-nameless-writes, Feb. 7, 2012, pp. 1-16. |
Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau, and Vijayan Prabhakaran, “ResearchRemoving the Costs of Indirection in Flash-based SSDs with NamelessWrites,” Jun. 22, 2010, pp. 1-5, published at www.cs.wisc.edu/wind/Publications/hotstorage10-nameless.pdf by Computer Sciences Department, University of Wisconsin-Madison and Microsoft Research. |
Stan Park and Kai Shen, Department of Computer Science, University of Rochester, “FIOS: A Fair, Efficient Flash I/O Scheduler,” Feb. 23, 2012, pp. 1-15, published at www.usenix.org/event/fast12/tech/full_papers/Park.pdf by the Advanced Computing Systems Association, Fast'12, 10th Usenix Conference on File and Storage Technologies, San Jose. |
Eric Seppanen, Matthew T. O'Keefe, David J. Lilja, Department of Electrical and Computer Engineering, University of Minnesota, “High Performance Solid State Storage Under Linux,” Apr. 10, 2010, MSST '10 Proceedings of the 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), pp. 1-12. |
Xiangyong Ouyangyz, David Nellansy, Robert Wipfely, David Flynny, Dhabaleswar K. Pandaz, “Beyond Block I/O: Rethinking Traditional Storage Primitives,” Aug. 20, 2011, published at http://www.sciweavers.org/read/beyond-block-i-o-rethinking-traditional-storage-primitives-327868, by Fusion IO and the Ohio State University. |
Intel Corp, PCI-SIG SR-IOV Primer—An Introduction to SR-IOV Technology,: 321211-002, Revision 2.5, Jan. 2011, 28 pages. |
Open NAND Flash Interface (ONFI), specification, version 2.0, 174 pp. Feb. 27, 2008. |
Open NAND Flash Interface (ONFI), specification, version 3.1, 296 pp. Sep. 19, 2012. |
NVM Express, V. 1.2.1, 217 pages, Jun. 3, 2016. |
Garth Gibson, Greg Ganger, “Principles of Operation for Shingled Disk Devices,” Canregie Mellon Parallel Data Laboratory, CMU-PDL-11-107, Apr. 2011, 9 pages. |
Li-Pin Chang, “Hybrid Solid State Disks: Combining Heterogeneous NAND Flash in Large SSDs,” National Chiao-Tung University, Taiwan, ASPDAC 2008, 26 pages. |
Hua Wangx, Ping Huangxz, Shuang Hex, Ke Zhoux, Chunhua Lix, and Xubin He, “A Novel I/O Scheduler for SSD with Improved Performance and Lifetime,” Mass Storage Systems and Technologies (MSST), 2013 IEEE 29th Symposium on, May 6-10, 2013, 5 pages. |
Altera Corp et al., “Hybrid Memory Cube” specification, 2012, 122 pages. |
JEDEC Standard, JESD229, Wide IO, Dec. 2011, 74 pages. |
Li-Pin Chang, “Hybrid Solid State Disks: Combining Heterogeneous NAND Flash in Large SSDs,” National Chiao-Tung University, Taiwan, 978-1-4244-1922-7/08, 2008 IEEE, 6 pages. |
Optimizing NAND Flash Performance, Flash Memory Summit, Santa Clara, CA USA Aug. 2008, Ryan Fisher, pp. 1-23. |
High-Speed NAND Flash: Design Considerations to Maximize Performance, Flash Memory Summit, Santa Clara, CA USA Aug. 11, 2009, , Robert Pierce, pp. 1-19. |
NAND 201: An Update on the Continued Evolution of NAND Flash, Jim Cooke, Micron White Paper, Sep. 6, 2011, pp. 1-10. |
Spansion SLC NAND Flash Memory for Embedded, data sheet, S34ML01G1, S34ML02G1, S34ML04G1, Sep. 6, 2012, pp. 1-73. |
Wang et al., “An Efficient Design and Implementation of LSM-Tree based Key-Value Store on Open Channel SSD,” EuroSys '14 Proceedings of the Ninth European Conference on Computer Systems, Article No. 16, Apr. 14, 2014, 14 pages. |
Ouyang et al., “SDF: Software-defined flash for web-scale internet storage systems,” Computer Architecture News—ASPLOS '14, vol. 42 Issue 1, Mar. 2014, 14 pages. |
Macko et al., “Tracking Back References in a Write-Anywhere File System,” FAST'10 Proceedings of the 8th USENIX conference on File and storage technologies, 14 pages, Feb. 23, 2010. |
Ohad, Rodeh, “IBM Research Report Defragmentation Mechanisms for Copy-on-Write File-systems,” IBM white paper, Apr. 26, 2010, 10 pages, available at domino.watson.IBM.com/library/CyberDig.nsf/papers/298A0EF3C2CDB17B852577070056B41F/$File/rj10465.pdf. |
Michael Comwell, “Anatomy of a solid-state drive,” Oct. 17, 2012, pp. 1-13, https://queue.acm.org/detail.cfm?id=2385276. |
RD527033 A, Mar. 10, 2008. |
C. Kuo et al., “Detecting Solid-State Disk Geometry for Write Pattern Optimization,” 2011 IEEE 17th International Conference on Embedded and Real-Time Computing Systems and Applications, 2011, pp. 89-94. |
S. Im and D. Shin, “Flash-aware RAID techniques for dependable and high-performance flash memory SSD,” IEEE Transactions on Computers, V. 60, No. 1, pp. 80-92, Jan. 2011. |
J. Kim et al., “A methodology for extracting performance parameters in solid state disks (SSDs),” 2009 IEEE International Symposium on Modeling, Analysis & Simulation of Computer and Telecommunication Systems, 2009, pp. 1-10. |
W. Lafi et al., “High level modeling and performance evaluation of address mapping in NAND flash memory,” 2009 16th IEEE International Conference on Electronics, Circuits and Systems (ICECS 2009), Yasmine Hammamet, 2009, pp. 659-662. |
Tennison, RD, “Memory space mapping using virtual addressing to multiple sized memory units,” IP.com, prior art database technical disclosure, Aug. 1, 1975. |
Y. Hu et al., “Achieving p. mapping FTL performance at block-mapping FTL cost by hiding address translation,” 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), Incline Village, NV, USA, May 3, 2010, pp. 1-12, doi: 10.1109/MSST.2010.5496970. |
Number | Date | Country | |
---|---|---|---|
Parent | 17498629 | Oct 2021 | US |
Child | 18199456 | US | |
Parent | 16707934 | Dec 2019 | US |
Child | 17498629 | US |