The present application claims priority to German Patent Application No. 10 2023 108 499.2, filed Apr. 3, 2023. The entirety of the aforementioned patent application is hereby incorporated by reference in its entirety for all purposes.
Embodiments of the present disclosure relate to the field of flash memory technology. Specifically, it is directed to a method and an apparatus for restructuring, i.e., rearranging or modifying or both, input data to be stored in a NAND flash memory having a plurality of multi-level flash memory cells being arranged in multiple physical pages, each having a defined number N>1 of data pages corresponding to the different levels of the multi-level flash memory. The NAND flash memory may particularly comprise one or more flash memory components, such as semiconductor-based memory chips.
Flash memory, particularly NAND flash memory, is one of the most popular data storage solutions, which has become more prominent in almost all modern electronic devices with storage capability demands, ranging from wearable devices to enterprise servers. The ever-increasing cost efficiency, non-volatile increasing capacity, decent performance, and mechanical shock resistance of flash memory mainly drive such popularity. Flash memory manufacturers have particularly been scaling NAND flash cells into smaller semiconductor technology nodes to cope with the increasing demands for high capacity, improve performance, and reduce the cost of flash memory. Inherently, NAND flash technology is an error-prone technology, and hence, its reliability is subject to continuous supervision, correction, and optimization. With increasing flash density, NAND flash memory cells generally become more susceptible to different error types and circuit-level noises, degrading their reliability and endurance. For instance, the write-erase (W/E) cycles typically cause gradual permanent damage inside the NAND flash cells, degrading their reliability over time and limiting the NAND flash memory's lifetime (i.e., endurance).
It is an object of the present disclosure to improve further the achievable reliability and/or lifetime of flash memory devices, including particularly 3D NAND flash memory devices.
A solution to this problem is provided by the teaching of the independent claims. Various preferred embodiments of the present solution are provided by the teachings of the dependent claims.
A first aspect of the present solution is directed to a method, which may particularly be a computer-implemented method, of restructuring input data to be stored in a NAND flash memory having a plurality of multi-level flash memory cells (i.e., MLC cells) being arranged in multiple physical pages, each having a defined number N>1 of data pages corresponding to the different levels of the multi-level flash memory.
The method, which may particularly be performed by a flash memory controller, comprises:
The restructuring comprises assigning to each output page a respective corresponding input page and selecting the respective input page for assignment to the respective output page based on: (i) the level of the one or more destination data pages being associated with the output page, (ii) the content of the input page, and (iii) a memory reliability optimization goal being defined as a function of the level of the one or more destination data pages being associated with the output page.
Further advantages, features, and applications of the present solution are provided in the following detailed description and the appended figures, wherein:
In a NAND flash cell, stored data is represented by the amount of electrical charge stored within a storage transistor, i.e., floating gate (FG) or charge trap (CT) transistor. Initially, NAND flash cells, known as single-level cells (SLC), could only store one bit of data per cell. However, to increase storage density, flash manufacturers turned to using multi-level cells (MLC) that can store multiple bits of data per cell, thereby multiplying (e.g., doubling, tripling or quadrupling) storage capacity while using the same physical flash memory device, such as a semiconductor chip. Nowadays, technologies commonly use Single-Level cell (SLC), dual-level cell, or Triple-Level Cell (TLC) NAND flashes. Each type has unique characteristics and applications, as will be discussed in more detail below in connection with
It is noted that while in the electronics industry, it is rather common to refer to dual-level cells-somewhat inconsistently with the above co-existing more general definition of MLC—as “Multiple-Level Cells” or “Multi-level Cells”, we adopt herein the above more general definition according to which the terms “Multiple-Level Cell”, “Multi-level Cell” and “MLC” each refer to cells of a NAND flash memory that stores multiple bits of data per cell instead of only a single bit as in SLC.
Due to the programming technique and the nature of the charge leaking directly after programming, slightly different charges can represent the same stored logical data. Therefore, in MLC memory cells, the stored logical data is represented by a threshold voltage (Vth) range, defining the minimal voltage required to turn on the transistor (i.e., storing cell) based on the stored charge of the cell's threshold voltage, as will also be discussed in more detail below in connection in
Conventional NAND flash memory has a planar (2D) architecture, where the entire memory cell set is laterally arranged in a single plane. However, the increasing demand for high-density, non-volatile memory devices has pushed flash manufacturers to switch to a 3D flash structure instead of a planar flash structure.
Nowadays, planar NAND flash memory usually uses a technology node of 20 nm to 15 nm, and future generations will likely use even smaller nodes. Scaling down the flash cell further while maintaining reliability was a significant challenge for flash manufacturers. 3D NAND flash memory has been introduced to overcome the aforementioned issues. Unlike planar NAND flash memory, 3D NAND flash uses vertically stacked flash cells in a 3D structure, as will be discussed in more detail below in connection in
Different phenomena affect the NAND flash reliability, disturbing stored data and reducing flash memory devices' lifetime. NAND flash is inherently error-prone, as the stored charge is changed for various reasons. This could result in reliability threats, such as data retention, cell interference, and cell degradation. These threats typically apply concurrently, which can severely worsen the reliability of the NAND flash if countermeasures are not considered. Correction mechanisms (e.g., error correction code (ECC)), therefore, are always required to maintain reliability and secure the stored data. Correction capabilities are limited to the employed algorithm, and increasing such capability is very costly. Advanced correction mechanisms are used to improve the correction capabilities at an acceptable cost. However, modern memory devices employ advanced mitigation mechanisms to reduce the impacts of reliability threats, keeping the error rates at an acceptable level and maintaining reliability.
Even though the advent of 3D NAND flash memory has generally improved reliability over planar 2D NAND flash memory, it brought new reliability threats to the scene. Nowadays, maintaining reliability while utilizing a long lifetime in high-density NAND flash memory devices is still challenging. However, while the reliability threats are typically concurrent phenomena, they usually share similar implications and roots. The stored charge (i.e., data) is the primary factor influencing the reliability and lifetime of the NAND flash cell. It can alleviate or intensify specific reliability issues to some extent. For instance, a good distribution of charges among neighboring cells can decrease cell-to-cell interference. Additionally, sustaining flash cells in good conditions by minimizing internal degradation is crucial for preserving their reliability and extending their lifetime, which is also influenced by the stored charge level.
It is an object of the present disclosure to improve further the achievable reliability and/or lifetime of flash memory devices, including particularly 3D NAND flash memory devices.
A solution to this problem is provided by the teaching of the independent claims. Various preferred embodiments of the present solution are provided by the teachings of the dependent claims.
A first aspect of the present solution is directed to a method, which may particularly be a computer-implemented method, of restructuring input data to be stored in a NAND flash memory having a plurality of multi-level flash memory cells (i.e., MLC cells) being arranged in multiple physical pages, each having a defined number N>1 of data pages corresponding to the different levels of the multi-level flash memory.
The method, which may particularly be performed by a flash memory controller, comprises:
The restructuring comprises assigning to each output page a respective corresponding input page and selecting the respective input page for assignment to the respective output page based on: (i) the level of the one or more destination data pages being associated with the output page, (ii) the content of the input page, and (iii) a memory reliability optimization goal being defined as a function of the level of the one or more destination data pages being associated with the output page.
The input page is selected so that it or a reversibly transformed version thereof satisfies the memory reliability optimization goal associated with the output page's associated destination data page, and the content of the output page is defined accordingly as that of the input page or its reversibly transformed version, respectively.
The term “flash memory device” or “memory device” in short, as used herein, refers to a physical device that comprises flash memory. Particularly, a flash memory component, such as a flash memory chip, a higher-integrated system, such as a flash memory stick or card, or an SSD storage of a computer are all examples of flash memory devices.
The term “physical page” as used herein, refers to a physical realization of a unit of data storage in a flash memory device, which is the smallest amount of data that can be written or read independently in the flash memory device, particularly in a flash memory component. The size of a physical page may vary depending on the type and manufacturer of the flash memory device. Currently, the size is typically on the order of kilobytes (KB) to several hundred kilobytes (KB). A physical page is made up of a fixed number of bytes, which can range from 128 bytes in some small flash memories to several megabytes (MB) or even more in larger ones. The number of physical pages in a flash memory device is typically limited by the size of the device and the capacity of the memory cells within it. Each page in a flash memory device typically has its own unique address, which is used to locate the page within the device. The addresses are usually assigned sequentially, with each page having a unique address that increases as more pages are added to the device. In a typical NAND flash memory design, all flash memory cells belonging to a same physical page share a same word line which is connected to and thereby interconnects the control gates of these memory cells.
The term “data page” as used herein, refers to a smaller unit of data storage within a larger physical page in a multi-level flash memory cell. In other words, a data page is a subset of a page that contains a portion of the total data stored in the physical page. The size of a data page can vary depending on the type and manufacturer of the flash memory device, but it is typically in the order of hundreds of bytes to several kilobytes (KB). The concept of data pages is used in multi-level cell (MLC) flash memories to increase storage density. In MLC flash memories, each physical page can store data in multiple levels, i.e., in multiple data pages, the levels corresponding to the different bit levels of the memory cells in the physical page. For example, in a TLC flash memory device with three levels of data storage (L0-L2), each physical page can store data from any of these three levels. These levels correspond to three bit-levels, which are usually denoted as most significant bit (MSB), center significant Bit (CSB), and least significant bit (LSB), respectively, as will be discussed in more detail further below in connection with
The term “destination data page” as used herein, refers to a data page being selected as a destination for a storage of at least a part of the content of an output page associated with this data page. Similarly, the term “destination page” as used herein, refers to a physical page to which the destination data page pertains.
The term “page structure”, as used herein, refers to a defined physical or logical structure according to which a set of data is structured into multiple data segments, which may particularly have a same size. For example, in the case of a physical structure, a page structure may be a physical array of memory cells into which data to be structured is loaded and where either the rows or the columns of the array define the segments, so that all data items of a same row (or column) belong to a same segment. In contrast, data items of different rows (or columns) belong to corresponding different segments. In another example, in the case of a logical structure, a page structure may be defined so that a set of data being organized according to a linear address range is logically divided into segments, each having the same defined page size m (i.e., m bits per page), so that every m bits in the range a new page begins.
The term “logical page”, as used herein, refers to a unit of data storage that is defined by the user or application, rather than being determined by the physical structure of the memory device. Logical pages may be larger than the physical pages of the flash memory and typically range in size from a few kilobytes to several megabytes (MB). A logical page typically consists of a contiguous block of data, which can be located across one or more physical pages, particularly their data pages of a same level, within the flash memory device. The number of physical pages needed to store a logical page can vary depending on the type and manufacturer of the flash memory device, as well as the specific application or user requirements.
The term “logical input page” or “input page” in short, as used herein, refers to a logical page (such as a data segment of a larger set of data) provided as part of the input data to be processed in accordance with the method to prepare the logical page for a later storage to a flash memory, e.g., to a same physical page of more particularly to a same data page thereof.
The term “logical output page” or “output page” in short, as used herein, refers to a logical page (such as a data segment of a larger set of data) which results from carrying out the method and which is to be stored to a flash memory, e.g., to a same physical page of more particularly to a same data page thereof.
The terms “memory reliability optimization goal” or “optimization goal” in short, as used herein, can be used interchangeably and refer to one or more defined conditions, the satisfaction of which is positively correlated to the reliability of the flash memory, particularly to a reduction of its degradation (typically caused by gradual degradation of the memory cells), or an increase in its lifetime. A memory reliability optimization goal may particularly be defined based on an assignment scheme of the flash memory, according to which assignment scheme different logical data representations of the flash memory cells are assigned to respective different threshold voltages of the flash memory cells of the flash memory. Thus, the memory reliability optimization goals may depend on the specific type of flash memory in relation to which the method is performed. The one or more conditions may particularly be defined in a quantifiable manner so that a degree of satisfaction below or above a threshold at which the condition becomes satisfied can be quantified. For example, if in the case of a single condition, the condition is defined so that it is satisfied if a number of bits of (logical) value “1” (one) in a bit pattern is higher than a threshold value k, then for a given bit pattern, e.g., an input page, its associated degree of satisfaction of the condition may be defined as a mathematical difference between the actual number n of bits of value “1” in the bit pattern and the threshold value k, i.e., as n-k. The term “optimization constraint” is used further below in to refer to such threshold values k, which can be used to define the optimization goals (see optimization constraints OTH and ZTH).
The terms “first”, “second”, “third” and the like in the description and in the claims, are used for distinguishing between similar elements and not necessarily for describing a sequential or chronological order. It is to be understood that the terms so used are interchangeable under appropriate circumstances, and that the embodiments of the present solution described herein are capable of operation in other sequences than described or illustrated herein.
Unless the context requires otherwise, where the term “comprising” or “including” or a variation thereof, such as “comprises” or “comprise” or “include”, is used in the present description and claims, it does not exclude other elements or steps and is to be construed in an open, inclusive sense, that is, as “including but not limited to”.
Where an indefinite or definite article is used when referring to a singular noun e.g., “a” or “an”, “the”, this includes a plural of that noun unless something else is specifically stated.
Appearances of the phrases “in some embodiments”, “in one embodiment” or “in an embodiment”, if any, in the description are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
Further, unless expressly stated to the contrary, “or” refers to an inclusive or and not to an exclusive or. For example, a condition A or B is satisfied by any one of the following: A is true (or present) and B is false (or not present), A is false (or not present) and B is true (or present), and both A and B are true (or present).
By the terms “configured” or “arranged” to perform a particular function, (and respective variations thereof) as they may be used herein, it is to be understood that a relevant device or component is already in a configuration or setting in which it can perform the function, or it is at least adjustable—i.e., configurable—in such a way that it can perform the function after appropriate adjustment. In this context, the configuration can be carried out, for example, through a corresponding setting of parameters of a process sequence or of hardware (HW) or software (SW) or combined HW/SW-switches or the like for activating or deactivating functionalities or settings. In particular, the device may have a plurality of predetermined configurations or operating modes, so that the configuration can be performed through a selection of one of these configurations or operating modes.
Accordingly, the method of the first aspect may be used to restructure, i.e., rearrange and/or transform the input data, namely its input pages, before its/their storage to the flash memory in a data-aware manner, such that based on the respective content of the input pages an optimized assignment of the input pages to output pages and hence to the output pages' associated destination data pages is determined in view of the memory reliability optimization goal associated therewith. Accordingly, based on the restructuring, an improved match between the input pages' respective content and the physical constraints of the data pages, as defined in the respectively associated memory reliability optimization goal can be achieved, and thus, stress for the flash memory decreased. For example, a particular first data page may be less stress-resistant when loaded with a bit pattern having a majority of bits of bit value “1” than when loaded with a bit pattern having a majority of bits of bit value “0” (also the location of the “1” or “0” values within the bit pattern may matter in this regard), while for second data page the opposite is true.
Then, according to the method, a given input page with a majority of bits of bit value “0” can be advantageously assigned to an output page associated with the second data page rather than the first data page. In this way, the stress level for the flash memory cells can be mitigated and the lifetime of the flash memory can be extended.
Typically, due to the physical design of flash memory cells, those bit representations which according to the associated assignment scheme of the flash memory require higher charges for their storage and thus longer writing times for their writing and higher threshold voltages for their reading from the flash memory are more prone to causing high stress, esp. on tunnel oxides of the affected memory cells, than bit representations that require respective lower voltages. Accordingly, the method may be considered as an optimization method that selects for given input pages (via assignment to output pages) suitable associated data pages for storing them, wherein the selection is based on the content of the input pages, the technology of the flash memory (particularly its assignment scheme), and the data page type (i.e., its level). The method tries to restructure the input data in such a manner that charging states of the flash cells belonging to the low-voltage side of the Vth distribution, i.e., lower Vth states, are used for the storage as much as possible, while higher Vth states are avoided, as much as possible.
Accordingly, the method may particularly serve to suppress the ever-increasing error rates and improve the longevity of multi-level NAND flash memory, such as TLC 3D NAND flash memory, without limitations.
In the following, preferred embodiments of the method of the first aspect are described, which can be arbitrarily combined with each other or with other aspects of the present solution, unless such combination is explicitly excluded or technically impossible.
In some embodiments, the selection of the respective input page for assignment to the respective output page may involve picking a particular input page and searching among the destination subpages to which no input page has been assigned yet a suitable one, and vice versa in other embodiments or vice versa, picking a particular available destination data page and searching among the yet unassigned input pages a suitable one, so that the memory reliability optimization goal is satisfied.
In some embodiments, for each output page, selecting the respective corresponding input page comprises determining, for each input page, a respective associated indicator that characterizes the input page. Furthermore, selecting the input page for assignment to the output page comprises: (i) comparing the associated respective indicators of multiple input pages with the memory reliability optimization goal related to the level of the one or more destination data pages being associated with the output page, and (ii) determining one input page among the multiple input pages, which itself or the reversibly transformed version thereof satisfies the memory reliability optimization goal, as the selected input page. In this way, the matching of input pages and suitable output pages can be performed based on a single indicator and thus with high efficiency. The indicator is preferably a number and the memory reliability optimization goal is defined as a mathematical condition so that the comparison can be easily performed as a mathematical calculation. This can further increase the achievable efficiency.
The associated indicator of an input page may particularly be a function of the number of bits within the input page with the same predetermined bit value, e.g., “1”. The function may particularly be the identity function, i.e., the indicator may particularly be equal to the number of bits having said same predetermined bit value. Thus, since counting the occurrences of a particular bit value within a given bit pattern is an operation which typically involves low efforts, the achievable efficiency of determining the indicator(s) can be further increased.
In some embodiments, the input data is randomized, and before the restructuring, the input pages are defined based on the randomized input data. Randomizing the data is particularly helpful to mitigate of cell-to-cell (C2C) interferences and hence helps to improve the reliability of the flash memory.
In some embodiments, the respective memory reliability optimization goal associated with a given destination data page is selected as a function of the level of the destination data page as one of the following: (i) maximizing the number of “0” bit values to be stored to the destination data page; (ii) maximizing the number of “1” bit values to be stored to the destination data page; (iii) optimizing a balance between “0” bit values and “1” bit values to be stored to the destination data page, i.e., achieving or at least approximating, as much as possible, an equal number of occurrence of both bit values.
In some embodiments, the flash memory is a 3D NAND flash memory in which the physical pages are distributed over a plurality of stacked layers of the flash memory. In this case, the memory reliability optimization goal applying to a destination data page may be defined as a function of both the level of the destination data page and the layer to which the physical page having the destination data page pertains. In order to further optimize the restructuring of the input data, this definition takes into account that typically the physical properties of the memory cells of a 3D NAND flash memory vary from layer to layer. Specifically, in many 3D NAND flash technologies, memory cells located closer to one end of the stack have slightly different dimensions, e.g., in terms of the thickness of their tunnel oxide, than memory cells located closer to the opposing end of the stack. Accordingly, the stress resistance of the memory cells, e.g., of their tunnel oxides, may vary accordingly. The same can and will usually apply to their associated Error Bit Counts (EBC) which typically positively correlate with the stress resistance of the memory cells.
Accordingly, similarly as described above more generally, in the specific case of a 3D NAND flash memory, the respective memory reliability optimization goal associated with a given destination data page is selected as a function of both the level of the destination data page and the layer to which the physical page having the destination data page pertains as one of the following: (i) maximizing the number of “0” bit values to be stored to the destination data page; (ii) maximizing the number of “1” bit values to be stored to the destination data page; (iii) optimizing a balance between “0” bit values and “1” bit values to be stored to the destination data page, i.e., achieving or at least approximating, as much as possible, an equal number of occurrence of both bit values.
In both above-mentioned cases, i.e., in said more general case (which is particularly also applicable to 2D NAND flash memory, and in the 3D NAND flash memory case, one or more of the following definitions can be applied: (i) maximizing the number of “0” bit values is defined as satisfying the condition that the number of bits in an input page or a reversibly transformed version thereof which have a bit value of one is less than a defined upper threshold; (ii) maximizing the number of “1” bit values is defined as satisfying the condition that the number of bits in an input page or a reversibly transformed version thereof which have a bit value of one is greater than a defined lower threshold; and/or (iii) optimizing a balance between “0” bit values and “1” bit values is defined as satisfying the condition that the number of bits in an input page or a reversibly transformed version thereof which have a bit value of one is greater than a defined lower threshold and less than a defined upper threshold. These definitions have the advantage that they do not require an absolute maximum of the number of “0” bit values or the number of “0” bit values or an exact balancing thereof, but rather more relaxed requirements which may be satisfied easier and with less effort (e.g., with less iterations for finding a good match between input pages and output pages) and thus more efficiently.
In some embodiments, when multiple input pages or their reversibly transformed versions, respectively, satisfy the memory reliability optimization goal associated with the one or more destination data pages being associated with a given output page, an input page among these multiple input pages is selected for assignment to the output page for which a degree that quantifies the satisfaction of the memory reliability optimization goal, such as a respective size of an overachievement, if any, is maximized among the multiple input pages. Thus, a selection of a suitable input page can be performed, which selection is both simple (and thus efficient) and, at the same time, highly effective (in view of the desired high memory reliability).
In some embodiments, selecting the respective input page for assignment to the respective output page further comprises (i) selecting one input page among the yet-unassigned input pages and transforming it according to a reversible transformation scheme being determined as a function of this input page, to obtain said reversibly transformed version of this input page; and (ii) saving transformation information based on which the transformation can be reversed in a retrievable manner for a future recovery of the untransformed input page from its transformed version. The transformation scheme is defined based on the yet-untransformed input page and the memory reliability optimization goal associated with the output page's associated one or more destination data pages so as to reversibly replace, in the input page, one or more data representations being associated with respective first threshold voltages of the memory flash by a respective number of data representations being associated with respective second threshold voltages of the memory flash being lower than the corresponding first threshold voltages to obtain the transformed version of the input page.
Accordingly, these embodiments specify a way how the restructuring of the input data can be extended beyond rearranging it in accordance with the above-described way of defining the assignment of output pages to suitable input pages to also comprise an optimization of the content of one or more of the input pages by way of the transformation(s) so as to avoid using high Vth states, as much as possible. The saving of the transformation information, which is required to reverse the related transformation(s), helps to maintain the entropy of the stored data when reading it back and allows the flash memory to function as intended, i.e., to recover the input data from the data stored in the flash memory.
Particularly, the transformation may be performed, when (i.e., if or even only if) in the course of the restructuring, a determination is made that for a given output page no yet-unassigned input page can be found that satisfies the memory reliability optimization goal associated with the output page's associated one or more destination data pages. Accordingly, in some of these embodiments, the transformation is only used, when a restructuring involving a mere rearrangement of the input data is found insufficient to satisfy the memory reliability optimization goal while otherwise a mere rearrangement is performed. Thus, the use of the transformation, which typically involves more effort than a mere rearrangement, can be effectively limited to such situations, where it is needed to satisfy the memory reliability optimization goal. In this way, the method can be made more effective than without the possible while maintaining a high level of efficiency.
In some embodiments, the one input page is selected so among the yet-unassigned input pages that pre-transformation, it minimizes among these input pages a degree that quantifies a size of a remaining gap of the respective input page towards a satisfaction of the memory reliability optimization goal associated with the output page's associated one or more destination data pages. Accordingly, this particular selection of the one input page requires only a minimal modification of its content (i.e., minimal when compared to the required modifications for the other considered input pages) to satisfy the memory reliability optimization goal. This is particularly advantageous in view of the objective of safely avoiding any over-optimization that might adversely impact the reliability of the memory.
In some embodiments, transforming the one input page comprises inverting, in this input page, at least a subset of the data representations associated with respective first threshold voltages. This approach is highly efficient because inverting (i.e., bit flipping) is a simple and very efficient bit operation. Furthermore, the entropy of the input data can be effectively protected.
Said inverting of at least a subset of the data representations being associated with respective first threshold voltages may comprise applying a masking operation to each of the to-be-inverted data representations individually or collectively. The masking operation may comprise mathematically combining the input page with at least one mask having a defined bit pattern so that the masking operation results in an inversion of said to-be-inverted data representation. The combining may particularly use a Boolean XOR operation and one or more masks having solely bit values of “1”. Using a plurality of different masks, a corresponding plurality of different transformation results can be maintained from the pre-transformation input page and a suitable result in this plurality of results may then be selected as the transformed version of the input page. The other results can be discarded.
Specifically, for the masking operations, the input page may be segmented into multiple frames of a same frame size in terms of bits, wherein the sizes of the masks in terms of bits are equal to or a multiple of the frame size, while different masks may have same or different sizes. Using this frame-based approach enables a particularly simple implementation of the masking operations and particularly supports the objective of effectively minimizing the needed modifications of the input page to achieve the satisfaction of the applicable memory reliability optimization goal.
As already indicated above, the masking operation may be performed multiple times per input page with different masks. Among the multiple results obtained thereby, one result may be selected as the transformed version of the input page which result, i.e., transformed page, maximizes among the multiple results an achievement degree that quantifies a respective size of an overachievement of the memory reliability optimization goal associated with the output page's associated one or more destination data pages by the various results.
When multiple results share the maximum achievement degree, that result among these results may be selected as the transformed version of the input page, which deviates the least from its related untransformed input page in terms of the number of bits inverted under the related transformation.
Specifically, the selection of the one result may be performed using a Greedy algorithm. For example, the series of masking operations with the different masks may be defined so, that the size of the masks increases over time. Using a Greedy algorithm, the series may be terminated as soon as the Greedy algorithm finds a result which although satisfying the memory reliability optimization goal has a lower achievement degree than the previous result. The result among all results achieved so far which has the best achievement degree may then be chosen as the one result to be selected as the transformed version of the input page.
In some embodiments, the input data is provided or stored in a cache memory other than the multi-level flash memory, e.g., in an SLC flash memory or an SRAM memory, and the restructuring of the input data is performed in relation to the input data stored in the cache memory to obtain the output data. In this way, the advantages of caching, such as buffering an incoming data stream providing the input data, and the restructuring according to the method may be effectively combined and the multi-level flash memory itself does have to be used for the restructuring process. This helps further to avoid unnecessary damages to the multi-level flash memory and thus conserve its lifetime.
In some embodiments, the respective memory reliability optimization goals associated with the one or more destination data pages are each defined as a function of a calibration parameter which defines a common reliability margin of the memory reliability optimization goals. The memory reliability optimization goals may be defined in addition as a function of the destination data pages to which they pertain, e.g., as a function of the layer in a 3D NAND flash in which the data pages are located. While the calibration parameter serves to provide the possibility to fine-tune or otherwise calibrate the memory reliability optimization goals to define an applicable reliability margin, the dependence of the memory reliability optimization goals from the destination data pages may be used to take the physical differences of the physical pages to which the destination data pages pertain, into account to achieve an optimal definition of the memory reliability optimization goals for optimizing the reliability of the flash memory.
A second aspect of the present solution is directed to a data processing apparatus, such as a flash memory controller device, configured to carry out the method of the first aspect.
In some embodiments, the data processing apparatus comprises one or more processors and having access to a program memory in which one or more programs are, which when executed on the one or more processors cause the data processing apparatus to carry out the method of the first aspect.
Finally, a third aspect of the present solution is directed to a computer program or a computer program product, comprising instructions to cause the data processing apparatus to carry out the method according to the first aspect of the solution. Specifically, the instructions may be defined so that when they are executed on one or more processors of the data processing apparatus, they cause the data processing apparatus to carry out the method according to the first aspect of the solution.
The data processing apparatus of the second aspect may accordingly have a program memory in which the computer program is stored. Alternatively, the data processing apparatus may also be set up to access a computer program available externally, for example on one or more servers or other data processing units, via a communication link, in particular to exchange with it data used in the course of the execution of the computer program or representing outputs of the computer program.
The features and advantages explained with respect to the first aspect of the solution apply accordingly to the further aspects of the solution.
Depending on the number k of different bit patterns that can be stored and distinguished per cell C, k-1 threshold voltages VTH are defined, which are needed to enable a distinction of different charge levels of the flash memory cells which in turn correspond to a stored bit value (for SLC) or to a stored bit pattern (for multi-level flash), respectively, particularly for read-out of stored data. In the example of
Typically, a lower number of levels of the flash memory is more advantageous in terms of high reliability and high performance, while a higher number of levels is more advantageous in terms of high density and low manufacturing costs.
A thin layer of oxide, which is called tunneling oxide, is deposited on top of the substrate to act as a tunnel barrier. This layer helps to control the flow of electrons through the flash memory cells. The next one or more layers up is where the flash memory cells are located. These cells are made up of a series of transistors and capacitors that are interconnected to store data. Each cell can hold multiple bits of data (MLC), and they are arranged in a grid-like pattern within each cell-bearing layer.
Above the flash memory cells is an inspection gate, which acts as a barrier to prevent electrons from escaping during the read process. This gate also helps to improve the read speed and accuracy of the flash memory. A selective gate is another layer that separates the flash memory cells from the control circuitry. It allows a control circuitry to selectively turn on or off the cells to access the stored data. The control circuitry is located at the top of the 3D NAND flash structure. This includes the gate drivers, sense amplifiers, and other components that are necessary for reading and writing data to the flash memory cells.
Connecting the various layers within the 3D NAND flash is a complex network of interconnects. These interconnects allow the control circuitry to communicate with the flash memory cells and read or write data as needed. Multiple insulating layers are used throughout the structure to isolate the different layers and prevent electrical shorts between them. Electrodes are located at the top and bottom of the 3D NAND flash structure, providing a pathway for the electrons to flow through the memory cells.
The exemplary block B (
By stacking these layers on top of each other, the storage density within the 3D NAND flash can be increased significantly compared to traditional 2D flash memories. This allows for more data to be stored in a smaller physical area, making it an attractive option for use in high-density storage devices such as smartphones and laptops.
As already discussed above, NAND flash is organized into blocks and pages. MLC-pages are grouped into a plurality of data pages (one per level) that share the same physical cells. 3D NAND flash memory has its physical pages stacked on multiple layers. The physical pages are organized from bottom to top layer, see
Write-Erase Cycle: On the one hand, writing to a flash memory cell works by storing the representative amounts of charge of the corresponding data bit(s), as shown in
Caching: Storage devices (e.g., Solid State Drive (SSD)) store data typically upon arrival. However, writing to the flash is an atomic operation, where all the data should be copied to the flash memory device before programming the corresponding pages. For instance, in 3D TLC NAND flash, the three data pages (i.e., LSP, CSP, and MSP) must be available and transferred to the flash device before programming. However, a caching mechanism is compulsory if insufficient data is available or direct writing is not granted. It even helps to improve the writing throughput in MLC flashes, as writing to cache is way faster than writing to MLC flash. Thereafter, data migration is vital, copying the cached data into flash pages. Note that, this process is totally hidden from the host and usually does not impact the writing throughput. Moreover, the host (e.g., a computer using the flash memory for data storage) does not know the physical location of the data, which is fully managed by the memory device itself.
Interference: In NAND flashes, adjacent cells influence each other, which alters the stored logical values. As previously mentioned, the stored amount of charge represents the logical value (data) stored in a flash cell. Due to the coupling capacitance between flash cells, the VTH of a flash cell can be changed while its neighboring cells are being programmed as charge leaks from one cell to another. Cell-to-cell (C2C) interference is a phenomenon where the threshold voltage of a flash cell being programmed unintentionally changes neighboring victim cell(s). Hence, the logical value of the victim cell becomes incorrect, resulting in corrupted data when reading data back.
Even after programming cells correctly, stored charge can still leak from one cell to another over time, changing the stored logical value of the victim cells, which impacts the data retention of the flash. Data retention defines the time that cells can maintain enough charge in order to read data correctly. Notably, the neighbor-cell state can affect the victim cell. The high similarity of charges (states) within the neighborhood is very destructive, provoking higher interference. On the other hand, the higher the neighbor-cell charge, the lower the threshold voltage shift. Therefore, the randomness of data compromises the impacts mentioned above, mitigating programming interference and improving data retention. Accordingly, using a randomizer within the memory device generally helps to avoid the worst-case neighboring charges.
Correction: NAND flash loses charge over time, making it inherently error-prone to errors. Therefore, stored binary data can be incorrectly interpreted due to bit flips. Hence, the stored data becomes corrupted when read. Flash memory devices employ sophisticated mechanisms to detect and correct errors, such as Error Correction Codes (ECC). ECC is commonly used in NAND flash devices to detect and correct bit errors. However, ECC has a limited capability of the number of corrections per data unit.
The impact of data on NAND flash's reliability can be very significant. As previously mentioned, with every W/E cycle, the internal flash cell degradation is gradually increased. The degradation level is directly proportional to the stress level and stress time (i.e., voltage level and time) applied to the cell. Hence, stress is determined by the applied voltage for writing (programming) and erasing as well as writing time. The present solution considers only degradation caused by writing, which can be—to some degree—mitigated if data is organized and tailored in a reliability-aware manner, as will be explained in more detail further below.
Each data representation has a unique amount of charge that should be stored within the cell. The larger the stored charge (i.e., VTH), the higher the stress (i.e., higher voltage and longer time, see
NAND flash vendors typically urge randomizing the stored data in every page. In turn, this helps to mitigate the C2C interference and, hence, improves reliability. Therefore, NAND flash's reliability is again data-dependent, which is also impacted based on the cell and neighborhood contents.
In the course of developing the present solution, various synthetic data patterns (around 20 data sets) for 3D TLC NAND flash were generated to characterize the reliability-data dependency. The data patterns were synthetically tailored to cover all possible use cases under the awareness and unawareness of the flash technology (e.g., 3D structure and W/E degradation (
(DP1): Data is tailored to cover the order of all unique values of TLC (i.e., three bits of eight states to form three bytes (3×8) for LSB, CSB, and MSB, based on
(DP2): Data is randomly generated for the whole block considering the 3D structure of the NAND flash (i.e., layer number is the seed for the random generator).
(DP3): Data is randomly generated for the whole block considering the 3D structure of the NAND flash. Data is biased only for LSP to contain more binary zero values.
(DP4): Data is randomly generated for the whole block considering the 3D structure of the NAND flash. Data is biased only for LSP to contain more binary one values.
(DP5): Data is randomly generated for the whole block considering the 3D structure of the NAND flash. Data is biased only for MSP to contain more binary zero values.
(DP6): Data is randomly generated for the whole block considering the 3D structure of the NAND flash. Data is biased only for MSP to contain more binary one values.
To quantify reliability degradation caused by each DP combined with W/E, the reliability of the test flash memories described above was examined. Each DP was used to stress a block for W/E=1000 cycles. Afterwards, the reliability of the block was examined ten minutes after programming. The reliability metric in this scope is the Error Bit Count (EBC), which refers to the number of bit flips in a page without using ECC for corrections. Note, in this section, results show solely EBC under W/E to isolate other reliability threats. Further below, reliability under W/E, data retention, and layer-to-layer variations will also be examined, as they have the most significant reliability threats in recent 3D flashes.
Results show that different DPs show different EBC, which is directly related to the internal degradation with every W/E cycle and slight C2C interference as no data retention is used. For instance, even though the data seems to be bit-wisely unique within the DP1 sequence, DP1's EBC is very high (around 600× of DP2) because the neighborhood has an akin level of charges, violating the randomness constraints, which significantly aggravates the program disturbance as well as C2C interference. This highlights the importance of the randomizer as in DP2. DP2 uses a state-of-the-art randomizer based on vendors' recommendations. However, DP2 predominantly shows a higher maximum EBC compared to biased DPs. This is expected as DP3-DP6 are designed to suppress the degradation with a predominantly significant reduction in EBC.
Importantly, there is no single optimal DP for the whole range of pages and different technologies. DP3 and DP6, for instance, alternatively reduce EBC over pages on technology
In summary, the results of the above-described characterization suggest that flash contents can be optimized in view of the above-defined objective of increasing flash reliability and endurance by maximizing the density of the left-side states (in
The method of the present solution, which due to its dependency on the data to be stored may also be designated as “data-aware slash reliability optimization technique” will now be described in more detail based on exemplary embodiments.
The technique is based on optimizing data before writing it to flash memory. As the host (which may provide the data to be stored and/or request a read-out of already stored data) does not know the actual physical location of the stored data in the flash memory, the memory device is privileged to organize and modify data based on the management technique it employs. The proposed technique seeks to be generic (i.e., applicable to any NAND flash technology), which is a preferable further objective in view of the variety of flash technologies that have different representations of the stored charge, and the fact that the data representation may differ from one flash vendor to another. Therefore, based on the flash technology's characteristics, the technique may be used as a post-optimization of input data, such as a randomizer's output.
Notably, the technique may particularly be used in connection with input data that is already randomized and cached or will be treated so upon arrival (e.g., at the memory controller performing the technique). Specifically, the data may be collected into SRAM cache to complete three pages size (i.e., as in TLC) and then be randomized and written into the flash memory. In many cases, SLC cache pages are used instead of expensive SRAM. Therefore, in these cases, data is already randomized when reading back from the SLC cache. Note that the technique works with data migration, typically done when enough pages are cached. However, the technique may also be employed in cache-less memory devices that use direct writing to the flash before writing data into flash, which could slightly impact the writing throughput. The data-aware flash reliability optimization technique aims at suppressing the internal degradation caused by every W/E cycle, reducing EBC, and/or prolonging the memory device's lifetime.
As the characterization described above shows, each data page type should receive a unique data pattern to enable data-aware flash reliability. For instance, an LSP requires more content of logical zeros than ones. The technique helps to increase the probability of desired pattern occurrences while sustaining randomness of data. This supports may particularly support fast runtime decisions and may requires fewer resources.
A page's content, particularly an input page's contents may be quantified by a suitable associated indicator, such as the sum of occurrences of logical ones and/or zeros (i.e., bits of value “1” or “0”, respectively) in the (input) page. Equations (1) and (2) provided below are examples for this approach:
wherein m is number of bits per page, Pagezeros and Pageones are indicators defined as the sum of occurrences of logical zeros and/or ones, respectively in the page, and “Ones” and “Zeros” are threshold constants (e.g., 60%). Accordingly, a page satisfying the condition expressed in eq. (1) has more occurrences of “0” bit values in the page than the threshold constant Zeros, and similarly, a page satisfying the condition expressed in eq. (2) has more occurrences of “1” bit values in the page than the threshold constant Ones.
Randomized data should not be modified in order to unrandomize the data when reading it back. Otherwise, changes would need to be tracked. Therefore, the technique considers minimal changes by only inverting portions of the input page (i.e., flipping logical bits 0→1 and 1→0), which can easily be tracked at a negligible overhead, as explained later. The technique may be summarized as an optimization algorithm that selects pages' locations based on their contents, flash technology, and page types, as well as (optionally) optimizes the page contents. The technique tries to restructure the input data so as to keep the threshold voltages VTH of the flash cells at the left side of the VTH distribution, i.e., at lower VTH states, as much as possible, while sustaining the randomness of data.
The optimization goals for the exemplary embodiment of the technique can be modeled as defined in the following equations:
wherein Page is the selected page, l is the page's layer number (e.g., l=1 for layer L1) for 3D flash, index i represents a page type (e.g., LSP, or CSP), m is the number of bits in a page, Zth and Oth are optimization constraints as defined in Eq. (4) and Eq. (5) provided below. The page with page type index i is selected, e.g., greedily selected, if it satisfies the optimization goal (memory reliability optimization goal) based on the available meta-data.
The optimization itself may particularly be implemented as a multi-objective optimization process that tries to maximize the profit greedily with a minimal number of modifications, if any, to the input data. Thus, we sustain the randomness of data with minimal read performance overhead. Note that any generic multi-objective algorithm can be used for the technique under given optimization goals. The (memory reliability) optimization goals (for a logical page P[i]) can particularly be defined in accordance with Eq. (3) in an abbreviated form as follows:
To consider layer process variations, where optimization levels might be differently required and to make the technique more generic, the model categorizes pages' contents into three levels/regions based on defined constraints, as shown in
The optimization constraints Zth and Oth may be defined as in Eq. (4) and Eq. (5) below for the number of zeros and ones, respectively, considering variations across the layers.
Where ϵz is an experimental “zeros” threshold array, ϵ0 is an experimental “ones” threshold array, α is a calibration constant (which can be used to define a layer-independent variation var of the constraints in
This is done by inverting portions of the page's contents (see Eq. (6)). Therefore, the exemplary optimization (i.e., data restructuring) algorithm (“algorithm 1”) discussed in more detail below splits the input page into various frames, evaluates all optimization scenarios (i.e., invert frames) as shown in
The method of the exemplary technique is summarized in Algorithm 1 provided in
In order to consider the underneath flash technology, the algorithm then reads the meta-data MD of the flash technology (i.e., VTH distributions, page size m, page types, data coding, etc.), and then it defines the optimization goals G (e.g., per (data) page type (Line 3), e.g., according to eq. (3).
Then, the algorithm applies a restructuring of the input pages by performing a data-aware flash reliability optimization process involving selecting the pages' contents to satisfy the optimization constraints (see Eq. (3)) to suppress the damage caused by the stored data (Line 5 to Line 17). The optimization works individually for each page type Ptype (Line 5), and then finds the layer l of the current physical destination page DP ((Line 6) to select the optimization goal G as a function of this layer l.
Then, the algorithm examines all available solutions (input pages) that satisfy optimization goals and greedily selects the optimal one that maximizes profit (Line 7 to Line 9). The algorithm also works to optimize available pages directly for the desired locations to match the write order constraint on flash pages in a block (i.e., between LSB/CSB/MSB pages and between layers) by only selecting the available three pages in the TLC case and optimizing them considering page type and layer. In case of insufficient data, a memory controller can generate random data, which must also be optimized.
Note that data pages should preferably be randomized to minimize the dependence of the error rate on data values. Good randomization aims to generate an equal number of randomly distributed zeros and ones to have a similar degree of degradation over W/E, leaving little room for optimization. Therefore, the algorithm optimizes the input page contents to change the number of zeros and ones if no solution is found.
The algorithm thus continues by looking for pages satisfying the required optimization goals G (Line 7). This is done by examining number of ones or zeros following the page type and layer number.
(Only) if no solution exists, then the algorithm selects the best candidate page for optimization as the one with the highest probability of satisfying the optimization constraints quickly and not being a solution candidate (Line 11). It is assumed that the best candidate page is the one which has a minimal number of ones or zeros, respectively, depending on the optimization goal G.
Then, the algorithm examines all predefined inverting masks M in Im to greedly find the one mask M, application of which satisfies the goal G (Line 13).
The final masking can be formed of a combination of multiple frames F with corresponding masks M, each mask M being represented as an inverting binary vector indicating the corresponding frame's inversion. Note, the smaller the mask size, the better the solution and, hence, reliability, but at the cost of performance and increasing the overhead of the management data. Importantly, over-optimization might result in more inadequate reliability. Over-optimization is optimizing data beyond a limit where high neighborhood similarity is observed, which largely eliminates the randomizer's work. Therefore, the algorithm applies a conservative selection by applying minimal modifications (i.e., minimum masking) to satisfy the optimization goal G.
Afterwards, the algorithm updates the result table T with all needed management data (Line 15), e.g., page contents, inverting vector, etc. The algorithm optimizes all cached pages, assuming enough pages are available. Otherwise, the remaining data waits for the next iteration, or random data can be written. Importantly, the technique is generic and works on any available data for optimization. At the same time, the technique is a multi-objective optimization technique.
A data generator, such as a host computer, generates and provides input data to be stored in the flash memory (Flash). In the course of the optimization process, the input data is first randomized and then further processed by the data-aware restructuring (optimization) method of the present solution, which receives flash technology data specifying details of the technology of the flash memory and other relevant information for the restructuring process (cf.
The data-aware restructuring (optimization) process yields for each input page in the input data a corresponding output page in the output data of the process as a result of the optimization. The output data is sent to the flash memory for storage therein (Write data).
A set of blocks using both DU and DA scenarios was tested. The first and last 12 layers' pages for DA were optimized (restructured) by increasing the optimization constraints while using lighter constraints for internal layers. Furthermore, the data retention time was extended to highlight further the EBC results of the top layers.
Because of such significant EBC, endurance, and retention of the NAND memory device can be very limited to the top and bottom layers. At the same time, many pages are still in good condition.
However,
Memory System with Data Processing Apparatus
The memory controller 2 is configured as a data processing device being adapted to perform the method of the present solution, particularly as described below with reference to
While above at least one exemplary embodiment of the present solution has been described, it has to be noted that a great number of variations thereto exists. Furthermore, it is appreciated that the described exemplary embodiments only illustrate non-limiting examples of how the present solution can be implemented and that it is not intended to limit the scope, the application or the configuration of the herein-described apparatuses and methods. Rather, the preceding description will provide the person skilled in the art with constructions for implementing at least one exemplary embodiment of the present solution, wherein it must be understood that various changes of functionality and the arrangement of the elements of the exemplary embodiment can be made, without deviating from the subject-matter defined by the appended claims and their legal equivalents.
Number | Date | Country | Kind |
---|---|---|---|
10 2023 108 499.2 | Apr 2023 | DE | national |