This application is related to U.S. Publication Nos. 2013/0094288, 2013/0094289 and 2013/0094286 filed on even date herewith, the content of which is incorporated by reference in its entirety.
The present disclosure is related to systems and methods for managing errors in non-volatile, solid-state memory. For example, in one embodiment, a method, apparatus, and/or computer readable medium facilitates assigning cells of a solid-state, non-volatile memory to one of a plurality of groups. Each group is defined by expected symbols stored in the cells in view of actual symbols read from the cells. Based on cell counts within the groups, it can be determined that a shift in a reference voltage will reduce a collective bit error rate of the cells. The shift can be applied to data access operations affecting the cells.
In another embodiment, multi-level cells of a solid-state, non-volatile memory are assigned to one of a plurality of groups p(b,r), where p(b,r) refers to a number of cells having expected symbol values of b being read in voltage region r, and where r represents regions adjacent a reference voltage. The regions are used to determine the symbol values of b. Based on evaluating an expression based on p(r,b), the reference voltage is adjusted to reduce a collective bit error rate of the cells.
These and other features and aspects of various embodiments may be understood in view of the following detailed discussion and accompanying drawings.
In the following diagrams, the same reference numbers may be used to identify similar/same components in multiple figures.
In the following description of various example embodiments, reference is made to the accompanying drawings that form a part hereof, and in which is shown by way of illustration various example embodiments. It is to be understood that other embodiments may be utilized, as structural and operational changes may be made without departing from the scope of the claims appended hereto.
The present disclosure is generally related to solid-state, non-volatile memory. In many implementations, solid-state, non-volatile memory uses cells similar to a metal-oxide semiconductor (MOS) field-effect transistor (FET), e.g., having a gate (control gate), a drain, and a source. The cells also include what is known as a “floating gate” that can retain a charge in the absence of external power. When a selected voltage is applied to the control gate, differing values of current may flow through the channel depending on the amount of charge on the floating gate. This current flow can be used to characterize two or more states of the cell that represent data stored in the cell.
The discussion below makes reference to flash memory, which may include NOR flash, NAND flash, 2D NAND flash, 3D NAND flash (also called vertical NAND, or VNAND) and various other technology types. These types of flash memory may be divided into two different classifications: single-level and multi-level cell memory. Single level cell (SLC) flash memory uses floating gate memory cells that store one bit of data per cell by distinguishing between two floating gate voltage levels. In contrast, multi-level cell (MLC) flash memory can be programmed to store two or more bits of information using more than two floating gate voltage levels. Because it can store more data per cell, MLC flash memory is less expensive than SLC on a per-byte basis. However, MLC flash memory has lower margin for error due to the increased resolution needed to differentiate between voltage levels. As a result, MLC may be more error-sensitive than SLC flash memory in response to such factors as repeated use (e.g., wear) and leakage of charge over time (e.g., data retention errors).
In general, a memory cell may be programmed to a number of voltages, M, where M can represent any of 2m memory states. The value m is equal to the number of bits stored, and is greater than 1 for MLC memory. For example, memory cells programmable to four voltages can store two bits per cell (M=4, m=2); memory cells programmable to eight voltages have a storage capacity of three bits per cell (M=8, m=3), etc. While specific examples below are illustrated as two-bit-per-cell, MLC NAND flash memory, it is not intended that the concepts and embodiments described herein be solely limited to this type of memory. For example, while MLC may be often used in the industry to refer to only two-bit per cell memory, in the present disclosure, MLC may refer to any number of multiple bits per cell. The categorization of errors and other features described below may be applicable to other types of non-volatile, solid state memory, e.g., those devices that share analogous features of the MLC NAND flash devices described herein.
An example of how data is stored a two-bit per cell MLC device is shown in the graph of
It should be noted that in an arrangement such as shown in
A variety of ECC algorithms are known that can correct a known number of bit errors for a predefined word size and number of ECC bits. Flash systems may use multiple levels of error coding (e.g., inner and outer coding) to improve overall error correction performance. The ECC is associated with a bit error rate (BER) metric that is used to gauge flash and error correction performance in general. However, BER may not take into account the underlying causes of the errors, such as whether the error represents a common or unusual physical deviation for the type of media involved. As seen in
Generally, a bit error occurs when the read voltage representing the data deviates from its expected range of values. Thus, in reference again to
In
For the four-level MLC shown in
The second group of errors 108 represents shifts that cause the actual measurements to be located in voltage ranges 102-105 that are not adjacent to the expected voltage ranges 102-105. There are six (M2−3M+2) of these types of errors 108 for two-bit per cell MLC memory. This group 108 is broken into two subgroups, 110 and 112. Subgroup 110 represents shifts of more than one voltage range, and subgroup 112 represents shifts of more than two voltage ranges.
In reference now to
The reference page data 204 is intended to represent a “correct” version of what is actually stored in the raw page data 202. As a result, the system 200 may take additional measures to ensure the reference page data 204 can be read back correctly and reliably. For example, the reference page data 204 may include user data stored with extra ECC to ensure successful decoding even if there are large numbers of read errors. In other arrangements, extra copies of the data 204 may be stored in areas known to have high reliability, the data 204 may determined/reconstructed from external resources (e.g., a host), etc. Or, the reference data 204 may include a known pattern that does not require decoding from programmable memory, e.g., may be coded into firmware or hardware.
The raw page data 202 may include page data that is decoded using current system parameters, e.g., adjusted read reference voltages to account for age and other factors but without using correction algorithms such as ECC. An XOR 206 of the raw page data 202 with associated reference data 204 will result in a value of one for any bits that don't agree between the two pages 202, 204, and zero for all other bits. The output 208 of the XOR operator 206 is then analyzed to increment “noise buckets” 210. Each bucket 210 holds a sum related to an error category, as indicated by column headings 212. There are 16 headings 211 for each column, each corresponding to the twelve types of errors 106, 108 shown in
A processor 214 analyzes the noise buckets 210 and uses the data to adjust parameters 216. These parameters 216 may be page-specific, and/or be related to smaller or larger divisions of memory. As will be described in greater detail below, the parameters 216 may be used to adjust reference voltages used in programming/reading the pages. For example, reference voltages used in defining the data (e.g., S0-S3 and R0-R2 shown in
To determine all values 212 for each of the buckets 210 in a multipage architecture, both MSB and LSB pages (see
The actions performed by system 200 related to calculation of the buckets 210 may occur at any time in the life of the associated memory apparatus. At least some of the actions may be triggered by data access operations, such as reads, programs, erasures, garbage collection, error recovery mode, etc. The actions may also be performed based on time intervals and or usage statistics. Flash characteristics may degrade with program/erase (PE) cycles and retention time (e.g., time since data was last programmed/refreshed). As a result, it may be useful to perform the actions at different times over the life of the system 200.
Flash noise statistics as shown being gathered in
One example set of metrics that may be used to categorize MLC flash errors are conditional error probabilities. For example, the conditional error probabilities p(i,j) are defined as the probability of programmed symbol si being detected as sj. This may alternately be described as the probability of the expected symbol value in view of the actual symbol value detected. For the arrangement shown in
In the example system 200, decoded digital data 202 and/or 204 is used as an indirect indicator of threshold voltages (and/or voltage ranges) detected within the cells in response to a read operation. This data 202, 204 is in turn used to populate data in the buckets 210. However, the system 200 may be adapted to determine the voltage data in other ways, and use alternate bucket categories with this data. For example, a flash memory may have provisions for reading read voltages directly. In such a case, a probability p(i,j) may be expressed as the probability of a cell programmed to vi threshold level being read back in the range v1j-v2j. These ranges may correspond to S0-S3 and/or R0-R2 shown in
Alternate probability distributions may be used in cases where MSB and LSB bits are determined separately using soft data read from the memory. Soft data generally refers to a technique of reading data that provides both an estimate of the stored binary value along with a reliability indicator. For example, multiple reads may be used in order to obtain higher resolution in soft information. In such a case, multiple read reference levels may need to be optimized. The choice of read reference voltage(s) may vary depending on whether the MSB or LSB bit is to be recovered, e.g., as indicated by the LSB and MSB regions in
In
As an example, consider a cell programmed with an LSB of 0 where three reads are carried out as shown in
Another example of soft read results is shown in
Some techniques for generating optimized reference voltage boundary values may use conditional probabilities as input. Other boundary generating methods may instead use the standard deviation or the variance of the noise associated with each signal Si. Under the assumption of a given disturbance distribution, the conditional probabilities can be converted to standard deviations. For example, for the Gaussian distribution, the probability and the standard deviation are related by the Q-function. In such a case, one way of obtaining the standard deviation from conditional probability buckets is by using a Q-function look-up-table.
The conditional probabilities and/or error distributions generated as described above can be used in modifying read reference voltages and program verify voltages in the memory device. In reference now to
The variable “i” in
The letters L and R refer to the value of the left and right bits. In
At block 404, the function sign(M) determines a direction of adjustment for the reference voltage R. The function sign (x) is defined as follows: sign(x)=1 if x≧0, otherwise=−1. At block 406, the magnitude of M is tested by comparison to a reference value T. The value of T may vary with i. If M is greater than or equal to T, then the voltage Ri is adjusted 408 by a magnitude Δ(i) in the direction indicated by S calculated at block 402. The variable Δ(i) shown in block 408 is a relatively small value used to increment the reference voltage Ri, and may vary depending on which boundary is being adjusted. If it is determined 406 that |M|<T, then no adjustment 408 is performed.
The procedure shown in
The updating 422 of buckets/categories may be carried out as described in relation to
The adjusting 424 of read reference voltages may be performed as shown and described in
The concept of noise buckets, in addition to being used for adjusting read levels, can be applied to program verify levels. Generally, a flash cell may be programmed by applying pulses of increasingly higher voltages to a cell. Each programming pulse is followed by a read operation to verify that the read voltage of a cell is greater than the program verify value. Adjusting program verify levels based on noise buckets can reduce read errors similar to the way that adjusting read thresholds can reduce read errors.
There may be complications in adjusting program verify levels based on noise buckets that may not exist for adjusting read boundaries. For example, as described above in relation to
Another factor to consider when adjusting program verify values based on noise buckets relates to multi-page architectures, e.g., where LSB and MSB of a cell are assigned to different pages. An example aspect of this architecture is illustrated in the block diagram of
As with read reference levels, two scenarios may be considered for determining if and how much program verify reference levels need to be modified. The first scenario considered is where only a single adjustment is performed. The process for performing under this condition is shown in the flowchart of
As shown in blocks 602 and 604, the variables L and R (representing left and right, respectively) are used as opposed to 0 and 1 in the previous examples. These variables refer to the value of bit associated with the left or right signal point of each channel, which can be 0 or 1. For example, the left value of channel 502 in
Upon entering the loop 608, the value of M is determined at block 610 using the indicated formula, which performs a natural logarithm (ln) operation on a function (in square brackets) of noise bucket probabilities. This noise bucket function depends in part on the value of K defined at 602 or 604. The natural logarithm in block 610 is multiplied by a distribution function C(i). The C(i) function is memory/distribution dependent, and can be found as shown above in Table 1. Some terms in the computation of M may be omitted, e.g., for some values of i, and/or where p(L,i) or p(R,i) are very small or zero.
At block 612, the loop iterator i is checked. The iterator is incremented at 614 if not equal to the END value defined at 602 or 604. Upon exiting the loop 608, the value S is determined 616 based on the sign of M, and the magnitude of M is checked 618 against a reference value T. If |M|>T, then a recommendation is made, a determination 620 is made as to whether the program value should be changed or not. Note that the change can either be an increase or decrease depending on the sign of S.
Changing program verify levels may affect multiple channels, and so the decision as to whether a program level should be modified may depend on results calculated for all of the channels. Therefore, the procedure in
In order to illustrate these steps, consider signal point 509 in
In cases where full sequence programming is used (e.g., operating on both MSB and LSB at the same time), there may be no need to consider multiple binary inputs, as the data could be obtained via a single M-input channel. In such a case, the second step described above may not be needed. Other than this change, the process may be much the same as what is described for multipage programming, e.g., in
In the read boundary adjust procedure of
In the case of using hard data for adjusting program verify levels, a plurality of tests against T may be used for each reference voltage S1-S2. In Table 2 below, four different values, M0-M3 are provided given using the numbering convention shown in
In the description so far, the program verify and read reference levels have been shown as being modified separately from one another. It is also possible to use noise buckets to modify both iteratively. An example of such a process according to an example embodiment is shown in the flowchart of
In
In
The non-volatile memory 910 includes the circuitry and media used to persistently store both user data and other data managed internally by apparatus 900. The non-volatile memory 910 may include one or more flash dies 912, which individually contain a portion of the total storage capacity of the apparatus 900. The memory contained within individual dies 912 may be further partitioned into blocks, here annotated as erasure blocks/units 914. The erasure blocks 914 represent the smallest individually erasable portions of memory 910. The erasure blocks 914 in turn include a number of pages 916 that represent the smallest portion of data that can be individually programmed or read. In a NAND configuration, for example, the page sizes may range from 4 kilobytes (KB) or more, and the erasure block sizes may be on the order of one megabyte (MB). It will be appreciated that the present embodiments described herein are not limited to any particular size of the pages 916 and blocks 914, and may be equally applicable to smaller or larger data unit sizes.
The apparatus 900 includes one or more controllers 904, which may include general- or special-purpose processors that perform operations of the apparatus. The controller 904 may include any combination of microprocessors, digital signal processor (DSPs), application specific integrated circuits (ASICs), field programmable gate arrays (FPGAs), or other equivalent integrated or discrete logic circuitry suitable for performing the various functions described herein.
Functions that may be provided by the controller 904 include read/write operations, media life management, error categorization, and parameter adjustment, which are represented here respectively by functional modules 906-908. The modules 906-908 may be implemented using any combination of hardware, software, and firmware, and may cooperatively perform functions related to error analysis as described herein. Error categorization module 906 performs operations related to categorizing errors stored in the memory 910. These operations may include determining expected values of data stored in the memory 910, e.g., by looking at the stored data itself, or by use of reference data 918. The reference data 918 may be stored within or separate from the main memory 910, and may include duplicate data, additional ECC data, etc.
At some point in time, the error categorization module may make a comparison of data stored in memory 910 with reference data. This comparison may include, for example, a bitwise XOR that can be used to identify particular cells that exhibit errors, as well as physical characteristics of the error, e.g., magnitude and direction of floating gate voltage shifts that resulted in the error. This data may come from analog read/write channels, encoders, decoders that directly interact with the media 910. In some cases the underlying physical characteristics can be derived from digital results, e.g., based on hard decision values of the entire cell data as shown in
The results obtained by the error characterization module 906 can be used by the media life management module 907 and by parameter adjustment module 908. The media life management module 907 monitors read/write operations and other factors related to wear and condition of the media 910. The module 907 may create and update statistics/metrics related to these operations, such as tracking program-erase cycles, time of operation, etc. The statistics may be updated based on patterns detected via the error characterization module 906.
The results obtained by the error characterization module 906 can be used by the parameter adjustment module 908 to make changes to parameters used in memory access operations. This may include adjustment of reference/boundary voltages used in reading, programming, verifying and/or erasing of particular pages 916, blocks 914 and/or dies 912. These adjustments may be applied iteratively, e.g., applying a first adjustment, measuring the result, applying a second adjustment, measuring the result, etc. At some point, the adjustments may either converge on one or more improved reference levels, or leave the levels unchanged.
In reference now to
The shift is applied 1006 to data access operations affecting the cells. The data access operations may include any combination of read and program-verify operations. The assigning of the cells to the groups and the determining and applying of the shift may be performed iteratively two or more times. Where the data access operation is a program-verify operation, the assignment of cells to the groups may optionally be based on previously performed read operations. In such a case, the shift may be further modified based on one or more conditions affecting the cells, such as program-erase cycles, data retention time, and disturb.
The various embodiments described above may be implemented using circuitry and/or software modules that interact to provide particular results. One of skill in the computing arts can readily implement such described functionality, either at a modular level or as a whole, using knowledge generally known in the art. The structures and procedures shown above are only a representative example of embodiments that can be used to facilitate optimizing reference voltages for reading and programming of memory cells.
The foregoing description of the example embodiments has been presented for the purposes of illustration and description. It is not intended to be exhaustive or to limit the inventive concepts to the precise form disclosed. Many modifications and variations are possible in light of the above teaching. Any or all features of the disclosed embodiments can be applied individually or in any combination are not meant to be limiting, but purely illustrative. It is intended that the scope be limited not with this detailed description, but rather determined by the claims appended hereto.
Number | Name | Date | Kind |
---|---|---|---|
6044019 | Cernea et al. | Mar 2000 | A |
6522580 | Chen et al. | Feb 2003 | B2 |
7023735 | Ban et al. | Apr 2006 | B2 |
7558109 | Brandman et al. | Jul 2009 | B2 |
7593259 | Kim | Sep 2009 | B2 |
7663914 | Lee | Feb 2010 | B2 |
7903468 | Litsyn et al. | Mar 2011 | B2 |
7944757 | Moschiano et al. | May 2011 | B2 |
7945825 | Cohen et al. | May 2011 | B2 |
7974132 | Cornwell et al. | Jul 2011 | B2 |
7990764 | Alrod et al. | Aug 2011 | B2 |
8000135 | Perlmutter et al. | Aug 2011 | B1 |
8000141 | Shalvi et al. | Aug 2011 | B1 |
8077520 | Yang et al. | Dec 2011 | B1 |
8331169 | Yang et al. | Dec 2012 | B1 |
8345477 | Yang | Jan 2013 | B1 |
8351258 | Yang et al. | Jan 2013 | B1 |
8363501 | Ramamoorthy et al. | Jan 2013 | B1 |
8498152 | Alrod et al. | Jul 2013 | B2 |
8531888 | Chilappagari et al. | Sep 2013 | B2 |
20050013165 | Ban | Jan 2005 | A1 |
20060028875 | Avraham et al. | Feb 2006 | A1 |
20080002464 | Maayan | Jan 2008 | A1 |
20080244339 | Kong et al. | Oct 2008 | A1 |
20090287975 | Kim et al. | Nov 2009 | A1 |
20100034019 | Kang et al. | Feb 2010 | A1 |
20100091535 | Sommer et al. | Apr 2010 | A1 |
20100118608 | Song et al. | May 2010 | A1 |
20100191931 | Kim | Jul 2010 | A1 |
20100199149 | Weingarten et al. | Aug 2010 | A1 |
20110038212 | Uchikawa et al. | Feb 2011 | A1 |
20110066902 | Sharon et al. | Mar 2011 | A1 |
20110066922 | Wezelenburg et al. | Mar 2011 | A1 |
20110069521 | Elfadel et al. | Mar 2011 | A1 |
20110141827 | Mozak et al. | Jun 2011 | A1 |
20110182118 | Litsyn et al. | Jul 2011 | A1 |
20110216598 | Kim et al. | Sep 2011 | A1 |
Number | Date | Country |
---|---|---|
WO2008078314 | Jul 2008 | WO |
WO2011008367 | Jan 2011 | WO |
WO2011094454 | Aug 2011 | WO |
Entry |
---|
U.S. Appl. No. 13/475,418, filed Oct. 18, 2011, Patapoutian et al. |
U.S. Appl. No. 13/475,497, filed Oct. 18, 2011, Sridharan et al. |
U.S. Appl. No. 13/475,598, filed Oct. 18, 2011, Sridharan et al. |
Yaakobi et al., “Error Characterization and Coding Schemes for Flash Memories”, GLOBECOM Workshops, 2010, IEEE, 5 pages. |
Jul. 3, 2013, File History for U.S. Appl. No. 13/275,497. |
U.S. Appl. No. 13/275,497, filed Oct. 18, 2011, Sridharan et al. |
U.S. Appl. No. 13/275,418, filed Oct. 18, 2011, Patapoutian et al. |
2008, Mielke et al., “Bit Error Rate in NAND Flash Memories”, 46th Annual International Reliability Physics Symposium, Phoenix, 2008, pp. 9-19. |
U.S. Appl. No. 13/275,598, filed Oct. 18, 2011, Sridharan et al. |
Nov. 1, 2013, File History for U.S. Appl. No. 13/275,497. |
Nov. 12, 2013 File History for U.S. Appl. No. 13/275,598. |
Nov. 12, 2013, File History for U.S. Appl. No. 13/275,418. |
Number | Date | Country | |
---|---|---|---|
20130094290 A1 | Apr 2013 | US |