1. Technical Field
This disclosure relates to solid-state storage devices. More particularly, the disclosure relates to systems and methods for monitoring data retention in solid-state storage devices.
2. Description of the Related Art
Non-volatile solid-state media can become corrupted over time due to various time-related and environmental factors. Periodic data scrubbing (reading data out and reprogramming the error-corrected version) can decrease the likelihood of user data error becoming uncorrectable by transferring physical data storage location before data retention properties of a block of data degrade beyond an acceptable threshold. Although various methods may be implemented to monitor data retention properties of solid-state storage, such as by storing real time-stamp information, such methods often require a substantial amount of storage space to accommodate.
Systems and methods that embody the various features of the invention will now be described with reference to the following drawings, in which:
While certain embodiments are described, these embodiments are presented by way of example only, and are not intended to limit the scope of protection. Indeed, the novel methods and systems described herein may be embodied in a variety of other forms. Furthermore, various omissions, substitutions and changes in the form of the methods and systems described herein may be made without departing from the scope of protection.
Overview
Data refreshing or scrubbing is commonly used in solid-state drive (SSD) applications to refresh data stored in the SSD before data retention degrades to a point that the data cannot be correctly read out. Data scrubbing is an error correction technique that uses a background task that periodically inspects memory for errors, and then corrects the error using error-correcting code (ECC) memory or another copy of the data. Data scrubbing can reduce the risk of occurrence of uncorrectable errors. Certain data storage systems monitor retention characteristics of blocks of memory and perform data scrubbing when such retention characteristics have likely degraded beyond a threshold. In order to determine when blocks of memory require scrubbing, a real time stamp may be used to monitor data age in a SSD system, which may be indicative of retention characteristics. The system then, based on the amount of time since the blocks were programmed, decides if a data refreshing, or scrubbing, is needed.
In order to implement a real time stamp solution, some amount of storage space must be allocated to store the time stamp, as much as a few bytes of data. Therefore, with respect to a solid-state storage module, time stamp data consumes usable capacity, thereby reducing user-usable capacity. Furthermore, it may be desirable for the time-stamp storage to be implemented at the page level in order to accurately account for different pages that were programmed at different times, which may require a significant amount of storage space to accommodate.
In addition to age, environmental factors also may have an effect on the data retention of a solid-state drive, especially for heavily cycled blocks. For example, the temperature at which the device is stored, which can vary over time and location, can affect the retention characteristics of a drive. In fact, according to the Arrhenius equation, data retention degradation acceleration due to elevated temperature is an exponential function, and thus can be very significant. Temperature and other environmental factors (e.g., electric/magnetic fields) are not generally taken into account merely by recording time stamp data, and therefore other techniques/mechanisms may be desirable or required in order to more accurately and/or more efficiently make data retention determinations.
Some embodiments disclosed herein enable controller firmware, upon power up or at another time, to check information from designated reference blocks and estimate data retention characteristics of blocks and/or pages of solid-state memory. For example, by reading a reference block, as described in greater detail below, a system may estimate an effective age, and/or other data retention factors, of data programmed on a solid-state drive. Such determinations advantageously inherently take into account various environmental conditions experienced by the drive, even when the device is powered off. With the estimated age, controller firmware may be configured to schedule a data scrubbing process.
The system may save a predetermined value in a reference block and later read data from the block and determine in some manner the block's data retention properties. Based on such reading, the system may determine an equivalent age of the programmed data, taking into account both temperature and the duration of time the SSD is in a power-off state; the information relating to data retention during a power-off period is inherently incorporated in the reference block, and therefore the controller may be able to determine subsequent refreshing/scrubbing timing when powered on again.
The various embodiments described in this disclosure increase the efficiency and/or accuracy of data retention evaluations, which may improve performance of the data storage system.
As used in this application, “non-volatile memory” may refer to solid-state memory such as NAND flash. However, the systems and methods of this disclosure may also be useful in more conventional hard drives and hybrid drives including both solid-state and hard drive components. Solid-state memory may comprise a wide variety of technologies, such as flash integrated circuits, Phase Change Memory (PC-RAM or PRAM), Programmable Metallization Cell RAM (PMC-RAM or PMCm), Ovonic Unified Memory (OUM), Resistance RAM (RRAM), NAND memory, NOR memory, EEPROM, Ferroelectric Memory (FeRAM), MRAM, or other discrete NVM (non-volatile memory) chips. The solid-state storage devices may be physically divided into planes, blocks, pages, and sectors, as is known in the art. Other forms of storage (e.g., battery backed-up volatile DRAM or SRAM devices, magnetic disk drives, etc.) may additionally or alternatively be used.
System Overview
Data Retention Monitoring
In certain solid-state storage devices, such as a storage device comprising NAND flash components, blocks or pages of storage within the device may be configured to withstand a certain number of program and erase (P/E) operations applied thereto before such operations have a significant adverse affect on the data retention properties of the block or page. Within a solid-state drive comprising NAND flash components, blocks of storage may comprise approximately 2 MB or 4 MB of storage space. In general, once a block is cycled up to a limit (e.g., 3K cycles for 2×nm MLC), the quality of this block may start degrading significantly.
Certain embodiments disclosed herein provide assignment of one or more blocks across one or more solid-state memory devices as reference blocks. Each reference block may be programmed with pre-defined data value, such as a random data pattern. For example, the pre-defined data value could be generated by a pseudo-random number generator with a known seed, so that the pre-defined data need not to be stored in a particular manner. Reference blocks advantageously may have a device usage history that is similar as that of the related data blocks, meaning that the information collected from the reference blocks can reflect the actual conditions experienced by the related data blocks closely enough such that the reference blocks can be considered valid references with respect to the related data blocks.
In certain embodiments, after power on of the SSD, controller firmware may read data from the reference block(s) and use a data retention estimation mechanism disclosed herein to estimate the elapsed time since the reference block has been programmed. Such estimation may allow for the determination of improved data scrubbing schedules. The estimated elapsed time may incorporate information related to the duration before power-off as well as the duration of the SSD system in power-off mode.
At block 206, the time until the next data scrubbing operation is determined in response to reading data from the one or more reference blocks. Such estimation may be performed based on one or more of the data retention estimation mechanisms disclosed herein. Once the data scrubbing schedule has been determined, at block 208, data scrubbing is performed according to such schedule, thereby at least partially preserving data retention functionality of the solid-state storage device.
Scheduling Data Scrubbing Operations
In certain embodiments, the read data is compared to the known data pattern, or patterns. This is performed at block 306. At block 308, the bit error rate (e.g., flipped bit count) may be calculated. In case of more than one reference block, multiple pages may be read out and the average raw bit error rate (RBER) may be used for estimation. The average RBER may be calculated in a number of ways, including any type of weighted average calculation methods. Based on the desired bit-error calculation, the process 300 may include estimating an effective amount of time since the block(s) was programmed and/or the amount of time until a subsequent data-scrubbing operation may be desirably carried out. The process inherently takes into account environmental factors experienced by the device during a power-off state and/or powered state. For example, if a storage device is subjected to excessive heat, such heat may affect data-retention properties of the reference block, such that analyzing values stored in the reference block accounts for effects of the heat. Simply using a time-stamp, on the other hand, may not account for environmental effects in the same way.
The estimation of block 310 may be performed according in any desirable manner. For example, in an embodiment in which the threshold for data scrubbing is ER1 (e.g., where ER1<<1), based on a difference between the measured RBER value and ER1, a period time before the next data scrubbing operation should be performed may be estimated. For example, a small difference between RBER and ER1 (which indicates that the error rate is approaching the threshold) may lead to a next scrubbing being scheduled in a short period of time.
In certain embodiments, the reference blocks are re-assigned dynamically. For example, when a static wear leveling (SWL) event (e.g., highest P/E-count block has 400 more P/E cycles than the lowest P/E-count block for entire device) occurs, data of the highest P/E-count block (SWL source) may be moved to one of the reference blocks (SWL destination). Thereafter, the SWL source block, which may have the highest P/E count, may be erased and assigned to be a reference block. In this fashion, the reference block can be re-assigned dynamically, wherein the P/E count of the respective blocks is substantially close to be the highest within the device. As garbage collection and data scrubbing continue running in background, certain data blocks (user or system) P/E count will increase, such that SWL events will continue to occur, at which point other reference blocks may be selected as SWL destination.
In certain embodiments, some, all, or substantially all, pages in a reference block are programmed. However, it may only be necessary to read one or two typical pages of the reference block for data retention estimation purposes. Therefore, reference block reading may require only a short period of time during power-on (e.g., in the order of 50 μs). In addition, the pre-defined data pattern for the reference blocks can be generated by a pseudo random number generator with a known seed (e.g., 0). Therefore, the data pattern may not need to be stored physically.
Estimating Time Since Last Programming Through RBER
The process 300 may be compatible with different solid-state memory devices, and may be performed in order to gather data from the reference blocks to determine an elapsed time since a prior program operation. The diagram shown in
The slope/shape of the curve 402 can be pre-characterized for the reference memory block. Thus, at a particular given point in time (take example time point 408), upon determining the RBER for that time point, an elapsed time since last programming (i.e., distance between time points 410 and 408) can be estimated. The RBER can also be extrapolated to estimate the time at which the relevant block or blocks will reach the scrubbing threshold (distance between time points 408 and 406). As can be seen, the higher the RBER, the shorter the time between the current time and the time at which scrubbing is triggered, and vice versa. The extrapolation may be adjusted according to environmental factors, in one embodiment. For example, if the current drive temperature is high, the next scheduled scrubbing may be moved up even further than the time as noted in the graph, or the next scheduled scrubbing may be given a higher priority relative to other tasks in the system, such that it may be performed sooner.
Estimating Time Since Last Programming Through Read Voltage Thresholds
In addition to using the bit error rate, other methods or mechanisms may be utilized in accordance with embodiments disclosed herein for estimating data retention properties of a group of blocks within a solid-state storage device. For example, changes in read voltage thresholds may provide information upon which data retention estimation may be based. To help explain such concepts,
The horizontal axis depicted in
In certain embodiments, as illustrated in
Estimating the data retention properties and/or data scrubbing schedule based on VT shift may be done in a number of ways. With further reference to
The graph of
With respect to various embodiments of detecting a VT shift, the embodiment of
The process 700 begins at block 702, where a data storage device is powered on. After certain initialization, certain system data may be read from the solid-state memory to check a value (e.g., activity flag) to determine if there was a data scrubbing, garbage collection, and/or static wear leveling operation running at power-off. Such step is performed at decision block 704. If so, it may be desirable for the storage device to continue execution of such operation(s). If no such operation is pending, the process 700 resumes normal operation, wherein a next data scrubbing event may be scheduled in accordance with one or more embodiments disclosed herein. The checked flag may indicate an unfinished data operation in a last power-on period. Therefore, the system may need to record up to which block the scrubbing or other operation had been performed before power off.
At block 706, the scrubbing or other operation is resumed, preferably at a point related to where the operation previously terminated. This allows for unfinished operations including possibly scrubbing, garbage collection and static wear leveling, to be performed before anything else. In certain embodiments, one or more reference blocks are scrubbed first.
The process 700 may continue with reading data from the one or more reference blocks and estimating data retention properties/characteristics based on one or more of the methods disclosed above. Based on the result, the process may include re-scheduling data scrubbing accordingly. Generally, the worse of data retention properties, the sooner of data scrubbing will be scheduled.
The various embodiment disclosed herein may provide one or more of the following advantages. First, as discussed above, some known techniques to track data retention time are to add real-time stamp in meta data area every time when a page is programmed. By doing so, the duration of data retention may be recorded, but the data doesn't account for P/E cycle or temperature factors. Reference blocks may have substantially the same experience as the data blocks, so the RBER value obtained by reading the raw data from the reference blocks and comparing against the pre-defined pattern may reflect almost all the possible factors of characteristic change on the solid-state memory devices. Such factors may include P/E cycle, inter-cell interference, data retention time, and temperature, even with respect to device condition during power-off. Furthermore, the implementation of reference blocks can enable a dynamic data scrubbing scheme which can improve the system performance without substantially losing reliability, such as by using less-frequent data scrubbing for a fresh product and more-frequent data scrubbing for a cycled product.
In addition, because reference blocks are not dedicated blocks, issues associated with wasting storage capacity, like what has been found with a real time stamp method, are marginalized. Reference blocks can be dynamically assigned from the data blocks each time static wear leveling occurs. In addition, because reference blocks are programmed with pre-defined data pattern, the feasibility of fine-tuning of the read levels with the calibration results from the reference blocks may be enabled.
Other Variations
As used in this application, “non-volatile memory” typically refers to solid-state memory such as, but not limited to, NAND flash. However, the systems and methods of this disclosure may also be useful in more conventional hard drives and hybrid hard drives including both solid-state and hard drive components. The solid-state storage devices (e.g., dies) may be physically divided into planes, blocks, pages, and sectors, as is known in the art. Other forms of storage (e.g., battery backed-up volatile DRAM or SRAM devices, magnetic disk drives, etc.) may additionally or alternatively be used.
Those skilled in the art will appreciate that in some embodiments, other types of data storage systems and/or data retention monitoring can be implemented. In addition, the actual steps taken in the processes shown in
While certain embodiments have been described, these embodiments have been presented by way of example only, and are not intended to limit the scope of protection. Indeed, the novel methods and systems described herein may be embodied in a variety of other forms. Furthermore, various omissions, substitutions and changes in the form of the methods and systems described herein may be made. The accompanying claims and their equivalents are intended to cover such forms or modifications as would fall within the scope and spirit of the protection. For example, the various components illustrated in the figures may be implemented as software and/or firmware on a processor, ASIC/FPGA, or dedicated hardware. Also, the features and attributes of the specific embodiments disclosed above may be combined in different ways to form additional embodiments, all of which fall within the scope of the present disclosure. Although the present disclosure provides certain preferred embodiments and applications, other embodiments that are apparent to those of ordinary skill in the art, including embodiments which do not provide all of the features and advantages set forth herein, are also within the scope of this disclosure. Accordingly, the scope of the present disclosure is intended to be defined only by reference to the appended claims.
Number | Name | Date | Kind |
---|---|---|---|
6856556 | Hajeck | Feb 2005 | B1 |
7126857 | Hajeck | Oct 2006 | B2 |
7315970 | Arakawa et al. | Jan 2008 | B2 |
7430136 | Merry, Jr. et al. | Sep 2008 | B2 |
7447807 | Merry et al. | Nov 2008 | B1 |
7502256 | Merry, Jr. et al. | Mar 2009 | B2 |
7509441 | Merry et al. | Mar 2009 | B1 |
7596643 | Merry, Jr. et al. | Sep 2009 | B2 |
7653778 | Merry, Jr. et al. | Jan 2010 | B2 |
7685337 | Merry, Jr. et al. | Mar 2010 | B2 |
7685338 | Merry, Jr. et al. | Mar 2010 | B2 |
7685374 | Diggs et al. | Mar 2010 | B2 |
7733712 | Walston et al. | Jun 2010 | B1 |
7765373 | Merry et al. | Jul 2010 | B1 |
7898855 | Merry, Jr. et al. | Mar 2011 | B2 |
7912991 | Merry et al. | Mar 2011 | B1 |
7936603 | Merry, Jr. et al. | May 2011 | B2 |
7962792 | Diggs et al. | Jun 2011 | B2 |
8078918 | Diggs et al. | Dec 2011 | B2 |
8090899 | Syu | Jan 2012 | B1 |
8095851 | Diggs et al. | Jan 2012 | B2 |
8108692 | Merry et al. | Jan 2012 | B1 |
8122185 | Merry, Jr. et al. | Feb 2012 | B2 |
8127048 | Merry et al. | Feb 2012 | B1 |
8135903 | Kan | Mar 2012 | B1 |
8151020 | Merry, Jr. et al. | Apr 2012 | B2 |
8161227 | Diggs et al. | Apr 2012 | B1 |
8166245 | Diggs et al. | Apr 2012 | B2 |
8243525 | Kan | Aug 2012 | B1 |
8254172 | Kan | Aug 2012 | B1 |
8261012 | Kan | Sep 2012 | B2 |
8296625 | Diggs et al. | Oct 2012 | B2 |
8312207 | Merry, Jr. et al. | Nov 2012 | B2 |
8316176 | Phan et al. | Nov 2012 | B1 |
8341338 | Lee et al. | Dec 2012 | B2 |
8341339 | Boyle et al. | Dec 2012 | B1 |
8365030 | Choi et al. | Jan 2013 | B1 |
8375151 | Kan | Feb 2013 | B1 |
8392635 | Booth et al. | Mar 2013 | B2 |
8397107 | Syu et al. | Mar 2013 | B1 |
8407449 | Colon et al. | Mar 2013 | B1 |
8423722 | Deforest et al. | Apr 2013 | B1 |
8433858 | Diggs et al. | Apr 2013 | B1 |
8443167 | Fallone et al. | May 2013 | B1 |
8447920 | Syu | May 2013 | B1 |
8458435 | Rainey, III et al. | Jun 2013 | B1 |
8478930 | Syu | Jul 2013 | B1 |
8489854 | Colon et al. | Jul 2013 | B1 |
8503237 | Horn | Aug 2013 | B1 |
8521972 | Boyle et al. | Aug 2013 | B1 |
8539315 | Hashimoto | Sep 2013 | B2 |
8549236 | Diggs et al. | Oct 2013 | B2 |
8583835 | Kan | Nov 2013 | B1 |
8593884 | Park et al. | Nov 2013 | B2 |
8601311 | Horn | Dec 2013 | B2 |
8601313 | Horn | Dec 2013 | B1 |
8612669 | Syu et al. | Dec 2013 | B1 |
8612804 | Kang et al. | Dec 2013 | B1 |
8615681 | Horn | Dec 2013 | B2 |
8638602 | Horn | Jan 2014 | B1 |
8639872 | Boyle et al. | Jan 2014 | B1 |
8683113 | Abasto et al. | Mar 2014 | B2 |
8700834 | Horn et al. | Apr 2014 | B2 |
8700950 | Syu | Apr 2014 | B1 |
8700951 | Call et al. | Apr 2014 | B1 |
8706985 | Boyle et al. | Apr 2014 | B1 |
8707104 | Jean | Apr 2014 | B1 |
8713066 | Lo et al. | Apr 2014 | B1 |
8713357 | Jean et al. | Apr 2014 | B1 |
8719531 | Strange et al. | May 2014 | B2 |
8724422 | Agness et al. | May 2014 | B1 |
8725931 | Kang | May 2014 | B1 |
8745277 | Kan | Jun 2014 | B2 |
8751728 | Syu et al. | Jun 2014 | B1 |
8769190 | Syu et al. | Jul 2014 | B1 |
8769232 | Suryabudi et al. | Jul 2014 | B2 |
8775720 | Meyer et al. | Jul 2014 | B1 |
8782327 | Kang et al. | Jul 2014 | B1 |
8788778 | Boyle | Jul 2014 | B1 |
8788779 | Horn | Jul 2014 | B1 |
8788880 | Gosla et al. | Jul 2014 | B1 |
8793429 | Call et al. | Jul 2014 | B1 |
20010047497 | Larson et al. | Nov 2001 | A1 |
20060288153 | Tanaka et al. | Dec 2006 | A1 |
20080239811 | Tanaka | Oct 2008 | A1 |
20080270680 | Chang | Oct 2008 | A1 |
20090172267 | Oribe et al. | Jul 2009 | A1 |
20100174849 | Walston et al. | Jul 2010 | A1 |
20100250793 | Syu | Sep 2010 | A1 |
20100318733 | Seo et al. | Dec 2010 | A1 |
20100332943 | D'Abreu et al. | Dec 2010 | A1 |
20110066899 | Kang et al. | Mar 2011 | A1 |
20110099323 | Syu | Apr 2011 | A1 |
20110231730 | Allen | Sep 2011 | A1 |
20110283049 | Kang et al. | Nov 2011 | A1 |
20120002468 | Sarin et al. | Jan 2012 | A1 |
20120030529 | Roohparvar et al. | Feb 2012 | A1 |
20120210076 | Jang et al. | Aug 2012 | A1 |
20120254519 | Ellis | Oct 2012 | A1 |
20120260020 | Suryabudi et al. | Oct 2012 | A1 |
20120278531 | Horn | Nov 2012 | A1 |
20120284460 | Guda | Nov 2012 | A1 |
20120324191 | Strange et al. | Dec 2012 | A1 |
20130128666 | Avila et al. | May 2013 | A1 |
20130132638 | Horn et al. | May 2013 | A1 |
20130145106 | Kan | Jun 2013 | A1 |
20130290793 | Booth et al. | Oct 2013 | A1 |
20140059405 | Syu et al. | Feb 2014 | A1 |
20140101369 | Tomlin et al. | Apr 2014 | A1 |
20140115427 | Lu | Apr 2014 | A1 |
20140133220 | Danilak et al. | May 2014 | A1 |
20140136753 | Tomlin et al. | May 2014 | A1 |
20140149826 | Lu et al. | May 2014 | A1 |
20140157078 | Danilak et al. | Jun 2014 | A1 |
20140181432 | Horn | Jun 2014 | A1 |
20140223255 | Lu et al. | Aug 2014 | A1 |
Number | Date | Country |
---|---|---|
1914689 | Feb 2007 | CN |
103632730 | Mar 2014 | CN |
Entry |
---|
Robert L. Horn, U.S. Appl. No. 13/110,639, filed May 18, 2011, 29 pages. |
Number | Date | Country | |
---|---|---|---|
20140059405 A1 | Feb 2014 | US |