The present invention generally relates to memory devices for use with computers and other processing apparatuses. More particularly, this invention relates to a nonvolatile or permanent memory-based mass storage device using background scrubbing to identify storage addresses that could potentially develop retention problems and proactively copy the data to a different location on the same device using idle periods.
Mass storage devices such as advanced technology (ATA) or small computer system interface (SCSI) drives are rapidly adopting nonvolatile memory technology such as flash memory or other emerging solid state memory technology including phase change memory (PCM), resistive random access memory (RRAM), magnetoresistive random access memory (MRAM), ferromagnetic random access memory (FRAM), organic memories, or nanotechnology-based storage media such as carbon nanofiber/nanotube-based substrates. The currently most common technology uses NAND flash memory as inexpensive storage memory.
Despite all its advantages with respect to speed and price, flash memory has the drawback of limited endurance and data retention caused by the physical properties of the floating gate, the charge of which defines the bit contents of each cell. Typical endurance for multilevel cell NAND flash is currently in the order of 10000 write cycles at 50 nm process technology and approximately 3000 write cycles at 4x nm process technology, and endurance is decreasing with every process node. Data retention is influenced by factors like temperature and number or frequency of accesses, wherein access can either be read or write The issue of frequency of accesses is not confined to a cell of interest that holds critical data, but can also encompass any cell in the physical proximity of that cell. In more detail, if a cell is accessed for a read, its floating gate charge may be altered slightly, but at the same time all other cells in the same block are subjected to an even higher exposure of electrical field which can potentially alter their contents. In the case of writes, which often also require an anteceding block-erase, the disturbance is even greater since both writing and erasing are very harsh processes, requiring exposure to extremely high electromagnetic fields to move electrons from or into the floating gate.
Similar to the case of write endurance, retention rates are progressively getting worse with smaller process geometries. This decreased data retention is related to a thinner tunnel oxide layer, which facilitates leakage currents. Moreover, proximity effects such as read disturb and stress-induced leakage current are becoming increasingly important with smaller process geometry because of the interaction of polarization fields as contributing factors for data leakage from the floating gate.
At present, there is no adequate predictability of when cells will start losing their data since operating temperature, changes in temperature, number of accesses, frequency of accesses and ratio between reads and writes influence the retention through interactions that are poorly understood and difficult to model. However, it is obvious that there is a need for some proactive measure to prevent data loss before it happens, and such measures should include the use of error checking and correction mechanisms to avoid or at least minimize the risk of catastrophic failures.
In mass storage systems, checking of data integrity during periods of no-transfers is generally referred to as disc scrubbing as described by Schwartz et al., Modeling, Analysis, and Simulation of Computer and Telecommunications Systems, Proceedings of the IEEE Computer Society's 12th Annual International Symposium, 409 (Oct. 4-8, 2004). The underlying principle is to use idle periods of drives to check for bad blocks and then rebuild the data in a different location. U.S. Pat. No. 5,632,012 to Belsan describes such a disk scrubbing system. U.S. Patent Application 2002/0162075 to Talagala describes disk scrubbing at the disk controller level wherein the disk controller reads back data during idle phases and generates a checksum that is compared to a previously stored checksum for the same data. Any disparity between the checksums of the area scanned is used to identify bad data and initiates rebuilding of the data at different addresses using redundancy mechanisms.
U.S. Pat. No. 6,292,869 to Gerchman et al. describes the interruption of self-timed refresh upon receiving a scrub command from the system to scrub memory arrays. U.S. Pat. No. 6,8408,063 by Rodeheffer teaches memory scrubbing of very large memory arrays using timer-based scan rates, wherein the scan rate can be defined depending on the requirements of the system.
None of the above references takes into account usage patterns that can be employed to initiate proactive scrubbing on demand.
The present invention provides methods, systems and devices for increasing the reliability of solid state drives containing one or more NAND flash memory arrays. The methods, systems and devices take into account usage patterns that can be employed to initiate proactive scrubbing on demand, wherein the demand is automatically generated by a risk index that can be based on one or more of various factors that typically contribute to loss of data retention in NAND flash memory devices.
According to a first aspect of the invention, a method is provided that includes logging timestamps of data writes to addresses of the NAND flash memory device, logging the number of read accesses of the data at the addresses, calculating a risk index based on the age of the data at the address, generating a risk warning if the risk index of the data at the address exceeds a predefined threshold, communicating the risk warning to a memory management unit of the mass storage device, issuing a copy command to copy the data at the address to a different address on the NAND flash memory device, and updating a file index of the mass storage device to reflect the different address of the data.
According to a second aspect of the invention, a method is provided that includes logging timestamps of data writes to first addresses of the NAND flash memory device, logging the number of read accesses of the data at the first addresses, calculating a primary risk index based on the age of the data at the first address, logging additional addresses of additional writes to the NAND flash memory device, generating a proximity value based on spatial relations between the first addresses and the additional addresses, generating a risk level map based on the proximity value, generating a secondary risk index of the data at the first address by combining the primary risk index with the risk level map, generating a risk warning if the secondary risk index exceeds a predefined threshold, communicating the risk warning to a memory management unit of the mass storage device, issuing a copy command to copy the data at the first address to a different address on the NAND flash memory device, and updating a file index of the mass storage device to reflect the different address of the data.
According to a third aspect of the invention, a system is provided that includes means for performing the steps of either method described
According to yet another aspect of the invention, the means for performing the steps of either method may be at system level or entirely contained within the mass storage device.
Objects and advantages of this invention will be better appreciated from the following detailed description.
The present invention is generally applicable to computers and other processing apparatuses, and particularly to computers and apparatuses that utilize nonvolatile (permanent) memory-based mass storage devices, a notable example of which is mass storage devices that make use of NAND flash memory devices.
As understood in the art, the mass storage device 10 is adapted to be accessed by a host system (not shown) with which it is interfaced. In
According to a preferred aspect of the invention, the reliability of the NAND flash memory devices 18 and their data is promoted through the use of a data management system and method that implements background scrubbing to identify storage addresses on the devices 18 that could potentially develop retention problems, and then proactively copy the data to a different location on the same device 18 during idle periods. In the preferred embodiment, the controller 20 is configured to perform memory management, represented in
While the process described above is described as being initiated and performed on the device controller level, the scrubbing operation can instead be initiated on the system level. In addition, the controller 20 or system can be configured to use back-up power during power-down states of the system to autonomously perform the scrubbing operation.
The controller 20 preferably tracks the write activity to all blocks of the memory devices 18 and performs an analysis to assess which memory blocks of each memory device 18 are close enough to be potentially affected by write activity on adjacent blocks. Depending on their distances from a block to which data are written, the risk levels of all blocks in proximity are increased to some degree. Because updating the data within the wear-leveling information of each block would require additional writes to those blocks and potentially lead to cascading write activity, a separate table is preferably utilized to store this information. This write-disturb information does not require ultimate granularity, but rather a high-level map of the physical block addresses may suffice to assign increased risk to particular areas of the memory devices 18. These areas, in turn, can be prioritized for scrubbing by combining the original risk index 30 with the write-disturb parameters to a secondary risk index.
While certain components and steps are represented and, in some cases, preferred for proactive scrubbing-enabled mass storage devices of the type described above, it is foreseeable that functionally-equivalent components could be used or subsequently developed to perform the intended functions of the disclosed components. Therefore, while the invention has been described in terms of a preferred embodiment, it is apparent that other forms could be adopted by one skilled in the art, and the scope of the invention is to be limited only by the following claims.
This application claims the benefit of U.S. Provisional Application No. 61/235,100, filed Aug. 19, 2009. The contents of this prior application are incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
61235100 | Aug 2009 | US |