The present invention generally relates to memory devices for use with computers and other processing apparatuses. More particularly, this invention relates to non-volatile (permanent) memory-based mass storage devices that use flash memory devices or any similar non-volatile solid-state memory devices for permanent storage of data.
Mass storage devices such as advanced technology (ATA) or small computer system interface (SCSI) drives are rapidly adopting non-volatile solid-state memory technology such as flash memory or other emerging solid-state memory technology, including phase change memory (PCM), resistive random access memory (RRAM), magnetoresistive random access memory (MRAM), ferromagnetic random access memory (FRAM), organic memories, and nanotechnology-based storage media such as carbon nanofiber/nanotube-based substrates. Currently the most common technology uses NAND flash memory devices as inexpensive storage memory.
NAND flash memory devices (integrated circuits, or ICs) store information in an array of floating-gate transistors (FGTs), referred to as cells. NAND flash cells are organized in what are commonly referred to as pages, which in turn are organized in predetermined sections of the component referred to as blocks. Each block is the minimum erasable physical data structure of the NAND flash memory space, and the pages of each block are the minimum read and write units. The page size of flash memory has evolved from 512 Bytes to 4 kBytes (kB) and recently to 8 kBytes, with future generations of NAND flash memory devices expected to reach 16 or 32 kBytes page sizes. Although it is possible to perform sub-page reads and writes (programming of NAND flash cells), the most commonly used practice for a read-modify-write operation involves reading out the entire page into the page buffer of the flash memory device and then writing the entire page back to either a different page on the same block or a free page on a different block. The page buffer can be SRAM-based or a register.
NAND flash memory is increasingly gaining importance as a storage media in mass storage devices such as solid state drives (SSDs) used in computer systems. Current file systems use 4 Kbytes cluster sizes as the smallest allocation unit associated with the file system used by the operating system of a computer. Each cluster comprises a contiguous number of physical sectors wherein each sector is associated with a logical block address (LBA). A typical sector size in the case of hard disk drive technology is 512 Bytes plus parity information. However, some hard disk drives are migrating to a 4 Kbyte sector size, in which case the physical sector size equals the logical cluster size.
A similar situation exists in the case of NAND flash memory. The controller of an SSD that contains NAND flash memory devices includes a flash translation layer (FTL) that generates the physical addresses for mapping units that can correspond to LBAs, clusters or, at least in theory, to any other unit size. As long as the sector or page size on the storage media equals the cluster size of the operating system, a 1:1 ratio between cluster size on the level of the file system and the page size as the physical memory structure is maintained. Accordingly, for each given cluster that is modified, a single page needs to be re-written. The same ratio is achieved in the case of smaller page or sector sizes by consolidating a contiguous number of sectors or pages. Vice versa, rewriting an entire page to reflect a single modified cluster content does not result in redundant or superfluous re-writing of clusters that have not been modified.
The above discussed balance between the file system and the NAND memory architecture, specifically, the page size, is disrupted with the migration to smaller process geometries and the concurrent increase to page sizes that are a multiple of the file system's cluster or allocation unit size. The problem arises if a single cluster is modified, since each write access will always program an entire page. In other words, as soon as the page size increases to a multiple of the cluster size, the update of a single cluster is no longer a seamless 1:1 match between the updated data set and the physical amount of data that need to be written. Rather, even if only a single cluster is updated, a full page containing several clusters needs to be written.
Strictly speaking, it is not necessarily the cluster or allocation unit size that can generate the above-noted problem, but rather the difference between a physical “mapping unit” corresponding to the cluster or allocation unit generated by the FTL and the page size implemented in the various NAND flash devices. However, as a non-limiting example for illustrating the problem and possible solutions, the mapping unit will be considered equivalent to a cluster.
Even when using large pages spanning several clusters, it is possible to write one single cluster to another page. In this case, it is common practice to combine data to be written with other data through the process of write combining. The original file or cluster will be invalidated within its original page on the level of the file system since the pointer now points to a different physical address. However, for the original page, the result will be invalid clusters within a page containing other clusters that are still valid. In other words, any such page contains a heterogeneous mixture of valid and invalid data. However, it is important to understand that, at present, on the NAND flash device level the entire page can only be treated as a single unit without differentiating between valid and invalid data.
The above discussed problem becomes important in the context of performing write amplification and garbage collection in an efficient manner and without involving a host computer system. Specifically, garbage collection works by consolidating valid pages into fully utilized blocks through rewriting the data to spare blocks. In the process, the original pages are rendered invalid on the level of the file system. Once a block contains only pages with data that are flagged by the file system as invalid, it can be erased through a TRIM command.
It is understood that consolidation of pages containing multiple clusters, and the majority of them being invalid, will result in very poor utilization of the actual capacity of the drive in that in the extreme case only a single cluster of all clusters in a page will have valid data. For example, two pages with a capacity of four clusters but each having only a single valid cluster could be consolidated to a third page storing two valid clusters, thereby utilizing only 50% of the page's capacity for valid data. Currently used strategies can solve this problem by loading the data into the controller, buffering them in some form of cache and subsequently discarding invalid data while combining or “packing” valid page fragments to coherent full pages that are then written back to the array.
The drawback of the above discussed solution is that any data traffic involving more than a single monolithic IC will waste precious bandwidth in that, for example, an entire channel of a controller is occupied whenever the above described consolidation of valid data and discarding invalid data occurs.
In light of the above, it is apparent that new strategies are necessary to add further capabilities to the NAND flash device proper, and particularly for the purpose of enabling the memory device itself to address the mismatch between clusters on the level of the file system and physical page size on the level of the NAND flash device, without occupying and involving other ICs or logic.
For the purpose of disambiguation, the following definitions will be used in this disclosure:
Page size: the size of a page within a NAND flash memory device.
Erase block: a block of NAND flash memory that comprises a plurality of NAND flash pages and is the minimum erasable unit in a NAND flash memory device.
Cluster: the smallest number of contiguous LBAs allocated by a host computer system and equivalent to a file system allocation unit or an FTL mapping unit.
Sector: the smallest physical storage area associated with an LBA; several contiguous sectors form a cluster.
Page buffer: A small amount of SRAM or a register used to buffer the contents of a page of NAND flash.
Page buffer segment: a segment of a page buffer corresponding to a cluster containing several contiguous sectors.
Programmable page buffer segment size: variable size of page buffer segments that is programmed during initialization of a NAND flash device.
The present invention provides a non-volatile memory-based mass storage device, for example, a solid-state drive (SSD), that uses at least one non-volatile solid-state memory device, for example, one or more NAND flash memory devices, that defines a memory space for permanent storage of data, and to methods of using such a mass storage device and memory device.
According to a first aspect of the invention, a memory device is used in a mass storage device that is operatively connected to a host computer system having an operating system and a file system. The memory device includes memory cells organized in pages that are organized into memory blocks for storing data, and a page buffer partitioned into segments corresponding to a cluster size of the operating system or the file system of the host computer system. The size of the page buffer is larger than the size of any page of the memory device.
According to a preferred aspect of the invention, the memory device is a NAND flash memory device, the page buffer is a multiple of a page size of the memory device, and segment sizes within the page buffer may be programmed during initialization of the memory device in order to increase access speed and provide flexibility for use in multiple environments and operating systems.
According to another aspect of the invention, the memory device is a NAND flash memory device, and the page buffer is partitioned into segments corresponding to the cluster size of the file system used by the host computer system. A first page containing a mixture of valid and invalid clusters is read into the page buffer and the clusters are associated with or stored in the segments. The segments containing invalidated clusters are marked for purging. A second page is read into the page buffer and segments containing invalid clusters are marked for purging. Segments containing valid data from both of the first and second pages are re-ordered, consolidated to correspond to a full page, aligned with the page boundaries of the memory device, and written to a third page of the memory device. Overflow segments, that is, valid segments exceeding the number of available segment capacity in a page, are carried over to be combined with segments corresponding to valid clusters from a fourth page read into the page buffer on a subsequent page read access, and written back to a fifth page as soon as the combined segments correspond to a page size.
According to a yet another aspect of the invention, data from at least two pages of a memory block of the memory device are read to the page buffer and segments containing invalid clusters are purged. The valid segments are reordered, consolidated and aligned to page boundaries. The aligned valid segments are written to a third page of the memory. Overflow segments are combined with segments containing valid sectors from a fourth page and written to a fifth page within the same or a different memory block. The first, second and fourth pages are marked as invalid. Once all free pages of the memory block are used up, all valid pages are copied to a new memory block of the memory device. Usage of a new block can also start during consolidation of partially valid pages, for example, the fifth page may be written to a different block than the third page.
Other aspects of the invention include methods for reclaiming pages of a memory device that have a capacity of multiple clusters after individual sectors stored in the pages are invalidated. Each page can store a plurality of clusters. The page buffer of the memory device can buffer multiple pages within segments corresponding to individual clusters. A first page of the memory device containing invalid clusters is read into the page buffer and the invalid sectors are purged. A second page of the memory device containing additional valid and invalid clusters is read into the page buffer and the invalid clusters are purged. Segments of the page buffer containing valid clusters of the first and second pages are combined, aligned with page boundaries of a third page of the memory device, and written to the third page. If the number of segments to be combined exceeds the number of clusters that can be stored in a page of the memory device, the overflow segments are combined with additional valid segments from a fourth page and written to a fifth page of the memory device.
Still other aspects of the invention encompass the use of a page buffer for a memory device, in which the page buffer has a capacity that is a multiple of the page size of the memory device. The page buffer is n-way set associative according to the number of clusters that can be stored in segments of the page buffer. The size of the segments can be programmed during initialization of the memory device depending on operational parameters of the host computer system's basic input/output system (BIOS), the extended system configuration data (ESCD), desktop management interface (DMI) or the file system used by the operating system of the host computer system. The page buffer is further configured to intelligently order logically coherent segments containing clusters from a first and a second page to write them to a third page of the memory device. Left-over segments are carried over for combining them with additional modified segments from a second page of the memory device and writing them to a third page of the memory device after reaching a page size of the combined segments or after a time-out period has been exceeded.
Another aspect of the invention involves operating the host computer system to write a single cluster of data to the mass storage device where it is committed to the memory device. The page buffer holds the cluster in one of the segments thereof and writes the data to the memory cells after enough free page buffer segments sufficient to fill an entire page of the memory device have been filled with additional writes from the host computer system or data originating in garbage collection of the mass storage device. In case the system is powered down, the page buffer is flushed and the data are committed to the memory cells even if they do not fill an entire page. Similarly after periods of inactivity that can be specified using a time-out counter, the data can be committed to the memory cells of the memory device.
Other aspects and advantages of this invention will be better appreciated from the following detailed description.
a shows a 4 MB NAND flash memory block with 16 kB page size and 32 kB, 8-way set-associative page buffer having eight 4 kB segments fetching two pages containing valid and invalid clusters in accordance with an embodiment of the invention.
b shows the same block as
c shows the same block as
Though the present invention is generally directed to non-volatile memory-based mass storage devices, for example, solid-state drives (SSDs), that are capable of using a variety of non-volatile solid-state memory devices, the following discussion will refer specifically to mass storage devices that make use of NAND flash memory devices, in part because NAND flash memory is a non volatile memory at extremely low cost per Byte, which makes it extremely suitable for use in mass storage devices.
The internal architecture of NAND flash memory devices causes a few functional idiosyncrasies, for example, data always are written and read in the form of entire pages, a plurality of which forms a block, which in turn is the smallest functional unit for erasing data. For the purpose of the current invention, the organization of NAND flash memory devices into pages as the smallest functional unit for read and write accesses is particularly relevant.
Most modern file systems use a uniform size of the smallest data unit associated with the operating system of a host computer system. In the case of Microsoft® Windows NTFS, this smallest data unit is 4 kBytes. Hard disk drives, which are still the prevailing storage media, are typically configured into physical sectors of 512 Bytes. However, the 4 Kbytes data equivalent is maintained by forming contiguous clusters of sectors. In other words, data management is uncomplicated as long as the smallest accessible physical data carrier is smaller than, or equal to, the smallest data unit associated with the file system. In the case of NAND flash-based storage devices, the flash translation layer generates mapping units that are the physical equivalent of the cluster used by the file system. As a result, each cluster of the file system is stored in one mapping unit generated by the FTL.
The situation becomes more complicated if the file system cluster size or the size of the mapping unit is smaller than a physical sector size, which, as discussed above, is the smallest data structure assigned to an LBA or, by extension, accessible by a read or write process. In this case it is necessary to combine multiple clusters in order to fully utilize the capacity of the sector. In the case of NAND flash, it is not sectors but pages that are the smallest functional units for a single read or program (write) access.
As discussed earlier, the page size of NAND flash memory is increasing along with the transition to smaller process geometries. The latest generations of NAND flash already features 8 Kbytes pages, meaning that every page will span two FTL mapping units and hold two 4 Kbytes file system clusters or allocation units. In the near future, the page size is expected to further increase to 16 kBytes or 32 kBytes and, accordingly, each page will be capable of storing four or eight clusters.
In most cases, this will not become an immediate problem since modern controllers as used for example in solid state drivers are capable of deferred writes and write combining, thereby combining four of eight clusters before writing them to any page in the NAND flash memory array. There is, however, the possibility that a single cluster write may occur, which would leave a page under-utilized.
Likewise, during garbage collection, pages containing a mixture of valid and invalid clusters may allow reclaiming of invalid clusters by reading the entire page into the controller and, on the controller level, recombining valid clusters from different pages while discarding the invalid data from the pages.
Either one of the above situations involves data transfer from the NAND flash IC to the controller, which means that unnecessary bandwidth is wasted. The current invention targets this issue by adding data management capabilities to the NAND flash IC in order to be able to carry out write-combining and house-keeping function internally without the involvement of any other control logic.
As shown in
Newer generations of NAND flash use 8 Kbytes page sizes, as represented by of the block of NAND memory shown
The next increment in page size results in a 16 Kbytes page size or an aggregate capacity of four clusters and the currently used full-page transfer mode results in all four clusters being loaded in a single transfer into the page buffer. For alignment purposes, the page buffer may be segmented as shown in
As shown in
If the number of valid clusters from two pages exceeds the capacity of a single page, the page buffer can hold the valid segment and carry it over to the next cycle in order to coalesce it with data from additional pages, align the valid data to page boundaries and then write them back to a free page.
To facilitate the proposed structure and operation, it would be advantageous to add several new NAND flash commands to the existing instruction or command set. Possible command extensions are given below as illustrative, non-limiting examples:
1. An extension to existing commands like read, program, copyback
2. New commands for page buffer manipulation
Command encoding may vary depending on the specific NAND flash IC used.
However, in order to maintain backward compatibility, this should also be a new command.
Examples are now given specifically with reference to the figures. It is noted, however, that these examples are nonlimiting and for illustrative purposes only, and other instructions that are functionally equivalent could be supplemented for those used here:
Use the new read command. In the
Followed by reading C7 into S3.
The implementations of new NAND flash instructions as discussed above in exemplary form, in combination with a segmented page buffer that is larger than a single page, results in a NAND flash device with built-in intelligent features and reduces the workload on the controller in housekeeping operations such as garbage collection and space reclamation. It is further noted that instead of a strict “cluster” or “sector”-based segmentation, it may be advantageous to define the offset on a byte basis in order to account for variable space requirements of the different forms and levels of error correction used.
While certain components are shown and described for non-volatile memory-based mass storage devices of this invention, it is foreseeable that functionally-equivalent components could be used or subsequently developed to perform the intended functions of the disclosed components. Therefore, while the invention has been described in terms of a preferred embodiment, it is apparent that other forms could be adopted by one skilled in the art, and the scope of the invention is to be limited only by the following claims.