1. Field of the Invention
The present invention relates generally to flash and non-volatile memory, and more specifically to increasing file system efficiency for flash memory devices.
2. Description of Related Art
Flash memory is a type of non-volatile memory that is commonly used in a wide variety of processing devices such as computer systems, computer terminals, cameras, handheld devices, music and video players, game consoles, and other electronic systems. Flash memory is a solid state form of memory that is used for the fast, easy and compact storage of data. Examples of flash memory may include, for example, the BIOS chip of a computer, CompactFlash™ and SmartMedia™ memory cards, PCMCIA flash memory cards used in notebook computers, and the like.
Flash memory may be controlled by a file system of a processing device via a software layer known as the flash translation layer. The flash translation layer may include a series of routines that emulates a sector-addressable device for the file system to enable the file system to access and store data on storage units within the flash memory device. The file system driver manages the file system of the processing device in which the flash device is used, which may be a computer, dumb terminal, PDA, etc. For example, in a windows environment, a computer terminal may implement an operating system which uses a FAT or NTFS file system. Where the computer terminal or other device includes flash memory and the processor performs reads or writes to the flash memory, the flash translation layer ordinarily receives the read/write requests from the file system driver. Thereupon, the flash translation layer accesses the flash hardware directly, applying an appropriate mapping. Other methods for accessing the flash hardware may be suitable depending on the processing device and applications involved.
In many configurations, flash memory contains sectors, or read/write units referenced by the file system driver for data storage or access. A plurality of sectors may correspond to one or more pages using a mapping scheme. A group of pages may form a memory block. Memory mapping is ordinarily used to translate logical addresses from the file system into physical addresses associated with the flash device. In one implementation, memory mapping is performed at the sector level, so that each logical sector referenced by the file system driver corresponds to a physical page on the flash device. The logical sector number of each page may be stored in the redundant area, which represents a dedicated portion of each sector, outside of the normal data area, for storing certain relevant information about the page, the block containing the page, or the flash device. A mapping table for the entire device may be stored in another memory or storage mechanism used by the underlying device, such as in the random access memory in the case of a personal computer. Ordinarily indexed by a logical sector number, entries in the mapping table contain the page corresponding to the logical sector. The flash translation layer may in some implementations read the redundant area of each page to acquire the logical sector numbers and build the mapping table when the system powers up.
Nand flash file systems generally use a similar mapping of logical units to physical units. Among other features, this mapping allows the flash translation layer to move around data from one block to another for wear-leveling purposes. In particular, it may be undesirable to use the same memory blocks over and over while other blocks remain substantially unused over time. The chronically-used memory blocks may eventually sustain sufficient wear to corrupt the device, a problem which could have been avoided by a broader distribution of the allocation of memory blocks, thereby increasing the life span of the flash device. Wear-leveling techniques accomplish this purpose by randomizing the allocation of new blocks for data storage, and by actively swapping out data from blocks which are updated infrequently.
Flash memory provides great advantages in, among other features, its durability, small size, solid state nature, and its capability to retain data on power off. The use of flash memory, however, is not without its disadvantages. First, the flash memory generally must be initialized upon power-up of the processing device. Each sector's mapping information is typically stored in a redundant area of that sector. Using a redundant area associated with each sector allows the mapping information to be written at the time the sector is written.
A problem that has persisted in the art is the difficulty for file systems to access each sector and create a mapping table in a timely fashion. As noted above, a mapping table may be stored in the volatile memory of a device (e.g., RAM). The process of accessing each redundant area of every sector on a flash device to generate a mapping table may cause highly undesirable slowdowns in the power-up sequence of the device as the file system proceeds to scan every sector. In one illustration, and depending on the configuration, it may take up to or greater than 25 microseconds for a file system to address each sector in order to create the necessary mapping between physical and logical addresses. Such a technique at start up can create unacceptable delays in the power-up of the computer or other underlying device.
A further problem relates to the size of the mapping table. The more information the file system must collect from the flash device, generally the more complex the mapping information. This complexity results in the requirement of more storage space in memory. Accessing each sector of a flash device can produce an undesirably large and complex mapping table, reducing memory availability for other applications.
Various techniques to help address these problems have been proposed in the literature. For example, one method attempts to speed up performance at power-up by consolidating all of the mapping information in a file while the device is running. The file can then be summarily accessed upon power-up, thereby removing certain steps in the initialization process. A major shortcoming with this procedure is that the mapping file must be written or updated every time the system is powered down. This method adds complexity to the initialization and shut down processes, and consumes additional flash space to store the file. Further, this method increases the delay associated with power down of the underlying processing device. Consequently, rather than solving the problem associated with slower computer performance, the proposed method merely translates the time delay from the power-up to the power down stage.
Another proposed method for decreasing power-up initialization time is to increase the size of the mapped unit. For example, instead of using 512-byte sectors, a system could use 1 KByte, 2 KByte, 4 KByte, or larger sectors. However, the use of larger mapped units increases wasted space due to fragmentation, and reduces performance because of the additional write/erase cycles required for the wasted space.
Still another method employs a multi-level mapping system, such that only the first level map is stored in RAM. This method can reduce power-up time in some implementations because it is not necessary to build the entire mapping table at power-up. Multi-level mapping systems, however, increase complexity and slow operations as the tables must be built and rebuilt at run-time.
Accordingly, it is an object of the invention to provide a faster and more efficient method and system for initializing flash memory devices used in a computer or other processing device.
It is a further object of the invention to reduce the storage space requirements for the mapping table generated at power-up.
The objects of the invention are realized in accordance with the principles of the invention by providing a method and system for reducing the time required for initialization of the flash device at power-up and for minimizing storage requirements when generating the mapping table. The method and system relies in part on the natural tendency of file systems to read and write to storage devices sequentially. The initialization routine as disclosed herein may perform a scan of a given block and may build the mapping table without reading all of the pages in the block. The flash device may also be programmed to employ a special flag in the last page of a block to enable the initialization routine to determine whether the optimization process can be applied to the block, or alternatively, whether deleted sectors are present in that block, thereby requiring the process to read each sector of the block to build the map. Power-up initialization of the flash device is generally completed when the mapping table incorporating physical-to-logical addresses is generated for the set of blocks contained within the flash device.
In one aspect of the present invention, a method of generating mapping table data for a memory block within a non-volatile memory device on power-up of a processing device in which the non-volatile memory device is used, the block comprising a plurality of pages, each page associated with a logical sector number and a consecutive sector count, and the last page of the block further comprising a flag indicating whether deleted pages in the block are present, the method comprising reading the last page of the block; identifying the status of the flag, the logical sector number, and the consecutive sector count of the last page; reading one or more additional pages of the block when the consecutive sector count indicates that not all sectors associated with the block are consecutive; and recording mapping table data for the block without reading all of the pages in the block when the flag indicates that no deleted pages in the block are present.
In another aspect of the present invention, a method of initializing a non-volatile memory device of a processing device in which the non-volatile memory device is used, the non-volatile memory device comprising a plurality of units, each unit comprising a known plurality of sub-units, each sub-unit comprising a logical sub-unit number and a consecutive sub-unit count, the method comprising: reading one of the sub-units of one of the units; reading one or more additional sub-units when the consecutive sub-unit count of the one of the sub-units indicates that not all sub-units associated with the one of the units are consecutive; identifying the logical sub-unit number of the sub-units read; and generating mapping table data for the one of the units without reading all of the sub-units in the unit.
In still another aspect of the invention, computer-readable media embodying a program of instructions executable by a computer program to perform a method of initializing a non-volatile memory device of a processing device in which the non-volatile memory device is used, the non-volatile memory device comprising a plurality of units, each unit comprising a known plurality of sub-units, each sub-unit comprising a logical sub-unit number and a consecutive sub-unit count, the method comprising: reading one of the sub-units of one of the units; reading one or more additional sub-units when the consecutive sub-unit count of the one of the sub-units indicates that not all sub-units associated with the one of the units are consecutive; identifying the logical sub-unit number of the sub-units read; and generating mapping table data for the one of the units without reading all of the sub-units in the unit.
In yet another aspect of the invention, a method of generating mapping table data for a memory block within a non-volatile memory device on power-up of a processing device in which the non-volatile memory device is used, the block comprising a plurality of pages, the method comprising: reading a page of the block; identifying a consecutive sector count associated with the page; and recording mapping table data for the block without reading all of the pages in the block when the consecutive sector count indicates that consecutive sectors are present in the block.
In still another aspect of the invention, a processing device comprising a non-volatile memory device, the memory device comprising a block of pages, the processing device further comprising media embodying a program of instructions executable by the processing device to perform a method of generating mapping table data for a memory block within a non-volatile memory device on power-up of a processing device in which the non-volatile memory device is used, the block comprising a plurality of pages, the method comprising reading a page of the block; identifying a consecutive sector count associated with the page; and recording mapping table data for the block without reading all of the pages in the block when the consecutive sector count indicates that consecutive sectors are present in the block.
Other embodiments of the present invention will become readily apparent to those skilled in the art from the following detailed description, wherein only certain embodiments of the invention are shown and described by way of illustration. As will be realized, the invention is capable of other and different embodiments and its several details are capable of modification in various other respects, all without departing from the spirit and scope of the present invention. Accordingly, the drawings and detailed description are to be regarded as illustrative in nature and not as restrictive.
Aspects of the present invention are illustrated by way of example, and not by way of limitation, in the accompanying drawings, wherein:
The detailed description set forth below in connection with the appended drawings is intended as a description of various embodiments of the present invention and is not intended to represent the only embodiments in which the present invention may be practiced. Each embodiment described in this disclosure is provided merely as an example or illustration of the present invention, and should not necessarily be construed as preferred or advantageous over other embodiments. The detailed description includes specific details for the purpose of providing a thorough understanding of the present invention. However, it will be apparent to those skilled in the art that the present invention may be practiced without these specific details. In some instances, well-known structures and devices are shown in block diagram form in order to avoid obscuring the concepts of the present invention.
A block diagram of an exemplary computer system 100 using a flash device 120 is shown in
The flash initialization routine in accordance with one embodiment may be stored in a separate nonvolatile memory on the device. Alternatively, the flash initialization routine may be stored on a hard drive, BIOS, or other storage device. The file system driver may be loaded into RAM 122 upon power-up. The file system driver may thereafter access the flash device 120 using the initialization routine according to the present invention.
The flash translation layer 240 then issues the request to the flash hardware 244, as shown conceptually by arrow 242. The flash translation layer 240 issues page read/write requests directly to the flash hardware, which requests include the physical address(es) of the page(s) to be written.
Mapping may be applied at the sector level, such that, in one configuration, each logical sector number referenced by the file system driver corresponds to a physical page of the flash device. (Note that in alternative implementations, a sector may be larger than a physical page, and will then correspond to multiple physical pages. In this case, the physical pages corresponding to a particular sector will typically be a block of consecutive pages that is mapped as a single unit, with a single logical sector number. For a given implementation, all sectors will be the same size, and will contain the same number of physical pages.) In
In one embodiment, the logical sector number of each page is stored in the redundant area of the page. A mapping table for the entire flash device may be stored in RAM. In
In one aspect of the present invention, an improved initialization method and apparatus is disclosed. The improved method and apparatus may comprise a routine that takes advantage of the fact that write requests originating from the file system driver typically specify sequential sectors. In particular, if the file system requests a write of data to an arbitrary sector designated x, it is likely that the next write request will be to write data to sector x+1. This observation is based on the principle that most file system drivers (e.g., FAT, NTFS, etc.) are optimized for use with disk drives. When using disk drives, it is more efficient as a general matter to store a file's sector sequentially on the disk to avoid excessive seeks and delays waiting for the disk to spin. Accordingly, all things being equal, the reads and writes to and from the disk will be sequential rather than random or arbitrary.
The present invention takes advantage of this characteristic of file systems. The system and method of power-up initialization of the flash memory device depends on the sequential allocation of physical pages. As the file system driver issues write requests to the flash translation layer, the flash translation layer may allocate pages sequentially within the block. This sequential allocation means that consecutive logical sectors are ordinarily found in consecutive physical pages on the media.
Consecutive Sector Count
Sectors versus Blocks (or Other Units)
Generally, depending on the implementation, consecutive logical sectors are not allocated onto consecutive physical pages of a flash device on an infinitely large scale. As an illustration, Nand flash devices may be organized into blocks, with each block constituting many pages each. Each block constitutes, in many implementations, the smallest collection of pages that can be erased in a single erase operation. In these devices, physical pages are typically allocated sequentially within a block, but not necessarily between blocks. Stated differently, after the translation layer writes data to a consecutive sequence of pages within a first block (and proceeds to allocate the remaining pages in that block), the file system may write additional data continuing on a second block that is not sequentially adjacent to the first block.
One reason for non-sequential allocation at the block level is the process of “garbage collection.” Garbage collection is a method for moving non-obsolete data from blocks so that those blocks can be erased and the pages containing the obsolete data can be reused. Another possible reason for this non-sequential allocation is “wear leveling.” As noted earlier, wear leveling is the process of allocating or moving data to and from different blocks to prevent any given block from prematurely wearing out or becoming corrupt due to chronic overuse. In these embodiments, both garbage collection and wear leveling tend to randomize the allocation of data at the block (or larger) level. However, whether or not the block data is substantially random, the data within the blocks will still tend to be sequential. In other embodiments, and depending on the specific implementation of the flash memory and/or the file system, units larger than a block may instead be randomized, with blocks and/or smaller units in some instances remaining sequential.
Initializing Sectors on Power-Up
According to the present invention, it is no longer necessary for the translation layer on power-up to automatically read every sector to build the mapping table. Instead, in one embodiment, the flash translation layer reads the redundant area of the last page of a block, and uses the count of the number of preceding sequential pages in that block to fill in the same number of additional entries in the mapping table. This method, which is based on the assumption that the data in the page (if data has been written to the page) was written sequentially, obviates the need for the flash translation layer to scan all but the redundant area of the last sector of the block being addressed. Thus, in the situation where data is allocated sequentially within the entire block, the mapping table can be created for this block by simply reading the redundant area of the last sector. Note that, if at run time a translation layer allocates pages to sequentially increasing addresses, then the mapping table initialization process at an ensuing start-up occurs starting from the highest page in the block, and moving to lower addresses. Accordingly, the initialization routine at power-up reads the logical sector number in the redundant area of the last sector. Knowing the number of sectors in the block (or the number of remaining sectors), the routine can simply fill in the logical sector numbers in the preceding consecutive sectors of the block by reducing, for each previous consecutive sector, the logical unit number by one.
Thus, as an illustration, for initializing block 407 in
In other situations, the pages may not be allocated sequentially, requiring that the flash translation layer read additional pages in the block for initialization. For example, in initializing block 408 at startup, the flash translation layer would read the redundant area of page 11 as well as the redundant area at page 8. This is because in block 408, not all pages were sequentially allocated. Nevertheless, even in this instance, initialization time is saved because the flash translation layer does not have to read every page in the block, as is typical of existing solutions.
After initializing a block, such as block 406 in the example of
Because the translation layer no longer need read the redundant area of each and every page of every block at power-up, substantial time savings can be achieved and the initialization routine can be dramatically quicker than in existing implementations. In addition, memory space may be saved in generating the mapping table because, for many blocks, the mapping table can be built using only a single logical sector number and the total number of sectors in the block.
The actual amount of performance improvement that will be achieved depends on, among other factors, the type of data being stored on the flash device. For instance, database files where page updates are permitted may routinely become fragmented (i.e., nonsequential), and consequently the blocks or other units containing these pages may not benefit much. That is, the file system generally must scan each page of these blocks to build a coherent mapping table, in a manner to be explained further below. Conversely, executable or other files that seldom (if ever) change after they are written to flash will likely significantly contribute to an overall faster flash initialization routine at startup.
Deleted Pages
In determining whether or not the translation layer need only scan the last page of sequences of consecutive pages within any given block as noted above, the translation layer can rely on the presence or absence of deleted pages within that block as explained herein. When the data in a logical sector needs to be updated, the file system in some embodiments writes, via the flash translation layer, the updated data to a new physical page and then “flags” the obsolete physical page for deletion by writing one or more bits to a dedicated flag in the redundant area. Some architectures use other methods, such as relying on sequence numbers to keep track of the most current version of the data. In any case, once a page is deleted in the manner described above, that page can no longer be considered a part of a sequential chain of pages.
Ordinarily, the deleted status of any given page may not be detected during the power-up initialization process because the optimization might cause the redundant area for the deleted page to be skipped. Accordingly, in another aspect of the invention, a flag may be maintained in the redundant area of the last page of each block, which flag will be set if any pages in that block are deleted. For convenience and to distinguish this flag from other “deleted” flags associated with each sector in some architectures, this new flag is referred to herein as the “deleted page(s) present” flag. The power-up initialization routine causes the translation layer to read at least the redundant area of the last page of each block. Where the “deleted page(s) present” flag is set, the translation layer determines that it must scan the entire block to build the mapping information for that block. Stated differently, the “deleted page(s) present” flag in the last block may be used to inform the driver that the optimization techniques described herein should not be used for that block.
For block 507, the translation layer reads the redundant area of the last page and maps the physical page 7 to logical sector number 296. Here, the sixth bit is set and thus no deleted pages are present. Accordingly, the optimization process can be used in connection with block 507, and only the last page of the block need be read in order to create the mapping table for that block. For block 508, the translation layer reads the redundant area of physical page 11 and determines that one or more deleted pages are present. As with block 506, the translation layer will then proceed to read each sector in block 508 to build the mapping table on power-up.
While most flash devices limit the number of times that a page can be written without erasing the block, most allow at least two “partial page program” writes to a page. This operation may be necessary to set the “deleted page(s) present” flag, because the last page may have already been programmed with data before the first deletion occurred. In the typical Nand flash architecture (to which the principles of the present invention are not limited), writes can only change “1” bits to “0” bits, and the input data for all other bits that are not to be programmed must be “1's”. Multiple writes to the same page generally increase the chance of an unintentional bit change—i.e., an error—somewhere else in the same block. However, if the architecture is such that the additional write is restricted to changing only a few bits in one byte (such as may be required for setting a deletion flag as discussed above), the chance of an error may be minimized.
In most embodiments, error correction mechanisms may be used to protect the data content of each page from errors. In some embodiments, one or more separate correction mechanisms may be employed to protect parts of the redundant area, so that it is not necessary to read the data area to verify the integrity of the redundant area and build the mapping information. To save space in the redundant area, in some embodiments the “deleted page(s) present” flag is combined with the last sector's count of consecutive sectors by using an invalid sector count to flag the presence of deleted sectors. In this embodiment, the “deleted pages present” flag is set by clearing bit six, and this is done in conjunction with other changes in the byte, which maintain the validity of the parity bit. The lower six bits are no longer needed if the “deleted pages present” flag is set, and so can be cleared to maintain parity. One simple solution is to clear the entire byte, when a deleted page is present. If there is an error, and one bit fails to clear, the parity will be incorrect, and the error will be detected. If the byte is zero, or parity is invalid, then the block may simply default to a manual scan of every sector to build the mapping table in accordance with principles discussed in this disclosure.
Improving Performance for Erased Blocks
In another aspect of the invention, the power-up initialization process may be configured to optimize performance when it encounters erased blocks. In configurations where physical pages are always allocated sequentially, starting from the beginning of a block and never moving to another block until the current allocation block is full, there will generally not be more than one block containing both used pages (which could still contain valid data, or could be flagged for deletion) and erased pages. All other blocks will either be fully erased, or will be full of used (currently in use, or deleted, but not erased) pages. In general, erased pages means that the data within the pages are reprogrammed to their initial state (e.g., all bits are “1's”). Deleted pages, conversely, means that the page has been marked for erasure, but the data to be erased remains in the data field (i.e., the page has not been erased yet). The specific architecture of any given flash memory may be different, but this distinction is made here for the purpose of illustration and clarity in the context of a Nand flash hardware architecture.
Fully used blocks may be detected at power-up by reading the redundant area of the last page of each block, as described above. If, however, the bytes of the redundant area according to one embodiment all contain FF values (using hexadecimal notation), then the bits are set to “1's” and the block is either fully erased or it is the one block from which sectors are currently being allocated. This distinction may be determined by reading the first page of the block. Where both the first sector and the last sector are erased in this embodiment, then the entire block is erased. At this point, there is no need for the file system to read the remaining sectors of the block.
Additional performance enhancements may thereby be achieved for fully erased blocks, because the redundant areas of only the first and last pages of the block need be read. If in this embodiment the last sector is erased and the first sector is not erased, then the block at issue constitutes the single current allocation block. The translation layer must then scan every page of that block to build mapping information on power-up.
In other embodiments, another mechanism, such as a separate non-volatile memory, is configured to store information regarding the current allocation block across power cycles in the flash device. In this instance, it is not necessary to read the first page of potentially fully erased blocks. Any time the last sector is erased in this embodiment, and the block is known to not be the current allocation block (in this case using the separate memory), then the entire block must be erased and the file system can skip the remainder of the sectors in that block on initialization.
Some implementations employ an “erase marker” to verify that blocks have been erased properly. In these implementations, the pages containing the erase markers may be included in the scan at power-up.
Illustrative Power-Up Initialization Methods
Referring to
In reading the redundant area of the last page within the block, the translation layer first determines whether the redundant area is completely erased (all bytes are FF hexadecimal) (step 606). If yes, then either the entire block is fully erased, or that block constitutes the current allocation block that contains both used and deleted sectors. Referring to
Referring back to
In step 622 of
Note that, in other initialization processes, the decision step 608 may precede the decision step 606 in
Returning to step 608 of
Alternative Implementations
The implementation described herein assumes for clarity that the mapped units (sectors) are the same size as physical pages of the flash device. As noted above, however, the principles of the present invention are equally applicable to situations where the mapped unit is a multiple of the page size. Similarly, the present implementation assumes for clarity and simplicity that the erase blocks are the same size as the wear-leveling blocks, and that the initialization routine only performs the optimization techniques on chains of consecutive sectors within these blocks. However, the consecutive sector block may be a different size than the erase block. Alternatively, the wear-leveling block may be larger than either the erase block or the consecutive sector block.
Additionally, some “journaling” type systems use a sequence number to avoid the necessity of flagging sectors for deletion. Where multiple physical pages contain the same logical sector number, only the page associated with the highest sequence number is valid. These types of systems can also take advantage of the concept of consecutive sectors. In this case, the presence of a DPP flag is not necessary, because no deleted pages are present (at least none that are flagged in the conventional manner). It nevertheless may be convenient, depending on the configuration, to handle consecutive sectors at the sub-block level only, to allow the file system the freedom to move blocks around as they are erased or for wear-leveling purposes.
Further, the principles of the present invention are not limited to the Nand flash device, and instead can apply to any type of non-volatile memory that has an architecture that allows for sequential reads and/or writes, at least within some unit (e.g., a block). A mapping table may be generated for one or more of the blocks in a memory device, or for all of the blocks.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.