This application relates generally to managing data in a memory system. More specifically, this application relates to the operation of a memory system to allow for continued operations in re-programmable non-volatile semiconductor flash memory despite an accumulation of bad memory blocks.
Non-volatile memory systems, such as flash memory, are used in digital computing systems as a means to store data and have been widely adopted for use in consumer products. Flash memory may be found in different forms, for example in the form of a portable memory card that can be carried between host devices or as a solid state disk (SSD) embedded in a host device. These memory systems typically work with data units called “blocks” that can be written, read and erased by a storage manager often residing in the memory system. Flash memory systems are typically marketed with a “declared capacity” that identifies a minimum amount of usable storage space available to a user. For example, the SanDisk Corporation produces microSD flash storage devices with a declared capacity of 2, 4, 8 and 16 gigabytes.
When a file system manager (FSM) supports a given storage device, the FSM learns the declared capacity and builds a table of addressable blocks that spans the range of blocks that make up the declared capacity. This range needs to be respected by both the FSM and the device. If there is a discrepancy between the actual range of the device and the range assumed by the FSM based on the declared capacity, data can be lost or parts of the storage device will be unused.
Due to physical processes well known in flash memory systems, blocks tend to fail and become useless over time. In non-flash memory systems, such as memory systems using magnetic media with physical to logical address translation, the FSM would be notified by the storage device of “bad sectors” and, as a result, the FSM marked the associated cluster as “bad” and wrote the data to an another physical location. With flash-based storage devices, where the host accesses logical blocks and not physical blocks, the management of bad storage areas has moved into the storage manager on the storage device and the FSM does not know about bad physical blocks or ever need to get involved in marking logical addresses as bad. As the FSM is not directly aware of the failure of a physical block, the storage manager of a storage device needs to replace the bad blocks. In general, flash memory systems are designed such that the storage manager maintains a number of spare blocks that are not visible to the FSM because they are not included in the declared capacity and are thus not part of the range of available blocks the FSM uses based on the declared capacity. When there is a need to replace a bad block, the storage manager replaces the physical address of the bad block with a physical address of a spare block and possibly copies data from the bad block into the spare block. This operation is transparent to the FSM.
A problem arises when the stock of spare blocks is exhausted, and there are no more spares to replace bad blocks. As the FSM expects to see the full range of available blocks, and as the storage device cannot deliver, the storage device cannot continue to serve the FSM in an ordinary way. Some vendors address this situation by declaring the storage device to be faulty once the spare blocks have run out and prevent any further use of the storage device. Other vendors switch the storage device into a “read only” mode, hiding from the FSM the fact that there are some writeable blocks. The FSM can then only retrieve the pre-written data from the storage device and back-up the written data to another storage device. The problem of spare block exhaustion can arise even if the storage device is almost empty, and the user will be disappointed and surprised to discover that nothing more can be written into that storage device. Thus, the storage device may have its ordinary life ended, even though most of the blocks in the storage device may be in perfect condition.
In order to enable a user to continue and use a storage device after the original spare blocks are exhausted or fall below a minimum amount, a system and method for managing bad blocks is disclosed. The storage device may include a first set of blocks, defined as operative blocks, a second set of blocks defined as spare blocks and a mechanism for re-defining an operative block as a spare block.
According to a first aspect a storage device includes a non-volatile memory having a first set of physical blocks, defined as operative blocks, that are visible to a host of the non-volatile memory, and a second set of physical blocks, defined as spare blocks, that are hidden from the host of the non-volatile memory. The storage device also includes a controller in communication with the non-volatile memory, where the controller is configured to re-define operative blocks as spare blocks. The re-definition of operative blocks as spare blocks may be in response to detecting a shortage of spare blocks in the non-volatile memory or to a user request to increase performance by increasing a number of spare blocks. A shortage of spare blocks may be defined as the number of spare blocks below a predetermined threshold.
In a second aspect, a method for managing bad memory blocks in a storage device includes detecting a shortage of spare blocks in the non-volatile memory at a controller of the non-volatile memory and, in response to detecting a shortage of spare blocks, the controller re-defining an operative block as a spare block. The controller may redefine an operative block as a spare block by communicating with a file system manager to determine which operative block or blocks to convert to spare blocks. The determination of which operative blocks to convert may be based on a selection of the deepest operative blocks where the capacity of the storage device is recorded as reduced. The deepest block refers to the usable block that is the last operative block in the memory. Alternatively, the determination may be based on a selection by the file system manager (FSM) of clusters associated with blocks other than the deepest block where the FSM maintains the same addressable range but frees up operative blocks by either classifying clusters associated with operative blocks as bad or by creating a dummy file that is never accessed by the FSM.
In yet another aspect, a method of managing memory blocks is disclosed that permits a user to re-define operative blocks as spare blocks in a storage device even before the stock of spare blocks is exhausted. These additional spare blocks may be used by the storage manager of the storage device for improving performance or endurance of the storage device. The storage device may notify the host of one or more options for the number of spare blocks to be redefined as operative blocks, and the host may then inform the user. The storage device may also provide information to the user via the host regarding the performance benefits that may be achieved for each of the different options.
Although not intended to be limiting in any way, the following terms may be used in this document and may take the meaning provided below.
File System Manager (FSM)—circuitry, software, or a combination of circuitry and software that manages the file system. The FSM may be located on the host of a memory system, but optionally may be in the controller inside the storage device of a memory system.
Cluster—a unit of storage space allocation that the file system manages where a cluster may include one or more sectors.
Block—a unit of storage space that a storage device can manage, read from and write to. A block has one or more pages, where a page is the minimum unit of reading or writing. A block is an internal storage unit that translates into sectors and clusters from the FSM's point of view.
Operative block—a physical block that is currently accessible to the FSM through a group of logical addresses (clusters) referred to as a logical block.
Spare block—a physical block that is hidden by the storage manager from the FSM such that the spare block is not in the addressable logical space visible to the FSM. Spare blocks are kept in reserve by the storage manager and used to replace operative blocks that become unusable.
Bad block—bad blocks are blocks that have been found to be bad or unusable and marked as such so that they are unavailable to the FSM or the storage manager.
Bad cluster—a range of addresses (cluster) that the FSM designates as “bad”, despite not being associated with any defective blocks, so that the operative block or blocks associated with the clusters may be sacrificed for use as spare blocks by the storage device.
Declared Capacity—The storage capacity that is presented by the storage device to the FSM. This capacity is a commercial commitment of the storage vendor.
Deepest Block—the usable block which is the last operative block in the memory. This term is used, instead of the common term “highest block”, as the nomenclature of blocks can be such that the lowest numbered block is the last block.
A non-volatile memory system 100 suitable for use in implementing aspects of the invention is shown in
The host 102 of
A block device driver 118 executed by the controller 120 of the storage device 104 manages communication with the host 102 over the interface 116. The storage manager 121 executed by the controller 120 of the storage device 104 manages the blocks 124 in memory 122. The controller 120 may convert between logical addresses of data used by the FSM 112 and physical addresses of the memory 122 during data programming and reading. The memory 122 includes physical blocks 124 of flash memory that each consist of a group of pages, where a block 124 is a group of pages in a flash storage device and a page is a smallest unit of writing in the memory 122. The blocks 124 in the memory 122 include operative blocks 130 that are represented as logical blocks to the FSM 112. Some of the blocks 124 are bad blocks 128 that have been found to be bad or unusable and marked as such so that they are unavailable to the FSM 112. Others of the blocks 124 are spare blocks 126 that are not available to the FSM 112 and are used by the controller 120 to replace bad blocks 128. A capacity register 119 identifying a current capacity of the storage device 104 may be maintained in non-volatile memory within the controller 120.
In another embodiment, as shown in
Phase 310 shows the state of the storage device 300 after some usage. During the time before phase 310 (not shown), erase cycles were performed on various blocks of the memory. As a result, some blocks are wearing out. Phase 310 occurs when the storage device 300 recognizes that block 318 has become a bad block that needs to be added to the list of bad blocks 306. In order to support the declared capacity of the storage device 300, the block 318, which has become a bad block and therefore unusable, should be replaced with one of the spare blocks 304. Phase 312 shows the state of the storage device 300 after adding block 318 to the list of bad blocks 306 and replacing block 318 with spare block 320. The process of replacing a bad block 306 with a spare block 318 may be accomplished by the storage manager, transparently to the file system manager, reassigning the physical address of the bad block to the spare block. Phase 314 shows the results of phases 310 and 312 where the number of spare blocks 304 is decreased by one and the number of bad blocks 306 is increased by one.
Line 322 in
Referring now to
The storage device 300 first determines the number of missing spare blocks needed to bring the spare block count above the minimum 322. In this example, the storage device needs one spare block. The storage manager 121 reports to the FSM 112 that it needs one spare block. In one embodiment, where a communication protocol between FSM 112 and storage manager 121 may be implemented that allows the FSM 112 to be the sole initiator communication with the storage manager 121, the reporting can be done by way of the storage manager 121 setting a flag to modify the status of the next “write” command to tell the FSM 112 that there is an issue with the spare blocks, and that a dialog with the storage manager 121 is needed. Upon the next “write” command from the FSM 112, the storage manager 121 reports to the FSM 112 that there is a spare block shortage. Alternatively, the reporting by the storage manager 121 to the FSM 112 may be accomplished by way of a polling mechanism where the FSM 112 periodically checks with the storage manager 121 to see if more spare blocks are needed rather than only when a next write command is issued. In another embodiment, the storage manager 121 can interrupt the FSM 112 and then the FSM 112 reads the information from the storage manager 121. In another embodiment, both the storage manager 121 and the FSM 112 can initiate commands and the storage manager 121 can send a message to the FSM 112 identifying the required number of spare blocks.
Phase 402 shows the response of the FSM 112 to learning the number of spare blocks needed to bring the total number of spare blocks above the minimum level 322. Here, one block is needed to bring the total number above the minimum 322. The FSM 112, which manages the file system in terms of clusters in logical space, selects a number of clusters that correspond to the number of required blocks (equal or greater). In this example, one cluster is assumed to span an address range equal to a physical block so a one-to-one correlation exists between clusters and blocks. The FSM 112 selects one cluster that is shown in
In one embodiment, the FSM 112 marks the cluster associated with block 406 as a bad cluster in order to free up the block 406 for use as a spare block.
In another embodiment, the FSM 112 may instead create a dummy file, or add the cluster to an existing dummy file, that collects all the clusters the FSM 112 needs to free for the storage manager 121 in order to free up block 406. The storage manager 121 provides either the number of blocks and the size of a block to the FSM 112 or the required space in terms of known units, such as kilobytes or megabytes. The FSM 112 knows the size of a cluster and can determine the number of clusters required based on the size of a block and the number of blocks being requested or by the amount of space specified in known units. The data in the dummy file is not valid and the dummy file may be any necessary size.
Phase 404 shows the block arrangement in the storage device after it disconnects the physical block 406 from its logical address and adds (as shown by arrow 412) the operative block to the list of spare blocks 304. The number of spare blocks was increased by one as expected, and the ordinary operation of the storage device is rehabilitated such that data may still be written to the storage device. In the two embodiments discussed above with respect to
In another embodiment, a variation of the methods described above with respect to the phases of
Attention is now called to
Phase 504 shows the state where the memory changes the distance to the deepest block of the operative blocks 302 to be one less than the previous value and the number of spare blocks 304 is increased by one as requested. The new deepest block 510 is the block prior to the former deepest block 506. Note that an advantage of the embodiment of
As described herein, reference to the number of spare blocks decreasing below the minimum level is intended to mean that the number of spare blocks has decreased below the minimum level considered necessary for the device to permit write operations, typically preset by the manufacturer, such that rehabilitation is triggered. When rehabilitation is triggered and spare blocks are created from operative blocks in any of the embodiments discussed above, more than the minimum number of spare blocks may be created. The FSM and storage manager 121 may be configured to create more than just the minimum number of spare blocks to avoid the trigger. This hysteresis between the minimum and actual number of spare blocks may be useful if the process of rehabilitation is time or energy consuming. The rehabilitation process may be time and energy consuming if, for example, significant amounts of data needs to be moved out of the operative block selected for conversion to a spare block, to where it is desirable to create more than the minimum number of spare blocks to bring the number of spare blocks well beyond the minimum level so as to minimize the number of times that the storage device's operations are interrupted.
Referring now to
When the FSM 112 notices the special status report, it initiates a dialogue with the storage device to learn how many spare blocks are needed. The FSM 112 then frees the required number of operative blocks, starting from the deepest operative block working its way backwards, by moving clusters of data to empty operative blocks 302 (at 906). The FSM 112 then confirms to the storage manager 121 that the clusters are free and that the address of the deepest block can be modified (at 908). The storage manager 121 then changes the address of the deepest block (see 508 in
In the discussion above, although reference is made to the FSM and storage manager arrangement of
In another implementation, there may be times when there end up being more operative blocks re-defined as spare blocks than are actually necessary. For example, the storage manager may have requested a default number of additional spare blocks to replenish the spare block supply and provide a buffer beyond the minimum number required based on a presumed rate of failure of operative blocks. If the storage manager later notes that the rate of replacement is lower than first assumed, it may release some of the spare blocks that were converted from operative blocks back to being operative blocks. This re-tasking of a spare block to an operative block, where the spare block was a previously re-tasked operative block, may be implemented for those spare blocks that that were previously obtained from operative blocks using the any of the methods of
It should be noted that the variations of the embodiments described above can be used to rehabilitate a storage device even if the FSM is very simple and cannot cooperate with the storage device as described above. For example, rehabilitation of a storage device with too few spare blocks is possible if a storage device having a storage manager configured to execute one of the embodiments is a memory card used in a camera or a digital recorder that cannot carry out the role of the FSM as described.
In such cases, the storage device will behave as a typical storage device, and when the reservoir of spare blocks is exhausted, it may implement one of the following variations of the methods of
When doing a recovery formatting, the computer backs up the contents of the storage device, reformats the storage device (it is assumed that the bad blocks will remain bad blocks through the reformatting, so that the storage device is reformatted with an insufficient number of spare blocks), and an application on the computer may handle a user interface to interact with the user to ask the user how much they wish to increase the spare block reservoir. The storage manager sends information to the computer regarding the minimal required amount of spare blocks needed for rehabilitation. In addition, optionally, it can also send the computer a recommendation for a larger number of spare blocks for performance improvements, where performance is considered improved by increasing the spare block level. The computer will accept a response from the user and interact with the FSM and storage manager to achieve the changes authorized by the user. The storage manager will apply the method of
In an embodiment where the storage manager incorporates the method of
After reconfiguring the memory card as indicated above, it will be recognized in a host with a simple FSM, such as the digital camera mentioned above, and the camera will either know what the declared volume of the device is (if the deepest block is changed as in the implementation of
In yet other embodiments, a storage device 104 may be configured to permit adjustment of the amount of spare blocks 126 to enhance performance even before the stock of spare blocks 126 is exhausted. In other words, a user of a new storage device may insert the device into a computer and be presented with a performance table that will allow the user to select a performance level for the device that correlates to a number of spare blocks to be maintained. Generally, increasing the number of spare blocks may improve performance of a storage device, for example, extra spare blocks may be used to create or increase the size of a memory cache in the storage device.
In another embodiment, there may be instances where the storage manager determines that it has converted more operative blocks to spare blocks than it really needed. In this situation, the storage manager may permit the additional spare blocks (that were previously operative blocks) to be converted back into operative blocks. These particular blocks released for use as operative blocks by the storage manager changing the declared capacity in the capacity register of the storage device so that the deepest block is increased and the FSM can now see that it has more addressable operative blocks. After the device has been in use for a time and more spare blocks may be needed to remain functional, the storage device can then increase the number of spare blocks by converting operative blocks 130 to spare blocks 126, decreasing usable capacity in any of the ways described above. By increasing or decreasing the number of spare blocks from the original value that the vendor has provided, the user can customize the trade-off between the capacity and the life expectancy of the storage device.
In this embodiment, where the number of extra spare blocks may be reduced by releasing spare blocks that were previously operative blocks, the FSM may present the user with a single recommendation for a performance improvement level, a group of incremental improvement levels or a combination of the two. The user may select an option and that option is relayed via the host user input device to the storage device for implementing as described.
A system and method for managing bad blocks to extend the life of a storage device as been disclosed. A storage device is capable of changing its declared capacity from a fixed, given capacity to a dynamic capacity that decreases with time, giving the FSM and the user the option to continue operating with the reduced capacity. In one embodiment, a procedure for changing the declared capacity of the storage device includes, upon detection of shortage in spare blocks, the storage device determining the available capacity, reserving a predefined number of spare blocks, and notifying the FSM. The FSM, in turn, shuffles data inside the storage device to release a number of blocks at the end of the storage device and indicates to the storage device that these blocks can be released. Then the storage device turns these blocks into new spare blocks and decreases the deepest block.
In another embodiment, upon detection of a shortage in spare blocks the storage device determines the number of missing spare blocks and reports this number to the FSM. The FSM may select a number of clusters that cover the required amount of blocks and marks these clusters as bad clusters. Once the FSM declares these “sacrificed” blocks as bad clusters and notifies the storage device, the storage device disconnects the corresponding physical blocks from their logical addresses, and adds these physical blocks to the list of spare blocks. In this implementation the deepest block is not changed.
In another embodiment, upon detection of shortage in spare blocks, the storage device determines the number of missing spare blocks and reports this number to the FSM. The FSM selects a number of clusters that cover the required amount of blocks and generates a new dummy file that consists of these clusters. By creating this dummy file and leaving it in the storage forever, the FSM confiscates the corresponding blocks from further use. The FSM then notifies the storage device that these blocks can be released, and the storage device then releases the corresponding physical blocks and adds them to the list of spare blocks. Note that in this implementation, as in the previous one, the highest block count (position of the deepest block) is not changed. Instead of creating a new dummy file each time the storage device asks for a rehabilitation of the spare block list, the FSM can add the clusters to an existing, appendable dummy file. Also, as noted above, in other implementations the storage manager can operate to re-define operative blocks as spare blocks without interacting with an FSM.
It is therefore intended that the foregoing detailed description be regarded as illustrative rather than limiting, and that it be understood that it is the following claims, including all equivalents, that are intended to define the spirit and scope of this invention.
This application claims the benefit of U.S. Provisional App. No. 61/207,555 filed Feb. 13, 2009, the entirety of which is incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
61207555 | Feb 2009 | US |