Exemplary embodiments disclosed herein pertain to digital memory used in digital electronic devices. More particularly, exemplary embodiments disclosed herein pertain to flash memory devices.
Computers use RAM to hold the program code and data during computation. A defining characteristic of RAM is that all memory locations can be accessed at almost the same speed. Most other technologies have inherent delays for reading a particular bit or byte. Adding more RAM is an easy way to increase system performance.
Early main memory systems built from vacuum tubes behaved much like modern RAM, except the devices failed frequently. Core memory, which used wires attached to small ferrite electromagnetic cores, also had roughly equal access time (the term “core” is still used by some programmers to describe the RAM main memory of a computer). The basic concepts of tube and core memory are used in modern RAM implemented with integrated circuits
Alternative primary storage mechanisms usually involved a non-uniform delay for memory access. Delay line memory used a sequence of sound wave pulses in mercury-filled tubes to hold a series of bits. Drum memory acted much like the modern hard disk, storing data magnetically in continuous circular bands.
Many types of RAM are volatile, which means that unlike some other forms of computer storage, such as disk storage and tape storage, they lose all data when the computer is powered down. Modern RAM generally stores a bit of data as either a charge in a capacitor, as in “dynamic RAM,” or the state of a flip-flop, as in “static RAM.”
Non-Volatile Random Access Memory (NVRAM) is a type of computer memory chip which does not lose its information when power is turned off. NVRAM is mostly used in computer systems, routers and other electronic devices to store settings which must survive a power cycle (like number of disks and memory configuration). One example is the magnetic core memory that was used in the 1950s and 1960s.
The many types of NVRAM under development are based on various technologies, such as carbon nanotube technology, magnetic RAM (MRAM) based on the magnetic tunnel effect, Ovonic Unified Memory based on phase-change technology, and FeRAM based on the ferroelectric effect. Today, most NVRAM is Flash memory, which is used primarily in cell phones and portable MP3 players.
Flash memory is non-volatile, which means that it does not need power to maintain the information stored in the chip. In addition, flash memory offers fast read access times (though not as fast as volatile DRAM memory used for main memory in PCs) and better shock resistance than hard disks. These characteristics explain the popularity of flash memory for applications such as storage on battery-powered devices.
Flash memory stores information in an array of floating gate transistors, called “cells”, each of which traditionally stores one bit of information. Newer flash memory devices, sometimes referred to as multi-level cell devices, can store more than 1 bit per cell, by varying the number of electrons placed on the floating gate of a cell.
NOR-based flash has long erase and write times, but has a full address/data (memory) interface that allows random access to any location. This makes it suitable for storage of program code that needs to be infrequently updated, such as a computer's BIOS (basic input output software) or the firmware of set-top boxes. Its endurance is 10,000 to 1,000,000 erase cycles. NOR-based flash was the basis of early flash-based removable media; Compact Flash was originally based on it, though later cards moved to the cheaper NAND flash.
In NOR flash, each cell looks similar to a standard MOSFET, except that it has two gates instead of just one. One gate is the control gate (CG) like in other MOS transistors, but the second is a floating gate (FG) that is insulated all around by an oxide layer. The FG is between the CG and the substrate. Because the FG is isolated by its insulating oxide layer, any electrons placed on it get trapped there and thus store the information.
When electrons are on the FG, they modify (partially cancel out) the electric field coming from the CG, which modifies the threshold voltage (Vt) of the cell. Thus, when the cell is “read” by placing a specific voltage on the CG, electrical current will either flow or not flow, depending on the Vt of the cell, which is controlled by the number of electrons on the FG.
This presence or absence of current is sensed and translated into 1's and 0's, reproducing the stored data. In a multi-level cell device, which stores more than 1 bit of information per cell, the amount of current flow will be sensed, rather than simply detecting presence or absence of current, in order to determine the number of electrons stored on the FG.
A NOR flash cell is programmed (set to a specified data value) by starting up electrons flowing from the source to the drain, then a large voltage placed on the CG provides a strong enough electric field to suck them up onto the FG, a process called hot-electron injection.
To erase (reset to all 1's, in preparation for reprogramming) a NOR flash cell, a large voltage differential is placed between the CG and source, which pulls the electrons off through quantum tunneling. In single-voltage devices (virtually all chips available today), this high voltage is generated by an on-chip charge pump.
Most modern NOR flash memory components are divided into erase segments, usually called either blocks or sectors. All of the memory cells in a block must be erased at the same time. NOR programming, however, can generally be performed one byte or word at a time.
Low-level access to a physical flash memory by device driver software is different from accessing common memories. Whereas a common RAM will simply respond to read and write operations by returning the contents or altering them immediately, flash memories need special considerations, especially when used as program memory akin to a read-only memory (ROM).
While reading data can be performed on individual addresses on NOR memories unlocking (making available for erase or write), erasing and writing operations are performed block-wise on all flash memories. A typical block size will be 64, 128, or 256 KiB or higher.
The read-only mode of NOR memories is similar to reading from a common memory, provided address and data bus is mapped correctly, such that NOR flash memory is much like any address-mapped memory. NOR flash memories can be used as execute-in-place memory, meaning it behaves as a ROM memory mapped to a certain address.
When unlocking, erasing or writing NOR memories, special commands are written to the first page of the mapped memory. These commands are defined as the common flash interface (defined by Intel Corporation of Santa Clara, California) and the flash circuit will provide a list of all available commands to the physical driver.
NAND Flash uses tunnel injection for writing and tunnel release for erasing. NAND flash memory forms the core of the removable USB interface storage devices known as “keydrives.”
NAND flash memories cannot provide execute-in-place due to their different construction principles. These memories are accessed much like block devices such as hard disks or memory cards. When executing software from NAND memories, virtual memory strategies are used: memory contents must first be paged into memory-mapped RAM and executed there, making the presence of a memory management unit (MMU) on the system absolutely necessary.
Because of the particular characteristics of flash memory, it is best used with specifically designed file systems which spread writes over the media and deal with the long erase times of NOR flash blocks. The basic concept behind flash file systems is: when the flash store is to be updated, the file system will write a new copy of the changed data over to a fresh block, remap the file pointers, then erase the old block later when it has time.
One limitation of flash memory is that although it can be read or programmed a byte or a word at a time in a random access fashion, it must be erased a “block” or “sector” at a time. Starting with a freshly erased block, any byte within that block can be programmed. However, once a byte has been programmed, it cannot be changed again until the entire block is erased. In other words, flash memory (specifically NOR flash) offers random-access read and programming operations, but cannot offer random-access rewrite or erase operations.
When compared to a hard disk drive, a further limitation is the fact that flash memory has a finite number of erase-write cycles (most commercially available EEPROM products are guaranteed to withstand 106 programming cycles), so that care has to be taken when moving hard-drive based applications, such as operating systems, to flash-memory based devices such as CompactFlash. This effect is partially offset by some chip firmware or file system drivers by counting the writes and dynamically remapping the blocks in order to spread the write operations between the sectors, or by write verification and remapping to spare sectors in case of write failure.
Flash memory devices of the prior art utilize current according to the operations being performed. Each operation consumes fixed current at a fixed performance level. Current usage rises and falls based on the type of operation being performed. These operations are performed serially; one operation must wait until another completes.
Flash memory devices are used in a variety of environments which imply dynamically changing power sources. Although the supply of current may increase or decrease, the current consumption of flash devices of the prior art is unresponsive to these changes.
Since the current consumption of the prior art flash devices is unresponsive, it is sometimes too low and does not utilize available current to perform at a high rate of speed.
Another limitation of the prior art flash devices is that they may inadvertently exceed a maximum current limit, causing a system failure, such as a “system hang” of a laptop computer.
These and other limitations of the prior art will apparent to those of skill in the art upon a reading following descriptions and a study of the several of the drawing.
Certain non-limiting exemplary embodiments provide an improved system consisting of a controller and flash devices capable of dynamically changing its power consumption by changing an internal constant which controls the rate at which data is processed. The internal constant, a “K-value” is the parameter in the flash that controls the trade-off between speed and current consumption. By dynamically altering K-value, it is possible to reduce current consumption when the current consumption levels approach the budgeted current limit of, for example, 70 to 80 milliamps. As is well known to those skilled in the art, current consumption is usually highest during program and erase operations. More generally, a high speed or high power mode may be used in the flash or other non-volatile memory (NVM) device. In this mode, the NVM uses more energy in order to improve its performance.
During periods of high activity in the flash, and especially during program and erase operations, current levels may approach the current limit. Failure to maintain current consumption within the current limit may result in various system failure modes, including but not limited to a system “hang”, such as with a laptop computer.
In accordance with certain embodiments, different operations can be performed in parallel to achieve higher performance. For example, a read operation can be performed in parallel with an erase operation. Concurrent operations such as these may induce an over-current condition if too many of them are performed at the same time. By modeling or dynamically measuring current usage, this kind of problem can be anticipated, and a new K-value can be selected according to present operating conditions which avoids the over-current problem. In one embodiment, a toggle mode is used to prevent operations from occurring concurrently when such concurrency would cause an over-current condition. In another embodiment, the same concurrency would be permitted but with a lower K-value.
The dynamic K-value can also be used to gain higher performance when current consumption is low. For example, the flash controller consumes power when orchestrating communications between the various components. Once an operation has begun, however, the power consumption in the controller is reduced because it pertains to initiating the operation but does not continue during the operation. Alternatively, controller power consumption could be reduced because of other activities unrelated to the operation regarding management of wear leveling, etc. In any case, when the controller detects that current consumption will be reduced, it can signal to the flash that it may operate at a higher K-value even after an operation has already commenced. Thus, a portion of the operation is performed at a lower K-value and another portion of the operation is performed at a higher K-value. It is contemplated that many such K-value changes can be performed during an operation. In general, the K-value can be changed at any point before, during or after an operation to achieve the proper current consumption and maximum performance within that current consumption budget.
As will be appreciated by those skilled in the art, current consumption within any device such as, for example, a camera varies over time. In one embodiment, the flash device would be allocated a fixed current budget of, for example, 70 to 80 milliamps. In another embodiment, this budget could be dynamic. The host may communicate the budget to the flash device in a special communication mode at the driver level This budget can be the result of direct measurement or modeling by the host of the overall current budget of the device as a whole. This embodiment is just one of many possible alternatives to overall current consumption budgeting.
The dynamic K-value can provide an improved data transfer rate for the NROM based flash card using dynamic power management and, as a result, may provide up to a 50% faster data rate. As will be appreciated by those skilled in the art, these techniques may be used in any non-volatile memory system or, in fact, in any non-volatile memory application, such as a cell, an array, an embedded array, a card, etc. In accordance with the dynamic K-value method, it is necessary to develop a table which enumerates each component according to its power consumption values in its various operational modes. These components include, for example, the card interface, the controller, the FIFO, and the flash. Since these various components have multiple operational modes, the power consumption levels for each mode must be enumerated in order to model the power consumption of the flash.
In certain embodiments, only some of the various components would be mapped into a table defining power consumption in the various modes of operation. Certain embodiments activate only the minimum components needed to conduct a specific operation and depend upon the flash device having a model of power consumption and a knowledge, of the permitted current limit and subsequently adjust K-value and the on/off power status of the various components to consume the permitted level. The controller is able to manage power budgeting in all of the various operational modes; in certain embodiments, operations are done in power-absolute sequential manner.
In certain situations, the flash device is supplied with an abundance of power and is relatively unconstrained regarding power consumption. In this mode, maximum parallelism is used to achieve the greatest performance increase; the controller is active all the time. By activating some or all of the components in parallel, various operations such as read operations, write operations, and erase operations are performed concurrently. In some cases, multiple operations of the same type may be performed in parallel. As a non-limiting example, multiple erase operations may be performed at the same time in certain embodiments.
This maximum parallelism is an emergent property of the embodiments described herein; as the current budget increases more and more concurrency is achieved. This graceful scaling of power consumption to utilize available power may be achieved by a variety of embodiments. In one embodiment, a set of rules is implemented in the hardware and software of the flash device. An example of such a rule is: when a certain phase of an operation is reached (prior to the end or at the end of the operation), signal availability so that another operation may commence. Another example of such a rule is to initiate an operation at a certain power level, and then, upon completion of various management and controller operations, the controller enters a sleep mode. Optionally, the performance of any operations in progress in other portions of the flash device may be boosted to proceed at a higher performance levels in order to make use of the power that is made available by the controller's auto sleep.
An alternative to the rule based scheme is to implement a scheduler which applies a performance metric to optimize the operation of the device within the current limit. An example of such a metric is overall throughput of the flash device.
Once the current level reaches a certain point, there is such an abundance of current that it is possible to perform with maximum parallelism. It should be noted that this maximum parallelism mode may be initiated during the normal operation of the flash device. It is also possible to exit the maximum parallelism mode or other modes dynamically. A non-limiting example of a situation in which maximum parallelism is possible would be a flash device connected to a laptop computer which is plugged into the wall. Since this kind of condition can be entered and exited at the discretion of the user, the flash device is dynamic with respect to this condition according to certain non-limiting embodiments.
As will be appreciated by those skilled in the art, flash memory must be erased before data can be written or programmed into it. It is the responsibility of the controller to maintain a logical to physical mapping between the logical address space used by the host and the physical address space used within the flash. When writing or programming a specific portion of the logical address space the data may be written to an entirely new physical location which is distinct and separate from the previous physical location associated with that logical address. At some point, the physical location which contained the previous data for that logical address must be erased so that the storage can be reused. These erase operations may, however, be deferred. Similarly, wear-leveling operations induce various data moves and erasures during flash operation. These data moves and erasures can be performed in the background during idle time or concurrently with various operations induced by host activity such as read and write operations.
Certain embodiments advantageously increase the operations performed in the background when power is abundant or the device is idle. The controller may, in certain embodiments, employ a prioritization technique which will alter the order in which these background operations are performed. One advantage of this kind of reordering of operations is that it can have an enhancing effect on write performance which is a well-known bottleneck in flash memory performance. As an example, the controller may use locality of reference to predict which portions of the logical address space are likely to be written in the near future. By boosting the priority of maintenance operations in areas of the flash that are likely to be written in the near future, a performance enhancement is achieved.
These and other embodiments and advantages and other features disclosed herein will become apparent to those of skill in the art upon a reading of the following descriptions and a study of the several figures of the drawing.
Several exemplary embodiments will now be described with reference to the drawings, wherein like components are provided with like reference numerals. The exemplary embodiments are intended to illustrate, but not to limit, the invention. The drawings include the following figures:
Certain embodiments use direct memory access to improve performance. As will be appreciated by those of skill in the art, direct memory access allows non-volatile memory system 2 to drive an 10 operation with respect to host device 4.
Scheduling and performance optimization of multiple concurrent tasks may be approached in a variety of ways.
Exemplary embodiments include scheduler 26, executive 28 and maintenance task 30. The various data structures used by these processes include current limit 32, host requests 34, background tasks 36, current consumption table 38 and schedule 40. In addition to interacting with these data structures, the sub-processes of
With continuing reference to
With continuing reference to
In one embodiment, at most a single task is selected. In other embodiments, multiple tasks may be selected.
In certain embodiments, the host request data structure 34 is considered to contain tasks which are of higher priority than those contained in background task data structure 36 because host request data structure 34 contains requests originating from host device 4 and any delay in the execution of these requests is perceived as unresponsiveness. In some situations, however, the tasks represented by background task data structure 36 are considered to be high priority tasks such as the situation where a program operation has been requested but the erase tasks are so backlogged that the host request cannot be performed.
In some embodiments, the priority of various of these background tasks is elevated when there is a depletion of freshly erased blocks in non-volatile memory 12. Operation 50 produces a small set of tasks collected from host request data structure 34 and background task data structure 36. In an operation 52, the set of tasks which was selected in operation 50 is reduced by removing any tasks that conflict with presently scheduled tasks represented in schedule 40, An example of a conflict would be a program operation occurring in parallel with another program operation. In certain embodiments, parallelism of this kind would not be allowed. Conflicts like these may be detected by examining the task set generated by operation 50 and the scheduled tasks represented in schedule data structure 40 and applying a set of rules.
Such a set of rules could be encoded as a table in one embodiment such as that shown in
The cases generated by the combinatory process of operation 54 also include variations of power levels derived from current consumption table 38 shown in greater detail in
This variety of cases allows the dynamic reallocation of current according to changing conditions, as signaled by various interrupt signals. A task that had previously been allocated a high current level could be reduced if said reduction is advantageous according to a performance metric.
The same set of tasks may be generated more than once by the combinatory engine of operation 54 because the power levels generated are varied from one case to the next. In an operation 56, the case generated by the combinatory engine of operation 54 is evaluated with respect to the best case that has been seen so far in this invocation of process 26 of
In certain embodiments, various data move operations are considered to have multiple parts or phases. For example, a write operation could be considered to have two phases, one which loads data into volatile memory and another which takes that data in volatile memory and programs the non-volatile memory. In one embodiment, these multiple parts are scheduled as two parts of the same operation and are not separated within the scheduler process 26.
The priority of the various tasks in host request data structure 34 and background task data structure 36 are in one embodiment dynamically evaluated based on when these requests were received and the type of task. Certain embodiments boost the priority of erase tasks when the condition arises that there are no more freshly erased physical sectors on which to write.
It should be noted that the power supply is often under the control of a user who can, for example, arbitrarily and without warning disconnect one or more sources of power. In some cases, an alternative source of power is available and thus operations may continue. In certain embodiments, the choice of dynamic power configuration will be limited to those configurations with task sets which may endure a sudden drop in power. In one embodiment, task sets which have a total minimum current level that is greater than the minimal current level that can be supplied after such a sudden loss would not be chosen.
Although various embodiments have been described using specific terms and devices, such description is for illustrative purposes only. The words used are words of description rather than of limitation. It is to be understood that changes and variations may be made by those of ordinary skill in the art without departing from the spirit or the scope of the present invention, which is set forth in the following claims. In addition, it should be understood that aspects of various other embodiments may be interchanged either in whole or in part. It is therefore intended that the claims be interpreted in accordance with the true spirit and scope of the invention without limitation or estoppel.
Number | Date | Country | |
---|---|---|---|
60739450 | Nov 2005 | US |