The present invention relates to memory technology based at least in part on the property of giant magnetoresistance (GMR). More specifically, such memory technology is employed in the context of a generalized state machine to render the state machine nonvolatile.
Computers of all kind, including personal computers (PCs), store their operating systems (OS) and application programs on nonvolatile media like hard disks (HD). Computing configurations with a small OS and few application programs can store all this software directly in the system memory provided this memory is nonvolatile. Most of the systems known as “general” computing configurations use a HD for storing, as they cannot afford a nonvolatile system memory for technology reasons, cost reasons, or both. Most of the computing configurations known as “embedded” use nonvolatile system memory for storing the OS and application programs. Other computing configurations might use a combination of the two storage methods.
Computers are built as a collection of components which, in aggregate, perform all needed system functions. Normal computer operation is typically referred to as the “active mode.” Configurations embodying sequential machines (e.g., algorithmic state machines which form the basis for controllers, microcontrollers and other sequential machines) have their power cut for the time they are out of operation. Often, power is cut temporarily for various reasons, with the intention of reapplying it when needed; in this case the computer is said to be “on standby” for as long as power is not applied. “Waking up” means activating a component after it was switched into the standby mode. When wakened the component needs access to the information on the last state of the machine prior to having been put on standby. This is a necessary condition for resuming the active mode. Normally, this information is stored in registers. As semiconductor registers are volatile memory cells, they need to be backed up in a nonvolatile scratch pad prior to interrupting power, and then restored after power has been reapplied. These operations would be unnecessary if the registers were nonvolatile. In principle, this can be achieved by using a nonvolatile semiconductor memory such as flash; in practice, this is not proven to be feasible because flash does not allow byte access, is too slow, and has a limited number of write cycles.
In the same context, more complex computing configurations allow only a limited number of internal components to be switched into the standby mode and then wakened.
Upon power up from the un-powered mode, the entire computing configuration has to be made operations ready, a process known as “booting”. In its most general form, the tasks involved in booting ensure that (a) all relevant OS and application-program parts are in the system memory; and (b) all system modules (central processing unit, specialized processing units, storage units, input/output units, communication units, etc.) are initialized. The latter operation typically involves loading all registers for each individual system module with the data required to put the system modules in position to commence task execution. Thus, in order to ready the system, computing configurations that do not have a nonvolatile system memory and store essential software in a separate storage module have to perform both operation (a) and (b); those that store essential software in a nonvolatile memory system, only operation (b). These operations have to be performed every time power is reapplied to the computing configuration after having been removed, for whatever reason.
It is well known in the state of the art that the relevant software can be preserved during power-down by replacing DRAM (the dominant technology for main memory in PCs and other general computing configurations) with nonvolatile memory; this is presently done in small computers. Saving register contents has turned out to be more elusive. The time needed to initialize the work memory and that to initialize all system registers greatly depends on the computing configuration. Memory initialization time for the typical PC results from the transfer of the needed OS routines, peripheral drivers, and basic applications (e.g., 10 MB of program) from the hard disk to the main PC memory. Typical times are around 6 sec.
Register initialization time results from the sequential transfer of the content of all system registers (e.g., 256 64-bit registers) and the parameters needed for the network connection of the PC from main memory to the respective registers and communication module memory. This register initialization also includes the time needed to initialize the PC monitor, the graphics module, and the frame memory of the monitor. Current register initialization times for IBM-type PCs with Windows operating system can be as low as 12 seconds, but also much higher, while that for Apple-type computers with Apple operating system can be as low as 7 seconds, but also much higher. Thus, the total minimum time to boot can be as low as 18 and 13 seconds, respectively, but also much higher. These times grow significantly when computers are part of a networked cluster because of the network initialization routines that need to be loaded or restored.
Typical computing architectures have a limited number of internal busses that connect the system modules with system memory. Most have only a single system bus for economic reasons. In a single-bus configuration, the most common configuration for PCs, the system registers are initialized sequentially, making this operation much more time consuming than if it were done in parallel. In a multi-bus system, which is often implemented in embedded computing configurations, the register initialization time might be reduced if register initialization can be performed simultaneously on more than one bus. Because of continuously increasing complexity of general, embedded and hybrid computing configurations, the number of system registers is increasing steadily, which in turn leads to a steady increase in the system initialization time both for single-bus and multi-bus systems.
The main reason computers are turned off for longer periods (e.g. overnight) is to reduce wear-out and to save power. The main reason computers are temporarily turned off is to save power. Saving power in electronic devices is becoming increasingly important for a multitude of reasons. The following example is for a PC, but the concepts are generic and apply to any other computing configuration as well.
Power consumption needs to be reduced for two main reasons: (a) to preserve battery power, and (b) to minimize heat generation in order to keep component packaging affordable. The more power a package has to dissipate, the more expensive it becomes. Beyond a certain level, no package material can help. This is the reason some high-performance microprocessors currently use on-chip mini fans, while others employ forced water cooling. Both methods significantly increase component costs and power consumption.
Fully functioning computing configurations are normally built by assembling a number of components, each of them able to perform a collection of system functions. As component manufacturing technologies evolve, the distribution of functions among the components (function partitioning) is changing. Function partitioning is also changing in order to improve system performance and reduce system power consumption. A further reason to redistribute the number of functions per component is to minimize the number of components that are in the active mode at any given time, thus saving more system power.
System power is saved in today's computers mainly by cutting it off from those system modules that are idle for periods of time. Because of the volatile nature of the semiconductor system-module register set, this operation involves safeguarding the register contents for all modules, i.e. creating a content backup, in a memory area that is either nonvolatile or permanently powered. After power is restored, the register set is initialized by transferring the safeguarded contents back from this memory to the registers.
The disadvantage of having to perform the system initialization operations (a) and (b) detailed above can be significant in many situations. Both the backup and initialization operations are resource intensive and very time consuming. If this technique is used, it requires complex software routines to back up the register content of the modules, which are temporarily put into power-saving modes, and to restore the contents once they are brought back into the active mode. For practical reasons, this technique of saving power by selectively cutting it off from system-modules registers while the modules are idle is therefore seldom used. Moreover, the technique cannot be used in systems that need to operate in real time or monitor external events. In addition, as PC hardware and software increase in complexity, users have to wait increasingly longer times for the PC to become operational. In applications where the initialization delay is not acceptable, special hardware is incorporated in order to take care of the system during the initialization period; this increases system costs and power consumption. Component volatility thus prevents the system from realizing its entire power-saving potential.
A more common technique of reducing power consumption is to divide the entire system into zones with individual power supplies. Because they have independent power supplies, powering can be done on an as-needed basis; i.e., zones that are not performing any function at a given time can be put on standby. For power-saving reasons, zoning granularity became finer over time, reaching component level; i.e., components are individually powered on an as-needed basis.
The state-of-the-art today is to zone the semiconductor component itself, i.e., the substrate of the component is partitioned into zones that can be individually powered. This means that parts of a semiconductor die can be in the active mode, with other parts in the standby mode. Powering up and down of the different zones, both on system and component level, requires complex timing, which in turn requires separate power-management logic. Currently, the logic that controls component-zone powering is implemented within the component itself, while the logic that controls system zone is implemented in a separate component.
Current practice is for the power-management logic in computing systems, either on chip or as a separate component, to detect the transition into the power-down mode by generating a power-down signal. In addition, the power-supply module is typically equipped with a power capacitor that creates a power reserve for a certain number of system cycles, generally around ten. That is, once the power-down condition has been detected and a power-down signal generated, the system will still be able to perform approximately ten more cycles until it grinds to a complete halt. These ten cycles add up to 10 ns for a 1 GHz system clock. This time is not nearly enough to back up the system registers into the nonvolatile portion of the main system memory, even if this exists.
A capacitor-type power-down reserve that provides enough reserve for the computer to save all registers into the nonvolatile portion of the main memory is not economically feasible in stationary computers (like desktop PCs, work stations, servers), and even less so in portable (notebooks, laptop PCs) or mobile computers (palm top PCs), which have severe space constraints.
According to the present invention, a nonvolatile sequential machine is provided which includes a semiconductor controller operable to control operation of the nonvolatile sequential machine according to a state machine comprising a plurality of states. The nonvolatile sequential machine further includes a plurality of state registers operable to store the plurality of states. The state registers comprise nonvolatile random-access memory operation of which is based on giant magnetoresistance.
According to a specific embodiment of the invention, a device is provided which includes a semiconductor controller operable to control operation of the device according to a state machine comprising a plurality of states. A plurality of semiconductor registers are operable to store the states during an active mode of the device. A plurality of shadow registers are operable to store the states during a reduced power mode of the device. The shadow registers comprise nonvolatile random-access memory operation of which is based on giant magnetoresistance (GMR). Interface circuitry is operable to transmit the states between the semiconductor registers and the shadow registers.
A further understanding of the nature and advantages of the present invention may be realized by reference to the remaining portions of the specification and the drawings.
FIGS. 14(a) and 14(b) are simplified representations of a differential transpinnor for use with specific embodiments of the present invention.
FIGS. 15(a)-15(d) illustrate four different embodiments in which a transpinnor is used to balance a sense-digit/reference line pair.
FIGS. 16(a)-16(e) illustrate the effect of the trimming technique of the present invention on the balancing of sense-digit/reference line pairs.
FIGS. 19(a) and 19(b) are is functional block diagrams of ISA-bus IBM compatible personal computer systems according to specific embodiments of the invention.
Reference will now be made in detail to specific embodiments of the invention including the best modes contemplated by the inventors for carrying out the invention. Examples of these specific embodiments are illustrated in the accompanying drawings. While the invention is described in conjunction with these specific embodiments, it will be understood that it is not intended to limit the invention to the described embodiments. On the contrary, it is intended to cover alternatives, modifications, and equivalents as may be included within the spirit and scope of the invention as defined by the appended claims. In the following description, specific details are set forth in order to provide a thorough understanding of the present invention. The present invention may be practiced without some or all of these specific details. In addition, well known features may not have been described in detail to avoid unnecessarily obscuring the invention.
Examples of storage cells for use with the present invention are described in U.S. Pat. No. 5,587,943 for NONVOLATILE MAGNETORESISTIVE MEMORY WITH FULLY CLOSED FLUX OPERATION issued on Dec. 24, 1996, and in U.S. Pat. No. 6,594,175 for HIGH DENSITY GIANT MAGNETORESISTIVE MEMORY CELL issued on Jul. 15, 2003, both of which are incorporated herein by reference in their entireties for all purposes. Specific examples of such storage cells will be described below.
Beginning at the upper right quadrant, both top and bottom layers 130 and 134 are saturated in the same direction. If the applied field H is reduced to substantially zero and then reversed in direction, the layer having the lower coercivity switches first, as shown by the cross section in the upper left quadrant. The switching occurs when the field is equal to the sum of the coercivity of the lower coercivity film plus the coupling field.
As the applied field H is increased in the negative direction, the film layer having a higher coercivity switches directions, as depicted in the lower left quadrant. This switching occurs when the field magnitude is equal to the coercivity of the higher-coercivity film less the value of the exchange coupling. Thus, switching is carried out in such films in a two-step process.
Readout of the memory cell of
Those of skill in the art will understand how each of the states may be written to memory cell 402. That is, layer 406 is magnetized first by the application of a magnetic field which overcomes the layer's coercivity. Because of its lower coercivity, layer 404 is also magnetized in the same direction, at least initially. The antiparallel state of layer 404 may then be written by application of a second magnetic field of the opposite orientation which is sufficient to overcome the coercivity of layer 404 but not layer 406.
The reading of the information stored in memory cell 402 will now be described with reference to
If, on the other hand, there is no difference between R1 and R2, the initial state could have been either “00” or “10”. If all that is desired is to determine the state of the low coercivity layer 404, i.e., “0” in both instances, no further action need be taken. However, if the state of layer 406 must be determined, a second magnetic field may be applied in the direction opposite to the first magnetic field, e.g., to the left in this example, and a third resistance value R3 measured (column 4). If R3-R2 is greater than zero, the initial state is determined to be “00”; if less than 0, the initial state is determined to be “10” (column 5). The initial state is then rewritten to the cell.
Although the descriptions of specific implementations refer to layers having different coercivities (e.g., layers 404 and 406), it should be noted that embodiments are contemplated which employ layers having the same coercivities, relying on alternative mechanisms to effect storage and readout. An example of such a mechanism is the use of localized fields to switch one layer without switching a nearby layer having the same coercivity. Examples of such embodiments are described below.
According to various other embodiments, memory cell designs are provided in which multiple bits of information may be stored in one memory cell. Specific embodiments will be described below in which 2, 3, or 4 bits of information may be stored in one memory cell and which employ either destructive read out (DRO) and nondestructive read out (NDRO). It will be understood, however, that particular ones of these designs may be generalized to store more bits of information than described.
Three embodiments which employ DRO will now be described with reference to
It should be noted that insulation layers are represented by the blank spaces between the layers shown. These layers were omitted for purpose of clarity. In addition, the various layers are shown having different widths for illustrative purposes. However, the layers of actual embodiments are typically the same width. Finally, it will be understood that the vertical dimension of the figures of the application are often exaggerated for illustrative purposes.
A memory module based on the memory cell of
One can understand how to write to the dibit memory cell 602 of
When, on the other hand, the current in word line 608 is antiparallel to that in sense-digit line 610 and the amplitudes of the currents are substantially equal, the combined field outside of lines 608 and 610 is effectively zero while the field between the lines, i.e., the field experienced by cobalt layer 604, is doubled. Thus, cobalt layer 604 may be written using coincident currents in the word and sense-digit lines of opposite polarity, each of which may have a field less than HC but whose combined sum is greater than HC.
According to a specific embodiment, the procedure for reading dibit memory cell 602 involves several steps. Initially, the resistance of sense-digit line 610 is measured. A logic state, e.g., a “1”, is then written to cobalt layer 604 with coincident currents in access lines 608 and 610 as described above. The resistance of sense-digit line 610 is then measured again. If it has changed, it is determined that the initial state of layer 604, i.e., the bit of information originally stored in layer 604, is different than the current state, e.g., if the layer was written as a “1” it must have previously been a “0”. If the resistance has not changed, the opposite conclusion is established, i.e., that the bit of information originally stored in layer 604 is the same as in the current state.
The state of layer 606 may subsequently be determined by reversing the state of layer 604 and comparing the resulting resistance to the last resistance measurement. The state of layer 606 may then be determined from whether the resistance increases or decreases. For example, if the top layer is switched from a “1” to a “0” and the resistance decreases, the bottom layer must be a “0”, i.e., the magnetization vectors of the two layers are now aligned. By contrast, if in such a scenario the resistance increased after such a switch, the bottom layer must be a “1”, i.e., the magnetization vectors of the two layers are now antiparallel. After a read operation, the original states of layers 604 and 606 may be rewritten as required.
Of course, it will be understood that a read operation may be performed to determine the state of both of the storage layers as described above, or to determine the state of either of the layers separately.
It will be understood that variations on the structure of memory cell 602 (as well as others of the memory cells described herein) may be made without departing from the scope of the present invention. For example, the respective coercivities or compositions of storage layers 604 and 606 may be varied. In addition, the current amplitudes of the current used to access memory cell 602 need not necessarily be equal to enable operation according to the principles of the present invention.
Memory cell 702 has four cobalt storage layers 704, 706, 708 and 710 each of which is capable of storing one bit of information. The cell access lines include a copper word line 712, a copper sense-digit line 714, and a copper inhibit line 716. The term “inhibit line” is used in reference to the inhibit line of the old ferrite core memories which employed three wires per cell. According to one implementation, an inhibit line allows a 3:1 ratio of field at selected to unselected locations, which is larger than the 2:1 ratio when there is no inhibit line. According to some embodiments, the inhibit line links all of the bits in an array. According to other embodiments, the inhibit line does not link all bits in the array. Rather they are configured to run diagonally through the array and are furnished with their own selection matrix.
As will become apparent, in three-bit embodiments, the magnetization states of storage layers 704 and 710 (and thus the information stored therein) are not independent. That is, each is magnetized in the opposite direction of the other. According to other embodiments (discussed below), this symmetry can be broken using a variety of techniques such that each of the four storage layers may be written and read independently.
According to the three-bit embodiment, the storage layers of memory cell 702 are characterized by substantially equal coercivities and may be written by the application of different combinations of coincident currents in the three access lines. The fields generated as a result of the applied currents are given by:
H1=k{−Iw−Ii−Id} (1)
H2=k{Iw−Ii−Id} (2)
H3=k{Iw+Ii−Id} (3)
H4=k{Iw+Ii+Id} (4)
From these equations, it can be seen that layers 706 and 708 may each be switched with a current pulse combination that will not switch any other film in the cell. For example, if Iw=+Hc/3k and Ii=Id=−Hc/3k, then the field at layer 706 is Hc, while the field at layers 704 and 708 is Hc/3 and the field at layer 710 is −Hc/3. That is, there is a three-to-one ratio between the field at the desired storage layer and each of the other storage layers. It can also be seen, however, that in this particular embodiment where the coercivities of layers 704 and 710 are substantially equal, these layers do not switch independently. That is, a field combination that switches one of these two layers will switch the other in the opposite direction. Thus, in such an embodiment where layers 704 and 710 are interdependent in this way, only three bits of information may be stored in or retrieved from memory cell 702.
To effect reading of the information in three-bit memory cell 702, the control electronics for word line 712 and sense-digit line 714 are the same. That is, low-level gates and pre-amps are situated at the ends of each making the word lines, in effect, word-sense lines. The reading of an individual cobalt storage film is achieved in much the same way as described above with regard to dibit memory cell 602. That is, the resistance of the access line to which the storage film of interest is attached is measured. A logic state is then written to the storage film of interest and the resistance of the associated access line measured again. If the resistance changes, the storage film was originally in the opposite state of the logic state that was just written. If the resistance does not change, then the current logic state is the same as the original logic state. Also as described above with reference to dibit memory cell 602, the state of the other storage film associated with the same access line may be determined by switching the first film again and determining whether the resistance goes up or down.
According to various implementations, memory cell 702 is modified such that all four storage layers may be used to store independent bits of information. That is, memory cell 702 has enough storage layers to store four bits of information. However, as discussed above, if the coercivities of the layers are substantially equal, any current pulse sequence which writes storage layer 704 to a particular logic state will also write storage layer 710 to the opposite state.
According to a first embodiment, memory cell 702 becomes a four-bit memory cell with the addition of another access line (placed, for example, above cobalt layer 1) to break the symmetry which results in the interdependency of layers 704 and 710. This embodiment requires an additional masking level and an additional selection matrix to control the added access lines.
According to a second embodiment, the compositions of storage layers 704 and 710 are made sufficiently different such that their switching thresholds require different field strengths for switching. This may be accomplished, for example, by depositing a permalloy layer directly over the cobalt film of storage layer 704. This will give layer 704 a lower coercivity than layer 710. Thus, when coincident currents are applied to the access lines, the resulting fields will write layer 704 before writing layer 710.
According to a third embodiment, the separation spacing between the keepers and the cobalt storage films is adjusted such that demagnetizing fields become significant enough to break the symmetry. This embodiment takes advantage of the fact that even a perfect keeper doesn't completely cancel the demagnetizing field of a finite size magnetic film spaced a nonzero distance from the keeper. Such a demagnetizing field increases strongly with the distance between the magnetic field and the keeper. This demagnetizing field can be used to break the symmetry and allow both layer 704 and layer 710 to be written to the same state. For example, if one wishes to write a “0” to both layers 704 and 710, a pulse combination may first be applied which writes a “1” to layer 704 and a “0” to layer 710. A “1” is then written into each of layers 706 and 708. This results in a demagnetizing field which tends to bias layers 704 and 710 toward the “0” state. Thus, when a subsequent pulse combination is applied which tends to write layer 704 in the “0” state and layer 710 in the “1” state, only layer 710 is switched. This leaves both layers 704 and 710 in the same state, e.g., “0”. Layers 706 and 708 may then be written independently.
According to a fourth embodiment, a keeper layer replaces a portion of the center of line 716. This shields layers 704, 712 and 706 from the field generated by currents in layer 708, 714 and 710, and vice versa. This removes the redundancy and allows four bits of information to be independently stored.
The four-bit embodiment of memory cell 702 may be read in much the same way as the three-bit embodiment described above. According to a specific embodiment, this may be done by switching only the interior bits (i.e., layers 706 and 708) and using the read procedure described with reference to the dibit memory cell 602 of
According to further embodiments, multi-layer memory cells are stacked to achieve increased information storage density. A double-density stacked memory cell 802 designed according to one such embodiment is shown in
The reading and writing of memory cell 802 will now be described with reference to the FIGS. 9(a) and 9(b) which show the resulting magnetic fields from opposing currents in multi-layer GMR structure 804. Current flowing out of the page through GMR structure 804 generates a magnetic field 902 as shown in
In
Because the conductivity of copper is much larger than that of cobalt, the approximation that all of the current in sense-digit line 804 is carried by the copper layers may be made. Using this approximation, it can be seen that the magnitudes of the fields in layers 806 and 812 are approximately three times the magnitudes of the fields in layers 808 and 810. For example, the field experienced by cobalt layer 808 from copper layer 814 is cancelled by the field from copper layer 816, leaving only the field component from copper layer 818. By contrast, cobalt layer 806 experiences positive field contributions from each of the copper layers. This difference in field magnitude is the basis for operating the stacked memory cells.
An exemplary technique for writing of the dibit memory cell 802 will now be described with reference to
A current in sense-digit line 804 will result in a magnetic field in cobalt layer 808 which is equal and opposite to the field experienced by cobalt layer 810. When a coincident current is applied to word line 822, the resulting field will add to the field in one of layers 808 and 810 and subtract from the other. This makes it possible to write to either one of layers 808 or 810 without disturbing the other. So, for example, to write to layer 810, a current which produces a field of magnitude HC/2 at layer 810 is applied to sense-digit line 804 in the direction out of the page (see
To read the information stored in dibit memory cell 802 the magnitude of the read current in sense-digit line is ⅓ of that of the write current. This results in a field of HC/2 at layer 806 and −HC/2 at layer 812. The resulting fields at layers 808 and 810 are of magnitude HC/6 and will therefore not cause any switching of these layers. To read the information in layer 808, layer 806 is written, i.e., magnetized, in a first direction and the resistance of sense-digit line 804 is measured. Layer 806 is then written in the other direction and the resistance measured again. The two resistance measurements are then compared. The resistance will be lower when layers 806 and 808 are magnetized in the same direction, and higher when they are magnetized in opposite directions. Therefore, the direction of magnetization of layer 808, i.e., the logic state stored in layer 808, may be determined from the comparison of the resistance values. The reading of layer 810 is achieved using the same procedure with layer 812.
A quadruple-density stacked memory cell 1002 designed according to another embodiment is shown in
As with dibit cell 802, the four bits of information of quadbit cell 1002 are stored in the two center cobalt layers of each of sense lines 1004 and 1005. The fields on the top and bottom data bit layers of sense line 1004 will be denoted H1 and H2, respectively. The fields on the top and bottom data bit layers of sense line 1005 will be denoted H3 and H4, respectively. The term k will be used to represent the constant of proportionality between the magnetic field and current on the surface of a stripline having the width of those in the memory (k=2π Oe/ma for a line 1 micron wide, and is inversely proportional to the width of the stripline). The current in top sense line 1004 will be denoted i1. The current in copper digit line 1006 will be denoted i2. The current in bottom sense line 1005 will be denoted i3. Using these definitions, the four fields at the four information storage layers are given by:
H1=k(i1/3+i2+i3) (5)
H2=k(−i1/3+i2+i3) (6)
H3=k(−i1−2+i3/3) (7)
H4=k(−i1−i2−i3/3) (8)
NDRO quadbit cell 1002 has the same control electronics for each of its two sense lines 1004 and 1005 as sense-digit line 804 of dibit cell 802, i.e., low level gates and preamps. From equations 5-8, it can be seen that each of the four information storage layers of quadbit cell 1002 may be written independently of the others by the appropriate combination of coincident current pulses in sense lines 1004 and 1005, digit line 1006 and word line 1008.
The read and write techniques described above with reference to dibit memory cell 802 of
According to a specific embodiment, all-metal memory cells may be configured into a memory array 1100 as shown in
According to other embodiments, the bit density of the dibit and quadbit memory cells may be further doubled by changing the shape of the word lines in an array 1100 of such devices and using separate sense and digit lines. This may be understood with reference to FIGS. 12(a) and 12(b). According to such embodiments word lines 1202 are straight and orthogonal to separate sense and digit lines (1204 and 1206, respectively).
As can readily be seen by comparing the array design of
Referring back to
A specific implementation of a transpinnor 1300 is shown in
As mentioned above, the resistance of each leg of transpinnor 1300 may be changed by application of a magnetic field to manipulate the magnetization vectors of the respective GMR film's layers. Such fields are generated by the application of currents in input lines 1310 and 1312 which are electrically insulated from the GMR films. Input line 1310 is coupled to and provides magnetic fields for altering the resistance of GMR films R1 and R3. Input line 1312 is coupled inductively to and provides magnetic fields for altering the resistance of GMR films R2 and R4. If the resistances of all four GMR films are identical, equal currents in input lines 1310 and 1312 change the resistances equally and do not unbalance the bridge, thus resulting in zero output. If, however, unequal currents are applied, an imbalance results, thus resulting in a nonzero output.
The relationship between the output voltage of transpinnor 1300 and a variety of other parameters including power supply voltage, input current, GMR value, leg resistance values, and output resistance will now be described. This analysis assumes the ideal case where the resistance of each of four resistive elements R1-R4 (when in identical magnetic states) is identical, and denotes this resistance value as r. When a positive current is applied at input 1 and a negative current is applied at input 2, the various resistances are given by:
R1=r(1−δ) (9a)
R2=r(1+δ) (9b)
R3=r(1−δ) (9c)
R4=r(1+δ) (9d)
Where
δ=f(H) gmr/2 (10)
gmr is the decimal equivalent of GMR (i.e., gmr=GMR/100), and f(H) is a number less than or equal to one, representing the fraction that a layer has switched.
The output resistance of transpinnor 1300 is denoted r5. The current in each of resistive elements R1-R4 and r5 denoted i1-i5, respectively. The voltage drop across the entire bridge, i.e., the voltage applied to the power lead) is denoted V. From Kirchoff's laws we then have
i1−i2−i5=0 (11a)
i4−i3+i5=0 (11b)
and from symmetry,
i=i3 (12a)
i2=i4 (12b)
Because the voltage drop over any path between the power lead and ground must be V,
(1−δ)rii+(1+δ)ri2=V (13)
(1−δ)ri+i5r5+(1−δ)ri1=V (14)
Combining equations (11), (13), and (14),
i5=2i1δ/[1+δ/(r5/r)] (15a)
This equation represents the output current of transpinnor 1300.
Also of interest is the dependence of the amplification factor,
A=output current/input current (16)
on the power supply to transpinnor 1300 and the line width of the GMR films. For this analysis will use the approximation that r5/r<<1. This is due to the fact that the input and output lines of transpinnor 1300 are much thicker than the GMR films (e.g., 20 run of copper and 300 nm of AlCu vs. 2-4 nm of copper). In addition, δ<<1 also (see equation 10). In the case of complete switching, equation 15a then becomes
i5=2i1δ=i1gmr (15b)
The input current must be sufficient to switch the lower coercivity, e.g., permalloy layer of the GMR films, i.e., sufficient to produce a magnetic field equal to the layer coercivity, HC. The field H produced by a current i in a stripline of width w and length L is found from Maxwell's equation, curl H=J′, to be
H=2πi/w Oe (17)
where i is in mA and w is in microns. (In changing units from Maxwell's equation to those in equation (17) it should be noted that 4πOe=103 amps/meter.) Thus, the input current required to produce a field HC is
input current=(½π)Hcw mA/(Oe-micron) (18)
To derive the output current, it should be noted with reference to
output current=103 gmrV/(2r)mA (19)
The amplification factor is then
A=π1000 gmrV/(rHcw) (20a)
It is further useful to write the resistance r as the sheet resistivity, Rsq (ohms per square) multiplied by the number of squares. The number of squares of one of the GMR resistive elements of
A=π1000 gmr V/(HcLRsq) (20b)
where HC is in Oe, and w and L are in microns.
As discussed above, transpinnors form the basis for the all-metal support electronics for memory 100 of
It turns out that the transpinnor is extremely effective for applications in which a physical signal is to be read above an offset arising from the difference between two unevenly match input lines. It functions as a transformer at its input, rejecting the common-mode signal between the two lines, and as a differential amplifier at its output, amplifying the physical signal. In memory 100 there is a differential transpinnor 110 coupled to each sense-digit/reference line pair such that the sense-digit line is connected to input 1 of the transpinnor and the corresponding reference line is connected to input 2 (see
When the sense-digit and reference lines of a pair are in the same magnetic state, the output of the differential transpinnor 110 should be zero. However, because of imperfections arising in the fabrication process, the resistance of a sense-digit line will typically be different than that of its reference line. Consequently, when the same voltage is applied to the two lines, different currents enter the two inputs of the associated differential transpinnor 110 causing a nonzero output, and thus the potential for error. According to a specific embodiment, the differential transpinnor 110 for each sense-digit/reference line pair may be trimmed to compensate for this imbalance.
That is, compensation for the resistive imbalance is achieved by reducing the output of the transpinnor through at least partial reversal of one of the high coercivity, i.e., cobalt, layer. According to a specific embodiment, the other side of the transpinnor is operated with the high coercivity layer(s) saturated. The low coercivity layer(s) remains free to react to the input current, thereby producing the dynamic output. By reversing just the right percentage of the cobalt layer, the output of the transpinnor can be made to go to zero when the reference and sense-digit lines are in the same magnetic state, i.e., when it is supposed to be zero.
Equation (15b) represents the case where the currents of inputs 1 and 2 are equal in magnitude and of opposite polarity. When the currents are of the same polarity and different magnitude, the equation becomes
i5=i1(δ1−δ2) (21)
Since the two fractional resistance changes are unequal, i5 is nonzero. In equation (10), f(H) is the fraction of the film for which the magnetization of the high coercivity layer and the low coercivity layer (i.e., the cobalt layer and the permalloy layer) are antiparallel less that for which they are parallel. We can therefore write f(H) as the product of two terms, one representing the high coercivity layer and one representing the low coercivity layer,
f(H)=fc(H)fp(H) (22)
where fc(H) is the fraction of the cobalt layer magnetized in the positive direction less that magnetized in the negative direction and fp(H) is the corresponding fraction for the permalloy layer. This assumes that the layers switch independently of one another which is a reasonable assumption in that the coercivity of cobalt is much higher than that of the permalloy, and the transpinnor is typically operated at low field where only the magnetization of the permalloy changes and that of the cobalt remains fixed. That is,
fc(H)=constant (23)
but the values of fc(H) will in general be different for the two inputs.
The transpinnor can be set up so that the response of the permalloy to the applied field (from the current in the input line) is relatively linear for the current range of interest, i.e.,
fp=kI |fp<1 (24)
where the value of the proportionality constant k is determined by the shear of the loop. Denote the current from the reference line by iref and the current from the sense-digit line by isense. Then
δ1=fc1fp gmr/2=fc1k isense gmr/2 (25)
δ2=fc2fp gmr/2=fc2k iref gmr/2 (26)
Then, by equations (21), (25), and (26), the output current 15 of the transpinnor is given by
i5=i1(δ1−δ2)=i1k(gmr/2)(fc1 isense−fc2iref) (27)
Equation (27) reveals that even if the sense current is different than the reference current when the line are in the same magnetic state, the output current i5 can be made zero by adjusting the magnetization in the cobalt film. Thus, for example, if the current in a sense-digit line is greater than that in the corresponding reference line, the currents can be balanced by saturating the cobalt in the reference leg of the transpinnor in the positive direction so that fc2=1 and partially reversing the cobalt in the sense-digit leg of the transpinnor such that fc1=iref/isense. This balances the input, even though the lines have different resistances. The adjustment is facilitated by the fact that the two cobalt layers can be adjusted independently. It should be noted that this technique can compensate for virtually any resistive inequality in a given sense-digit/reference line pair. This is even the case where the difference in resistance is much greater than the films' gmr values.
According to various specific embodiments of the present, there are a number of ways in which a transpinnor may be connected to a sense-digit/reference line pair. Four of these options will now be described with reference to
The four configurations of
A differential transpinnor exhibits hysteresis unless operated in a specific way. This hysteresis can be avoided if the transpinnor is biased in the hard direction of the low coercivity (e.g., permalloy) layer with a field greater than or equal to the anisotropy field. This eliminates the hysteresis and the permeability becomes very large. The high coercivity (e.g., cobalt) layer is largely unaffected because its anisotropy field is typically much larger than that of the low coercivity layer. The signal field is applied by the input lines of the transpinnor and is in the easy-axis direction.
A second method which requires no bias field is to fabricate the transpinnor with the easy axis of the low coercivity layer perpendicular to the easy axis of the high coercivity layer. The low coercivity layer thus undergoes uniform magnetization rotation rather than wall-motion switching.
A third method of dealing with transpinnor hysteresis is to initialize the transpinnor the same way before each read operation. For example, each read operation could be started by the application of a negative pulse which switches all the low coercivity layers but not any of the high coercivity layers. This erases any previous low coercivity layer history.
According to a fourth method, the low coercivity layer of the transpinnor is initialized antiparallel to the high coercivity layer, leaving it on the very steep part of the device's hysteresis curve where a small input current will produce a large output.
According to a specific embodiment, when a transpinnor is used to balance a sense-digit line against its reference line, the resistive elements of the transpinnor are adjusted such that when the sense-digit and reference lines are in identical magnetic states (i.e., with the same number of ones and zeros in the storage layers of the two lines and at the corresponding locations in each, and with the same corresponding magnetizations in the readout layers of the two lines), the transpinnor gives zero output. When a bit is changed on the reference line but not the sense-digit line, the ratio of resistances changes and the transpinnor gives a nonzero output. That is, the transpinnor is adjusted to give zero output not when both input currents are equal, but when the sense-digit line and the reference line are in the same magnetic state. Note that the voltages applied to the two lines are equal, but because the resistances are unequal, the currents in the lines are unequal. Thus, though the supply to the line pair is a constant current, the individual currents in the pair may be different.
During a read operation, the read current through the trimming transpinnor is large enough to switch its low coercivity layer, but not its high coercivity layer. Therefore, the trimming adjustment is made to the high coercivity layer (which remains in the partially switched state during the read operation), not the low coercivity layer (which needs to be free to change in response to the read current). The high coercivity layer in the transpinnor is not affected by write operations because the resistive elements of the transpinnors are not physically connected to the sense-digit lines.
FIGS. 16(a)-16(e) illustrate the effect of the trimming technique on the balancing of sense-digit/reference line pairs according to a specific embodiment thereof. Each set of three diagrams corresponds to a transpinnor with specific characteristics. In each set the left most diagram represents the transpinnor output, the middle diagram the output from read signal for a “1,” and the right most diagram the output from a read signal for a “0.”
When the transpinnor associated with a particular sense-digit/reference line pair is well balanced, i.e., the sense-digit line and the reference line have equal resistances, the outputs for a “1” and a “0” are as shown in
Similarly, if the resistance of the sense-digit line is greater than that of its corresponding reference line, the result is a pedestal of the opposite polarity as illustrated in
It will be understood with reference to the diagrams of
Referring once again to
According to specific embodiments, it is desirable that the GMR films for each of the SpinRAM memory elements 102 have high GMR values to achieve a favorable signal-to-noise ratio. Relatively low coercivities may also be desirable for both the high and low coercivity layers of the memory elements to ensure low switching currents, although the difference in coercivity between the high and low coercivity layers should be sufficiently large to maintain satisfactory operating margins.
The characteristics of the GMR films for the transpinnor-based elements (i.e., 104, 106, and 110) may be similar to those discussed above for the memory elements, but may differ in some respects. That is, like the memory elements, high GMR values are desirable, as is a relatively low coercivity for the low coercivity layers. However, the coercivity of the high coercivity layers can be significantly larger than that which would be acceptable for the corresponding layers of the memory elements. In addition, it is desirable that the GMR values and coercivities of the layers of GMR resistors 108 be relatively high to ensure stability.
A simplified schematic of a transpinnor-based selection matrix is shown in
At each intersection of a power current line 1702 and a transpinnor selection line 1704 is a transpinnor 1706 which delivers current to a selected word (or sense-digit) line 1708. A simplified representation of a transpinnor 1706 is shown in
A power current is applied to the column of transpinnors 1706 which includes the transpinnor corresponding to the line 1708 to be selected via one of power current lines 1702. Power being applied to each resistively balanced transpinnor results in zero output. As discussed above, individual transpinnors may be balanced to achieve this zero output using the technique referred to herein as magnetoresistive trimming. Coincident with the application of the power current, a current is transmitted via the input selection line 1704 corresponding to the transpinnor 1706 to be selected. The field associated with this current unbalances the selected transpinnor by at least partially reversing the magnetization of at least one of the transpinnor's low coercivity layers, and thereby changing the resistance of the corresponding GMR element. The transpinnor imbalance results in a corresponding output current which is delivered to the memory array via the word (or sense-digit) line 1708 connected to the transpinnor output.
Most computer systems are based on the use of volatile main memory which is typically implemented using dynamic random access memory (DRAM) technology. The volatile nature of DRAM and its relatively high cost per bit of storage capacity has, in turn, led to the development of magnetic disk technology as the basis for the permanent mass storage component of computer memory systems. This hybrid architecture has some well know disadvantages which include, among other things, the relatively long access time for magnetic disks, increased operating system complexity, and the risk of data loss during power failures.
The block diagram of
If a requested piece of information is not present in the cache, the information must be retrieved from main memory 1808. Main memory 1808 communicates with microprocessor 1802 via memory interface 1810, is typically much larger (e.g., 16M) and slower (e.g., access times of 70 ns) than cache memory, and is typically implemented in DRAM. This main memory provides microprocessor 1802 with relatively fast access to large blocks of data as well as stores and streams data to the display.
If a requested piece of information is not present in main memory, the information must be retrieved from mass storage. Such mass storage may be provided by one or more magnetic disks 1812 which are coupled to microprocessor 1802 via disk controller 1814 and I/O bus 1816 which may be, for example, an ISA, EISA, PCMCIA, PCI, or CompactPCI bus. The typical storage capacity of such magnetic disk technology is on the order of gigabytes, but the access times are orders of magnitude slower than the other levels of the memory hierarchy (e.g., 12 ms).
The technology described herein provides an architecture in which each of the memories outside of microprocessor 1802 may be implemented with the all-metal giant magnetoresistive memories described herein. These memories will also be referred to herein as SpinRAMs®. A comparison of the memory technologies described herein with the conventional memory technologies they replace is given in Table 1. The SpinRAM technology replacement for DRAM/FLASH is also referred to as SpinRAM2 and the replacement for rotating disk storage is referred to as SpinRAM3. SpinRAM1 is the replacement for SRAM such as that used in cache memories.
An example of a unified memory architecture will now be described with reference to FIGS. 19(a) and 19(b).
It should also be noted that, although all three of the cache, system and hard disk memories are replaced in this example, some other subset of these memories (e.g., just the disk drive and system memory) may be replaced by all-metal giant magnetoresistive memory technology.
With reference to ISA subsystem 1962, ISA SpinRAM hard card 1970 replaces the disk drive and controller of system 1900. The memory architecture of SpinRAM hard card 1970 may be, for example, any of the architectures and memory designs described above with reference to
A block diagram of a specific implementation of a SpinRAM hard card 1970 is shown in
The desired functionality of SpinRAM controller 2004 may be implemented, for example, by modifying an existing chip set, using discrete components, or designing a custom controller ASIC. The final interface between controller 2004 and the actual memory cells of SpinRAM array 2002 comprise module interface circuits (not shown) such as, for example, selection matrices 104 and 106 of
Referring back to
It should be noted that the examples of specific memory architectures described above are tailored to replace an existing installed base of computer systems in which the ways in which the different types of memories are connected to the system are artifacts of the characteristics of the memory technologies themselves, and may not take full advantage of the performance capabilities of the SpinRAM technology described herein. That is, for example, although plugging a SpinRAM hard card as a replacement for a hard disk drive may represent a simple and fast integration of giant magnetoresistive memory technology into the vast installed base of IBM compatible PCs, a more fundamental memory architecture shift is contemplated which will more readily exploit the advantages of all-metal memories.
This may be understood with reference to the architectural constraints of the PC bus system. Because the time required for a CPU to retrieve data from a conventional hard disk is primarily a function of disk access time rather than propagation delay through the bus controller, there is little or no penalty associated with connecting the hard disk to the CPU through the controller. Of course, this is not the case for cache and system memory which are directly connected (architecturally) to the CPU. With the fast access times of SpinRAM technology, it is desirable to connect SpinRAM-based mass storage to the CPU in such a way to avoid the penalty imposed by conventional PC bus architectures. Such an embodiment is shown in
System ROM 2212 is also implemented as a giant magnetoresistive SpinRAM array. System ROM 2212 may be used, for example, to store a PC's BIOS code or user applications for a palm top device. Using the byte alterable SpinRAM for system ROM allows the capability of updating what is typically hard coded information in many of today's computer systems. According to another embodiment, cache memory 2214 may also be implemented using SpinRAM technology.
It will be understood that SpinRAM memory subsystem 2204 may be configured in a variety of ways. That is, subsystem 2204 may comprise different subsets of memories 2208, 2210, 2212 and 2214. In addition, different subsets of these memories may be integrated in the same device or configured as separate modules.
It will be understood by those skilled in the art that changes in the form and details of the memory technologies described above may be made without departing from the spirit or scope of the invention. For example, specific embodiments have been described herein with reference to a selection matrix implemented using single input transpinnors (e.g., see
In addition, it will be understood that the number of memory access lines required to access information in the individual memory cells in a memory array will vary in accordance with the structure of the memory cells and the number of bits stored in each. The number and types of access lines for a given memory cell structure may be determined by one of skill in the art of memory technology from, for example, the descriptions of various GMR memory cells herein.
Furthermore, although an example of a unified memory architecture has been described herein in the context of specific architecture types, it will be understood that a wide variety of memory architectures for computers and other systems are enabled.
As discussed above, in one such architecture a rotating disk is physically but not logically replaced with a SpinRAM array. That is, a memory controller is configured such that the rest of the system operates as if it is connected to a rotating disk, but the controller interacts with the SpinRAM array. Such an architecture eliminates the disadvantages of rotating disk memories (e.g., long access times, susceptibility to environmental conditions) without the need for extensive retrofitting or redesign of installed computer base.
Another contemplated architecture involves the partial replacement of system memory with SpinRAM technology. The SpinRAM portion of the system memory could, for example, be used to store data that must be preserved in the event of a power failure. According to a specific embodiment, the SpinRAM portion of the system memory store a small RAM file system which provides very fast access to a subset of the system's overall file stores.
Of course full replacement of system memory with SpinRAM technology is contemplated as well. This would allow expansion of the use of system memory to include data which must be maintained through power loss and system reboots. Such a system could recover much faster than conventional systems after a power down has occurred. All that would need to be done is the normal processor power-up diagnostics and the restoration of the internal machine state. No time would be wasted reloading information from mass storage to system memory.
Another contemplated architecture replaces both system DRAM and magnetic disk storage with SpinRAM technology. The replacement of both of these memories makes possible the unified memory architecture in which most or all of a computer system's memory is implemented using a single technology, i.e., SpinRAM. Further variations of such an architecture include the replacement of other memories with SpinRAM technology including, for example, cache memory and system ROM.
A simplified block diagram of a generalized computer system based on SpinRAM technology is shown in
It should be noted that SpinRAM technology allows the cache paradigm to be carried throughout system 2300 regardless of the number of SpinRAM levels. Thus, for example, CPU 2308 receives data from its level one cache. The level one cache receives data from the level two cache (e.g., cache 2310). The level two cache receives data from main memory 2302. Main memory 2302 acts as a level three cache in concert with SMU 2306. Finally, main memory 2302 receives data from mass storage memory 2304 which acts as a fourth level cache.
The foregoing describes the basic theory of operation underlying SpinRAM technology and transpinnor-based electronics and a few representative examples of the wide variety of applications for which such technology is suited. As should be appreciated at this point, SpinRAM and other transpinnor-based electronics may be employed as the basic building blocks for virtually any type of electronic circuit or system currently implemented using conventional semiconductor technologies. However, given the ubiquitous nature of such conventional technologies, it is desirable to provide interface circuitry which is capable of translating signal between the transpinnor and semiconductor domains. Suitable interface technology is described in U.S. Patent Publication No. US-2004-0075152-A1 published on Apr. 22, 2004 (Attorney Docket No. IMECP016), the entire disclosure of which is incorporated herein by reference. As will become apparent, various embodiments of the invention may employ such interface technology (and any suitable alternatives) to integrate all-metal SpinRAM with conventional semiconductor circuits and devices.
As described above, SpinRAM may be used to implement a wide variety of memory systems and subsystems in virtually any computing configuration. More generally, SpinRAM may also be employed in any type of device the operation of which may be characterized by a state or sequential machine to render such devices nonvolatile. More specifically, embodiments of the invention enable the various components of a system to be kept operations ready when brought up from standby or unpowered modes through the use of an all-metal nonvolatile memory which has true random bit access and virtually unlimited write cycles.
According to various embodiments of the invention, a nonvolatile metal RAM, e.g., SpinRAM, is employed to preserve the last state of a sequential machine, thereby rendering the device based on the state machine nonvolatile. “Metal RAM,” and more generally “metal electronics,” is circuitry based on giant magnetoresistance (GMR) and involves no semiconductors. The foundation for such metal electronics is the transpinnor, an active element made of GMR films as described above. In the case of the magnetic SpinRAM, the support circuitry as well as memory array are made of GMR films. Thus, an entire block of SpinRAM (including support electronics) may be made of metal layers and insulators alone, with no semiconductors.
Systems and devices embodying sequential machines (i.e., algorithmic state machines, which form the basis for controllers, microcontrollers and other computers) have their power removed for the time they are out of operation. In order to resume the active mode, the device needs access to the information on its last state prior to having had its power removed.
The time needed to initialize the work memory and initialize all system registers in the typical computing system greatly depend on the computing configuration. Memory initialization time results from the transfer of the needed OS routines, peripheral drivers, and basic applications from the hard disk to the main memory. Register initialization time results from the sequential transfer of the content of all system registers from a backup system, a memory area that is either nonvolatile or permanently powered.
The current and future needs for system nonvolatility may be considered in the context of three types of processors: 1) embedded control computers, 2) real-time control systems, and 3) general-purpose computers.
In the first category, embedded control computers are becoming ever more pervasive, primarily because they can be implemented as single-chip stand-alone devices that can solve problems previously responsive only to interconnected larger modules. The use of nonvolatile memory can be critical for these systems. Essential state information must be carried forward from one activation to the next. This may not include the entire register and memory content, but is often a significant part of it. Additionally, these control computers are typically generalized, with a set of parameters to be tailored to the specific implementation and system, either when the unit is manufactured, first installed, or configured externally. These parameter values must be maintained even when power is removed. For these systems the volatility problem is completely solved by a nonvolatile register set.
In the past, many of these small computers have been used in isolated systems, where they are totally self-reliant and therefore must provide reliability to the full extent required by the system into which they are embedded. This usage will continue, but a new dimension has been added. Interconnectivity of these small computers is being motivated by the availability of high-speed serial-wired connection technology as well as by optical and wireless technologies. Because single-chip computers can perform both their original functions and these added communication tasks, they need to maintain not only their internal state, but also the state of the communication system and that of close neighbors to avoid continuous reconfiguration. This further increases their need for nonvolatility.
Currently, nonvolatility is provided in these embedded computers by a combination of flash and battery-backed SRAM. Both are problematic. Flash has long write times and a limited number of write cycles. Moreover, like all semiconductor memories, flash is becoming increasingly more vulnerable to radiation as cell size decreases. Roughly half of all soft errors are now accounted for by radioactive impurities in packaging materials. Both of these limitations impose severe constraints on how flash is used in embedded applications. Backup batteries have limited capacities, less than ideal temperature ranges, and may have mechanical mounting issues. Many embedded systems must function in very harsh environments.
In the second category are computers that serve in real-time control systems. These computers—which may be very large, very small, or anywhere in between—function as part of larger systems, in which the real processing involves interaction of machines and equipment that form part of the system. Human interaction is typically limited to operator interfaces. In these systems there is usually enough redundancy so that the processing load can be shifted to other computers if one goes down. Therefore, state restoration can proceed more leisurely than for embedded systems. However, for critical components—and depending on the design of the system—the need for nonvolatility is just as great as for the first category. Typically, these systems have an OS, and state recovery can be functionally built into it.
In the third category are general-purpose computers, e.g., handhelds, laptops, desktops and servers. Their technology is basically human-interface and processing oriented. In these systems, nonvolatile memory, though not as critical as in either of the first two types, is certainly desirable. For example, at the current state of technology, e.g. Windows XP, a 3 GHz computer can take as much as several minutes to power up. These times grow significantly, depending on the configuration to become operational, e.g. when part of a networked cluster. A suitable nonvolatile technology can be a major contributor to improving user and system productivity.
The current state-of-the-art for power-management logic in computing systems, either on chip or as a separate component, is to detect the transition into the power-down mode by generating a power-down signal. The power-supply module is typically equipped with a power capacitor that creates a power reserve for a certain number of system cycles. This ensures that once the power-down condition has been detected and a power-down signal generated, the system will still be able to perform additional cycles, e.g. on the order of 10, until it grinds to a complete halt. For a 1 GHz system clock this translates into 10 ns, not nearly enough to back up the system registers into the nonvolatile portion of the main system memory, even if such nonvolatile memory exists. A capacitor-type power-down reserve that provides enough reserve for the computer to save all registers into the nonvolatile portion of the main memory is not economically feasible in stationary computers (like desktop PCs, work stations, servers), and even less so in portable (notebooks, laptop PCs) or mobile computers (palmtop PCs), which have severe space constraints.
Therefore, according to various specific embodiments of the invention, a metal-electronics system is provided that renders existing semiconductor computing components nonvolatile. According to a specific embodiment, two subsystems are provided on a single chip; a second set of registers (e.g., SpinRAM registers) to duplicate the contents of the semiconductor register set; and an interface between the semiconductor set and the metal set of registers, containing both semiconductor and metal logic.
The properties of SpinRAM that can keep computer components operations ready when brought into the active mode are that—unlike flash—it has true random-bit access, is fast (SRAM speed or faster), has virtually unlimited write cycles, and is inherently radiation resistant. SpinRAM and transpinnor electronics will thus enable a nonvolatile computer component, even with the time constraints imposed on the system by an economic power-down reserve system. In addition, such use of SpinRAM will allow the computer component to almost immediately resume operation from the point of interruption when returning into the active mode from the standby or unpowered modes.
According to different embodiments of the invention, metal memory cells (e.g., SpinRAM cells) can be used to preserve the last state of the system, and thereby render a sequential machine nonvolatile, in at least two different ways. According to a first set of embodiments, a second set of metal registers is built into existing semiconductor computing components to duplicate the functionality of the semiconductor register set for each system module separately. These registers will be referred to herein as “shadow” registers. According to a second set of embodiments, the entire semiconductor register set is replaced by a SpinRAM array of registers. System-module examples that can be made nonvolatile according to either set of embodiments include central processing units, graphic processing units, arithmetic processing units, input/output units, storage units and many more. In both cases the metal registers realize the desired goal, i.e., register contents are preserved upon loss of power.
Shadow registers are advantageous for some applications because they leave the conventional system module architecture largely unchanged. Replacement of the semiconductor registers by SpinRAM requires more complex system changes, but the register set in such embodiments will not lose its content when power is removed, and there will therefore no longer be need for either the register backup cycle prior to power removal or register initialization cycle after power is restored. That is, initialization time will be zero.
Transpinnor and GMR logic levels are easily adapted to one another and to CMOS levels thereby enabling seamless connection of logic in and among CMOS and GMR circuits. Circuits suitable for implementing interfaces between current-based transpinnor logic levels and voltage-based semiconductor logic levels are described in U.S. Patent Publication No. US-2004-0075152-A1 incorporated herein by reference above. Using such techniques, all types of GMR film blocks can be monolithically embedded into CMOS structures. Shadow registers are an example of the unique capabilities—nonvolatility in this case—that GMR film blocks can confer to mature CMOS system modules such as microcontrollers and microprocessors.
According to various embodiments, shadow registers can be activated in a variety of ways. For example, in an automatic mode, the system module writes into the shadow register set every time a semiconductor register is updated. Alternatively, the contents of all shadow registers may be saved upon receiving a signal from the system, e.g., a power-down signal generated by the power management logic. Still other alternatives might involve updating shadow register contents periodically or in response to specific events.
According to a specific embodiment, the input and output of a semiconductor register bit and the corresponding shadow register bit are connected. The output of the configuration is fed back to its input.
Embodiments in which the shadow registers are frequently updated are suitable when the write speed of the shadow register bits is equal to or faster than that of the semiconductor register bit. Such embodiments may even be suitable when the write speed of the shadow register bits is somewhat slower than that of the semiconductor register bit, provided the slowdown in speed is acceptable. On the other hand, embodiments employing less frequent updates may be called for when the write speed of the shadow register bits is significantly slower that that of the semiconductor register bits. This avoids the system slow down which would otherwise result writing to the slower shadow registers on every write cycle.
According to some embodiments, shadow register bits are in close physical proximity to their corresponding semiconductor register bits. The two bits of a pair may also be logically close, i.e., they may be connected point-to-point with no other logic layer between them. Therefore, any information transfer between the semiconductor register bit and the shadow register bit can take place in a single system clock cycle, e.g. on the order of 1 ns for a 1 GHz system clock. In such embodiments, this is also the time required to transfer information between the two register sets regardless of the number of register bits in the system because the transfers are made in parallel. This is ample time to back up all semiconductor registers into their respective shadow registers in the time typically made available (˜10 ns) by an economic power-down reserve system. Because the shadow register set allows cutting off power with only two clock cycles notice (i.e., one clock cycle to enable register output, and one cycle to latch the contents into the shadow registers), and restores the content of all system registers simultaneously in a single clock cycle, the system can reduce power consumption by cutting power more frequently.
A further advantage associated with the use of shadow register bits as described herein is that both the content backup and content restoration routines may be implemented in the component hardware, therefore eliminating the need for complex software routines to perform these tasks. That is, such routines consume significant system resources themselves in terms of software, power and time. Additionally, the occurrence of any system problems during the operation of such routines represents a potential disaster.
As a reference for register initialization times, some subsystems use flash as backup for configuration states; this is the case for FPGA chips. In this situation, recovery requires reading sequentially from flash and setting the SRAM-configuration control registers through a serial shift register. This typically takes on the order of milliseconds rather than the 1 ns or less for the specific embodiments of the invention described herein.
Thus, some embodiments of the present invention provide all-metal registers that are built into semiconductor components which mirror the registers of these components. These registers have the capability to store the module register data and eliminate the need to store these data in the system memory. According to more specific alternative and/or complementary embodiments, (1) the all-metal memory block is automatically updated in a transparent mode every time the corresponding semiconductor register is updated; (2) the all-metal memory block is automatically updated in a transparent mode, as triggered by either a self-initiated or a system-initiated signal; (3) the all-metal memory block is automatically updated whenever power is cut off from the respective module, as triggered by a self-generated or by a system-generated signal; and (4) the module registers are initiated by transferring the contents of the all-metal memory block to the register set every time the system module is powered.
The conversion from GMR logic levels to CMOS or TTL levels requires a conversion translation from the GMR output. The GMR output may be characterized by a small open-circuit output voltage or a moderate short-circuit output current. The preferred output is the short-circuit load where the logic levels are currents. Short-circuit output has the advantage that output-load capacitance effects, which normally cause slower responses, are significantly reduced. Open-circuit output operation is also possible, but with load-capacitance effects present.
A variety of exemplary interface circuits will now be described which may be employed to effect conversion between all-metal GMR circuitry and conventional semiconductor circuitry. It will be understood with reference to this description and the accompanying drawings that these or similar interface circuits may be employed, for example, to implement the tight coupling between semiconductor register bits and all-metal GMR register bits referred to above. It will also be understood that these or similar interface circuits may be employed (as will be described with reference to
The conversion of CMOS or TTL logic signals to signals suitable for GMR logic is a simple operation in which the CMOS or TTL output voltages are converted to currents.
A CMOS design can also make use of the excellent current-source capability of CMOS to fashion an output driver that provides the logic-level currents directly in a regulated fashion. The normal practice is to refer current sources to a bandgap-style regulator, which has a reference current output commonly called a PTAT.
“PTAT” is an abbreviation for Proportional To Absolute Temperature. PTAT is a commonly used term in CMOS design. In CMOS analog circuit design, the current practice is to design a power supply regulator commonly referred to as a bandgap regulator. A typical bandgap regulator includes a pair of diodes used to produce a reference current that is PTAT and used to compensate temperature variations to produce the bandgap voltage for the regulator. It is common practice to also use this PTAT current (since it is there already) as a reference to bias the rest of the chip through the extensive use of CMOS mirror circuits. Virtually everywhere a bias current is needed, a PTAT mirror is used—examples include amplifiers, comparators, digital-to-analog reference currents, external device biases, etc.
In
The SAVE signal is generated by external circuitry and causes the semiconductor register bits to be copied into the shadow memory bits. According to one implementation, the SAVE signal is only issued at times when the semiconductor contents are in a state that may need to be recovered. The RESTORE signal is also generated by external circuitry and causes the shadow memory contents to be restored into the semiconductor register. The RESTORE signal is issued only when a recovery of the saved state needs to be performed.
Save Control circuit 3606 is a control circuit that causes the contents of the semiconductor register bits to be recorded into the shadow memory bits. The circuitry can be a simple AND gate with appropriate level shifting to convert the voltage implemented logic levels in the semiconductor register into the currents needed to set the state of the GMR memory elements.
Restore Control circuit 3608 is a semiconductor control circuit that converts the current output of the GMR memory elements into voltage levels and then AND's these with the RESTORE signal to set the semiconductor register bits.
As will be understood, the specific implementations of the save and restore controls are dependent on the design of the semiconductor register and on the design of the shadow register. The discussions provided above relating to transpinnor-based circuits and the interfacing of semiconductors and all-metal GMR based circuitry, and the information provided in the patent documents incorporated herein by reference above, are sufficient to enable one or ordinary skill in the art to implement the appropriate design.
According to another generalized set of embodiments, the present invention provides an all-metal nonvolatile register set, built into a semiconductor-based system component to replace the semiconductor-based register set. Such a configuration is shown in
A block diagram of exemplary interface circuitry between the CMOS circuitry of a semiconductor-based component and a SpinRAM array is shown in
The input interface to the semiconductor circuitry shown on the left-hand side corresponds to a CMOS set of controls. The one-bit data line DATA is tri-stated and receives the input value for a write and sends the output value during a read. The chip select SELN (active low) enables read or write operations. A high signal for the read and write control RD/WRN indicates a read, and a low a write. The memory-operation cycle starts when SELN is pulled low or the RD/WRN signal changes. During changes, the signals on the ADDR lines must be stable.
The blocks in the interface perform the following functions. The ADDR BUFFER block is a buffer register with gating logic to capture the address when a read or write operation starts. The register holds the address stable during the operation. The BIT DRIVE SEL block is an address decoder and an analog current generator that produces the half-select bit (column) drive currents for the SpinRAM column-driver transpinnors. Only one of these drivers is active at a time. The currents generated are dual polarity and two levels corresponding to the read currents for switching the SpinRAM soft layer and corresponding to the write currents for switching the SpinRAM hard layer.
The WORD DRIVE SEL block is an address decoder and an analog current generator that produces the half-select word (row) drive currents for the SpinRAM row-driver transpinnors. It operates similarly to the BIT DRIVE SEL circuitry. The DATA BUFFER block is a 1-bit buffer register with logic to control write-operation currents and to receive the bit read during a read operation. During a read, the tri-stated input line is activated to output the bit read.
The READ/WRITE LOGIC block receives the read or write request along with the SELN signal to start a read or write operation sequence. A state machine sequences through a set of states to drive the SpinRAM memory selectors. For reads, a sequence of operations are performed to determine whether the selected bit is a 1 state or a 0 state. The proper timing sequences for applying the select currents and gating the output onto the DATA line are also generated in this block. The CLOCK LOGIC & POWER DISTRIB block receives control from the SELN signal going low and initiates a sequence of actions conditioned by the RD/WRN pin state.
In addition to these embodiments and as described above, SpinRAM can also be used as nonvolatile main memory for any computing configuration, thus rendering the memory-initialization cycle unnecessary.
While the invention has been particularly shown and described with reference to specific embodiments thereof, it will be understood by those skilled in the art that changes in the form and details of the disclosed embodiments may be made without departing from the spirit or scope of the invention. In addition, although various advantages, aspects, and objects of the present invention have been discussed herein with reference to various embodiments, it will be understood that the scope of the invention should not be limited by reference to such advantages, aspects, and objects. Rather, the scope of the invention should be determined with reference to the appended claims.
The present application claims priority under 35 U.S.C. 119(e) to U.S. Provisional Application No. 60/501,670 for NONVOLATILE SEQUENTIAL MACHINE filed on Sep. 9, 2003 (Attorney Docket No. IMECP020P), the entire disclosure of which is incorporated herein by reference for all purposes.
Number | Date | Country | |
---|---|---|---|
60501670 | Sep 2003 | US |