IBM® is a registered trademark of International Business Machines Corporation, Armonk, N.Y., U.S.A. Other names used herein may be registered trademarks, trademarks or product names of International Business Machines Corporation or other companies.
1. Field of the Invention
This invention relates to dynamic random access memory (DRAM) and refresh engines, and particularly to systems and methods for a DRAM concurrent refresh engine with processor interface.
2. Description of Background
DRAM is a type of random access memory (RAM) that stores each bit of data in a separate capacitor within an integrated circuit. Since capacitors leak charge, the information eventually fades unless the capacitor charge is refreshed periodically. Since DRAM loses its data when the power supply is removed, it is in the class of volatile memory devices. DRAMs can also include an on-chip cache, thereby having a main memory portion and a cache memory portion. Such cache DRAMs can be implemented in low-end workstations and personal computers, as well as high-end systems as a secondary cache scheme. DRAM uses refresh circuitry for the purpose of maintaining the charge and thus information stored. If the refresh cycle is interrupted for any length of time, the information in the memory is lost. There persists a need for a fast DRAM cache having refresh that does not degrade performance of the DRAM.
Exemplary embodiments include a memory array system having a memory and refresh engine, the system including memory cells requiring periodic refresh at least once each for a specified refresh interval and words of an array organized banks in which the banks are selected for access by a bank-enable signal, each bank having a word decoder accepting one of two refresh word addresses, wherein one refresh word address is for a normal access, and one refresh word address is for a refresh access, one of the two word addresses selected by two separate enable signals, the enable signals provided by on-macro refresh logic, wherein the on-macro refresh logic includes instructions to select one bank for refresh when no normal access occurs and select one bank for refresh concurrently with a normal access having no bank conflicts, the refresh logic maintaining the refresh status, timing of the refresh interval, and insuring all memory cells are refreshed within the refresh interval by providing a potential time out flag to a processor/memory controller to request inhibiting of further conflict accesses.
Additional embodiments include a method for managing refresh intervals in DRAM, the method including providing memory cells requiring periodic refresh at least once each for a specified refresh interval, organizing words within an array organized banks in which the banks are selected for access by a bank-enable signal, each bank having a word decoder accepting one of two refresh word addresses, wherein one refresh word address is for a normal access, and one refresh word address is for a refresh access, one of the two word addresses selected by two separate enable signals, the enable signals provided by on-macro refresh logic, selecting one bank for refresh when no normal access occurs and selecting one bank for refresh concurrently with a normal access having no bank conflicts, the refresh logic maintaining the refresh status, timing of the refresh interval, and insuring all memory cells are refreshed within the refresh interval by providing a potential time out flag to a processor/memory controller to request inhibiting of further conflict accesses.
System and computer program products corresponding to the above-summarized methods are also described and claimed herein.
Additional features and advantages are realized through the techniques of the present invention. Other embodiments and aspects of the invention are described in detail herein and are considered a part of the claimed invention. For a better understanding of the invention with advantages and features, refer to the description and to the drawings.
As a result of the summarized invention, technically we have achieved a solution which over a large percent of the DRAM accesses, allows refresh to occur simultaneously with a normal access, and does so with a very simple refresh engine and extremely simply interlace to the processor for controlling conflicts between normal accesses and a potential refresh period time-out.
The subject matter which is regarded as the invention is particularly pointed out and distinctly claimed in the claims at the conclusion of the specification. The foregoing and other objects, features, and advantages of the invention are apparent from the following detailed description taken in conjunction with the accompanying drawings in which:
The detailed description explains the preferred embodiments of the invention, together with advantages and features, by way of example with reference to the drawings.
Exemplary embodiments include concurrent refresh structure and logic for a 2 Mbit eDRAM macro. This macro contains all the refresh controls including the generation of an interrupt (Busy) signal sent back to the CPU/memory controller to stall any new normal accesses when refresh completion becomes imperative. This macro continuously attempts to refresh itself within any given, preset refresh time interval, TRI, required for all bits to be refreshed. The macro also has controls to insure only the necessary number of one refresh per bit every full TRI, thus avoiding any unnecessary refreshes, which waste power.
In one exemplary embodiment of this logic structure, and the resulting system requirements include one refresh and one normal access for each eDRAM cycle unless the two current pending refresh-banks are identical and coincide with normal access to same bank. In addition, refresh is free running under its own control and only does one refresh per word in the TRI refresh time interval. For a pathological case of possible incomplete refresh within the TRI time, the macro issues an INTERRUPT (Busy) signal back to the CPU in advance of the TRI boundary and completes the refresh interval if the CPU/memory controller does not issue any more access requests. The refresh engine continues automatically and reset itself for each new refresh interval, TRI. In general, no external control is needed except for start-up initialization of two counters and two shift registers. Refresh time, TRI, can easily be changed, as well as the time of issue of the Interrupt signal within the TRI interval. The system requirements for this exemplary embodiment include a CPU/memory controller that is able to halt normal accessing upon assertion of eDRAM Interrupt (Busy) signal.
In another exemplary embodiment, a selectable, multi-mode operation includes a refresh engine that has two modes, externally selectable by a single control bit. In one implementation, mode one is similar to the first mode described above. In another exemplary implementation, mode two includes a refresh cycle that can take place only when the CPU has asserted a separate input signal that allows refresh. The macro is now described in detail in the following discussion.
For ease of description, and only as an example, the concurrent refresh engine is described with respect to implementation in a 2 Mbit macro consisting of four separate, physical banks. The 2 Mbit macro can be constructed from the 1 Mbit macro shown in the top or bottom portion of
For a 2 Mbit Macro, four pBanks are on top and four are on bottom with REFRESH to 2 pBanks simultaneously, one on top and one on bottom. In addition, one RAC (Refresh Address Counter) for 2×4 pBanks is included. Bank selection enable for two shift registers as now discussed determine refresh. FU (shift Up) register A, and FD (shift Down) register, B, point to the next pBanks to be refreshed. The same RAC is used for the word, address of all pBanks. Register A or B is shifted after one word is refreshed in all banks. RAC is incremented +1 (or −1 as logic dictates) only after the same word in all banks have been refreshed and the shift occurs when position of register A=B as described below. It is possible to allow two refreshes per cycle (both A and B) if no conflict with a normal access.
For this 2 Mbit macro, each register, A & B needs four positions (one for each logical bank) since, on any array cycle, one can be stalled by a normal access and if is desirable for the other to be able to continue on one of the remaining pBanks.
A full, TRI sec interval starts with RAC=0 and A & B in positions having maximum separation (e.g. A to left at position 0, B to right at position 3). Each time the word line pointed to by RAC is refreshed in the pBank pointed to by A, it is then shifted to A+1. Similarly, each time the same word line as per RAC and a different pBank pointed to by B, is refreshed, B is incremented to the next B−1 position.
When the two tokens (1's) in A and B completely circulate and come back together, and that position has also been refreshed, the refresh cycle is complete for the ONE word in RAC, so the RAC is Incremented by 1. After the first time A=B, 255 more word refresh cycles are performed, during which RAC increments from 0 to 255 (or 255 to 0). After this event occurs, the full array refresh operation has been completed and must have been done within TRI sec. If the 2 Mbit macro has a cycle time of Ta sec., then since only 1K cycles (1K words=4 banks*256 words/bank) are needed for this full array refresh, this would take only Ta Ksec, if one bank refresh occurs every cycle. This scenario is possible if there are no bank refresh collisions with normal access demands. Even for most eases of some collisions, the refresh likely completes in less than TRI since typically TRI>>Ta. Thus, after all 1K words are refreshed, if the refresh logic just continued to operate, there could be excessive numbers of unnecessary refreshes, which consume excessive power. To prevent too many refreshes from being done, the RAC will have an Enable-Refresh bit, ER, which starts at 1, is used to enable refresh cycles, and is set to 0 when the RAC increments back to 0. A system clock which counts to TRI is used to set this bit back to 1. Refresh can occur only when this Enable Refresh bit is set to 1.
In a first scenario, with one refresh per cycle, the control functions are significantly simpler if only one bank is REFRESHED on any 1.3 cycle. Register A is given priority and is chosen, unless a collision occurs with a normal access, which causes B to be chosen.
On the next memory cycle, the above operations repeat but with either A or B pointing to a different pBank, using the same RAC address as previously. The operations continue for a total of four cycles (in the case of four logical banks), which refreshes the same word in each pBank. At this time, register A is aligned with B and this is used as the signal to increment RAC as described below. Additional details for this logic are now discussed.
Referring to
The logic for controlling the incrementing of the RAC is shown in
As described above, RAC is incremented only after the word line specified by the current RAC is refreshed in all four logical banks. This condition occurs when A=B and is obtained as follows. Referring to
The logic for controlling the full refresh interval of TRI sec. is shown in
As long as registers A and B do not point to the same logical bank, there is always one bank available for refresh, no matter what bank is accessed for a normal request. Even when A=B and has a collision with the normal access, there are often sufficient extra cycles for the refresh to be completed within TRI sec. However, there are conditions for which the 1K word refresh cycles are not going to completion within the TRI sec and require both detection and action. When A=B, there is only one logical bank available for refresh, and this can give rise to a pathological condition. If repeated, normal accesses occur to the same bank pointed to by A=B for many cycles, at some point normal accesses may cease and start a forced refresh. For example, if there are 100 words remaining to be refreshed in all four banks, a total remaining refresh requirement is 100×4×Ta sec/cycle=400×Ta sec. If the elapsed time since the start of the current TRI refresh interval is TRI=400 Ta, then there is only the exact number of cycles remaining for refresh if one refresh occurs each remaining cycle. Therefore, normal access is stalled and refresh goes to completion.
There are options during the TRI refresh cycle in which a forced completion of the refresh or some portion of it can occur. In one option, a wait period is identified close to the TRI limit and then cheek for a forced refresh. In one case, only three words would have been refreshed (A=B for RAC=0, and is pre empted by normal access) and the remaining time in the refresh interval is about 400 Ta. Thus, a forced refresh is necessary and for the last 400 Ta sec of the TRI sec interval all pBanks are unavailable for normal, access, (i.e. the CPU would have to be able to tolerate a nearly 400 Ta sec lockout). This lockout time can be small, e.g. for Ta=4 or 2 ns, the lockout is 4 or 2 micro sec. respectively. In another option, the TRI sec interval is divided into, say, P intervals and insists that 1024/P refreshes be completed in each smaller interval. Thus, if these have not been completed, the pBanks are unavailable for only a max of 400 Ta/F sec., which can be made small. However, any P>1 can introduce more total CPU lockouts than P=1 because some of the lockouts in smaller intervals have a high probability of completing and not being noticed in a larger time interval.
In the logic below, a TRI sec interval is used fundamentally. TRI sec is timed by an NRI bit counter where NRI is the upper integer value of: NRI=log 2[TRI]=log 19[TRI]/log 10[2]. Each tick of the clock, which increments this counter, is equal to the Ta sec. required for refresh. In the following discussions, each counter can be Up-counter (starting from 0, counting up) or Down-counter (starting from some preset count, and decrementing downward in count), which changes the logic slightly. In one implementation, the RAC is a down counter, which starts at 255 and is decremented (counts down) for each “increment” input signal. The refresh interval counter, RI can be an Up-counter, starting at 0 and counting up to a maximum possible count value and maximum time Tmax, respectively, of:
CountMax=2N
However, since the given value of TRI may be less than this Tmax., some logic is embedded within this counter macro, which provides a Reset-to-0 signal when, the counter reaches the given TRI value. The setting and use of these counters is described in detail below.
A full TRI sec refresh cycle begins when the RI counter toggles to 0. As a result, the refresh is enabled by setting latch ER=1 and the refresh engine runs unattended. At some time, Tx, before the end of the current refresh cycle, (i.e. as the RI counter nears the end of its interval cycle), a check is made, as described previously, to determine if the full refresh has been completed. If not, the Interrupt signal is asserted to insure full completion of the current refresh cycle. Tx is some pre-specified time, which is less than TRI by at least the full time it would take to refresh the entire macro if no other accesses occurred, i.e. for the macro of
In most cases, the refresh would have been completed lone before this Tx time is reached. When the count reaches the value of Tx, the signal, RI, is asserted to indicate this checkpoint has been reached. The “check” is, “Has RAC reached 0?” If it has, all the words have been refreshed in this interval. If RAC is not 0, there are remaining words to be refreshed. Thus for the rare cases when RAC is not 0 when RI=1, the Interrupt signal, Int, is set to 1 which is used by the CPU/memory controller to halt normal accesses. Now refresh automatically commences because there are no incoming collisions, and goes to completion indicated by RAC=0. At this point, the interrupt signal, Int is automatically reset to 0 by RAC=0, so the normal accesses can restart. In general, the CPU can be interrupted for the exact number of cycles required to complete the refresh.
Referring to
The processor cycle time is at least 2× or more times as fast as the eDRAM cycle time. As a result, with a free-running refresh engine, it is possible that a CPU access request arrives on a boundary that occurs at the mid point of an on-going refresh cycle. In such cases, the processor waits additional cycles for the DRAM to be available and reload to start. For high-end systems, such an extra delay is either undesirable or unacceptable. This can be avoided by not using a free-running refresh engine but rather let the CPU provide a control signal to allow refresh cycles to proceed.
In an exemplary embodiment, the CPU/controller provides a CPU Enabled Refresh, CER, signal each cycle, which indicates that a new refresh cycle can proceed. In such a case, the refresh engine distributes an Enable Refresh signal at the beginning of each such cycle, and lasts for only one cycle. The remainder of the logic remains essentially unchanged. Only the turning on and off of ER is different from that in the free-running refresh logic
A fast eDRAM macro with only CPU controlled refresh is likely to be unsuitable for low and mid range systems. It is possible to have a single DRAM refresh design. In an exemplary embodiment, an Enable Refresh signal to allow or prevent refresh on any cycle can be implemented. For example, one system is a multi-mode refresh engine in which either a free-running mode or a CPU controlled mode is initialized on the macro and it continues in that mode until reset. The resetting can be done at start-up time, or it can he done dynamically, on the fly. Such a system is shown in FIG 6.
An additional system is shown in
The capabilities of the present invention can be implemented in software, firmware, hardware or some combination thereof.
As one example, one or more aspects of the present invention can be included in an article of manufacture (e.g., one or more computer program products) having, for instance, computer usable media. The media has embodied therein, for instance, computer readable program code means for providing and facilitating the capabilities of the present invention. The article of manufacture can be included as a part of a computer system or sold separately.
Additionally, at least one program storage device readable by a machine, tangibly embodying at least one program of instructions executable by the machine to perform the capabilities of the present invention can be provided.
The flow diagrams depicted herein are just examples. There may be many variations to these diagrams or the steps (or operations) described therein without departing from the spirit of the invention. For instance, the steps may be performed in a differing order, or steps may be added, deleted or modified. All of these variations are considered a part of the claimed invention.
While the preferred embodiment to the invention has been described, it will be understood that those skilled in the art, both now and in the future, may make various improvements and enhancements which fall within the scope of the claims which follow. These claims should be construed to maintain the proper protection for the invention first described.