Memory controller to utilize DRAM write buffers

Information

  • Patent Application
  • 20060123187
  • Publication Number
    20060123187
  • Date Filed
    December 02, 2004
    20 years ago
  • Date Published
    June 08, 2006
    18 years ago
Abstract
A method, an apparatus, and a computer program are provided to account for data stored in Dynamic Random Access Memory (DRAM) write buffers. There is difficulty in tracking the data stored in DRAM write buffers. To alleviate the difficulty, a cache line list is employed. The cache line list is maintained in a memory controller, which is updated with data movement. This list allows for ease of maintenance of data without loss of consistency.
Description
FIELD OF THE INVENTION

The present invention relates generally to Dynamic Random Access Memory (DRAM) write buffers, and more particularly, to a memory controller that utilizes DRAM write buffers.


DESCRIPTION OF THE RELATED ART

In current DRAM designs, DRAMs often employ a single port to drive write data and sense read data. A bidirectional bus is used, and since the two different operations employ this resource, conflicts often occur within the recipient device. Thus, due to such conflict, delays will result on write to read turnarounds. Each delay then collectively contributes to poorer performance of the memory generally.


To combat the latencies associated with resource conflicts that result from write to read turnarounds, Extreme Data Rate (XDR™) DRAMs, which are available from Rambus, Inc., El Camino Real, Los Altos, Calif. 94022, can employ write buffers (WBs). The WBs allow the XDR™ DRAMs (XDRAMs), as well as some other DRAMs, to hold one or more cache lines in buffers, which are not actually written to the XDRAM cores. A cache line can be 128 bytes, but could be more or less depending on the implementation. The cache lines are stored in the WBs until instructed to be written to the XDRAM cores. The act of loading the WBs is called preloading, and the committing of the data to the XDRAM cores is called writing. A write command preloads the data for some future write while writing the oldest WB data to the address specified.


However, maintaining an accounting of the WBs by a memory controller can be difficult and require additional hardware. To complicate matters, the number and depth of WBs varies architecturally. Therefore, there is a need for a method and/or apparatus for maintaining an accounting by a memory controller of data stored within WBs with a reduced amount of additional hardware that addresses at least some of the problems associated with conventional method and/or apparatuses.


SUMMARY OF THE INVENTION

The present invention provides an apparatus, a method, and a computer program for utilizing Dynamic Random Access Memory (DRAM) write buffers in a memory system comprised of a plurality of DRAMs. As data is written to the write buffers, a list is generated in the memory controller. The list indicates which cache lines are pending in the DRAM WBs. Once generated, the list is stored in the memory controller for future use and can be updated.




BRIEF DESCRIPTION OF THE DRAWINGS

For a more complete understanding of the present invention and the advantages thereof, reference is now made to the following descriptions taken in conjunction with the accompanying drawings, in which:



FIG. 1 is a block diagram depicting an XDR™ Memory system;



FIG. 2 is a conceptual block diagram depicting a simplified XDRAM;



FIG. 3 is a flow chart depicting memory controller utilization of the XDRAM WBs;



FIGS. 4A and 4B are block diagrams depicting the read and write queues;



FIG. 5 is a flow chart depicting the operation of adding new entries to the list of commands to execute; and



FIG. 6 is a flow chart depicting the execution of memory commands from read and write queues.




DETAILED DESCRIPTION

In the following discussion, numerous specific details are set forth to provide a thorough understanding of the present invention. However, those skilled in the art will appreciate that the present invention may be practiced without such specific details. In other instances, well-known elements have been illustrated in schematic or block diagram form in order not to obscure the present invention in unnecessary detail. Additionally, for the most part, details concerning network communications, electromagnetic signaling techniques, and the like, have been omitted inasmuch as such details are not considered necessary to obtain a complete understanding of the present invention, and are considered to be within the understanding of persons of ordinary skill in the relevant art.


It is further noted that, unless indicated otherwise, all functions described herein may be performed in either hardware or software, or some combinations thereof. In a preferred embodiment, however, the functions are performed by a processor such as a computer or an electronic data processor in accordance with code such as computer program code, software, and/or integrated circuits that are coded to perform such functions, unless indicated otherwise.


Referring to FIG. 1 of the drawings, the reference numeral 100 generally designates an XDR™ Memory system. The system 100 comprises a chip 102 and XDRAMs 104. The XDRAMs 104 comprise memory banks 110, WBs 122, and a high speed Input/Output (I/O) (not shown).


Traditionally, Synchronous DRAMs (SDRAMs) perform senses, precharges, and so forth in order to load and store data. XDRAMs, such as the XDRAMs 104, also employ many of the same operations and features common to various types of SDRAM. However, there are some significant differences between XDRAMs and other SDRAMs. For example, the XDRAMs employ differential signaling levels, where there is only a 200 mV difference between logic high and logic low to provide lower power consumption and higher switching speeds. XDRAMs also employ octal data rates to allow for 8 bits of data transmission during a clock cycle; therefore, data transmission for a 400 MHz clock would be 3.2 Gbps. Hence, additional control features are necessary to precisely operate and control the XDRAMs.


To control data, addresses, and commands sent to the XDRAMs, the chip 102 employs a number of on-chip components. Specifically, the chip 102 comprises a Memory Controller 106 and an XDR™ IO Cell (XIO) 108. Commands are communicated from the Memory Controller 106 to the XIO 108 through an internal command bus 112. The internal command bus 112 is unidirectional, allowing communication of commands and addresses from the Memory Controller 106. Additionally, data is communicated from the Memory Controller 106 and the XIO 108 through a pair of internal data buses 114. The internal bus data buses 114 are unidirectional and allow for data to be transmitted to the XIO 108 during store operations, while allowing data to be transmitted back to the Memory Controller 106 during load operations.


The Memory Controller 106 communicates with the WBs 122 via the XIO 108 to preload data. In other words, data can be preloaded into the WBs 122 without being written to the XDRAM 104 cores. However, maintaining an accounting of the data stored within WBs 122 is accomplished by utilizing a cache line list 124 incorporated into the Memory Controller 106.


Based on the information communicated between the memory controller 106 and the XIO 108, the XDRAMs 104 can be utilized. Control data passes from the XIO 108 to the XDRAMs 104 through an external, unidirectional control bus 116 that sends commands and addresses. Data is communicated between XIO 108 and the XDRAMs 104 through an external, bidirectional data bus 118.


The configuration of WBs 122 is such that data can be queued therein. Referring to FIG. 2 of the drawings, the reference numeral 200 generally designates a simplified XDRAM. The XDRAM 200 comprises the WBs 202, bank0204, and bankl 206. Two banks are depicted in the XDRAM 200 for the purposes of illustration; however, there can be more banks. Typically, there are 4 to 16 banks.


Data from the XIO, such as the XIO 108 of FIG. 1, is communicated through the data pins 208. For a write operation, once the data is passed to the data pins 208, the WBs 202 relays data through the communication channel 210 to either bankO 204 or bank1206. For a read operation, the banks 204 and 206 can output data to the data pins 208. The communication channels 212 and 214 are for restoring the contents of DRAM cells on a read, providing background data (typically the rest of the row) on a write, or for refreshing an entire row.


The XDRAM 200 operates by transmitting and receiving data through data pins 208. Write data is relayed to the WBs 202. From there, and under memory controller (not shown) control, the WBs 202 can provide either of the banks 204 and 206 with write data. Once data has been designated to be read from the memory core of the XDRAM 200, the data is transferred from one of the banks 204 and 206 to the data pins 208.


The overall operation of the system 100, though, is complicated as a result of utilizing the WBs 122 (or the WBs 202). Referring to FIG. 3 of the drawings, the reference numeral 300 generally designates a flow chart depicting memory controller utilization of the XDRAM WBs.


When operations begin, there are no entries in the WBs 122 (202). WBs contain store data that is pending a write to the core. As a command is processed, a determination is then made in step 302 as to whether a write or a read is to be performed.


When a read operation is to be performed, steps to perform the read are taken. If the read is dependent on any write including those in the WBs 122, the read is stalled. If the read is not dependent on any writes, a determination is made as to whether the write to read turnaround is met in step 304. Once the write to read turnaround has been met (if the previous operation was a write), the read operations are performed in step 306.


If a write operation is to be performed, then different steps are employed. A determination is made as to whether the read to write turnaround has been met in step 308. This determination compares the last issued command to the desired command. At this point, the reads that are dependent on writes cause those writes to advance to the top of the queue. Then, a determination is made as to if the WBs 122 in step 310 is full. If the WBs 122 are not full, then data is preloaded into the WBs 122 in step 312.


If, on the other hand, the WBs 122 are full, the banks 110 are analyzed. A determination is made in step 314 as to whether the bank 110 with the oldest or top WB entry is available, where the memory controller 106 waits until the bank becomes available. Once available, the oldest WB entry is written to the banks 110 in step 316. While writing the top entry, a determination is made as to if there is an available write in step 318. If there is a next write, the associated data is preloaded to the WBs 122 in step 320. However, if there is not a next write, then dummy data is preloaded into the WBs 122 in step 322. Step 316 and step 320 or 322 are also performed at the same time and are accomplished by a single write.


By filling the list 124 with dummy entries, several objectives can be accomplished. During periods where the chip is idle, all data can be committed to DRAM cores. If there are no pending writes in the WBs, reads will not be dependent on them. Also, when the chip is idle, the banks will become available so that additional reads and writes can be done.


The operation used for reads and writes can be complicated, employing additional modifications to the respective queues. Referring to FIGS. 4A and 4B of the drawings, the reference numerals 400 and 450 generally designate a read queue and a write queue, respectively, that reside in the memory controller. The read queue 400 comprises a number of entry slots 402 and a dependency list 404 that denotes dependencies on write commands. The write queue 450 comprises a number of entry slots 452, a dependency list 456 that denotes dependencies on read commands, and a priority indication 454 that denotes priorities of write commands.


Referring to FIG. 5 of the drawings, the reference numeral 500 generally designates the addition of entries to the queues 400 and 450.


When a command is presented to the queues 400 and 450, a determination is made in step 502 as to whether the command or entry is a read command or a write command. If the command is a read command, then another determination is made in step 504 as to whether the new read command is dependent on a write entry, which is denoted in the write dependency list 404 of the read queue 400. Dependencies on write commands force an existing write command to execute before the new read command executes; therefore, if there is a write dependency, then the write is given a higher priority for execution in step 508 that is denoted in the priority indication 454 of the write queue 450, and then the read command is queued in step 506. If there is not a write dependency, then the read command is queued in step 506.


Under circumstances where the new command is not a read command, but is a write command, other steps are performed. Firstly, a determination is made in step 510 as to whether the new write command is dependent on a read command. If the new write command is dependent, then it is marked as dependent in the read dependency list 456 of the write queue 450 in step 514, and the write command is queued in step 512. If there is no read dependency, then the new write command is queued in step 512.


Referring to FIG. 6 of the drawings, the reference numeral 600 generally designates the execution of entries from the queues 400 and 450.


The top or oldest entry (command) of the read queue is examined in step 602. A determination is then made in step 604 as to whether the read command is dependent on any writes. If the read command is not dependent on any writes, then the read can be safely executed in step 606.


However, if dependencies do exist, then other steps are taken. Because the read is at the top of the queue, some device is waiting for the read to occur, and the read should be performed quickly. A determination is made in step 608 as to whether there are writes to be performed. If there are no next writes to be performed (meaning that the write queue is empty), then the read is dependent on a WB entry so the WB entries are committed to the XDRAM cores and dummy entries are preloaded in steps 615 and 616. Then, another determination is made in step 604 as to whether there are still any dependencies on writes. When there are no dependencies, the read is issued in step 606.


In cases where there are writes to be performed, then a determination of priority is made. In step 610, a determination is made as to whether there are any high priority writes. Those higher priority writes are performed one at a time in step 612. If there are no priority writes, the other writes are performed in step 614 (the read is dependent on an existing WB entry if there are no priority writes).


The use of dependency procedures allows for proper arbitration of limited resources and makes sure read-after-write and write-after-read consistency is adhered to. The arbitration can then efficiently allocate how data can be moved into and out of the XDRAMs. Therefore, the overall performance of the XDR™ memory system can be improved.


It is understood that the present invention can take many forms and embodiments. Accordingly, several variations may be made in the foregoing without departing from the spirit or the scope of the invention. The capabilities outlined herein allow for the possibility of a variety of programming models. This disclosure should not be read as preferring any particular programming model, but is instead directed to the underlying mechanisms on which these programming models can be built.


Having thus described the present invention by reference to certain of its preferred embodiments, it is noted that the embodiments disclosed are illustrative rather than limiting in nature and that a wide range of variations, modifications, changes, and substitutions are contemplated in the foregoing disclosure and, in some instances, some features of the present invention may be employed without a corresponding use of the other features. Many such variations and modifications may be considered desirable by those skilled in the art based upon a review of the foregoing description of preferred embodiments. Accordingly, it is appropriate that the appended claims be construed broadly and in a manner consistent with the scope of the invention.

Claims
  • 1. A method for implementing a system that employs at least one write buffer for a memory system comprised of a plurality of Dynamic Random Access Memories (DRAMs), comprising: generating a list of which cache lines are pending in the DRAMs; and storing the list in a memory controller.
  • 2. The method of claim 1, wherein the method further comprises: determining if the at least one write buffer is full; preloading data to the at least one write buffer if the at least one write buffer is not full; and updating the list.
  • 3. The method of claim 1, wherein the method further comprises: determining if the at least one write buffer is full; writing a top entry to at least one DRAM of the plurality of DRAMs if the at least one write buffer is full; and updating the list.
  • 4. The method of claim 1, wherein the method further comprises: transferring data from the at least one write buffer to at least one DRAM of the plurality of DRAMs; filling the list with dummy entries when there are no, available writes to be executed.
  • 5. Extreme Data Rate (XDR) memory comprising: XDR DRAM that is at least configured to have at least one write buffer; and a memory controller having a list to account for data preloaded into at least one write buffer.
  • 6. The XDR memory of claim 5, wherein the XDR memory further comprises an XDR Input/Output Module (XIO) that is at least configured to receive data and commands through at least one bus.
  • 7. The XDR memory of claim 5, wherein the at least one write buffer further comprises: a plurality of banks for storing data; and a plurality of communication channels that enable the data to be transferred to and received from the plurality of banks through at least one data pin.
  • 8. The XDR memory of claim 5, wherein the memory controller is at least configured to fill the list with dummy entries when there are no available writes to be executed.
  • 9. The XDR memory of claim 5, wherein the at least one write buffer is at least configured to write stored data to at least DRAM memory core when the at least one write buffer is full.
  • 10. A computer program product for maintaining at least one write buffer for a memory system comprised of a plurality of Dynamic Random Access Memories (DRAMs), the computer program product having a medium with a computer embodied thereon, the computer program product comprising: computer code for generating a list of which cache lines are pending in the DRAMs; and computer code for storing the list in a memory controller.
  • 11. The computer program product of claim 10, wherein the computer program product further comprises: computer code for determining if the at least one write buffer is full; computer code for preloading data to the at least one write buffer if the at least one write buffer is not full; and computer code for updating the list.
  • 12. The computer program product of claim 10, wherein the computer program product further comprises: computer code for determining if the at least one write buffer is full; computer code for writing a top entry to at least one DRAM of the plurality of DRAMs if the at least one write buffer is full; and computer code for updating the list.
  • 13. The computer program product of claim 10, wherein the computer program product further comprises: computer code for transferring data from the at least one write buffer to at least one DRAM of the plurality of DRAMs; computer code for filling the list with dummy entries when there are no available writes to be executed.
  • 14. A processor for implementing at least one write buffer for a memory system comprised of a plurality of DRAMs, the processor including a computer program product comprising: computer code for generating a list of which cache lines are pending in the DRAMs; and computer code for storing the list in a memory controller.
  • 15. The computer program of claim 14, wherein the computer program further comprises: computer code for determining if the at least one write buffer is full; computer code for preloading data to the at least one write buffer if the at least one write buffer is not full; and computer code for updating the list.
  • 16. The computer program of claim 14, wherein the computer program further comprises: computer code for determining if the at least one write buffer is full; computer code for writing a top entry to at least one DRAM of the plurality of DRAMs if the at least one write buffer is full; and computer code for updating the list.
  • 17. The computer program of claim 14, wherein the computer program further comprises: computer code for transferring data from the at least one write buffer to at least one DRAM of the plurality of DRAMs; computer code for filling the list with dummy entries when there are no available writes to be executed.