1. Field of the Invention
This invention relates to memory architectures, and particularly to memory architectures employing fully buffered DIMMs.
2. Description of the Related Art
Many schemes have been developed for the organization and operation of random access memory (RAM) devices accessed by a microprocessor. One traditional “stub bus” RAM architecture, in this case for a DDR memory channel, is shown in
However, the architecture of
Some applications, such as a server computer, require access to large quantities of RAM-often more than can be provided using the stub bus architecture of
As before, each RAM chip stores a group of bits at each unique address. A given data word is generally stored on a particular DIMM, with its data bits typically distributed across all the RAM chips on the DIMM. For example, assuming a DIMM contains nine x8 RAM chips, a 72-bit data word is stored with 8 bits on each of the nine chips. When host 30 sends a ‘read’ command to a particular address, the RAM chips of the appropriate DIMM each deliver their 8 bits to the AMB, which assembles them into a half frame for return to the host via the NB data path.
However, the architecture of
One technique used to enable memory systems to tolerate a failed RAM chip is referred to as a “chipkill” implementation. Here, a memory array is architecturally partitioned to spread out an ECC-enhanced data word over many RAM chips such that any individual chip contributes only one bit of the data word - thereby enabling a data word to be recovered using ECC bits even if an entire RAM chip fails.
Applying the chipkill technique to an FB DIMM architecture as described above would require modifying the
A FB DIMM architecture and protocol is presented which overcomes the problems noted above, providing the advantages of a fully buffered architecture while also enabling the system to successfully tolerate the failure of a RAM chip.
The present architecture and protocol comprises a memory controller (host), a first memory channel with first and second DIMMs, and a second memory channel with third and fourth DIMMs. SB and NB data paths are connected between the controller and the first DIMM and between the first DIMM and the second DIMM such that the first and second DIMMs are serially-connected to the controller. Another pair of SB and NB data paths serially-connects the controller with the third and fourth DIMMs. The SB data paths are used to write the data bits of x-bit wide data words from the controller to the first, second, third and fourth DIMMs, and the NB data paths are used to read the data bits of x-bit wide data words from the first, second, third and fourth DIMMs to the controller.
Each DIMM comprises a plurality of RAM devices, each of which is arranged to store y bits of data at respective addresses, with each DIMM containing x/y RAM chips. Each DIMM also includes an AMB device arranged to receive data from the SB and NB data paths, to encode and decode data for each of the DIMM's RAM devices, and to redrive data received from the SB path to the next device on the SB path, and to redrive data received from the NB path to the next device on the NB path.
To enable the present system to tolerate the failure of a single RAM chip, the system's protocol is arranged such that the bits of any given data word stored in the first and second memory channels are interleaved across the RAM devices such that each RAM stores no more than one bit of the data word. As such, the failure of a RAM chip results in the loss of just one bit of a given data word, which can be corrected via the word's ECC (if used).
Further features and advantages of the invention will be apparent to those skilled in the art from the following detailed description, taken together with the accompanying drawings.
One possible embodiment of a FB DIMM architecture and protocol in accordance with the present invention is illustrated in
In the exemplary embodiment shown, first and second memory channels (Channel 0 and Channel 1) 50 and 52 are interfaced to a memory controller 54. First memory channel 50 includes a first DIMM (DIMM 1) and a second DIMM (DIMM 2). The channel includes a southbound (SB) data path 56 by which data bits of 72-bit wide data words are written to DIMM 1 and DIMM 2, and a northbound (NB) data path 58 by which data bits are read from DIMM 1 and DIMM 2. The SB and NB data paths are connected between memory controller 54 and DIMM 1 and between DIMM 1 and DIMM 2 such that DIMM 1, DIMM 2 and memory controller 54 are serially-connected.
The second memory channel (Channel 1) is similarly configured, containing a third DIMM (DIMM 3) and a fourth DIMM (DIMM 4) serially-connected to controller 54, with a SB data path 60 by which data bits are written to DIMM 3 and DIMM 4, and a NB data path 62 by which data bits are read from DIMM 3 and DIMM 4.
Each DIMM includes a plurality of RAM devices 64 and an AMB 66. In general, when data words x bits in length are stored using the first and second memory channels, and each RAM device is arranged to store y bits of data at respective addresses, each DIMM will contain x/y RAM chips. For this example, x=72 and y=4; therefore, each DIMM contains 18 RAM chips.
Each AMB is arranged to receive data from a channel's SB and NB data paths, to encode/decode data for each of the DIMM's RAM devices, to redrive data received from the SB path to the next device on the SB path, and to redrive data received from the NB path to the next device on the NB path. Thus, for the example illustrated in
Similarly, the AMB in DIMM 3 receives data from SB data path 60 and NB data path 62, encodes/decodes data for each of DIMM 3's RAM devices, redrives data received from SB path 60 to the AMB in DIMM 4, and redrives data received from the AMB in DIMM 4 via NB path 62 to memory controller 54. The AMB in DIMM 4 receives data from SB data path 60, encodes/decodes data for each of DIMM 4's RAM devices, and drives data to the AMB in DIMM 3 via NB path 62.
A chipkill approach is employed to ensure that the present system can tolerate the failure of one of the RAM chips. The system is arranged such that the bits of any given data word stored in the first and second memory channels are interleaved across the RAM devices such that each RAM chip stores no more than one bit of the data word. This enables the system to tolerate the failure of a single RAM chip, as this results in the loss of just one bit of a given data word-which can be recovered via the word's ECC (assuming that each data word includes ECC bits capable of recovering one lost or corrupted data bit).
One way in which the bits of data words to be stored can be arranged is shown in
Note that the organization of data bits shown in
Data is conveyed on the SB and NB data paths in data frames 70. For the NB data path, each data frame is made up of two half-frames 72, 74. For the exemplary embodiment shown in
A given data frame is written to one of the channel's two DIMMs (e.g., DIMM 2 for memory channel 50 in the example shown), and the subsequent data frame is written to the channel's other DIMM (DIMM 1 in this case). As such, the contents of a given SB frame are determined by the contents of a corresponding DIMM.
The SB data path is preferably 10 bits wide, and the NB path is preferably between 12 and 14 bits wide, depending on the particular ECC scheme (if any) employed. In accordance with JEDEC specifications, a 14 bit wide NB path employs two bit lanes for CRC code bits, a 13 bit wide NB path has one CRC bit lane, and a 12 bit wide NB path accommodates no CRC bits. As such, a 72-bit group of data bits for a given half-frame is conveyed up a 12-bit wide NB path 12 parallel bits at a time, requiring six consecutive 12 bit groups to send the entire 72-bits. A 14-bit wide NB path would also require convey the data as six consecutive 12 bit groups, but would also include 2×6=12 CRC code bits. Similarly, a 72-bit group of bits on the SB path requires eight consecutive 10 bit groups to fill a data frame. The AMB device on each DIMM coordinates the transfer of data bits between its RAM devices and the SB and NB data paths.
Memory controller 54 issues write and read commands via the SB data path, with each command including an address. Both of the DIMMs on a channel respond to the same address, such that the two DIMMs essentially act as one DIMM.
The present invention provides an FB DIMM architecture which includes a chipkill functionality, but which only requires two memory channels. This is half the number of channels than might otherwise be needed. As such, significant savings are realized in terms of number of I/O pins (200-300 fewer than a comparable four channel implementation) and required PC board area (due to the reduced number of I/O pins) . Because the present scheme requires a response from two AMB devices to fill a frame in response to a read request—with one AMB filling the first half-frame and the second AMB filling the second half-frame —each AMB must differ slightly from the configuration specified by JEDEC.
The premise of the present invention could also be applied to an eight channel FB-DIMM architecture, to reduce it to four channels. Here, each of the four memory channels would contain 2 DIMMs, each of which is populated with x8 RAM chips and an AMB. Each channel is interfaced to a common memory controller via respective SB and NB data paths. As above, the architecture and protocol is arranged such that the bits of any given data word stored in the four memory channels are interleaved across the RAM devices such that each RAM stores no more than one bit of the data word.
The present invention enables the pinout requirements of an eight channel FB-DIMM architecture to be reduced by half, with a consequent reduction in space requirements. For example, a conventional eight channel architecture would comprise 8 DIMMs, each interfaced to the memory controller via respective SB and NB data paths. In a typical arrangement, such a system would store 72-bit data words, each DIMM would consist of 9 RAM chips and an AMB, and each RAM chip would be an x8-i.e., with 8 bits stored at each unique address.
An exemplary four channel implementation in accordance with the present invention is shown in
The data words are suitably stored as shown in
While particular embodiments of the invention have been shown and described, numerous variations and alternate embodiments will occur to those skilled in the art. Accordingly, it is intended that the invention be limited only in terms of the appended claims.