The present disclosure generally relates to memory devices and more particularly relates to a memory device with 4N and 8N die stacks.
New designs for memory devices are being developed to enable faster, less-expensive, or more-reliable computing. For example, new communication technologies can increase the efficiency of communications between memory controllers and the memory devices. Concurrently, designers are implementing additional memory dies in memory devices to increase the memory capacity of these devices. Some methods for increasing the capacity of memory devices, however, may not be compatible with new technologies. As a result, these memory devices can operate inefficiently or inaccurately.
Memory devices can be used to implement memory, which stores data to be operated on by a processor. As applications for computing devices become more complex, memory devices are required to store larger amounts of data and communicate that data more quickly. Accordingly, techniques to improve the efficiency and overall capacity of semiconductor devices are needed. One technique to improve the efficiency of semiconductor devices is to develop communication technologies that enable fast and reliable communication of data to and from different components. For example, memory banks can be arranged (e.g., coupled with different buses) such that they can store or return data in a more efficient manner (e.g., communicate with a higher bandwidth). Moreover, the overall capacity of the memory devices can be increased by implementing additional memory dies into the memory device. For example, memory dies can be stacked vertically to increase the number of circuit components in memory device without increasing the device's footprint.
While these techniques individually can be used to increase memory efficiency and capacity, challenges can arise when trying to increase the capacity of devices designed for these improved communication technologies. Take, for example, a high-bandwidth memory (HBM) device, comprised of a stack of multiple memory dies (e.g., DRAM dies). The HBM device may be compliant with an HBM specification (e.g., the HBM3 specification, the HBM4 specification, etc.) and designed to communicate data with an increased bandwidth in comparison to those of other memory devices. An HBM memory device (e.g., an HBM4 memory device) can be designed with 32 channels (e.g., associated with a command and address (CA) bus), where each channel is further subdivided into two pseudo channels (e.g., each associated with a data (DQ) bus). As illustrated in
The HBM device with an 8N architecture can be symmetrical, such that each of the channels has the same number of banks assigned to each of the two pseudo channels, when implemented as an eight-die device. In fact, symmetrical devices can be created at each multiple of eight dies (e.g., a 16-die HBM device referred to as 16H, 24-die HBM device referred to as 24H, etc.). There exists, however, a need to create HBM devices that do not contain a multiple of eight dies. For example, an HBM device made up of 12 memory dies may be desired (referred to as a 12H stack). HBM devices made up of multiples of 4 memory dies, that are not multiples of 8 (e.g., 20H stack, 28H stack, 36H stack, etc.), may also be desired. If adding an additional stack of four memory dies implemented with the 8N architecture to the HBM device to implement a 12-die device, however, the 32 channels would be distributed evenly across the additional stack of memory dies and associated with a single pseudo channel (e.g., only the first pseudo channel). Given that the first pseudo channel would now include the first half of the eight-die stack as well as the additional stack of four dies, while the second pseudo channel would only implemented at the second half of the eight-die stack, the HBM device would now be asymmetrical (i.e., each pseudo channel would be associated with a different number of banks within the HBM device). Specifically, the device has twice the number of banks associated with the first pseudo channel compared to those associated with the second pseudo channel. This asymmetry can be difficult for a memory controller to handle. For example, the memory controller would have to issue twice as many commands to the banks associated with the first pseudo channel compared to the banks associated with the second pseudo channel, which can be difficult given that the banks of the first and second pseudo channels are coupled with the same CA buses.
Alternatively, the HBM device can be implemented with a 4N architecture, as illustrated in
To address these needs and others, embodiments of the present disclosure relate to a 12H HBM device (and other HBM devices with multiples of 4 memory devices that are not divisible by 8, such as 20H HBM devices, 28H HBM devices, etc.) with a first stack of eight memory dies having an 8N architecture and a second stack of four memory dies having a 4N architecture. A first half (e.g., four memory dies) of the first stack of eight memory dies can include 32 channels, and the banks of each of the 32 channels on the first half of dies can be associated with respective first pseudo channels of the 32 channels. A second half (e.g., four memory dies) of the first stack of memory dies can include the 32 channels, and the banks of each of the 32 channels on the second half of dies can be associated with respective second pseudo channels of the 32 channels. The second stack of four memory dies can include 32 channels divided equally amongst the four memory dies, and the banks of each of the 32 channels can be divided equally amongst respective first and second pseudo channels of the 32 channels.
In aspects, the first stack of eight memory dies can be associated with a first SID and the second stack of four memory dies can be associated with a second SID. Thus, embodiments of the mixed 4N/8N 12-die memory device disclosed herein can reduce the number of distinct stack identifiers needed to communicate with the memory device compared to the 12-die memory device implemented using three 4N die stacks, which utilized three distinct SIDs. As discussed, commands may be communicated with increased delays when addressing memory dies with different SIDs. Accordingly, reducing the number of SIDs in the memory device can improve the efficiency of communication. Moreover, given that the mixed 4N/8N memory device includes a symmetric four-die 4N die stack and a symmetric eight-die 8N die stack, the combined 12-die memory device can be symmetric with regard to the number of banks associated with the two pseudo channels of each channel, which can reduce communication complexity.
In additional aspects, individual memory dies of the HBM device of the present disclosure can operate according to the 4N architecture or the 8N architecture. That is, for example, the memory die can be configured so that all of the banks of the memory die are associated with a single pseudo channel, or in the alternative the memory die can be configured so that half the banks are associated with a first pseudo channel and the other half of banks are associated with a second pseudo channel. The memory die can be configured to operate according to the 4N or 8N architecture based on the assembly of the HBM device (e.g., by programming a configuration register of the memory device, blowing an electronic fuse of the memory device, etc.).
As shown, the host device 302 and the memory device 308 are coupled with one another through the interconnect 314. The processor 304 executes instructions that cause the memory controller 306 of the host device 302 to send signals on the interconnect 314 that control operations at the memory device 308. The memory device 308 can similarly communicate data to the host device 302 over the interconnect 314. The interconnect 314 can include one or more CA buses 316 or one or more DQ buses 318. The CA buses 316 can communicate control signaling indicative of commands to be performed at select locations (e.g., addresses) of the memory device 308. The DQ buses 318 can communicate data between the host device 302 and the memory device 308. For example, the DQ buses 318 can be used to communicate data to be stored in the memory device 308 in accordance with a write request, data retrieved from memory device 308 in accordance with a read request, or an acknowledgement returned from the memory device 308 in response to successfully performing operations (e.g., a write operation) at the memory device 308. The CA buses 316 can be realized using a group of wires, and the DQ buses 318 can encompass a different group of wires of the interconnect 314. As some examples, the interconnect 314 can include a front-side bus, a memory bus, an internal bus, peripheral control interface (PCI) bus, etc.
The processor 304 can read from and write to the memory 308 through the memory controller 306. The processor 304 may include the computing device's: host processor, central processing unit (CPU), graphics processing unit (GPU), artificial intelligence (AI) processor (e.g., a neural-network accelerator), or other hardware processor or processing unit.
The memory device 308 can be integrated within the host device 302 or separate from the computing device 300. The memory device 308 can include any memory 312, such as integrated circuit memory, dynamic memory, random-access memory (e.g., dynamic random-access memory (DRAM), static random-access memory (SRAM)), or flash memory to name just a few. The memory device 308 can include memory 312 of a single type or memory 312 of multiple types. In general, the memory device 308 can be implemented as any addressable memory having identifiable locations of physical storage. The memory device 308 can include memory-side control logic 310 that executes commands from the memory controller 306. For example, the control logic 310 can decode signals from the memory controller 306 and perform operations at the memory 312.
As a specific example, the memory device 308 can include a high-bandwidth memory (HBM) device. For example, the memory device 308 can include an interface die implementing at least a portion of the memory-side control logic 310 and one or more memory 312 (e.g., memory dies) stacked to the interface die. The memory-side control logic 310 can receive commands from the memory controller 306 through the interconnect 314 and communicate signaling to execute the commands at the memory 312 in an improved manner compared to other memory devices (e.g., with a higher bandwidth). The interconnect 314 can similarly be implemented in accordance with an HBM device. For example, the interconnect 314 can include 32 channels further divided into two pseudo channels per channel. Each channel can be coupled to a CA bus, and each pseudo channel can transmit or receive data through a respective DQ bus. Thus, the interconnect 314 can include twice as many DQ buses 318 (e.g., 64 DQ buses) as CA buses 316 (e.g., 32 CA buses). Further details of the memory device 308 will be described in greater detail with respect to
Control logic 410 (e.g., a portion of the control logic 310 of
The memory die 400 can perform operations in accordance with commands received from a memory controller (e.g., memory controller 306 of
Once the command is determined to be directed to the memory die 400, the command can be analyzed to determine which of the banks 406-1 are targeted by the command. The command can include one or more bits (e.g., in a header) indicating to which of the pseudo channels 404 the command is directed. For example, the command could indicate a single pseudo channel bit with value “1” when the command is directed to pseudo channel 404-1A (e.g., or one or more of banks 406-1A) and a single pseudo channel bit with value “0” when the command is directed to pseudo channel 404-1B (e.g., or one or more of banks 406-1B). The control circuitry 410-1 can analyze the command and determine, based on the one or more bits identifying the targeted banks, to which of the banks 406-1 to transmit signaling to perform the operations indicated by the command. In aspects, the control logic 410-1 can determine the banks 406-1A of pseudo channel 404-1A to be the targeted banks. Accordingly, the control logic 410-1 can decode the command to determine a targeted row, a targeted column, and a desired operation associated with the command. The control logic 410-1 can then forward signaling to the banks 406-1A to perform the desired operation at the targeted row and column of the banks 406-1A.
Performing operations at the banks 406-1A can cause data to be returned to the control logic 410-1 for output to the memory controller. For example, if the operation is a read operation, the return data can include data stored in the targeted row and column of the banks 406-1. Alternatively, if the operations is a write operation, the data can include an acknowledgement (e.g., a success flag or a return of the data that was written) of a successful write operation at the targeted row and column of the banks 406-1A. Given that the banks 406-1A and the banks 406-1B are configured to return data on different DQ buses implemented within the TSVs 408-1, the control logic 410-1 can determine which DQ bus to route the return data to. For example, return data resulting from operations at the banks 406-1A can be routed to the first DQ bus of the TSVs 408-1, and return data resulting from operations at the banks 406-1B can be routed to the second DQ bus of the TSVs 408-1. In aspects, the control logic 410-1 determines where the return data is originating from by analyzing a header of the return data or based on the previous decision regarding where to route the command from the CA bus. Once routed to the associated DQ bus of the TSVs 408-1, the return data can be transmitted to the memory controller using the associated DQ bus.
In aspects, the internal data path from the banks 406-1A to the control logic 410-1, or vice versa, can be at least partially shared with the internal data path from the banks 406-1B to the control logic 410-1, or vice versa. Thus, data contention can occur when operations at the banks 406-1 cause data to be returned from the banks 406-1A and the banks 406-1B at the same time. Accordingly, it is important to mitigate concurrent returns from two pseudo channels of a same channel on a single die. In other aspects, in which banks 406-1 are associated with different pseudo channels, the banks do not receive commands at the same time (causing data contention) because they share a common command bus. In still other, the banks 406-1A and the banks 406-1B can each be connected to the TSVs 408-1 through independent data paths. In this case, the return data can be routed to the associated DQ buses of the TSVs 408-1 directly through the independent data paths.
Channel 402-2, channel 402-m, and channel 402-n can be similarly configured, where “m” and “n” are positive integers. For example, Channel 402-2 can include pseudo channels 404-2 (e.g., pseudo channel 404-2A having banks 406-2A and pseudo channel 404-2B having banks 406-2B) coupled with TSVs 408-2 through control logic 410-2. Channel 402-m and channel 402-n can be similarly arranged with pseudo channels 404-m (e.g., pseudo channel 404-mA having banks 406-mA and pseudo channel 404-mB having banks 406-mB) and pseudo channels 406-n (e.g., pseudo channel 404-nA having banks 406-nA and pseudo channel 404-nB having banks 406-nB) coupled with TSVs 408-m and TSVs 408-n through control logic 410-m and control logic 410-n, respectively. There can be any number of channels 402 on the memory die 400. As a specific example, n can be 8 such that the memory die 400 includes channel 402-1 through channel 402-8. In this way, each rank can include 4 memory dies having 8 channels each, thus implementing 32 channels per rank, as required by the HBM specification.
Although illustrated as a single component of control logic, the control logic 410 associated with the various channels 402 can be implemented as discrete portions of control logic. For example, the control logic 410 can be implemented at any location on or off the memory die 400 (e.g., at an interface die of the memory device). In aspects, portions of the control logic 410 can be implemented at different locations. For example, a portion of the control logic 410 responsible for decoding the command or determining the targeted banks/dies can be separate from a portion of the control logic 410 responsible for routing the return data to an associated DQ bus. Accordingly, it should be appreciated that the control logic 410 is shown schematically in
In aspects, a memory die configured in accordance with an 8N architecture can look similar to the memory die 400 illustrated in
The 4N stack of memory dies 504 can include memory dies 508 (e.g., 508A-508D). The memory dies 508 can include 32 channels distributed equally across the memory dies 508 such that each of the memory dies 508 includes eight channels. Banks of the 32 channels on the memory dies 508 can be divided equally amongst the first and second pseudo channel. The 4N stack of memory dies 504 can have an SID, “1”, different from the 8N stack of memory dies 502.
The memory device 500 can further include an interface die 510 in accordance with the HBM specification. The interface die 510 can optimize signaling to/from the memory dies of the memory device 500.
The memory device 600 can also include circuitry (e.g., a fuse, a mode register, a selector) to indicate the configuration of the cores. For example, each of the cores can include a mode register (C4N) or other circuitry that indicates whether the core is configured in accordance with a 4N architecture or an 8N architecture. As illustrated, Core 8 through Core 11 have C4N set to HIGH, thus indicating a 4N architecture, and Core 0 through Core 7 have C4N set to LOW, thus indicating an architecture other than 4N (e.g., 8N).
The memory device 600 also includes DQ buses, DWORD0 and DWORD1, associated with each of the respective first and second pseudo channels of each channel. For example, DWORD0 can be multiple DQ buses used to return data from respective first pseudo channels of each channel (e.g., a pseudo channel identified as PC0), and DWORD1 can be multiple DQ buses used to return data from respective second pseudo channels of each channel (e.g., a pseudo channel identified as PC1). As illustrated, Core 0 through Core 3 each include 16 banks returning data on DWORD0, and Core 4 through Core 7 each include 16 banks returning data on DWORD1. Core 8 through 11 is configured in a 4N architecture. Core 8 through Core 11 each include 16 banks divided equally amongst first and second pseudo channels. Eight banks of each of Core 8 through Core 11 return data on DWORD0, and 8 banks of each of Core 8 through 11 return data on DWORD1. Thus, in this example, each pseudo channel includes 24 banks. In general, Core 8 through Core 11 include half as many banks on each pseudo channel as Core 0 through Core 7.
Given that Core 8 through Core 11 include half as many banks on each pseudo channel as Core 0 through 7, the number of bank addressing (BA) bits used to address banks of the pseudo channels of Core 8 through Core 11 can be less than the number of BA bits used to address banks of the pseudo channels of Core 0 through Core 7. For example, four bank addressing bits (e.g., BA[0:3]) can be used to address the 16 banks per pseudo channel of Core 0 through Core 7, and three bank addressing bits (e.g., BA[0:2]) can be used to address the 8 banks per pseudo channel of Core 0 through Core 7. If the same number of bits are used to address the banks of Core 0 through Core 7 and Core 8 through Core 11, the most-significant bit (BA[3]) can equal “0”.
For a 12-die stack, an SID bit and four BA bits can be used to address the 24 banks of each pseudo channel. The SID can be used to distinguish between the 8N die stack and the 4N die stack, and the BA bits can be used to address specific banks within the channels of the die stacks. Given that the 4N die stack includes half as many banks (e.g., 8 banks) per pseudo channel as the 8N die stack (e.g., 16 banks), one less bit can be used to address the banks of the 4N die stack. Thus, when the 4N die stack is targeted (e.g., SID=“1”), the most-significant BA bit can be set to “0” (e.g., BA[3]=“0”).
For a 16-die stack, two 8N die stacks can be implemented. Thus, the addressing can be the same as the eight-die stack with the addition of an SID bit to distinguish between the two 8N stacks.
As used herein, including in the claims, “or” as used in a list of items (for example, a list of items prefaced by a phrase such as “at least one of” or “one or more of”) indicates an inclusive list such that, for example, a list of at least one of A, B, or C means A or B or C or AB or AC or BC or ABC (i.e., A and B and C). Also, as used herein, the phrase “based on” shall not be construed as a reference to a closed set of conditions. For example, an exemplary step that is described as “based on condition A” may be based on both a condition A and a condition B without departing from the scope of the present disclosure. In other words, as used herein, the phrase “based on” shall be construed in the same manner as the phrase “based at least in part on.”
From the foregoing, it will be appreciated that specific embodiments of the invention have been described herein for purposes of illustration, but that various modifications may be made without deviating from the scope of the invention. Rather, in the foregoing description, numerous specific details are discussed to provide a thorough and enabling description for embodiments of the present technology. One skilled in the relevant art, however, will recognize that the disclosure can be practiced without one or more of the specific details. In other instances, well-known structures or operations often associated with memory systems and devices are not shown, or are not described in detail, to avoid obscuring other aspects of the technology. In general, it should be understood that various other devices, systems, and methods in addition to those specific embodiments disclosed herein may be within the scope of the present technology.
The present application claims priority to U.S. Provisional Patent Application No. 63/447,563, filed Feb. 22, 2023, the disclosure of which is incorporated herein by reference in its entirety.
Number | Date | Country | |
---|---|---|---|
63447563 | Feb 2023 | US |