3D memory circuit

Information

  • Patent Grant
  • 12293108
  • Patent Number
    12,293,108
  • Date Filed
    Monday, January 23, 2023
    2 years ago
  • Date Issued
    Tuesday, May 6, 2025
    10 hours ago
  • Inventors
    • DeLaCruz; Javier A. (San Jose, CA, US)
    • Fisch; David E. (Corrales, NM, US)
  • Original Assignees
  • Examiners
    • Tran; Michael T
    Agents
    • Haley Guiliano LLP
Abstract
Some embodiments provide a three-dimensional (3D) circuit that has data lines of one or more memory circuits on a different IC die than the IC die(s) on which the memory blocks of the memory circuit(s) are defined. In some embodiments, the 3D circuit includes a first IC die with a first set of two or more memory blocks that have a first set of data lines. The 3D circuit also includes a second IC die that is stacked with the first IC dies and that includes a second set of two or more memory blocks with a second set of data lines. The 3D circuit further includes a third IC die that is stacked with the first and second IC dies and that includes a third set of data lines, which connect through several z-axis connections with the first and second sets of data lines to carry data to and from the first and second memory block sets when data is being written to and read from the first and second memory block sets. The z-axis connections in some embodiments electrically connect circuit nodes in overlapping portions of the first and third IC dies, and overlapping portions of second and third IC dies, in order to carry data between the third set of data lines on the third IC die and the first and second set of data lines of the first and second of memory block sets on the first and second IC dies. These z-axis connections between the dies are very short as the dies are very thin. For instance, in some embodiments, the z-axis connections are less than 10 or 20 microns. The z-axis connections are through silicon vias (TSVs) in some embodiments.
Description
BACKGROUND

Electronic circuits are commonly fabricated on a wafer of semiconductor material, such as silicon. A wafer with such electronic circuits is typically cut into numerous dies, with each die being referred to as an integrated circuit (IC). Each die is housed in an IC case and is commonly referred to as a microchip, “chip,” or IC chip. According to Moore's law (first proposed by Gordon Moore), the number of transistors that can be defined on an IC die will double approximately every two years. With advances in semiconductor fabrication processes, this law has held true for much of the past fifty years. However, in recent years, the end of Moore's law has been prognosticated as we are reaching the maximum number of transistors that can possibly be defined on a semiconductor substrate. Hence, there is a need in the art for other advances that would allow more transistors to be defined for an IC chip.


BRIEF SUMMARY

Some embodiments provide a three-dimensional (3D) circuit that has multiple stacked IC dies, with a memory circuit that spans two or more of the stacked IC dies. In some embodiments, the memory circuit includes a memory block on one die and data lines for the memory block on another IC die. For instance, in some embodiments, the 3D circuit includes a first IC die with a first set of two or more memory blocks that have a first set of data lines. The 3D circuit also includes a second IC die that is stacked with the first IC die and that includes a second set of two or more memory blocks with a second set of data lines.


The 3D circuit further includes a third IC die that is stacked with the first and second IC dies and that includes a third set of data lines, which connect through several z-axis connections with the first and second sets of data lines to carry data to and from the first and second memory block sets when data is being written to and read from the first and second memory block sets. The z-axis connections in some embodiments electrically connect circuit nodes in overlapping portions of the first and third IC dies, and overlapping portions of second and third IC dies, in order to carry data between the third set of data lines on the third IC die and the first and second set of data lines of the first and second memory block sets on the first and second IC dies. These z-axis connections between the dies are very short as the dies are very thin. For instance, in some embodiments, the z-axis connections are less than 10 or 20 microns. The z-axis connections are through silicon vias (TSVs) in some embodiments.


In some embodiments, the first and second memory block sets are part of a single addressable memory circuit, while in other embodiments these memory block sets are part of multiple, separately addressable memory circuits (e.g., the first memory block set is part of a first addressable memory circuit, while the second memory block set is part of a different, second addressable memory circuit). The set of one or more memory circuits formed by the first and second memory block sets in some embodiments include (1) a set of addressing circuits to activate different addressed locations in the memory blocks, and (2) a set of input/output (I/O) circuits to write/read data to addressed locations in the memory blocks.


In some embodiments, the addressing circuits are implemented at least partially on the first and second dies, while the I/O circuits are implemented at least partially on the third die. For instance, in some embodiments, the addressing circuits include sense amplifiers and bit lines defined on the first and second dies. The first and second memory block sets have numerous bit lines that connect their respective storage cells to their respective first and second data line sets through sense amplifiers that amplify the values stored in the storage cells.


In some embodiments, the I/O circuits include the third data line sets on the third die, which connect to the first and second data line sets. In some of these embodiments, the I/O circuit set further include a set of buffers defined on the third die. Different buffers are used in different embodiments. Examples of such buffers include inverters, level shifters, stateful storage circuits (e.g., latches, flip flops, etc.), etc. In some embodiments, compute circuits are defined on the third die, and these compute circuits receive through the I/O circuits on the third die the data that is read from the first and second memory blocks. In some of these embodiments, these compute circuits also provide to the I/O circuits data that is to be written to the first and second memory blocks. In some embodiments, these compute circuits are processing cores that implement machine-trained nodes (e.g., neurons) of a machine trained network (e.g., a neural network).


The preceding Summary is intended to serve as a brief introduction to some embodiments of the invention. It is not meant to be an introduction or overview of all inventive subject matter disclosed in this document. The Detailed Description that follows and the Drawings that are referred to in the Detailed Description will further describe the embodiments described in the Summary as well as other embodiments. Accordingly, to understand all the embodiments described by this document, a full review of the Summary, Detailed Description, the Drawings and the Claims is needed.





BRIEF DESCRIPTION OF THE DRAWINGS

The novel features of the invention are set forth in the appended claims. However, for purposes of explanation, several embodiments of the invention are set forth in the following figures.



FIG. 1 illustrates a 3D circuit of some embodiments of the invention.



FIG. 2 illustrates another perspective view of the components of the memory circuit of FIG. 1.



FIG. 3 illustrates the structure of a DRAM memory block that can be used to implement the memory blocks of FIG. 1.



FIG. 4 illustrates an example where the pass gate transistors of a memory block are controlled by AND'ing a die select signal and a block select signal.



FIG. 5 illustrates buffer circuits of the I/O circuits defined on the fourth IC die of FIG. 1.



FIG. 6 illustrates another 3D circuit of some embodiments.



FIG. 7 illustrates a device that uses a 3D IC of some embodiments of the invention.





DETAILED DESCRIPTION

In the following detailed description of the invention, numerous details, examples, and embodiments of the invention are set forth and described. However, it will be clear and apparent to one skilled in the art that the invention is not limited to the embodiments set forth and that the invention may be practiced without some of the specific details and examples discussed.


Some embodiments provide a three-dimensional (3D) circuit that has multiple stacked IC dies, with a memory circuit that spans two or more of the stacked IC dies. In some embodiments, the memory circuit includes a memory block on one die and data lines for the memory block on another IC die. For instance, in some embodiments, the 3D circuit includes a first IC die with a first set of two or more memory blocks that have a first set of data lines. The 3D circuit also includes a second IC die that is stacked with the first IC die and that includes a second set of two or more memory blocks with a second set of data lines. The 3D circuit further includes a third IC die that is stacked with the first and second IC dies and that includes a third set of data lines, which connect through several z-axis connections with the first and second sets of data lines to carry data to and from the first and second memory block sets when data is being written to, and read from, the first and second memory block sets.


In some embodiments, the first and second memory block sets form a single addressable memory circuit, while in other embodiments these memory block sets are part of multiple, separately addressable memory circuits (e.g., the first memory block set is part of a first addressable memory circuit, while the second memory block set is part of a different, second addressable memory circuit). Examples of such memory circuits include DRAMs (Dynamic Random Access Memories), SRAMs (Static Random Access Memories), ROMs (Read Only Memories), etc.


The set of one or more memory circuits formed by the first and second memory block sets in some embodiments include (1) a set of addressing circuits to activate different addressed locations in the memory blocks, and (2) a set of input/output (I/O) circuits to write/read data to addressed locations in the memory blocks. In some embodiments, the addressing circuits are implemented at least partially on the first and second dies, while the I/O circuits are implemented at least partially on the third die. For instance, in some embodiments, the addressing circuits include sense amplifiers defined on the first and second dies, while the I/O circuits include the third data line sets on the third die, which connect to the first and second data line sets. In some of these embodiments, the I/O circuit set further includes a set of buffers defined on the third die. Different buffers are used in different embodiments. Examples of such buffers include inverters, level shifters, stateful storage circuits (e.g., latches, flip flops, etc.), etc.


In the discussion above and below, the connections that cross bonding layers (that bond vertically stacked dies) to electrically connect electrical nodes (e.g., circuit points, etc.) on different dies are referred to below as z-axis connections. This is because these connections traverse completely or mostly in the z-axis of the 3D circuit (e.g., because these connections in some embodiments cross the bonding layer(s) in a direction normal or nearly normal to the bonded surface), with the x-y axes of the 3D circuit defining the planar surface of the IC die substrate or interconnect layers. These connections are also referred to as vertical connections to differentiate them from the horizontal planar connections along the interconnect layers of the IC dies.


Through silicon vias (TSVs) are one example of z-axis connections used by some embodiments of the invention. In some embodiments, z-axis connections are native interconnects that allow signals to span two different dies with no standard interfaces and no input/output protocols at the cross-die boundaries. In other words, the direct bonded interconnects allow native signals from one die to pass directly to the other die with no modification of the native signal or negligible modification of the native signal, thereby forgoing standard interfacing and consortium-imposed input/output protocols. In some embodiments, z-axis connections are direct unbuffered electrical connections (i.e., connections that do not go through any buffer or other circuit).


A z-axis connection between two dies terminates typically on electrical contacts (referred to as pads) on each die (e.g., on an interconnect or substrate layer of each die). Through interconnect lines and/or vias on each die, the z-axis connection pad on each die electrically connects the z-axis connection with circuit nodes on the die that need to provide the signal to the z-axis connection or to receive the signal from the z-axis connection. For instance, a z-axis connection pad connects to an interconnect segment on an interconnect layer of a die, which then carries the signal to a circuit block on the die's substrate through a series of vias and interconnect lines. Vias are z-axis structures on each die that carry signals between the interconnect layers of the die, and between the IC die substrate and the interconnect layers of the die.


The discussion above and below refers to different circuits or blocks on different dies overlapping with each other. As illustrated in the figures described below, two circuit blocks on two vertically stacked dies overlap when their horizontal cross sections (i.e., their horizontal footprint) vertically overlap (i.e., have an overlap in the vertical direction).



FIG. 1 illustrates a 3D circuit 100 of some embodiments of the invention. The 3D circuit 100 has a memory circuit 105 with different components on different IC dies. Specifically, the 3D circuit 100 includes four dies 120-126 that are vertically stacked on top of each other. To vertically stack these dies on top of each other, some embodiments use commonly known techniques for aligning dies vertically and bonding neighboring dies through a bonding layer. As further described below, some embodiments use z-axis connections 160 (e.g., connections that are orthogonal to the x-y surface of the dies) to electrically connect nodes on vertically mounted dies.


In FIG. 1, the first IC die 120 includes a first set of four memory blocks 130, the second IC die 122 includes a second set of four memory blocks 132, and the third IC die 124 includes a third set of four memory blocks 134. The memory blocks in each of these three dies 120-124 are arranged in a single direction (e.g., a single row or single column), with the cross section of each block (e.g., block 130d on die 120) on each die overlapping the cross section of two other memory blocks on two other dies (e.g., blocks 132d and 134d on dies 122 and 124). In other words, each memory block on one die is vertically aligned with two other memory blocks on two other dies in this example. In other embodiments, the memory blocks are not so aligned, and/or have a different arrangement on each die (e.g., are arranged in a two-dimensional array).


In some embodiments, each die includes a semiconductor substrate 190 and a set of interconnect layers 192 defined above the semiconductor substrate. On each die, numerous electronic components (e.g., active components, like transistors and diodes, or passive components, like resistors and capacitors) are defined on that die's semiconductor substrate, and are connected to each other through interconnect wiring on the die's set of interconnect layers, in order to form storage cells, microcircuits (e.g., Boolean gates, such as AND gates, OR gates, etc.) and/or larger circuit blocks (e.g., functional blocks, such as memories, decoders, logic units, multipliers, adders, etc.). For instance, in some embodiments, each memory block on each die is defined on that die's semiconductor substrate with the needed interconnect wiring on the die's set of interconnect layers.


Each memory block has a set of local data lines 140 on the same IC die as the memory block. The local data lines 140 of each memory block carry data read from, and written to, the memory block. These local data lines 140 of each memory block connect to global data lines 145 on the fourth IC die 126 through control circuits 165 and z-axis connections 160. As shown, the memory circuit has several sets of global data lines 145 on the fourth IC die 126, with each set of global data lines used by a different set of overlapping memory blocks on the first, second and third IC dies 120-124.


In some embodiments, the global data lines 145 include wiring that is defined on one or more interconnect layers of the fourth IC die 126. The global data lines 145 provide the data read from the memory blocks to the I/O circuits 180 (e.g., circuits on the fourth IC die 126) of the memory circuit 105, and provide data to write to the memory blocks from the I/O circuits 180. In some embodiments, the I/O circuits 180 are implemented at least partially on the fourth die 126. For instance, the I/O circuits in some embodiments include buffer circuits (e.g., inverters, level shifters, stateful storage circuits (e.g., latches, flip flops, etc.), etc.) that are defined on the fourth IC die 126.


The z-axis connections 160 in some embodiments electrically connect circuit nodes in overlapping portions of the local data lines 140 and global data lines 145, in order to carry data between the global data lines and the local data lines. These z-axis connections between the dies are very short as the dies are very thin. For instance, in some embodiments, the z-axis connections are less than 10 or 20 microns. The z-axis connections are through silicon vias (TSVs) in some embodiments.


The memory circuit 105 has row and column addressing circuits 170 and 172 that activate a set of addressed locations in a set of memory blocks based on addresses that the receive from other circuits of the 3D circuit 100. In some embodiments, the memory circuit 105 has different row and column addressing sub-circuits for each memory block that process the received addresses for that memory block. In some embodiments, each memory block's row and column addressing sub-circuits are at least partially defined on that block's die. For instance, as further described below, the addressing sub-circuits of each memory block in some embodiments include sense amplifiers and bit lines that are defined on the memory block's die. In some embodiments, the bit lines of the memory block connect the block's storage cells to their respective block's local data lines through sense amplifiers that amplify the values stored in the storage cells.



FIG. 2 illustrates another perspective view of the memory blocks 130-134, the local data lines 140 and global data lines 145 of the memory circuit 105. In this view, the memory circuit 105 is a DRAM that is implemented with a differential logic design. This view illustrates the four memory blocks on each of the first three dies 120-124, with each memory block vertically overlapping two other memory blocks on two other dies and each set of three vertically overlapping memory blocks on the three dies 120-124 sharing one set of global data lines 145. Specifically, it shows the local data lines 140 of each memory block connected through pass gate controls 265 (serving as the control circuits 165) and z-axis connections 160 to the global data lines 145. It further shows the four sets of global data lines 145 for the four sets of overlapping memory blocks on the first, second and third IC dies 120-124.


Each memory block's set of local data lines 140 has two subsets of complementary local data lines (as the design is a differential design), with each subset having several (e.g., 8, 16, 32, 64, etc.) data lines. Similarly, each pass gate control 265 of the memory block has two subset of pass gates for the two subsets of local data lines, with each subset of pass gates having several (e.g., 8, 16, 32, 64, etc.) pass gates.


In FIG. 2, the pass gate controls 265 receive die select signals that at any given time, activate the pass gate controls for the memory blocks of just one die. For example, for one set of address values, the pass gate controls 265 of the first IC die 120 would receive an active die select signals DS1 that would turn on their transistors to connect their local data lines 140 to the global data lines 145, while the other pass gate controls 265 of the other IC dies 122 and 124 would not receive active die select signals DS2 and DS3.


A given address in these embodiments would cause each of the memory blocks on one IC die (e.g., the first IC die) to read from or write to one set of storage locations. Hence, under this approach, a large amount of data can be read from, or written to, addressed sets of locations in the memory blocks on one IC die (e.g., the first IC die) concurrently through the local data lines 140 of the memory blocks, their associated pass gate controls 265, and the different sets of global data lines 145.


In this concurrent accessing scheme, the access to any one memory block on a die is not blocked by the concurrent access of another memory block on the die as the different memory blocks on the same die connect to different global data lines. Also, in this scheme, the global data lines do not have to span all the memory blocks on a given die, and hence have a shorter length than global data lines that are typically used today to span a row or column of memory blocks on a single die. In some embodiments, the span of the global data lines is one length, or less than one length, of a memory block, as each set of global data lines is used for three overlapping memory blocks that have the same footprint (i.e., cross section). Hence, each set of global data lines needs to be long enough to provide sufficient space for connecting to the z-axis connections from the memory blocks.


The short span of the global data lines is highly advantageous when the memory circuit has a large number of memory blocks (e.g., 8, 16, etc.). In the memory block arrangement illustrated in FIG. 2, the length of the wire and z-axis connections between each memory block's local data lines 140 and its corresponding global data lines 145 is rather short, as the global data lines traverse over the local data lines very near to the memory blocks, and the z-axis connections are very short.


For a given address, the memory circuit 105 in some embodiments sequentially activates the die select signals of the different dies so that after concurrently reading from or writing to addressed locations in all the memory blocks of one die, the memory circuit can then read from or write to the addressed locations of the memory block of other die(s). For instance, in the above-described example, after reading from or writing to the set of address locations in the memory blocks of the first IC die 120, the memory circuit sequentially provides active die select signals to the pass gate controls of the second and third IC dies 122 and 124 so that it can sequentially read from or write to the set of address locations in the memory blocks of the second IC die 122 followed by the set of address locations in the memory blocks of the third IC dies 124. In other embodiments, the memory circuit 105 has other schemes for activating the pass gate controls and accessing the memory blocks on different IC dies, as further described below by reference to FIG. 4.



FIG. 3 illustrates the structure of a DRAM memory block 300 that can be used to implement the memory blocks 130, 132 and 134 when the memory circuit is a DRAM. The memory block 300 has a commonly used differential design that is used in many DRAMs today. In this design, each logical storage cell is implemented by a complementary pair of single physical storage cells 310 (e.g., single capacitors) that are accessed through complementary pass gate transistors 315, word lines and bit lines. Each cell's pass gate transistor connects to a bit line, a word line and the cell. The bit and word lines 330 and 332 that connect to the cell's pass gate transistor are complimentary (i.e., carry the opposite signal values) to the bit and word lines that connect to that cell's complimentary cell.


Specifically, each particular pass gate transistor 315 of each particular cell has its gate connected to a particular word line, while a word line that is complementary to the particular word line connects to the gate of the pass gate transistor of a cell that is the complementary cell to the particular cell. Similarly, each particular pass gate transistor 315 of each particular cell has one of its second terminal connected to a particular bit line, while a bit line that is complementary to the particular bit line connects to the second terminal of the pass gate transistor of the complementary cell of the particular cell. Lastly, each pass gate transistor's third terminal connects to its storage cell. Hence, in this design, several storage locations in a memory block can be accessed concurrently by activating (i.e., by providing active signals on) complimentary word line pairs of the storage locations, so that data can be read from, or written through, the complimentary bit line pairs of the storage locations.


Each pair of complementary bit lines are fed to a differential sense amplifier circuit 340 that amplifies the differential voltage value read from a complementary pair of cells by the bit lines, in order to quickly move the data to the high and low rail values. In some embodiment, each differential pair of cells has one cell store a high or low value, while the other stores the opposite value or a mid-range value. In these embodiments, the sense amplifiers quickly move the data values to the desired rail values to address any degradation in stored values, or to address the storage of the mid-range value.


The sense amplifier circuits 340 includes several differential sense amplifiers (e.g., one for each bit line pair, or one for each several bit lines pairs). In some embodiments, each differential sense amplifier is formed as a gated, cross coupled latch. The bit lines in some embodiments connect to the local data lines 140 of the memory circuit through column addressing controls (not shown) of the column addressing circuit of the memory circuit. With the exception of the z-axis connections, all the components illustrated in FIG. 3 (i.e., the bit and word lines 330 and 332, the local data lines 140, the storage cells 310, the pass gate transistors 315, the sense amplifier circuits 340) in some embodiments are defined entirely on one of the dies 120, 122 or 124.


Instead of controlling the pass gate transistors 265 with die select signals, other embodiments control these pass gate transistors 265 differently. For instance, FIG. 4 illustrates an example where the pass gate transistors 465 of a memory block 400 (e.g., memory block 130, 132 or 134) are controlled by AND′ing a die select signal and a block select signal. By specifying different die and block select signals for different memory blocks, the 3D memory circuit 105 can have any arbitrary combination of non-overlapping memory blocks connect their local data lines 140 to the global data lines 145 through the pass gate transistors 265 and the z-axis connections 160. For instance, for the example illustrated in FIG. 2, a particular combination of die and block select signals can result in the memory bocks 130a, 132b, and 134c outputting their results concurrently on their respective global data lines 145. Also, other embodiments use staggered sets of sense amplifiers such that consecutive bit lines in each set of bits lines are fed to different sense amplifiers (e.g., even complementary bit lines are fed to a sense amplifier to the right of the memory cells while odd complementary bit lines are fed to a sense amplifier to the left of the memory cells).



FIG. 5 illustrates buffer circuits 500 of the I/O circuits 180 defined on the fourth IC die 126 along with the global data lines 145. Different buffers are used in different embodiments. As shown, examples of such buffers include inverters 502, level shifters 504, stateful storage circuits 506 (e.g., latches, flip flops, etc.), etc. I/O circuits 180 of the memory circuit 105 receives data to store in the memory blocks from, and supply data read from the memory blocks to, circuit defined on the first, second, third and fourth dies IC 120-126. In some embodiments, these circuits include compute circuits 550 defined on the fourth IC die 126, as shown in FIG. 5. In some embodiments, these compute circuits on the fourth IC die 126 are processing cores that implement machine-trained nodes (e.g., neurons) of a machine trained network (e.g., a neural network), while the memory blocks store values used or computed by these compute circuits (e.g., weight values or activation values).


Other embodiments use other architectures to read data from or write data to the memory blocks 130-134 of the memory circuit 105. For instance, some embodiments have two sets of global data lines 145 for two opposing sides (e.g., right and left sets of global data lines) of each set of stacked memory blocks (e.g., memory blocks 130a, 132a, and 134a), instead of just having one set of global data lines 145 for each set of stacked memory blocks. Also, some embodiments also employ a multiplexer between the I/O circuit 500 and the compute circuits 550 to connect different subsets of global data lines with the compute circuits at different times. Both these approaches would increase the number of memory blocks that can be concurrently or sequentially accessed through the global data lines and the z-axis connections.


One of ordinary skill will also realize that while some embodiments have been described above by reference to the memory circuit 105, other embodiments of the invention can be implemented differently. For instance, in some embodiments, the memory blocks on one set of stacked IC dies that use the global data lines on another stacked IC die are part of two or more separately addressable memory circuits, instead of the single addressable memory circuit 105. Also, other embodiments use many more memory blocks and global data lines than the memory circuit 105.


For instance, instead of having four sets of overlapping memory blocks on three dies, the memory circuit of other embodiments has eight overlapping memory blocks on three dies. In these embodiments, the memory circuit has eight memory blocks on each of the three stacked dies 120, 122 and 124, and these twenty-four memory blocks form eight sets of three overlapping memory blocks on these dies. Each of these eight sets shares two sets of global data lines that connect to two sets of local data lines that emanate from two sides of each memory block. In addition, other embodiments have different sets of global data lines on different stacked IC dies (e.g., a first set of global data lines on IC die 126 for use by a first set of memory blocks on IC dies 120-124, and a second set of global data lines on IC die 120 for use by a second set of memory blocks on IC dies 122-126).


When all the blocks on one IC die are accessed concurrently through the global data lines, a very large amount of memory locations in the memory blocks on one die can be accessed concurrently. This number can be increased by three-fold when the memory circuit successively activates the die select signals on each of the three dies so that the memory blocks on each of the three dies can be successively accessed.


The four dies 120-126 of the 3D circuit 100 of FIG. 1 are face-to-back mounted, in that the set of interconnect layers of one die is mounted next to the backside of the semiconductor substrate of the other die. In this architecture, TSVs are used as the z-axis connections to carry signals from one die to another. The 3D circuit of other embodiments uses other techniques for vertically stacking the dies.



FIG. 6 illustrates one such alternative approach. It shows a 3D circuit 600 that, like the 3D circuit 100, has four vertically stacked dies, with the first three being face-to-back mounted. However, unlike the 3D circuit 100, the third and fourth dies 124 and 626 of the 3D circuit 600 are face-to-face stacked. In some embodiments, the die 626 is similar to the die 126 in that it includes the global data lines 145 discussed above. However, the die 626 in some embodiments has contacts that facilitate its face-to-face mounting to the die 124.


In FIG. 6, the sets of interconnect layers of the dies 124 and 626 are facing each other and are bonded to each other through a direct bonding process that establishes direct-contact metal-to-metal bonding, oxide bonding, or fusion bonding between these two sets of interconnect layers. An example of such bonding is copper-to-copper (Cu—Cu) metallic bonding between two copper conductors in direct contact. In some embodiments, the direct bonding is provided by a hybrid bonding technique such as DBI® (direct bond interconnect) technology, and other metal bonding techniques (such as those offered by Invensas Bonding Technologies, Inc., an Xperi Corporation company, San Jose, CA). In some embodiments, DBI connects span across silicon oxide and silicon nitride surfaces. The DBI process is further described in U.S. Pat. Nos. 6,962,835 and 7,485,968, both of which are incorporated herein by reference. This process is also described in U.S. Published Patent Application 2018/0102251, which is also incorporated herein by reference.


When the third and fourth dies 124 and 626 are face-to-face bonded, the back side of the fourth die 626 can be used to connect to a ball grid array, which is then used to mount the 3D circuit 600 on a board. Instead of just face-to-face mounting the two dies 124 and 626, other embodiments face-to-face mount two pairs of dies (e.g., dies 120 and 122 and dies 124 and 626) and then back-to-back mount one die from each of these pairs (e.g., dies 122 and 124). Back-to-back stacked dies have the backside of the semiconductor substrate of one die mounted next to the backside of the semiconductor substrate of the other die.



FIG. 7 illustrates a device 702 that uses a 3D IC 100. As shown, the 3D IC die 100 includes a cap 750 that encapsulates the four dies of this IC in a secure housing 725. On the back side of the die 120 one or more TSVs and/or interconnect layers are defined to connect the 3D IC to a ball grid array 720 (e.g., a micro bump array) that allows this to be mounted on a printed circuit board 730 of the device 702. The device 702 includes other components (not shown). In some embodiments, examples of such components include one or more memory storages (e.g., semiconductor or disk storages), input/output interface circuit(s), one or more processors, etc.


In some embodiments, the die 120 receives data signals through the ball grid array, and routes the received signals to I/O circuits on this and/or other dies through interconnect lines on the interconnect layer, vias between the interconnect layers, and z-axis connections with the other dies. As mentioned by reference to FIG. 6, other embodiments connect the backside of the substrate of the die 626 to the ball grid array.


While the invention has been described with reference to numerous specific details, one of ordinary skill in the art will recognize that the invention can be embodied in other specific forms without departing from the spirit of the invention. For instance, several embodiments were described above where the data from or to an I/O circuit is written to or read from memory blocks in parallel or concurrently. Other embodiments, however, have data that is read from a first memory block in an IC die written to a second memory block (e.g., a second memory block stacked with the first memory block or offset from the first memory block) through one z-axis connections, or through one set of z-axis connections, a set of global data lines and then another set of z-axis connections. Thus, one of ordinary skill in the art would understand that the invention is not to be limited by the foregoing illustrative details, but rather is to be defined by the appended claims.

Claims
  • 1. A device comprising: an integrated circuit (IC) die stack comprising a first die and a plurality of second die, wherein: the first die comprises global data lines, global input/output (I/O) circuits, and processing cores;each of the plurality of second die comprises: memory blocks,local data lines connected to the memory blocks,an addressing circuit, anda local input/output (I/O) circuit;an active side of the first die is directly bonded to a topmost second die of the plurality of second die;the local data lines of the topmost second die of the plurality of second die are communicatively coupled to the global data lines of the first die;the processing cores are capable of receiving, through the global I/O circuits, data from the memory blocks and providing, through the global I/O circuits, data to the memory blocks;each addressing circuit is capable of activating different memory blocks of the respective second die; andeach local input/output (I/O) circuit is capable of writing/reading data to the respective activated memory blocks.
  • 2. The device of claim 1, wherein the global data lines of the first die vertically overlap the local data lines of the topmost second die of the plurality of second die.
  • 3. The device of claim 1, wherein the first die further comprises a multiplexer between the I/O circuits and the processing cores, and the multiplexer is capable of connecting different subsets of the global data lines with the processing cores at different times.
  • 4. The device of claim 1, wherein the active side of the first die is hybrid bonded to the topmost second die of the plurality of second die.
  • 5. The device of claim 1, wherein local data lines of each of the plurality of second die vertically overlap with respective local data lines of one or more adjacent second die.
  • 6. The device of claim 5, wherein the vertically overlapping local data lines are connected through hybrid bonds formed between adjacent ones of the second die.
  • 7. The device of claim 1, wherein memory blocks of each of the plurality of second die vertically overlap with respective memory blocks of one or more adjacent second die.
  • 8. The device of claim 7, wherein each global data line is connected to a respective plurality of overlapping memory blocks.
  • 9. The device of claim 7, wherein: local data lines of each of the plurality of second die vertically overlap with respective local data lines of one or more adjacent second die;the vertically overlapping local data lines are connected through hybrid bonds formed between adjacent ones of the second die; andeach global data line is connected to a respective plurality of overlapping memory blocks through a corresponding plurality of overlapping local data lines.
  • 10. The device of claim 1, wherein each of the plurality of second die comprises pass gate controls capable of connecting each local data line to a respective global data line.
  • 11. The device of claim 1, wherein each of the memory blocks is connected to complimentary local data lines.
  • 12. The device of claim 11, wherein the complimentary local data lines are connected to a differential sense amplifier circuit.
  • 13. The device of claim 11, wherein the complimentary local data lines are connected to a respective global data line through a pass gate control.
  • 14. The device of claim 1, wherein at least some of the local data lines are connected to respective global data lines by TSVs disposed through one or more of the second die.
  • 15. The device of claim 14, wherein one or more of the second die have a TSV density of more than 100,000,000 per square centimeter.
  • 16. The device of claim 1, wherein the IC die stack is capable of reading from and/or writing to memory blocks on each respective second die in parallel.
  • 17. The device of claim 1, wherein the global data lines provide parallel read/write access to respective memory blocks on each second die.
  • 18. The device of claim 1, wherein: the addressing circuits of each second die are capable of activating different addressed locations in the memory blocks;the first die further comprises input/output (I/O) circuits capable of reading data from and writing data to the memory blocks of each second die; andthe respective memory blocks and addressing circuits of the second die and I/O circuits of the first die connected thereto collectively form a plurality of memory circuits.
  • 19. The device of claim 1, wherein each memory block comprises logical storage cells, and each logical storage cell is connected to a bit line and a word line.
  • 20. A device comprising: an integrated circuit (IC) die stack comprising a first die and a plurality of second die, wherein: the first die comprises global data lines, global input/output (I/O) circuits, and processing cores;each of the plurality of second die comprises: memory blocks,local data lines connected to the memory blocks,an addressing circuit, anda local input/output (I/O) circuit;an active side of the first die is directly bonded to a topmost second die;the local data lines of the topmost second die are communicatively coupled to the global data lines of the first die;the processing cores are capable of receiving, through the global I/O circuits, data from the memory blocks and providing, through the global I/O circuits, data to the memory blocks;each addressing circuit is capable of activating different memory blocks of the respective second die;each local input/output (I/O) circuit is capable of writing/reading data to the respective activated memory blocks; andwherein the processing cores implement machine-trained nodes of a machine trained network.
CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a continuation of U.S. patent application Ser. No. 17/098,299, filed Nov. 13, 2020, which claims the benefit of U.S. Provisional Application No. 62/937,749, filed Nov. 19, 2019. The disclosures of which are hereby incorporated by reference herein in their entireties.

US Referenced Citations (211)
Number Name Date Kind
5016138 Woodman May 1991 A
5376825 Tsukamoto et al. Dec 1994 A
5579207 Hayden et al. Nov 1996 A
5621863 Boulet et al. Apr 1997 A
5673478 Beene et al. Oct 1997 A
5717832 Steimle et al. Feb 1998 A
5740326 Boulet et al. Apr 1998 A
5793115 Zavracky et al. Aug 1998 A
5909587 Tran Jun 1999 A
6137746 Kengeri Oct 2000 A
6320255 Terrill et al. Nov 2001 B1
6421654 Gordon Jul 2002 B1
6707124 Wachtler et al. Mar 2004 B2
6844624 Kiritani Jan 2005 B1
6891447 Song May 2005 B2
6909194 Farnworth et al. Jun 2005 B2
6917219 New Jul 2005 B2
6962835 Tong et al. Nov 2005 B2
7046522 Sung et al. May 2006 B2
7099215 Rotenberg et al. Aug 2006 B1
7124250 Kyung Oct 2006 B2
7202566 Liaw Apr 2007 B2
7485968 Enquist et al. Feb 2009 B2
7638869 Irsigler et al. Dec 2009 B2
7692946 Taufique et al. Apr 2010 B2
7863918 Jenkins et al. Jan 2011 B2
8032711 Black et al. Oct 2011 B2
8042082 Solomon Oct 2011 B2
8064739 Binkert Nov 2011 B2
8110899 Reed et al. Feb 2012 B2
8148814 Furuta et al. Apr 2012 B2
8338224 Yoon Dec 2012 B2
8432467 Jaworski et al. Apr 2013 B2
8516409 Coteus et al. Aug 2013 B2
8546955 Wu Oct 2013 B1
8547769 Saraswat et al. Oct 2013 B2
8704384 Wu et al. Apr 2014 B2
8736068 Bartley et al. May 2014 B2
8797818 Jeddeloh Aug 2014 B2
8816506 Kawashita et al. Aug 2014 B2
8860199 Black et al. Oct 2014 B2
8901749 Kim et al. Dec 2014 B2
8907439 Kay et al. Dec 2014 B1
8930647 Smith Jan 2015 B1
8947931 D'Abreu Feb 2015 B1
9067272 Sutanto et al. Jun 2015 B2
9076700 Kawashita et al. Jul 2015 B2
9142262 Ware Sep 2015 B2
9300298 Cordero et al. Mar 2016 B2
9318418 Kawashita et al. Apr 2016 B2
9432298 Smith Aug 2016 B1
9478496 Lin Oct 2016 B1
9497854 Giuliano Nov 2016 B2
9501603 Barowski et al. Nov 2016 B2
9508607 Chua-Eoan et al. Nov 2016 B2
9613689 Takaki Apr 2017 B1
9640233 Sohn May 2017 B2
9645603 Chall et al. May 2017 B1
9647187 Yap et al. May 2017 B1
9691739 Kawashita et al. Jun 2017 B2
9726691 Garibay et al. Aug 2017 B2
9746517 Whetsel Aug 2017 B2
9853053 Lupino et al. Dec 2017 B2
9915978 Dabby et al. Mar 2018 B2
10121743 Kamal et al. Nov 2018 B2
10255969 Eom et al. Apr 2019 B2
10262911 Gong et al. Apr 2019 B1
10269394 Kim et al. Apr 2019 B2
10269586 Chou et al. Apr 2019 B2
10289604 Sankaralingam et al. May 2019 B2
10347354 Zimmerman Jul 2019 B2
10373657 Kondo et al. Aug 2019 B2
10446207 Kim et al. Oct 2019 B2
10446601 Otake et al. Oct 2019 B2
10468379 Liu Nov 2019 B1
10490281 Park et al. Nov 2019 B2
10580735 Mohammed et al. Mar 2020 B2
10580757 Nequist et al. Mar 2020 B2
10580817 Otake et al. Mar 2020 B2
10586786 Delacruz et al. Mar 2020 B2
10593667 Delacruz et al. Mar 2020 B2
10600691 Delacruz et al. Mar 2020 B2
10600735 Delacruz et al. Mar 2020 B2
10600780 Delacruz et al. Mar 2020 B2
10607136 Teig et al. Mar 2020 B2
10672663 Delacruz et al. Jun 2020 B2
10672743 Teig et al. Jun 2020 B2
10672744 Teig et al. Jun 2020 B2
10672745 Teig et al. Jun 2020 B2
10719762 Teig et al. Jul 2020 B2
10762420 Teig et al. Sep 2020 B2
11217516 Muthukumar Jan 2022 B2
11397687 Malladi Jul 2022 B2
11599299 Delacruz et al. Mar 2023 B2
20010017418 Noguchi et al. Aug 2001 A1
20030227795 Seyyedy et al. Dec 2003 A1
20050127490 Black et al. Jun 2005 A1
20060036559 Nugent Feb 2006 A1
20070220207 Black Sep 2007 A1
20080080261 Shaeffer et al. Apr 2008 A1
20090040861 Ruckerbauer Feb 2009 A1
20090070727 Solomon Mar 2009 A1
20090103345 McLaren Apr 2009 A1
20090255705 Pratt Oct 2009 A1
20100085825 Keeth et al. Apr 2010 A1
20100140750 Toms Jun 2010 A1
20100195364 Riho Aug 2010 A1
20100315887 Park Dec 2010 A1
20110121433 Kim May 2011 A1
20110184688 Jetake et al. Jul 2011 A1
20110208906 Gillingham Aug 2011 A1
20110215394 Komori Sep 2011 A1
20120201068 Ware Aug 2012 A1
20120242346 Wang et al. Sep 2012 A1
20120243355 Shin et al. Sep 2012 A1
20120250286 Chi et al. Oct 2012 A1
20120327728 Anderson Dec 2012 A1
20130032950 Perego et al. Feb 2013 A1
20130051116 En et al. Feb 2013 A1
20130070507 Yoon Mar 2013 A1
20130144542 Ernst et al. Jun 2013 A1
20130187292 Semmelmeyer et al. Jul 2013 A1
20130207268 Chapelon Aug 2013 A1
20130242500 Liu et al. Sep 2013 A1
20130275798 Kondo et al. Oct 2013 A1
20130275823 Cordero et al. Oct 2013 A1
20130320567 Thacker Dec 2013 A1
20140022002 Chua-Eoan et al. Jan 2014 A1
20140040698 Loh et al. Feb 2014 A1
20140133246 Kumar et al. May 2014 A1
20140181417 Loh Jun 2014 A1
20140189257 Aritome Jul 2014 A1
20140323046 Asai et al. Oct 2014 A1
20150016172 Loh et al. Jan 2015 A1
20150121052 Emma et al. Apr 2015 A1
20150199126 Jayasena Jul 2015 A1
20150213860 Narui et al. Jul 2015 A1
20150228584 Huang et al. Aug 2015 A1
20150355763 Miyake et al. Dec 2015 A1
20160028395 Bains Jan 2016 A1
20160071556 Venkata Mar 2016 A1
20160111386 England et al. Apr 2016 A1
20160181214 Oh et al. Jun 2016 A1
20160225431 Best et al. Aug 2016 A1
20160233134 Lim et al. Aug 2016 A1
20160329312 O'Mullan et al. Nov 2016 A1
20160379115 Burger et al. Dec 2016 A1
20170092615 Oyamada Mar 2017 A1
20170092616 Su et al. Mar 2017 A1
20170139635 Jayasena May 2017 A1
20170148737 Fasano et al. May 2017 A1
20170154655 Seo Jun 2017 A1
20170194038 Jeong et al. Jul 2017 A1
20170213787 Alfano et al. Jul 2017 A1
20170278213 Eckert et al. Sep 2017 A1
20170278789 Chuang et al. Sep 2017 A1
20170285584 Nakagawa et al. Oct 2017 A1
20170301625 Mahajan et al. Oct 2017 A1
20180005697 Park et al. Jan 2018 A1
20180017614 Leedy Jan 2018 A1
20180033466 Lee Feb 2018 A1
20180047432 Kondo et al. Feb 2018 A1
20180286800 Kamal et al. Oct 2018 A1
20180330992 Delacruz et al. Nov 2018 A1
20180330993 Delacruz et al. Nov 2018 A1
20180331037 Mohammed et al. Nov 2018 A1
20180331038 Delacruz et al. Nov 2018 A1
20180331072 Nequist et al. Nov 2018 A1
20180331094 Delacruz et al. Nov 2018 A1
20180331095 Delacruz et al. Nov 2018 A1
20180350775 Delacruz et al. Dec 2018 A1
20180366471 Harari Dec 2018 A1
20180373975 Yu et al. Dec 2018 A1
20180374788 Nakagawa et al. Dec 2018 A1
20190042377 Teig et al. Feb 2019 A1
20190042912 Teig et al. Feb 2019 A1
20190042929 Teig et al. Feb 2019 A1
20190043832 Teig et al. Feb 2019 A1
20190051641 Lee et al. Feb 2019 A1
20190088581 Yu Mar 2019 A1
20190096453 Shin Mar 2019 A1
20190109057 Hargan et al. Apr 2019 A1
20190115052 Seong et al. Apr 2019 A1
20190123022 Teig et al. Apr 2019 A1
20190123023 Teig Apr 2019 A1
20190123024 Teig Apr 2019 A1
20190146870 Cha et al. May 2019 A1
20190164914 Hu May 2019 A1
20190196742 Yudanov Jun 2019 A1
20190229089 Zhou Jul 2019 A1
20190244933 Or-Bach et al. Aug 2019 A1
20190268086 Wuu Aug 2019 A1
20190278511 Lee et al. Sep 2019 A1
20190287584 Hollis Sep 2019 A1
20190363098 Lung Nov 2019 A1
20190371391 Cha et al. Dec 2019 A1
20190385981 Chen Dec 2019 A1
20200013699 Liu et al. Jan 2020 A1
20200194052 Shaeffer et al. Jun 2020 A1
20200203318 Nequist et al. Jun 2020 A1
20200219771 Delacruz et al. Jul 2020 A1
20200227389 Teig et al. Jul 2020 A1
20200273798 Mohammed et al. Aug 2020 A1
20200293872 Teig et al. Sep 2020 A1
20200294858 Delacruz et al. Sep 2020 A1
20210050347 Brewer Feb 2021 A1
20210149586 Delacruz et al. May 2021 A1
20220208247 Inuzuka Jun 2022 A1
20220208252 Kim Jun 2022 A1
20220270665 Park Aug 2022 A1
20230146659 Nam May 2023 A1
Foreign Referenced Citations (14)
Number Date Country
105244312 Jun 2018 CN
208608187 Mar 2019 CN
109643700 Sep 2019 CN
111181006 May 2021 CN
102018130035 Apr 2020 DE
3698401 Aug 2020 EP
3698402 Aug 2020 EP
5952923 Jul 2016 JP
153683 Jul 2009 SG
441308 Jun 2014 TW
2017138121 Aug 2017 WO
2019079625 Apr 2019 WO
2019079631 Apr 2019 WO
WO-2019066986 Apr 2019 WO
Non-Patent Literature Citations (19)
Entry
International Search Report and Written Opinion of Commonly Owned International Patent Application PCT/US2018/056565 (XCEL.P0017PCT), mailed Apr. 2, 2019, 17 pages, International Searching Authority (European Patent Office).
International Search Report and Written Opinion of Commonly Owned International Patent Application PCT/US2018/056559 (XCEL.P0047PCT), mailed Mar. 29, 2019, 17 pages, International Searching Authority (European Patent Office).
“Hybrid Memory Cube—HMC Gen2 HMC Memory Features”, Micron Technology, Inc., 2018, pp. 1-105.
Bansal, Samta , “3D-IC is Now Real: Wide-IO is Driving 3D-IC TSV”, Cadence Flash Memory Summit, Cadence Design Systems, Inc., Aug. 2012, 14.
Black, B. , et al., “3D processing technology and its impact on iA32 microprocessors”, In proceedings of the 2004 IEEE International Conference on Computer Design, Oct. 2004, 316-318.
Black, Bryan , et al., “Die Stacking (3D) Microarchitecture”, Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture, IEEE, Orlando, Florida, Dec. 9-13, 2006., 11.
Black, Bryan , “Die Stacking is Happening!”, Advanced Micro Devices, Inc., Santa Clara, California, Dec. 9, 2013, 53.
Hajkazemi, Mohammad Hossein, et al., “Wide I/O or LPDDR? Exploration and Analysis of Performance, Power and Temperature Trade-Offs of Emerging DRAM Technologies in Embedded MPSoCs”, Proceedings of 33rd IEEE International Conference on Computer Design (ICCD), IEEE, New York City, New York, Oct. 18-21, 2015, 8.
Kim, Jung-Sik , et al., “A 1.2 V 12.8 GB/s 2 GB Mobile Wide-I/O DRAM With 4x128 I/Os Using TSV Based Stacking”, IEEE Journal of Solid-State Circuits, Jan. 2012, 10 pages, vol. 47, No. 1, IEEE.
Lin, F. , et al., “Memory Interface Design for Hybrid Memory Cube (HMC)”, IEEE, 2016, pp. 1-5.
Loh, Gabriel H., et al., “Processor Design in 3D Die-Stacking Technologies”, IEEE Micro, May-Jun. 2007, 18 pages, vol. 27, Issue 3, IEEE Computer Society.
Nakamoto, Mark , et al., “Simulation Methodology and Flow Integration for 3D IC Stress Management”, 2010 IEEE Custom Integrated Circuits Conference, Sep. 19-22, 2010, 4 pages, IEEE, San Jose, CA, USA.
Patti, Bob , “A Perspective on Manufacturing 2.5/3D”, IEEE 3D System Integration Conference (2013), pp. 1-30.
Pawlowski, J. Thomas, “Hybrid Memory Cube (HMC)”, Micron Technology, Inc., Aug. 4, 2011, pp. 1-24.
Sangki, Hong , “3D Super-Via for Memory Applications”, Micro-Systems Packaging Initiative (MSPI) Packaging Workshop, Jan. 31, 2007, pp. 1-35.
Tran, Kevin , et al., “Start Your HBM/2.5D Design Today”, High-Bandwidth Memory White Paper, Mar. 29, 2016, 6 pages, Amkor Technology, Inc., San Jose, CA, USA.
Unknown , “Fact Sheet: New Intel Architectures and Technologies Target Expanded Market Opportunitities”, Intel Corporation, Santa Clara, California, Dec. 2, 2018, 9.
Unknown , “Vector Supercomputer SX Series: SX-Aurora Tsubasa”, NEC Corporation, Oct. 2017, 2.
Wu, Xiaoxia , et al., “Electrical Characterization for Intertier Connections and Timing Analysis for 3-D ICs”, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, Dec. 6, 2010, 5 pages, IEEE.
Related Publications (1)
Number Date Country
20230376234 A1 Nov 2023 US
Provisional Applications (1)
Number Date Country
62937749 Nov 2019 US
Continuations (1)
Number Date Country
Parent 17098299 Nov 2020 US
Child 18100110 US