The present invention relates to a memory device.
In recent years, the structure of semiconductor device has changed rapidly, and the storage capacity of semiconductor device increases continuously. Memory device has been widely used in storage device of several products. With the increasing applications, it is desired that memory device has small dimension and large memory capacity. To fulfill the requirement, a memory device having high density and small dimension is needed.
According to some embodiments of the present disclosure, a memory device includes a memory interposer, memory array regions, logic chips, and interconnection lines. The memory array regions are in the memory interposer, in which the memory array regions include at least one memory having NAND architecture. The logic chips are over the memory interposer. The interconnection lines connect the logic chips to each other, and connect the logic chips to the memory array regions.
In some embodiments, the memory array regions further includes a volatile memory different from the memory having NAND architecture.
In some embodiments, the volatile memory is a DRAM.
In some embodiments, among the memory array regions, a number of the memory having NAND architecture is greater than a number of the volatile memory.
In some embodiments, the memory device further includes a controller chip over the memory interposer, in which the controller chip is configured to refresh the memory having NAND architecture.
In some embodiments, an endurance of the memory having NAND architecture is in a range from about 106 to about 1010.
In some embodiments, a retention of the memory having NAND architecture is in a range from 1 second to about 1 year.
In some embodiments, a number of inputs/outputs of the memory having NAND architecture is equal to or greater than 1024.
In some embodiments, each of the logic chips includes about 100 to about 104 cores.
In some embodiments, the memory having NAND architecture includes a bit line, word lines, memory units, and a transistor. The memory units are connected in series, in which the word lines are electrically connected to the memory units, respectively. The transistor connects one of the memory units to the bit line.
According to some embodiments of the present disclosure, a memory device includes a first memory chip and a second memory chip stacked over the first memory chip and electrically connected to the first memory chip. The first and second memory chips each includes a bit line, word lines, memory units, and a transistor. The memory units are connected in series, in which the word lines are electrically connected to the memory units, respectively. The transistor connects one of the memory units to the bit line.
In some embodiments, the second memory chip is stacked over the first memory chip in a staircase manner.
In some embodiments, the memory device further includes a conductive via in contact with a bottom surface of the second memory chip and electrically connected to the second memory chip.
In some embodiments, the memory device further includes a third memory chip stacked over the second memory chip, in which the third memory chip is electrically connected to the first memory chip via through silicon vias vertically extending through the second memory chip.
In some embodiments, the memory device further includes a dielectric layer, a fan-out metal layer, a conductive via, and a bump. The dielectric layer surrounds the first memory chip and the second memory chip. The fan-out metal layer is in contact with a bottom surface of the second memory chip and is electrically connected to the second memory chip, in which the fan-out metal layer laterally extends from the bottom surface of the second memory chip to the dielectric layer. The conductive via is in the dielectric layer and is in contact with a bottom surface of the fan-out metal layer. The bump is disposed on a bottom surface of the dielectric layer and in contact with the conductive via.
In some embodiments, the memory device further includes a third memory chip electrically connected to the first and second memory chips, in which the third memory chip includes a volatile memory.
In some embodiments, the volatile memory is a DRAM.
In some embodiments, an endurance of the first memory chip is in a range from about 106 to about 1010.
In some embodiments, a retention of the first memory chip is in a range from 1 second to about 1 year.
In some embodiments, a number of inputs/outputs of the first memory chip is equal to or greater than 1024.
It is to be understood that both the foregoing general description and the following detailed description are by examples, and are intended to provide further explanation of the invention as claimed.
Aspects of the present disclosure are best understood from the following detailed description when read with the accompanying figures. It is noted that, in accordance with the standard practice in the industry, various features are not drawn to scale. In fact, the dimensions of the various features may be arbitrarily increased or reduced for clarity of discussion.
The following disclosure provides many different embodiments, or examples, for implementing different features of the provided subject matter. Specific examples of components and arrangements are described below to simplify the present disclosure. These are, of course, merely examples and are not intended to be limiting. For example, the formation of a first feature over or on a second feature in the description that follows may include embodiments in which the first and second features are formed in direct contact, and may also include embodiments in which additional features may be formed between the first and second features, such that the first and second features may not be in direct contact. In addition, the present disclosure may repeat reference numerals and/or letters in the various examples. This repetition is for the purpose of simplicity and clarity and does not in itself dictate a relationship between the various embodiments and/or configurations discussed.
Further, spatially relative terms, such as “beneath,” “below,” “lower,” “above,” “upper” and the like, may be used herein for ease of description to describe one element or feature's relationship to another element(s) or feature(s) as illustrated in the figures. The spatially relative terms are intended to encompass different orientations of the device in use or operation in addition to the orientation depicted in the figures. The apparatus may be otherwise oriented (rotated 90 degrees or at other orientations) and the spatially relative descriptors used herein may likewise be interpreted accordingly.
The basic unit of computation in a neural network is a neuron. A neuron receives inputs from other neurons, or from an external source and computes an output.
In the sum-of-products expression above, each product term is a product of a variable input xi and a weight wi. The weight wi can vary among the terms, corresponding, for example, to coefficients of the variable inputs xi. Similarly, outputs from the other neurons in the hidden layer can also be calculated. The outputs of the two neurons in the hidden layer 110 act as inputs to the output neuron in the output layer 104.
Neural networks can be used to learn patterns that best represent a large set of data. The hidden layers closer to the input layer learn high level generic patterns, and the hidden layers closer to the output layer learn more data-specific patterns. Training is a phase in which a neural network learns from training data. During training, the connections in the synaptic layers are assigned weights based on the results of the training session. Inference is a stage in which a trained neural network is used to infer/predict input data and produce output data based on the prediction.
In the neural network 100 of
The purpose of training the neural network is to improve the learning ability of the network. In greater details, neural network calculates a predicted result of an input via forward calculation, and the predicted result is compared with a standard answer. The difference between the predicted result and the standard answer will be sent back to the neural network via backward propagation. The weights of the neural network will be updated according to the difference. Generally, the forward calculation can be regarded as proceeding sum-of-products, layer by layer, along the +X direction of
Once the training is completed, the trained neural network can be applied to a real situation along the X direction of
After the above operations are completed, the memory data will be changed once or twice. For example, in forward calculation, few memory data will be changed. While in backward propagation, many memory data will be changed.
When the model width (Y) and the batch size (Z) increase, the parallelism will increase. That is, the amount of read/write is large, and thus more time is needed for processing the data.
Moreover, if the model depth (X) increases, the calculation time will increase, and the data will be stored for longer time.
Yet from another aspect, if the model depth (X) and the model width (Y) increase, more memories are needed.
Accordingly, in calculation of neural network, the data will be stored for longer time, it is less needed for latency, and more memories are needed. Volatile memories, such as SRAM, DRAM, are commonly used in conventional working memory. This is because SRAM and DRAM have greater endurance and lower latency. However, SRAM and DRAM have large memory cells, and thus the memory capacity is low, which is not suitable for calculation of big data and artificial intelligence.
To solve the above issue, the present disclosure provides a volatile memory having NAND architecture, which has greater endurance than conventional non-volatile NAND, and is beneficial for calculation of big data and artificial intelligence.
The memory having NAND architecture also includes word lines WL, which are electrically connected to the memory units 30, respectively. In some embodiments, each word line WL is electrically connected to a gate of a corresponding memory unit 30.
The NAND strings 31, 32 are connected to corresponding bit lines BL-1, BL-2 through respective string select transistors 36, and are connected to common source line 35 through respective ground select transistor.
In a conventional non-volatile NAND memory, the memory unit is small and thus the memory capacity is large. Furthermore, non-volatile NAND memory generally includes high retention, low latency, and poor endurance. Thus, non-volatile NAND memory is commonly used in storage device, such as hard disk (HD).
In the present disclosure, the disclosed memory having NAND architecture has greater endurance by tuning thickness or material of the charge trapping material, or by changing program/erase method. In some embodiments, the endurance of the memory having NAND architecture is in a range from about 106 times to about 1010 times. In some embodiments, the retention of the memory having NAND architecture is less than the retention of conventional non-volatile NAND memory. For example, the retention of conventional non-volatile NAND memory can be about 10 years, while the retention of the memory having NAND architecture can be about 1 second to about 1 year. In some embodiments, the disclosed memory having NAND architecture may include “volatile” property, and thus a refresh mechanism is needed to maintain the data. Thus, the disclosed memory having NAND architecture may also be referred to as volatile NAND memory. In some embodiments, the number of inputs/outputs of the disclosed memory having NAND architecture is greater than 1024. In some embodiments, the number of inputs/outputs of the disclosed memory having NAND architecture is in a range from about 103 to about 107. Here, the term “endurance” may indicate the number of times that a memory device can perform the program/erase cycle before it fails to read back the proper data. The term “retention” can be referred to the longest time that a stored data can be maintained in a memory cell.
Based on the above discussion, the disclosed memory having NAND architecture only preserves the advantage of high density as conventional non-volatile NAND memory, but also has greater endurance. Although the disclosed memory having NAND architecture may include poor latency, the calculation of big data and artificial intelligence has less requirement of latency as discussed above. Accordingly, the “volatile” memory having NAND architecture is beneficial for the calculation of big data and artificial intelligence.
The memory device 200 includes a memory interposer 210. Here, the memory interposer may indicate using memory as an interposer, which means that the interposer includes memory. In some embodiments, the memory interposer may include one or more memory chips including independent I/O. In some embodiments, the area of the memory interposer 210 may be about 8.5 cm2.
The memory interposer 210 includes several memory array regions M1, M2, and M3. Although in the embodiments of
In some embodiments, the memory array regions of the memory interposer 210 only include the memory having NAND architecture. For example, all of the memory array regions M1, M2, and M3 include the memory having NAND architecture.
In other embodiments, the memory array regions of the memory interposer 210 can be hybrid memory array regions. That is, the memory interposer 210 can include the volatile memory having NAND architecture and other types of volatile memories (such as DRAM or SRAM). For example, parts of the memory array regions M1, M2, and M3 include the volatile memory having NAND architecture, while other parts of the memory array regions M1, M2, and M3 include other types of volatile memories (such as DRAM or SRAM). However, in the memory interposer 210, the number of the volatile memory having NAND architecture is greater than the number of other types of volatile memories. For example, two of the memory array regions M1, M2, and M3 include the volatile memory having NAND architecture, and one of the memory array regions M1, M2, and M3 include other types of volatile memories (such as DRAM or SRAM).
In some embodiments, the memory having NAND architecture of the memory array regions M1, M2, and M3 can include 2D arrangement as illustrated in
In some embodiments, the retention of the core memory data of each of the memory array regions M1, M2, and M3 is in a range from about 1 second to about 1 year. In some embodiments, the endurance of the memory array regions M1, M2, and M3 can be greater than 106 times. The total inputs/outputs of each of the memory array regions M1, M2, and M3 can be greater than 1024. In some embodiments, the total inputs/outputs of each of the memory array regions M1, M2, and M3 is in a range from about 103 to about 107.
As mentioned above, because the memory having NAND architecture has “volatile” property, and thus the memory array regions M1, M2, and M3 can include integrated refresh controller. In some embodiments, external refresh controller may be used to refresh the memory array regions M1, M2, and M3.
The memory interposer 210 includes several logic chips 220 stacked over the memory interposer 210. In the embodiments of
It is understood that, in the generation of big data and artificial intelligence, large amount of small cores are commonly used, by using parallel calculation and deep learning to solve different problems. In some embodiments, each logic chip 220 may include large amount of small cores, for example, each logic chip 220 may include about 100 to about 104 cores. For example, the small cores of the logic chips 220 may include GPU, TPU, extremely small CPU, DPU, APU, or the like.
The logic chips 220 can be electrically connected to the memory interposer 210. As shown in the embodiments of
The memory interposer 210 may include several interconnection lines, in which the interconnection lines includes interconnection lines 240A connecting the logic chips 220 to each other, and interconnection lines 240B connecting the logic chips 220 to memory array regions M1, M2, and M3. The interconnection lines 240A can be used for communications between logic chips 220, and the interconnection lines 240B can provide the logic chips 220 with accessing memory data from the memory array regions M1, M2, and M3 at different positions.
In some embodiments, the interconnection lines 240A and 240B include at least one conductive line extending laterally, and several conductive vias vertically extending from top surface and/or bottom surface of the lateral conductive line. For example, each interconnection line 240A include a conductive line extending laterally, and conductive vias that extends upwardly from opposite sides of the lateral conductive line, so as to connecting the logic chips 220 over the memory interposer 210 to each other. In some embodiments, the interconnection lines 240A may be electrically connected to the logic chips 220 through bumps 230.
On the other hand, each interconnection line 240B include a conductive line extending laterally, one conductive via that extends upwardly from one side of the lateral conductive line, and another conductive via that extends downwardly from another side of the lateral conductive line, so as to connecting the logic chips 220 down to the memory array regions M1, M2, and M3. In the embodiments of
The memory device 300 includes vertically stacked memory chips 310A, 310B, 310C, and 310D. In the embodiments of
In some embodiments, the memory chips 310A, 310B, 310C, and 310D only include the memory having NAND architecture. For example, all of the memory chips 310A, 310B, 310C, and 310D include the memory having NAND architecture.
In other embodiments, the memory chips 310A, 310B, 310C, and 310D can be hybrid memory chips. That is, the memory chips 310A, 310B, 310C, and 310D can include the volatile memory having NAND architecture and other types of volatile memories (such as DRAM or SRAM). For example, parts of the memory chips 310A, 310B, 310C, and 310D include the volatile memory having NAND architecture, while other parts of the memory chips 310A, 310B, 310C, and 310D include other types of volatile memories (such as DRAM or SRAM). However, in the memory chips 310A, 310B, 310C, and 310D, the number of the volatile memory having NAND architecture is greater than the number of other types of volatile memories.
In some embodiments, each of the memory chips 310A, 310B, 310C, and 310D include several through silicon vias (TSVs) 320. In some embodiments, the memory chips 310A, 3106, 310C, and 310D can be electrically connected to each other through micro bumps 330. In other embodiments, the memory chips 310A, 310B, 310C, and 310D can be electrically connected to each other through cu-cu bonding. It is beneficial for minimizing the device size by using the through silicon vias (TSVs) 320.
The memory device 400 includes vertically stacked memory chips 410A, 410B, 410C, and 410D. In the embodiments of
In some embodiments, the memory chips 410A, 410B, 410C, and 410D only include the memory having NAND architecture. For example, all of the memory chips 410A, 410B, 410C, and 410D include the memory having NAND architecture.
In other embodiments, the memory chips 410A, 410B, 410C, and 4100 can be hybrid memory chips. That is, the memory chips 410A, 410B, 410C, and 410D can include the volatile memory having NAND architecture and other types of volatile memories (such as DRAM or SRAM). For example, parts of the memory chips 410A, 410B, 410C, and 410D include the volatile memory having NAND architecture, while other parts of the memory chips 410A, 410B, 410C, and 410D include other types of volatile memories (such as DRAM or SRAM). However, in the memory chips 410A, 410B, 410C, and 410D, the number of the volatile memory having NAND architecture is greater than the number of other types of volatile memories.
In some embodiments, the memory chips 410A and 4108 are separated from each other through a dielectric layer 420, in which the memory chips 410B and 410C are separated from each other through a dielectric layer 420, the memory chips 410C and 410D are separated from each other through a dielectric layer 420. In some embodiments, the widths of the memory chips 410A, 410B, 410C, and 410D are substantially the same as the widths of the dielectric layers 420.
The memory device 400 includes a dielectric layer 425 that surrounds the memory chips 410A, 410B, 410C, and 410D, and the dielectric layer 420.
The memory device 400 includes fan-out metal layers 430A, 430B, and 430C. In some embodiments, the fan-out metal layer 430A is electrically connected to the memory chip 410B through the bottom surface of the memory chip 410B, and the fan-out metal layer 430A extends laterally from the dielectric layer 420 to the dielectric layer 425. Stated another way, a portion of the fan-out metal layer 430A is in contact with the dielectric layer 420, another portion of the fan-out metal layer 430A is in contact with the dielectric layer 425. Similarly, the fan-out metal layer 430B is electrically connected to the memory chip 410C through the bottom surface of the memory chip 410C, and the fan-out metal layer 430C is electrically connected to the memory chip 410D through the bottom surface of the memory chip 410D. In some embodiments, the fan-out metal layer 430C extends farther than the fan-out metal layer 430B, and the fan-out metal layer 430B extends farther than the fan-out metal layer 430A.
The memory device 400 includes conductive vias 435A, 435B, 435C, and 435D. The conductive via 435A is in contact with the bottom surface of the memory chip 410A, and is electrically connected to the memory chip 410A. The conductive via 435A extends downwardly from the bottom surface of the memory chip 410A to the bottom surface of the dielectric layer 425. On the other hand, the conductive via 435B is in contact with the portion of the fan-out metal layer 430A extending to the dielectric layer 425, and extends downwardly to the bottom surface of the dielectric layer 425. Similarly, the conductive via 435C is in contact with the portion of the fan-out metal layer 430B extending to the dielectric layer 425, and extends downwardly to the bottom surface of the dielectric layer 425. The conductive via 435D is in contact with the portion of the fan-out metal layer 430C extending to the dielectric layer 425, and extends downwardly to the bottom surface of the dielectric layer 425.
The memory device 400 includes micro bumps 440. In some embodiments, the micro bumps 440 are electrically connected to the conductive vias 435A, 435B, 435C, and 435D, respectively. In some embodiments, the micro bumps 440 can be connected to other substrate (not shown), so as to electrically connect the memory chips 410A, 410B, 410C, and 410D to other substrate.
The memory device 500 includes vertically stacked memory chips 510A, 510B, 510C, and 510D. In the embodiments of
In some embodiments, the memory chips 510A, 510B, 510C, and 510D only include the memory having NAND architecture. For example, all of the memory chips 510A, 5108, 510C, and 510D include the memory having NAND architecture.
In other embodiments, the memory chips 510A, 510B, 510C, and 510D can be hybrid memory chips. That is, the memory chips 510A, 510B, 510C, and 510D can include the volatile memory having NAND architecture and other types of volatile memories (such as DRAM or SRAM). For example, parts of the memory chips 510A, 510B, 510C, and 510D include the volatile memory having NAND architecture, while other parts of the memory chips 510A, 510B, 510C, and 510D include other types of volatile memories (such as DRAM or SRAM). However, in the memory chips 510A, 510B, 510C, and 510D, the number of the volatile memory having NAND architecture is greater than the number of other types of volatile memories.
In some embodiments, the memory chips 510A, 510B, 510C, and 510D are stacked in a staircase manner. For example, one side of the memory chip 510B extends beyond one side of the memory chip 510A, one side of the memory chip 510C extends beyond one side of the memory chip 5108, and one side of the memory chip 510D extends beyond one side of the memory chip 510C.
The memory device 500 includes dielectric layers 520A, 5208, and 520C. In some embodiments, the memory chips 510A and 510B are separated from each other by the dielectric layer 520A, the memory chips 510B and 510C are separated from each other by the dielectric layer 520b, and memory chips 510C and 510D are separated from each other by the dielectric layer 520C. In some embodiments, the dielectric layer 520A substantially covers the top surface of the memory chip 510A, and has substantially the same width as the memory chip 510A. Similarly, the dielectric layer 520B substantially covers the top surface of the memory chip 510B, and has substantially the same width as the memory chip 510B. The dielectric layer 520C substantially covers the top surface of the memory chip 510C, and has substantially the same width as the memory chip 510C.
The memory device 500 includes a dielectric layer 525 that surrounds the memory chips 510A, 510B, 510C, and 510d, and the dielectric layers 520A, 520B, and 520C. In some embodiments, the bottom surface of the dielectric layer 525 is substantially level with the bottom surface of the memory chip 510A.
The memory device 500 includes conductive vias 535A, 535B, and 535C. The conductive via 535A is in contact with the bottom surface of the memory chip 510B, and is electrically connected to the memory chip 5108. The conductive via 535A extends downwardly from the bottom surface of the memory chip 510B to the bottom surface of the dielectric layer 525. Similarly, the conductive via 535B is in contact with the bottom surface of the memory chip 510C, and is electrically connected to the memory chip 510C. The conductive via 535B extends downwardly from the bottom surface of the memory chip 510C to the bottom surface of the dielectric layer 525. The conductive via 535C is in contact with the bottom surface of the memory chip 510D, and is electrically connected to the memory chip 510D. The conductive via 535C extends downwardly from the bottom surface of the memory chip 510D to the bottom surface of the dielectric layer 525.
The memory device 500 includes micro bumps 540. In some embodiments, the micro bumps 540 are in contact with the bottom surface of the memory chip 510A, and is electrically connected to the memory chip 510A. On the other hand, the micro bumps 540 are electrically connected to the conductive vias 535A, 535B, and 535C. In some embodiments, the micro bumps 540 can be connected to other substrate (not shown), so as to electrically connect the memory chips 510A, 510B, 510C, and 510D to other substrate.
The memory interposer 210 of the memory device 600 may include several interconnection lines, in which the interconnection lines includes interconnection lines 242A connecting the logic chips 220 to the switching matrix chip 222 (or the memory controller chip 224), and interconnection lines 242B connecting the switching matrix chip 222 (or the memory controller chip 224) to the memory array regions M1, M2, and M3. The switching matrix chip 222 (or the memory controller chip 224) is electrically connected to the logic chips 220 through the interconnection lines 242A, so as to operate and switch the logic chips 220, such that the logic chips 220 can be communicated with the memory array regions M1, M2, and M3 at different positions.
In some embodiments, the interconnection lines 242A are similar to the interconnection line 240A discussed in
According to the aforementioned embodiments, it can be seen that the present disclosure offers advantages in fabricating integrated circuits. It is understood, however, that other embodiments may offer additional advantages, and not all advantages are necessarily disclosed herein, and that no particular advantage is required for all embodiments. In calculation of big data and artificial intelligence parallel calculation and deep learning are commonly used to solve different problems. Thus, large and deep structure needs large memories. Data will be stored for a longer time, and the requirement of the read/write is reduced. On advantage of the disclosure is that a volatile memory having NAND architecture is used in calculation of big data and artificial intelligence, the memory density can be increased, the total inputs/outputs can be increased, and the device performance can be further improved.