Reconfigurable AI system

Information

  • Patent Grant
  • 12299597
  • Patent Number
    12,299,597
  • Date Filed
    Friday, August 27, 2021
    3 years ago
  • Date Issued
    Tuesday, May 13, 2025
    2 months ago
Abstract
A system in package platform includes a processor chip having a runtime processor core, an accelerator core and a processor-memory interface exposed on a chip-to-chip bonding surface, a first memory chip such as 3D NAND flash memory storing a collection of executable models of inference engines, and a second memory chip storing weights of a selected executable model. The second memory chip can comprise a nonvolatile, random access memory, such as phase change memory. Direct vertical connections such as via-to-via connections, are provided between the processor chip and the second memory chip.
Description
BACKGROUND
Field

The present invention relates to computation platforms for performing inference operations using artificial intelligence models, and models generated using machine learning, and more particularity to such platforms suitable for use in edge devices.


Description of Related Art

Systems executing computation models that are developed using machine learning, including artificial intelligence models, involve executing large numbers of arithmetic operations across input arrays using large arrays of coefficients. The coefficients are often referred to as weights. In a platform executing these models, off-chip memory access can be a limiting power and performance issue. Because of the size of the arrays of coefficients used in these models, on-chip memory can be insufficient, particularly in systems in which it is desirable to utilize more than one model.


It is desirable to provide a platform for performing inference operations addressing these issues.


SUMMARY

A reconfigurable inference platform is described suitable for implementation using a system in package “SiP” configuration. A platform as described herein can comprise a processor chip, a first memory chip suitable for storing arrays of weights, and a second memory chip suitable for storing collections of executable models. The platform can be implemented as a multichip module in a single package. The package can be mounted on a circuit board or other type of substrate, and connected to sensors and other components that can generate data consumed by the execution of the models, and consume data generated by execution of the models.


A processor chip in implementations of the platform can include a runtime processor core, an accelerator core and a processor-memory interface exposed on a chip-to-chip bonding surface of the processor chip. A first memory chip in implementations of the platform can include a nonvolatile, high capacity memory, such as 3D NAND flash memory. The first memory chip can store a collection of executable models of inference engines, where each executable model includes a set of weights to be applied in execution of the model, and in some cases a computation graph for the inference engine. A second memory chip can store at least the set of weights of a selected executable model. The second memory chip can comprise a nonvolatile, random access memory, such as phase change memory. The second memory chip can include a memory-processor interface exposed on a surface of the second memory chip, and complementary to the processor-memory interface on the processor chip. Direct vertical connections such as via-to-via connections, between the processor-memory interface and the memory-processor interface are provided, which enable low power, high throughput, and low latency transfer of data between the chips in support of execution of the selected model.


In an example described herein, the processor chip and the second memory chip are stacked and disposed on an interposer. The first memory chip is also disposed on the interposer which includes interconnection wiring forming at least part of a data path between the first memory chip and the second memory chip. The processor chip can include an input/output interface in addition to the processor-memory interface, and the data path can include a connection from the interconnection wiring of the interposer to the input/output interface of the processor chip.


In an example described herein, the processor chip has access to instruction memory, which can be included on the processor chip or accessible in off-chip storage, storing instructions to perform a runtime procedure. The runtime procedure can include selecting an executable model from the collection of executable models stored in the first memory chip, loading a computation graph for the selected model including configuring the accelerator core on the processor chip, transferring the set of weights of the selected model to the second memory chip, and executing the selected model. Also, the runtime procedure can include changing the model in response to a control event in the field. Thus, the runtime procedure can include changing the selected model to a different model in the collection of executable models, loading a computation graph for the different model including configuring the accelerator core, transferring the set of weights of the different model to the second memory chip, and executing the different model.


An example of a reconfigurable inference method is described comprising providing a processor chip including a runtime processor core, an accelerator core, on-chip memory and a processor-memory interface exposed on a chip-to-chip bonding surface of the processor chip, storing a collection of executable models of an inference engine for a model implemented by machine learning in a first memory chip accessible by the processor chip, each model including a set of weights to be applied in execution of the model, selecting in response to a control event an executable model from the collection of executable models stored in the first memory chip, loading a computation graph for the selected model including configuring the accelerator core, and transferring the set of weights of the selected executable model from the first memory chip to a second memory chip, the second memory chip including a memory-processor interface disposed on a surface the second memory chip and complementary to the processor-memory interface, and executing the selected executable model using direct vertical connections between the processor-memory interface and memory-processor interface.


Other aspects and advantages of the present invention can be seen on review of the drawings, the detailed description and the claims, which follow.





BRIEF DESCRIPTION OF THE DRAWINGS


FIG. 1 is an illustration of a multichip module including an inference platform as described herein.



FIG. 2 is an illustration of another embodiment of a multichip module including an inference platform as described herein.



FIG. 3 is an illustration of yet another embodiment of a multichip module including an inference platform as described herein.



FIG. 4 is a simplified functional block diagram of an inference platform as described herein.



FIG. 5 is a flowchart of a runtime procedure which can be executed by an inference platform as described herein.





DETAILED DESCRIPTION

A detailed description of embodiments of the present technology is provided with reference to the FIGS. 1-5.



FIG. 1 illustrates a reconfigurable inference platform that includes a processor chip 101, a first memory chip 103 (model collection), and the second memory chip 102 (weight memory). In this example, the processor chip 101 and the second memory chip 102 are stacked, and the combination of the processor chip 101 stacked with second memory chip 102, and the first memory chip 103 is mounted on an interposer 110. The assembly is configured as a multichip module 120 in a single package.


The processor chip 101 can include a runtime processor core (e.g. CPU) and an accelerator core, such as an artificial intelligence accelerator (e.g. AIAcc) or a neuron processing unit. The processor chip 101 includes a chip-to-chip bonding surface on which a processor-memory interface 131 is exposed for connection to the second memory chip 102. The second memory chip includes a memory-processor interface 132 exposed on a surface of the second memory chip, and complementary to the processor-memory interface 131 on the processor chip 101. In this example, direct vertical connections are provided between the processor-memory interface and the memory-processor interface. The direct vertical connections can comprise copper via-to-via conductors or other chip-to-chip contact technologies suitable for high speed, low latency, and low power communication between the chips.


In this example, processor chip 101 includes an input/output interface 113 disposed on the surface of the chip 101. The input/output interface 113 is connected to vertical connectors such as through silicon via TSV connections to interconnection wiring 111 on the interposer 110.


The first memory chip 103 includes an interface 112 for connection to the interconnection wiring 111 on the interposer 110.


Thus, interconnection wiring 111 provides part of the data path between the first memory chip and the second memory chip through the processor chip 101.


In the example illustrated in FIG. 1, the processor chip 101 includes another input/output interface 122 for connection to external contact structures 121 of the multichip module 120.



FIG. 2 illustrates another configuration of an inference engine as described herein. This configuration includes a processor chip 201, a first memory chip 203 (model collection), and the second memory chip 202 (weight memory). In this example, the processor chip 201 and the second memory chip 202 are stacked, and the combination of the processor chip 201 stacked with the second memory chip 203 and the first memory chip 202 is mounted on an interposer 210. The assembly is configured as a multichip module 220 in a single package.


The processor chip 201 can include a runtime processor core (e.g. CPU) and an accelerator core, such as an artificial intelligence accelerator (e.g. AIAcc) or a neuron processing unit. The processor chip 201 includes a chip-to-chip bonding surface on which a processor-memory interface 231 is exposed for connection to the second memory chip 202. The second memory chip includes a memory-processor interface 232 exposed on a surface of the second memory chip, and complementary to the processor-memory interface 231 on the processor chip 201. In this example, direct vertical connections at the surfaces are provided between the processor-memory interface and the memory-processor interface. The direct vertical connections can comprise very short length copper via-to-via conductors or other chip-to-chip contact technologies suitable for high speed, low latency, and low power communication between the chips.


In this example, processor chip 201 includes an input/output interface 213 disposed on the surface of the chip 201. The input/output interface 213 is connected to vertical connectors such as through silicon via TSV connections to interconnection wiring 211 on the interposer 210.


Also, the second memory chip 202 includes an input/output interface 241 exposed on the surface opposite the processor chip 201, which connects to complementary interface 240 on the interposer 210, for connection to the interconnection wiring 211 of the interposer 210.


The first memory chip 203 includes an interface 212 for connection to the interconnection wiring 211 on the interposer 210.


Thus, interconnection wiring 211 provides part of the data path between the first memory chip and the second memory chip through the interposer interconnection wiring 211, as an alternative to a data path through the processor chip 201.


In the example illustrated in FIG. 2, the processor chip 201 includes another input/output interface 222 for connection to external contact structures 221 of the multichip module 220.



FIG. 3 illustrates another configuration of an inference engine as described herein. This configuration includes a processor chip 302, a first memory chip 303 (model collection), and the second memory chip 301 (weight memory). In this example, the processor chip 302 and the second memory chip 301 are stacked, and the combination of the processor chip 302 stacked with the second memory chip 301 and the first memory chip 303 is mounted on an interposer 310. In this example, the processor chip 302 is between the second memory chip 301 and the interposer 310. The assembly is configured as a multichip module 320 in a single package.


The processor chip 302 can include a runtime processor core (e.g. CPU) and an accelerator core, such as an artificial intelligence accelerator (e.g. AIAcc) or a neuron processing unit. The processor chip 302 includes a chip-to-chip bonding surface (top surface) on which a processor-memory interface 332 is exposed for connection to the second memory chip 301. The second memory chip 301 includes a memory-processor interface 331 exposed on a surface of the second memory chip (bottom surface), and complementary to the processor-memory interface 332 on the processor chip 302. In this example, direct vertical connections are provided between the processor-memory interface and the memory-processor interface. The direct vertical connections can comprise copper via-to-via conductors or other chip-to-chip contact technologies suitable for high speed, low latency, and low power communication between the chips.


In this example, processor chip 302 includes an input/output interface 313 disposed on the bottom surface of the chip 302. The input/output interface 313 is connected to vertical connectors, which connect to interconnection wiring 311 on the interposer 310.


Also, the processor chip 302 includes an input/output interface 361 exposed on the bottom surface opposite the second memory chip 301, which connects to a complementary interface 362 on the interposer 310, for connection to the interconnection wiring 350 of the interposer 310.


The first memory chip 303 includes an interface 312 for connection to the interconnection wiring 311 on the interposer 310.


Thus, interconnection wiring 311 provides part of the data path between the first memory chip and the second memory chip through the interposer interconnection wiring 311, and through the processor chip 302.


The interposer 310 includes an interface 352 for connection of the interconnection wiring 350 of the interposer (which can be connected to or part of the interconnection wiring 311 of the interposer). Wiring connections are provided from the interface 352 to external contact structures 351 of the multichip module 320.


In other embodiments, the interface 352 can be replaced or supplemented by an interface or interfaces on the side or bottom surfaces of the interposer.



FIGS. 1-3 provide example arrangements of a platform as described herein, showing varieties of configurations of the chips and connections among the chips, the interposer and external contacts of the package. Other arrangements can be implemented as suits a particular need.



FIG. 4 is a simplified functional block diagram of a platform implemented as described with reference to FIGS. 1-3. The platform includes a processor chip 401, a first memory chip 403, and a second memory chip 402. The processor chip 401 in this example includes a CPU or processor core 410, accelerator core 411, on-chip memory 412, such as SRAM which can be used as working memory and as a cache memory, a first I/O interface 413 and a second I/O interface 414. A bus system 420 provides for intra-chip communications among the components.


The first memory chip 403 in this example comprises a high capacity, nonvolatile memory 440 such as 3D NAND implemented using charge trapping storage technology, for example. The first memory chip 403 includes a first memory I/O interface 441 for off-chip communications. The first memory I/O interface 441 can comprise a high-speed serial port, such as an SPI compatible port, or a parallel port, depending on the particular implementation of the memory chip utilized. A data path 451 is provided in this example between the first memory I/O interface 441, and the first I/O interface 413 on the processor chip 401.


The second memory chip 402, in this example, comprises a high-speed, random-access nonvolatile memory 430, such as 3D phase change storage technology. In other examples, the second memory chip 402 can comprise NOR flash memory using charge trapping storage technology, or other suitable random-access technologies like resistive RAM (e.g. metal oxide memory), magnetic RAM, Ferroelectric RAM and so on.


The second memory chip 402 includes a memory I/O interface 431, for off-chip communications directly connected by vertical interconnections 450 to the second I/O interface 414 on the processor chip 401.


DRAM is an option to bond into the SiP in case on-chip SRAM is not big enough. Thermal (heat) management can used to guarantee data retention.


An accelerator core (e.g. accelerator core 411), as the term is used herein, is a configurable logic circuit including components designed or suitable for execution of some or all of the arithmetic operations of an inference model. Configuration of the accelerator core can include loading a set of weights used in the inference model, or parts of the set of weights. In some embodiments, configuration of the accelerator core can include loading some or all of the of the computation graphs of the inference model to define the sequence and architecture of the operation of the inference model. The inference model can comprise a computation graph of a deep learning neural network, in some examples having a plurality of fully connected and partially connected layers, activation functions, normalization functions and so on.


An accelerator core can be implemented using configurable logic, like arrays of configurable units used in field programmable gate arrays for example, in which compiled computation graphs are configured using bit files. An accelerator core can be implemented using a hybrid of data flow configurable logic and sequential processing configurable logic.


The runtime processor core (e.g. CPU 410) can execute a runtime program to coordinate operation of the accelerator core to accomplish real time inference operations, including data input/output operations, loading computation graphs, moving the set of weights to be applied in the inference operation into and out of the accelerator core, delivering input data to the accelerator core, and performing parts of the computation graphs.



FIG. 5 is a flowchart illustrating an example of logic of a procedure executed by an inference platform, such as platforms described with reference to FIGS. 1-4. The logic can be implemented using computer programs stored in memory, such as the SRAM on-chip memory 412, or other memory accessible by the CPU 410. In this example, the procedure includes downloading a collection of executable artificial intelligence models from an external source, such as a network, and loading the collection in the high capacity NAND flash memory on the platform (501). During runtime, the procedure waits for a control event (502). The control event can include a reset, an expiration of a timer, a message received from a communication network or other external source, data generated by execution of an inference engine in the processor chip itself, or other signals. As long as no control event is detected, the procedure loops.


When the control event is detected, the procedure includes selecting an artificial intelligence model from the collection stored in the NAND flash memory (503). The selected model, or at least a set of weights of the selected model, is then transferred from the NAND flash memory to the weight memory (504). The procedure includes configuring the accelerator core using parameters of the selected model read from the NAND flash memory (505). After loading the weights and configuring the accelerator core, the procedure includes executing an inference procedure using the parameters of the selected models stored in the weight memory, including transferring parameters such as weights, using the direct vertical connections between the processor chip 401, and the second memory chip 402 (506).


Thus, the procedure of FIG. 5 includes a procedure to select an executable model from the collection of executable models stored in the first memory chip to load a computation graph for the selected model including configuring the accelerator core, to transfer the set of weights of the selected model to the second memory chip, and to execute the selected model. Also, as shown in FIG. 5, after executing or beginning to execute the selected model, the process loops to step 502, to wait for a next control event. Upon detection of the next control event, the steps 503 to 506 are traversed, and can include changing the selected model to a different model in the collection of executable models, loading a computation graph for the different model including configuring the accelerator core, transferring the set of weights of the different model to the second memory chip, and executing the different model.


It will be appreciated with reference to FIG. 5, that many of the steps can be combined, performed in parallel or performed in a different sequence without affecting the functions achieved. In some cases, as the reader will appreciate, a rearrangement of steps will achieve the same results only if certain other changes are made as well. In other cases, as the reader will appreciate, a rearrangement of steps will achieve the same results only if certain conditions are satisfied. Furthermore, it will be appreciated that the flow charts herein show only steps that are pertinent to an understanding of the invention, and it will be understood that numerous additional steps for accomplishing other functions can be performed before, after and between those shown.


An SiP platform described, in which one or more 3D NAND chip(s) store a collection including multiple different AI models (computation graph and weights), a weight memory chip(s) stores the weights of a selected AI model, and a processor chip which can be a special purpose AI logic chip (CPU+AI accelerator) is included with the memory system to execute the selected AI model parameters (e.g. weights), hyperparameters (e.g. neural network computation graphs or architectural details) needed by the CPU/NPU (e.g. layers, normalization functions, activation functions, etc.)


Inter-chip bonding between the AI logic chip and the weight memory chip can be Via-to-Via Cu bonding or other 3D (2.5 D) bonding technologies.


While the present invention is disclosed by reference to the preferred embodiments and examples detailed above, it is to be understood that these examples are intended in an illustrative rather than in a limiting sense. It is contemplated that modifications and combinations will readily occur to those skilled in the art, which modifications and combinations will be within the spirit of the invention and the scope of the following claims.

Claims
  • 1. A reconfigurable inference platform, comprising: a processor chip including a runtime processor core, an accelerator core, on-chip memory and a processor-memory interface exposed on a chip-to-chip bonding surface of the processor chip;a first memory chip accessible by the processor chip to store a collection of executable models of an inference engine, each model including a set of weights to be applied in execution of the model;a second memory chip to store the set of weights of a selected executable model, the second memory chip including a memory-processor interface exposed on a surface of the second memory chip and complementary to the processor-memory interface; anddirect vertical connections between the processor-memory interface and memory-processor interface,wherein the processor chip and the second memory chip are stacked and disposed on an interposer, and the first memory chip is disposed on the interposer, the interposer including interconnection wiring forming part of a data path between the first memory chip and the second memory chip, andwherein the processor chip includes a second input/output interface, the data path including a connection from the interconnection wiring of the interposer to the second input/output interface on the processor chip.
  • 2. The platform of claim 1, wherein the direct vertical connections comprise via-to-via connections.
  • 3. The platform of claim 1, wherein the processor core has access to instruction memory, storing executable instructions to perform a procedure including: selecting an executable model from the collection of executable models stored in the first memory chip, loading a computation graph for the selected model including configuring the accelerator core, transferring the set of weights of the selected model to the second memory chip, and executing the selected model.
  • 4. The platform of claim 1, wherein the processor core has access to instruction memory, storing executable instructions to perform a procedure in response to a control event, including changing the selected model to a different model in the collection of executable models, load a computation graph for the different model including configuring the accelerator core, transferring the set of weights of the different model to the second memory chip, and executing the different model.
  • 5. The platform of claim 1, wherein the interposer is below the second memory chip, and the processor chip is disposed above the second memory chip.
  • 6. The platform of claim 1, wherein the interposer is below the processor chip and the second memory chip is disposed above the processor chip.
  • 7. The platform of claim 1, wherein the first memory chip comprises a charge trapping, NAND-architecture memory, and the second memory chip comprises nonvolatile random access memory.
  • 8. The platform of claim 7, wherein the nonvolatile random access memory is phase change memory.
  • 9. The platform of claim 7, wherein the nonvolatile random access memory is a charge trapping, NOR-architecture memory.
  • 10. The platform of claim 1, wherein the processor chip, first memory chip and second memory chip are disposed in a multichip package.
  • 11. A reconfigurable inference method, comprising: providing a processor chip including a runtime processor core, an accelerator core, on-chip memory and a processor-memory interface exposed on a chip-to-chip bonding surface of the processor chip;storing a collection of executable models of an inference engine for a model implemented by machine learning in a first memory chip accessible by the processor chip, each model including a set of weights to be applied in execution of the model;selecting in response to a control event an executable model from the collection of executable models stored in the first memory chip, loading a computation graph for the selected model including configuring the accelerator core, and transferring the set of weights of the selected executable model from the first memory chip to a second memory chip, the second memory chip including a memory-processor interface disposed on a surface the second memory chip and complementary to the processor-memory interface; andexecuting the selected executable model using direct vertical connections between the processor-memory interface and memory-processor interface,wherein the processor chip and the second memory chip are stacked and disposed on an interposer, and the first memory chip is disposed on the interposer, the interposer including interconnection wiring forming part of a data path between the first memory chip and the second memory chip, andwherein the processor chip includes a second processor-memory interface, and including transferring data from the first memory chip to the processor chip on a data path including a connection from the interconnection wiring of the interposer to the second processor-memory interface on the processor chip.
  • 12. The method of claim 11, wherein the direct vertical connections comprise via-to-via connections.
  • 13. The method of claim 11, including changing, in response to a second control event, the selected model to a different model in the collection of executable models, loading a computation graph for the different model including configuring the accelerator core, transferring the set of weights of the different model to the second memory chip, and executing the different model.
  • 14. The method of claim 11, wherein the first memory chip comprises a charge trapping, NAND-architecture memory, and the second memory chip comprises nonvolatile random access memory.
  • 15. The method of claim 14, wherein the nonvolatile random access memory is phase change memory.
  • 16. A reconfigurable inference platform, comprising: a processor chip including a runtime processor core, an accelerator core, on-chip memory and a processor-memory interface exposed on a chip-to-chip bonding surface of the processor chip;a first memory chip accessible by the processor chip to store a collection of executable models of an inference engine, each model including a set of weights to be applied in execution of the model;a second memory chip to store the set of weights of a selected executable model, the second memory chip including a memory-processor interface exposed on a surface of the second memory chip and complementary to the processor-memory interface; anddirect vertical connections between the processor-memory interface and memory-processor interface,wherein the processor chip and the second memory chip are stacked and disposed on an interposer, and the first memory chip is disposed on the interposer, the interposer including interconnection wiring forming part of a data path between the first memory chip and the second memory chip, andwherein (i) the interposer is below the second memory chip and the processor chip, and (ii) the processor chip is disposed above the second memory chip or the second memory chip is disposed above the processor chip.
US Referenced Citations (244)
Number Name Date Kind
4219829 Dorda et al. Aug 1980 A
4987090 Hsu et al. Jan 1991 A
5029130 Yeh Jul 1991 A
5586073 Hiura et al. Dec 1996 A
5963803 Dawson et al. Oct 1999 A
6034882 Johnson et al. Mar 2000 A
6107882 Gabara et al. Aug 2000 A
6313486 Kencke et al. Nov 2001 B1
6385097 Liao et al. May 2002 B1
6486027 Noble et al. Nov 2002 B1
6593624 Walker Jul 2003 B2
6829598 Milev Dec 2004 B2
6856542 Roy et al. Feb 2005 B2
6906940 Lue Jun 2005 B1
6960499 Nandakumar et al. Nov 2005 B2
7081377 Cleeves Jul 2006 B2
7089218 Visel Aug 2006 B1
7129538 Lee et al. Oct 2006 B2
7177169 Scheuerlein Feb 2007 B2
7368358 Ouyang et al. May 2008 B2
7436723 Rinerson et al. Oct 2008 B2
7593908 Abdulkader et al. Sep 2009 B2
7646041 Chae et al. Jan 2010 B2
7747668 Nomura et al. Jun 2010 B2
7948024 Kim et al. May 2011 B2
8045355 Ueda Oct 2011 B2
8154128 Lung Apr 2012 B2
8203187 Lung et al. Jun 2012 B2
8275728 Pin Sep 2012 B2
8331149 Choi et al. Dec 2012 B2
8432719 Lue Apr 2013 B2
8564045 Liu Oct 2013 B2
8589320 Breitwisch et al. Nov 2013 B2
8630114 Lue Jan 2014 B2
8725670 Visel May 2014 B2
8860124 Lue et al. Oct 2014 B2
9064903 Mitchell et al. Jun 2015 B2
9111617 Shim et al. Aug 2015 B2
9147468 Lue Sep 2015 B1
9177966 Rabkin et al. Nov 2015 B1
9213936 Visel Dec 2015 B2
9379129 Lue et al. Jun 2016 B1
9391084 Lue Jul 2016 B2
9397110 Lue Jul 2016 B2
9401371 Lee et al. Jul 2016 B1
9430735 Vali et al. Aug 2016 B1
9431099 Lee et al. Aug 2016 B2
9520485 Lue Dec 2016 B2
9524980 Lue Dec 2016 B2
9535831 Jayasena et al. Jan 2017 B2
9536969 Yang et al. Jan 2017 B2
9589982 Cheng et al. Mar 2017 B1
9698156 Lue Jul 2017 B2
9698185 Chen et al. Jul 2017 B2
9710747 Kang et al. Jul 2017 B2
9747230 Han et al. Aug 2017 B2
9754953 Tang et al. Sep 2017 B2
9767028 Cheng et al. Sep 2017 B2
9898207 Kim et al. Feb 2018 B2
9910605 Jayasena et al. Mar 2018 B2
9922716 Hsiung et al. Mar 2018 B2
9978454 Jung May 2018 B2
9983829 Ravimohan et al. May 2018 B2
9991007 Lee et al. Jun 2018 B2
10037167 Kwon et al. Jul 2018 B2
10043819 Lai et al. Aug 2018 B1
10056149 Yamada et al. Aug 2018 B2
10073733 Jain et al. Sep 2018 B1
10157012 Kelner et al. Dec 2018 B2
10175667 Bang et al. Jan 2019 B2
10211218 Lue Feb 2019 B2
10242737 Lin et al. Mar 2019 B1
10381376 Nishikawa et al. Aug 2019 B1
10403637 Lue Sep 2019 B2
10528643 Choi et al. Jan 2020 B1
10534840 Petti Jan 2020 B1
10540591 Gao et al. Jan 2020 B2
10552759 Rich Feb 2020 B2
10565494 Henry et al. Feb 2020 B2
10635398 Lin et al. Apr 2020 B2
10643713 Louie et al. May 2020 B1
10719296 Lee et al. Jul 2020 B2
10777566 Lue Sep 2020 B2
10783963 Hung et al. Sep 2020 B1
10790023 Harari Sep 2020 B2
10790828 Gunter et al. Sep 2020 B1
10825510 Jaiswal et al. Nov 2020 B2
10860682 Knag et al. Dec 2020 B2
10880994 Aoki et al. Dec 2020 B2
10910393 Lai et al. Feb 2021 B2
10942673 Shafiee Ardestani et al. Mar 2021 B2
10957392 Lee et al. Mar 2021 B2
11069704 Lai et al. Jul 2021 B2
11127108 Sharma et al. Sep 2021 B2
11181115 Manipatruni et al. Nov 2021 B2
11410028 Crill et al. Aug 2022 B2
11443407 Sharma et al. Sep 2022 B2
11694940 Mathuriya Jul 2023 B1
12086410 Mathuriya et al. Sep 2024 B1
20010055838 Walker et al. Dec 2001 A1
20020028541 Lee et al. Mar 2002 A1
20030122181 Wu Jul 2003 A1
20050088878 Eshel Apr 2005 A1
20050280061 Lee Dec 2005 A1
20050287793 Blanchet et al. Dec 2005 A1
20070158736 Arai et al. Jul 2007 A1
20080101109 Haring-Bolivar et al. May 2008 A1
20080117678 Shieh et al. May 2008 A1
20090097321 Kim et al. Apr 2009 A1
20090184360 Jin et al. Jul 2009 A1
20100172189 Itagaki et al. Jul 2010 A1
20100182828 Shima et al. Jul 2010 A1
20100202208 Endo et al. Aug 2010 A1
20100270593 Lung et al. Oct 2010 A1
20110018051 Kim et al. Jan 2011 A1
20110063915 Tanaka et al. Mar 2011 A1
20110106742 Pino May 2011 A1
20110128791 Chang et al. Jun 2011 A1
20110140070 Kim Jun 2011 A1
20110194357 Han et al. Aug 2011 A1
20110286258 Chen et al. Nov 2011 A1
20110297912 Samachisa et al. Dec 2011 A1
20120007167 Hung et al. Jan 2012 A1
20120044742 Narayanan Feb 2012 A1
20120112264 Lee et al. May 2012 A1
20120182801 Lue Jul 2012 A1
20120235111 Osano et al. Sep 2012 A1
20120254087 Visel Oct 2012 A1
20130070528 Maeda Mar 2013 A1
20130075684 Kinoshita et al. Mar 2013 A1
20130119455 Chen et al. May 2013 A1
20140043898 Kuo et al. Feb 2014 A1
20140063949 Tokiwa Mar 2014 A1
20140119127 Lung et al. May 2014 A1
20140149773 Huang et al. May 2014 A1
20140268996 Park Sep 2014 A1
20140330762 Visel Nov 2014 A1
20150008500 Fukumoto et al. Jan 2015 A1
20150170001 Rabinovich et al. Jun 2015 A1
20150171106 Suh Jun 2015 A1
20150179661 Huo et al. Jun 2015 A1
20150199126 Jayasena et al. Jul 2015 A1
20150331817 Han et al. Nov 2015 A1
20150340369 Lue Nov 2015 A1
20160043100 Lee et al. Feb 2016 A1
20160141299 Hong May 2016 A1
20160141337 Shimabukuro et al. May 2016 A1
20160181315 Lee et al. Jun 2016 A1
20160232973 Jung Aug 2016 A1
20160247579 Ueda et al. Aug 2016 A1
20160308114 Kim et al. Oct 2016 A1
20160329341 Shimabukuro et al. Nov 2016 A1
20160336064 Seo et al. Nov 2016 A1
20160342892 Ross Nov 2016 A1
20160342893 Ross et al. Nov 2016 A1
20160343421 Pyo Nov 2016 A1
20160358661 Vali et al. Dec 2016 A1
20160379115 Burger et al. Dec 2016 A1
20170003889 Kim et al. Jan 2017 A1
20170025421 Sakakibara et al. Jan 2017 A1
20170084748 Yang Mar 2017 A1
20170092370 Harari Mar 2017 A1
20170103316 Ross et al. Apr 2017 A1
20170123987 Cheng et al. May 2017 A1
20170148517 Harari May 2017 A1
20170160955 Jayasena et al. Jun 2017 A1
20170169885 Tang et al. Jun 2017 A1
20170169887 Widjaja Jun 2017 A1
20170243879 Yu et al. Aug 2017 A1
20170263623 Zhang et al. Sep 2017 A1
20170270405 Kurokawa Sep 2017 A1
20170287928 Kanamori et al. Oct 2017 A1
20170309634 Noguchi et al. Oct 2017 A1
20170316833 Ihm et al. Nov 2017 A1
20170317096 Shin et al. Nov 2017 A1
20170337466 Bayat et al. Nov 2017 A1
20180113649 Shafiee Ardestani et al. Apr 2018 A1
20180121790 Kim et al. May 2018 A1
20180129424 Confalonieri et al. May 2018 A1
20180129936 Young et al. May 2018 A1
20180144240 Garbin et al. May 2018 A1
20180157488 Shu et al. Jun 2018 A1
20180173420 Li et al. Jun 2018 A1
20180182776 Kim Jun 2018 A1
20180189640 Henry et al. Jul 2018 A1
20180240522 Jung Aug 2018 A1
20180246783 Avraham et al. Aug 2018 A1
20180247195 Kumar et al. Aug 2018 A1
20180285726 Baum Oct 2018 A1
20180286874 Kim et al. Oct 2018 A1
20180321942 Yu et al. Nov 2018 A1
20180342299 Yamada et al. Nov 2018 A1
20180350823 Or-Bach et al. Dec 2018 A1
20190019538 Li et al. Jan 2019 A1
20190019564 Li et al. Jan 2019 A1
20190035449 Saida et al. Jan 2019 A1
20190043560 Sumbul et al. Feb 2019 A1
20190050714 Nosko et al. Feb 2019 A1
20190065151 Chen et al. Feb 2019 A1
20190073564 Saliou Mar 2019 A1
20190073565 Saliou Mar 2019 A1
20190088329 Tiwari et al. Mar 2019 A1
20190102170 Chen et al. Apr 2019 A1
20190138891 Kim May 2019 A1
20190138892 Kim et al. May 2019 A1
20190148393 Lue May 2019 A1
20190164044 Song et al. May 2019 A1
20190164617 Tran et al. May 2019 A1
20190213234 Bayat et al. Jul 2019 A1
20190220249 Lee et al. Jul 2019 A1
20190244662 Lee et al. Aug 2019 A1
20190286419 Lin et al. Sep 2019 A1
20190311243 Whatmough et al. Oct 2019 A1
20190311749 Song et al. Oct 2019 A1
20190325959 Bhargava et al. Oct 2019 A1
20190340497 Baraniuk et al. Nov 2019 A1
20190349426 Smith et al. Nov 2019 A1
20190363131 Torng et al. Nov 2019 A1
20200026993 Otsuka Jan 2020 A1
20200034148 Sumbul et al. Jan 2020 A1
20200065650 Tran et al. Feb 2020 A1
20200098784 Nagashima et al. Mar 2020 A1
20200098787 Kaneko Mar 2020 A1
20200110990 Harada et al. Apr 2020 A1
20200117986 Burr et al. Apr 2020 A1
20200118638 Leobandung et al. Apr 2020 A1
20200143248 Liu et al. May 2020 A1
20200160165 Sarin May 2020 A1
20200227432 Lai et al. Jul 2020 A1
20200334015 Shibata et al. Oct 2020 A1
20200343252 Lai et al. Oct 2020 A1
20200349093 Malladi et al. Nov 2020 A1
20200365611 Hung et al. Nov 2020 A1
20200381450 Lue et al. Dec 2020 A1
20200395309 Cheah Dec 2020 A1
20200402997 Ahn et al. Dec 2020 A1
20210125042 Han Apr 2021 A1
20210168230 Baker Jun 2021 A1
20210209468 Matsumoto et al. Jul 2021 A1
20210240945 Strachan et al. Aug 2021 A1
20220284657 Müller et al. Sep 2022 A1
20230101654 Nava Rodriguez et al. Mar 2023 A1
20230153587 Vogelsang May 2023 A1
20240064044 Liu Feb 2024 A1
Foreign Referenced Citations (40)
Number Date Country
101432821 May 2009 CN
1998012 Nov 2010 CN
103778468 May 2014 CN
105718994 Jun 2016 CN
105789139 Jul 2016 CN
106530210 Mar 2017 CN
106815515 Jun 2017 CN
107077879 Aug 2017 CN
107368892 Nov 2017 CN
107533459 Jan 2018 CN
107767905 Mar 2018 CN
108268946 Jul 2018 CN
110598752 Dec 2019 CN
2048709 Apr 2009 EP
H0451382 Feb 1992 JP
2006127623 May 2006 JP
2009080892 Apr 2009 JP
201108230 Mar 2011 TW
201523838 Jun 2015 TW
201618284 May 2016 TW
201639206 Nov 2016 TW
201715525 May 2017 TW
201732824 Sep 2017 TW
201741943 Dec 2017 TW
201802800 Jan 2018 TW
201807807 Mar 2018 TW
201822203 Jun 2018 TW
201939717 Oct 2019 TW
202004573 Jan 2020 TW
202011285 Mar 2020 TW
202046179 Dec 2020 TW
202103307 Jan 2021 TW
202122994 Jun 2021 TW
202129509 Aug 2021 TW
2012009179 Jan 2012 WO
2012015450 Feb 2012 WO
2016060617 Apr 2016 WO
2016084336 Jun 2016 WO
2017091338 Jun 2017 WO
2018201060 Nov 2018 WO
Non-Patent Literature Citations (53)
Entry
Goplen et al., “Placement of 3D ICs with Thermal and Interlayer Via Considerations,” 2007 44th ACM/IEEE Design Automation Conference, Jun. 4-8, 2007, pp. 626-631.
Tanaka et al., “Through-Silicon via Interconnection for 3D Integration Using Room-Temperature Bonding,” in IEEE Transactions on Advanced Packaging, vol. 32, No. 4, Nov. 2009, pp. 746-753.
Temiz, et al., “Post-CMOS Processing and 3-D Integration Based on Dry-Film Lithography,” in IEEE Transactions on Components, Packaging and Manufacturing Technology, vol. 3, No. 9, Sep. 2013, pp. 1458-1466.
TW Exam Report from 110137775 family member application, English Machine Translation, dated May 18, 2022, 7 pages.
Anonymous, “Data in the Computer”, May 11, 2015, pp. 1-8, https://web.archive.org/web/20150511143158/https://homepage.cs.uri .edu/faculty/wolfe/book/Readings/Reading02.htm (Year: 2015)—See Office Action dated Aug. 17, 2020 in U.S. Appl. No. 16/279,494 for relevance—no year provided by examiner.
Aritome, et al., “Reliability issues of flash memory cells,” Proc. of the IEEE, vol. 81, No. 5, May 1993, pp. 776-788.
Beasley, “Manufacturing Defects,” may be found at https://slideplayer.com/slide/11407304, downloaded May 20, 2020, 61 pages.
Chen et al., “A Highly Pitch Scalable 3D Vertical Gate (VG) NAND Flash Decoded by a Novel Self-Aligned Independently Controlled Double Gate (IDG) StringSelect Transistor (SSL),” 2012 Symp. on VLSI Technology (VLSIT), Jun. 12-14, 2012, pp. 91-92.
Chen et al., “Eyeriss: An Energy-Efficient reconfigurable accelerator for deep convolutional neural networks,” IEEE ISSCC, Jan. 31-Feb. 4, 2016, 3 pages.
Choi et al., “Performance Breakthrough in NOR Flash Memory with Dopant-Segregated Schottky-Barrier (DSSB) SONOS Device”, 2009 Symposium onVLSI Technology Digest of Technical Papers, Jun. 16-18, 2009, pp. 1-2.
Entegris FAQ Series, “Contamination Control in Ultrapure Chemicals and Water: Fundamentals of Contamination Control,” may be found at https://www.entegris.com/en/home/resources/technical-information/faq/contamination-control-in-ultrapure-chemicals-and-water.html., downloaded May 20, 2020, 10 pages.
Fukuzumi et al. “Optimal Integration and Characteristics of Vertical Array Devices for Ultra-High Density, Bit-Cost Scalable Flash Memory,” IEEE Dec. 2007, pp. 449-452.
Gonugondla et al., “Energy-Efficient Deep In-memory Architecture for NAND Flash Memories,” IEEE International Symposium on Circuits and Systems (ISCAS), May 27-30, 2018, 5 pages.
Guo et al., “Fast, energy-efficient, robust, and reproducible mixed-signal neuromorphic classifier based on embedded NOR flash memory technology,” IEEE Int'l Electron Devices Mtg., San Francisco, CA, Dec. 2-6, 2017, 4 pages.
Hsu et al., “Study of Sub-30nm Thin Film Transistor (TFT) Charge-Trapping (CT) Devices for 3D NAND Flash Application,” 2009 IEEE, Dec. 7-9, 2009, pp. 27.4.1-27.4.4.
Hubert et al., “A Stacked SONOS Technology, Up to 4 Levels and 6nm Crystalline Nanowires, With Gate-All-Around on Independent Gates (Flash), Suitable for Full 3D Integration,” IEEE 2009, Dec. 7-9, 2009, pp. 27.6.1-27.6.4.
Hung et al., “A highly scalable vertical gate (VG) 3D NAND Flash with robust program disturb immunity using a novel PN diode decoding structure,” 2011 Symp. on VLSI Technology (VLSIT), Jun. 14-16, 2011, pp. 68-69.
IMEC Magazine, Mar. 2018, 35 pages.
Jang et al., “Vertical Cell Array Using TCAT (Terabit Cell Array Transistor) Technology for Ultra High Density NAND Flash Memory,” 2009 Symposium on VLSI Technology Digest of Technical Papers, Jun. 16-18, 2009, pp. 192-193.
Johnson et al., “512-Mb PROM With a Three-Dimensional Array of Diode/Antifuse Memory Cells,” IEEE Journal of Solid-State Circuits, vol. 38, No. 11, Nov. 2003, pp. 1920-1928.
Jung et al., “Three Dimensionally Stacked NAND Flash Memory Technology Using Stacking Single Crystal Si Layers on ILD and TANOS Structure for Beyond 30nm Node,” International Electron Devices Meeting, 2006. IEDM '06, Dec. 11-13, 2006, pp. 1-4.
Katsumata et al., “Pipe-shaped BiCS flash memory with 16 stacked layers and multi-level-cell operation for ultra high density storage devices,” 2009 Symp. on VLSI Technology, Jun. 16-18, 2009, 2 pages.
Kim et al. “Novel Vertical-Stacked-Array-Transistor (VSAT) for Ultra-High-Density and Cost-Effective NAND Flash Memory Devices and SSD (Solid State Drive)”, Jun. 2009 Symposium on VLSI Technolgy Digest of Technical Papers, pp. 186-187. (cited in parent—not provided herewith).
Kim et al., “Multi-Layered Vertical Gate NAND Flash Overcoming Stacking Limit for Terabit Density Storage,” 2009 Symposium on VLSI Technology Digest of Technical Papers, Jun. 16-18, 2009, pp. 188-189.
Kim et al., “Novel 3-D Structure for Ultra High Density Flash Memory with VRAT (Vertical-Recess-Array-Transistor) and PIPE (Planarized Integration on the same PlanE),” IEEE 2008 Symposium on VLSI Technology Digest of Technical Papers, Jun. 17-19, 2008, pp. 122-123.
Kim et al., “Three-Dimensional NAND Flash Architecture Design Based on Single-Crystalline STacked ARray,” IEEE Transactions on Electron Devices, vol. 59, No. 1, pp. 35-45, Jan. 2012.
Kim, “Abrasive for Chemical Mechanical Polishing. Abrasive Technology: Characteristics and Applications,” Book Abrasive Technology: Characteristics and Applications, Mar. 2018, 20 pages.
Lai et al. “Highly Reliable MA BE-SONOS (Metal-AI203 Bandgap Engineered SONOS) Using a SiO2 Buffer Layer,” VLSI Technology, Systems and Applications 2008, VLSI-TSA International Symposium on Apr. 21-23, 2008, pp. 58-59.
Lai et al., “A Multi-Layer Stackable Thin-Film Transistor (TFT) NAND-Type Flash Memory,” Electron Devices Meeting, 2006, IEDM '06 International, Dec. 11-13, 2006, pp. 1-4.
Liu et al., “Parallelizing SRAM Arrays with Customized Bit-Cell for Binary Neural Networks,” 55th ACM/ESDA/IEEE Design Automation Conference (DAC), Sep. 20, 2018, 4 pages.
Lue et al., “A Highly Scalable 8-Layer 3D Vertical-Gate (VG) TFT NAND Flash Using Junction-Free Buried Channel BE-SONOS Device”, 2010 Symposium on VLSI Technology Digest of Technical Papers, pp. 131-132, Jun. 15-17, 2010.
Lue et al., “A Novel 3D AND-type NVM Architecture Capable of High-density, Low-power In-Memory Sum-of-Product Computation for Artificial Intelligence Application,” IEEE VLSI, Jun. 18-22, 2018, 2 pages.
Lue et al., “A Novel Buried-Channel FinFET BE-SONOS NAND Flash with Improved Memory Window and Cycling Endurance”, 2009 Symposium on VLSI Technology Digest of Technical Papers, p. 224-225.
Meena, et al., “Overview of emerging nonvolatile memory technologies,” Nanoscale Reearch Letters 9:526, Oct. 2, 2014, 34 pages.
Merrikh-Bayat et al., “High-Performance Mixed-Signal Neurocomputing with Nanoscale Flowting-Gate Memory Cell Arrays,” in IEEE Transactions on Neural Netowrks and Learning Systems, vol. 29, No. 10, Oct. 2018, pp. 4782-4790.
Minghao Qi, “ECE 695Q Lecture 10: Optical Lithography—Resolution Enhancement Techniques,” may be found at https://nanohub.org/resources/15325/watch?resid=24507, Spring 2016, 35 pages.
Nowak et al., “Intrinsic fluctuations in Vertical NAND flash memories,” 2012 Symposium on VLSI Technology, Digest of Technical Papers, pp. 21-22, Jun. 12-14, 2012.
Ohzone et al., “Ion-Implanted Thin Polycrystalline-Silicon High-Value Resistors for High-Density Poly-Load Static RAM Applications,” IEEE Trans. on Electron Devices, vol. ED-32, No. 9, Sep. 1985, 8 pages.
Paul et al., “Impact of a Process Variation on Nanowire and Nanotube Device Performance”, IEEE Transactions on Electron Devices, vol. 54, No. 9, Sep. 2007, p. 2369-2376.
Rincon-Mora, et al., “Bandgaps in the crosshairs: What's the trim target?” IEEE, The Georgia Tech Analog & Power IC Labroator, Oct. 18, 2006, 5 pages.
Rod Nussbaumer, “How is data transmitted through wires in the computer?”, Aug. 27, 2015, pp. 1-3, https://www.quora.com/ How-is-data-transmitted-through-wires-in-the-computer (Year: 2015)—See Office Action dated Aug. 17, 2020 in U.S. Appl. No. 16/279,494 for relevance—no year provided by examiner.
Sakai et al., “A Buried Giga-Ohm Resistor (BGR) Load Static RAM Cell,” IEEE Symp. on VLSI Technology, Digest of Papers, Sep. 10-12, 1984, 2 pages.
Schuller et al., “Neuromorphic Computing: From Materials to Systems Architecture,” US Dept. of Energy, Oct. 29-30, 2015, Gaithersburg, MD, 40 pages.
Seo et al., “A Novel 3-D Vertical FG NAND Flash Memory Cell Arrays Using the Separated Sidewall Control Gate (S-SCG) for Highly Reliable MLC Operation,” 2011 3rd IEEE International Memory Workshop (IMW), May 22-25, 2011, 4 pages.
Soudry, et al. “Hebbian learning rules with memristors,” Center for Communication and Information Technologies CCIT Report #840, Sep. 1, 2013, 16 pages.
Tanaka et al., “Bit Cost Scalable Technology with Punch and Plug Process for Ultra High Density Flash Memory,” VLSI Technology, 2007 IEEE Symposium on Jun. 12-14, 2007, pp. 14-15.
The Nikon eReview, “KLA-Tencor Research Scientist Emphasizes Stochastic Challenges at LithoVision 2018,” may be found at https://nikonereview.com/2018/kla-tencor-research-scientist-emphasizes-stochastic-challenges-at-lithovision-Spring 2018, 7 pages.
Wang, Michael, “Technology Trends on 3D-NAND Flash Storage”, Impact 2011, Taipei, dated Oct. 20, 2011, found at www.impact.org.tw/2011/Files/NewsFile/201111110190.pdf.
Webopedia, “DRAM—dynamic random access memory”, Jan. 21, 2017, pp. 1-3, https://web.archive.org/web/20170121124008/https://www.webopedia.com/TERM/D/DRAM.html (Year: 2017)—See Office Action dated Aug. 17, 2020 in U.S. Appl. No. 16/279,494 for relevance—no year provided by examiner.
Webopedia, “SoC”, Oct. 5, 2011, pp. 1-2, https://web.archive.org/web/20111005173630/https://www.webopedia.com/ TERM/S/SoC.html (Year: 2011)—See Office Action dated Aug. 17, 2020 in U.S. Appl. No. 16/279,494 for relevance—no month provided by examiner.
Webopedia, “volatile memory”, Oct. 9, 2017, pp. 1-4, https://web.archive.org/web/20171009201852/https://www.webopedia.com/TERMN/volatile_memory.html (Year: 2017)—See Office Action dated Aug. 17, 2020 in U.S. Appl. No. 16/279,494 for relevance—no year provided by examiner.
Whang, SungJin et al. “Novel 3-dimensional Dual Control-gate with Surrounding Floating-gate (DC-SF) NAND flash cell for 1Tb file storage application,” 2010 IEEE Int'l Electron Devices Meeting (IEDM), Dec. 6-8, 2010, 4 pages.
Y.X. Liu et al., “Comparative Study of Tri-Gate and Double-Gate-Type Poly-Si Fin-Channel Spli-Gate Flash Memories,” 2012 IEEE Silicon Nanoelectronics Workshop (SNW), Honolulu, HI, Jun. 10-11, 2012, pp. 1-2 (MXIC 2261-1).
Related Publications (1)
Number Date Country
20230067190 A1 Mar 2023 US