Reconfigurable AI system

Description

BACKGROUND
Field

The present invention relates to computation platforms for performing inference operations using artificial intelligence models, and models generated using machine learning, and more particularity to such platforms suitable for use in edge devices.

Description of Related Art

Systems executing computation models that are developed using machine learning, including artificial intelligence models, involve executing large numbers of arithmetic operations across input arrays using large arrays of coefficients. The coefficients are often referred to as weights. In a platform executing these models, off-chip memory access can be a limiting power and performance issue. Because of the size of the arrays of coefficients used in these models, on-chip memory can be insufficient, particularly in systems in which it is desirable to utilize more than one model.

It is desirable to provide a platform for performing inference operations addressing these issues.

SUMMARY

A reconfigurable inference platform is described suitable for implementation using a system in package “SiP” configuration. A platform as described herein can comprise a processor chip, a first memory chip suitable for storing arrays of weights, and a second memory chip suitable for storing collections of executable models. The platform can be implemented as a multichip module in a single package. The package can be mounted on a circuit board or other type of substrate, and connected to sensors and other components that can generate data consumed by the execution of the models, and consume data generated by execution of the models.

A processor chip in implementations of the platform can include a runtime processor core, an accelerator core and a processor-memory interface exposed on a chip-to-chip bonding surface of the processor chip. A first memory chip in implementations of the platform can include a nonvolatile, high capacity memory, such as 3D NAND flash memory. The first memory chip can store a collection of executable models of inference engines, where each executable model includes a set of weights to be applied in execution of the model, and in some cases a computation graph for the inference engine. A second memory chip can store at least the set of weights of a selected executable model. The second memory chip can comprise a nonvolatile, random access memory, such as phase change memory. The second memory chip can include a memory-processor interface exposed on a surface of the second memory chip, and complementary to the processor-memory interface on the processor chip. Direct vertical connections such as via-to-via connections, between the processor-memory interface and the memory-processor interface are provided, which enable low power, high throughput, and low latency transfer of data between the chips in support of execution of the selected model.

In an example described herein, the processor chip and the second memory chip are stacked and disposed on an interposer. The first memory chip is also disposed on the interposer which includes interconnection wiring forming at least part of a data path between the first memory chip and the second memory chip. The processor chip can include an input/output interface in addition to the processor-memory interface, and the data path can include a connection from the interconnection wiring of the interposer to the input/output interface of the processor chip.

In an example described herein, the processor chip has access to instruction memory, which can be included on the processor chip or accessible in off-chip storage, storing instructions to perform a runtime procedure. The runtime procedure can include selecting an executable model from the collection of executable models stored in the first memory chip, loading a computation graph for the selected model including configuring the accelerator core on the processor chip, transferring the set of weights of the selected model to the second memory chip, and executing the selected model. Also, the runtime procedure can include changing the model in response to a control event in the field. Thus, the runtime procedure can include changing the selected model to a different model in the collection of executable models, loading a computation graph for the different model including configuring the accelerator core, transferring the set of weights of the different model to the second memory chip, and executing the different model.

An example of a reconfigurable inference method is described comprising providing a processor chip including a runtime processor core, an accelerator core, on-chip memory and a processor-memory interface exposed on a chip-to-chip bonding surface of the processor chip, storing a collection of executable models of an inference engine for a model implemented by machine learning in a first memory chip accessible by the processor chip, each model including a set of weights to be applied in execution of the model, selecting in response to a control event an executable model from the collection of executable models stored in the first memory chip, loading a computation graph for the selected model including configuring the accelerator core, and transferring the set of weights of the selected executable model from the first memory chip to a second memory chip, the second memory chip including a memory-processor interface disposed on a surface the second memory chip and complementary to the processor-memory interface, and executing the selected executable model using direct vertical connections between the processor-memory interface and memory-processor interface.

Other aspects and advantages of the present invention can be seen on review of the drawings, the detailed description and the claims, which follow.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is an illustration of a multichip module including an inference platform as described herein.

FIG. 2 is an illustration of another embodiment of a multichip module including an inference platform as described herein.

FIG. 3 is an illustration of yet another embodiment of a multichip module including an inference platform as described herein.

FIG. 4 is a simplified functional block diagram of an inference platform as described herein.

FIG. 5 is a flowchart of a runtime procedure which can be executed by an inference platform as described herein.

DETAILED DESCRIPTION

A detailed description of embodiments of the present technology is provided with reference to the FIGS. 1-5.

FIG. 1 illustrates a reconfigurable inference platform that includes a processor chip 101, a first memory chip 103 (model collection), and the second memory chip 102 (weight memory). In this example, the processor chip 101 and the second memory chip 102 are stacked, and the combination of the processor chip 101 stacked with second memory chip 102, and the first memory chip 103 is mounted on an interposer 110. The assembly is configured as a multichip module 120 in a single package.

The processor chip 101 can include a runtime processor core (e.g. CPU) and an accelerator core, such as an artificial intelligence accelerator (e.g. AIAcc) or a neuron processing unit. The processor chip 101 includes a chip-to-chip bonding surface on which a processor-memory interface 131 is exposed for connection to the second memory chip 102. The second memory chip includes a memory-processor interface 132 exposed on a surface of the second memory chip, and complementary to the processor-memory interface 131 on the processor chip 101. In this example, direct vertical connections are provided between the processor-memory interface and the memory-processor interface. The direct vertical connections can comprise copper via-to-via conductors or other chip-to-chip contact technologies suitable for high speed, low latency, and low power communication between the chips.

In this example, processor chip 101 includes an input/output interface 113 disposed on the surface of the chip 101. The input/output interface 113 is connected to vertical connectors such as through silicon via TSV connections to interconnection wiring 111 on the interposer 110.

The first memory chip 103 includes an interface 112 for connection to the interconnection wiring 111 on the interposer 110.

Thus, interconnection wiring 111 provides part of the data path between the first memory chip and the second memory chip through the processor chip 101.

In the example illustrated in FIG. 1, the processor chip 101 includes another input/output interface 122 for connection to external contact structures 121 of the multichip module 120.

FIG. 2 illustrates another configuration of an inference engine as described herein. This configuration includes a processor chip 201, a first memory chip 203 (model collection), and the second memory chip 202 (weight memory). In this example, the processor chip 201 and the second memory chip 202 are stacked, and the combination of the processor chip 201 stacked with the second memory chip 203 and the first memory chip 202 is mounted on an interposer 210. The assembly is configured as a multichip module 220 in a single package.

The processor chip 201 can include a runtime processor core (e.g. CPU) and an accelerator core, such as an artificial intelligence accelerator (e.g. AIAcc) or a neuron processing unit. The processor chip 201 includes a chip-to-chip bonding surface on which a processor-memory interface 231 is exposed for connection to the second memory chip 202. The second memory chip includes a memory-processor interface 232 exposed on a surface of the second memory chip, and complementary to the processor-memory interface 231 on the processor chip 201. In this example, direct vertical connections at the surfaces are provided between the processor-memory interface and the memory-processor interface. The direct vertical connections can comprise very short length copper via-to-via conductors or other chip-to-chip contact technologies suitable for high speed, low latency, and low power communication between the chips.

In this example, processor chip 201 includes an input/output interface 213 disposed on the surface of the chip 201. The input/output interface 213 is connected to vertical connectors such as through silicon via TSV connections to interconnection wiring 211 on the interposer 210.

Also, the second memory chip 202 includes an input/output interface 241 exposed on the surface opposite the processor chip 201, which connects to complementary interface 240 on the interposer 210, for connection to the interconnection wiring 211 of the interposer 210.

The first memory chip 203 includes an interface 212 for connection to the interconnection wiring 211 on the interposer 210.

Thus, interconnection wiring 211 provides part of the data path between the first memory chip and the second memory chip through the interposer interconnection wiring 211, as an alternative to a data path through the processor chip 201.

In the example illustrated in FIG. 2, the processor chip 201 includes another input/output interface 222 for connection to external contact structures 221 of the multichip module 220.

FIG. 3 illustrates another configuration of an inference engine as described herein. This configuration includes a processor chip 302, a first memory chip 303 (model collection), and the second memory chip 301 (weight memory). In this example, the processor chip 302 and the second memory chip 301 are stacked, and the combination of the processor chip 302 stacked with the second memory chip 301 and the first memory chip 303 is mounted on an interposer 310. In this example, the processor chip 302 is between the second memory chip 301 and the interposer 310. The assembly is configured as a multichip module 320 in a single package.

The processor chip 302 can include a runtime processor core (e.g. CPU) and an accelerator core, such as an artificial intelligence accelerator (e.g. AIAcc) or a neuron processing unit. The processor chip 302 includes a chip-to-chip bonding surface (top surface) on which a processor-memory interface 332 is exposed for connection to the second memory chip 301. The second memory chip 301 includes a memory-processor interface 331 exposed on a surface of the second memory chip (bottom surface), and complementary to the processor-memory interface 332 on the processor chip 302. In this example, direct vertical connections are provided between the processor-memory interface and the memory-processor interface. The direct vertical connections can comprise copper via-to-via conductors or other chip-to-chip contact technologies suitable for high speed, low latency, and low power communication between the chips.

In this example, processor chip 302 includes an input/output interface 313 disposed on the bottom surface of the chip 302. The input/output interface 313 is connected to vertical connectors, which connect to interconnection wiring 311 on the interposer 310.

Also, the processor chip 302 includes an input/output interface 361 exposed on the bottom surface opposite the second memory chip 301, which connects to a complementary interface 362 on the interposer 310, for connection to the interconnection wiring 350 of the interposer 310.

The first memory chip 303 includes an interface 312 for connection to the interconnection wiring 311 on the interposer 310.

Thus, interconnection wiring 311 provides part of the data path between the first memory chip and the second memory chip through the interposer interconnection wiring 311, and through the processor chip 302.

The interposer 310 includes an interface 352 for connection of the interconnection wiring 350 of the interposer (which can be connected to or part of the interconnection wiring 311 of the interposer). Wiring connections are provided from the interface 352 to external contact structures 351 of the multichip module 320.

In other embodiments, the interface 352 can be replaced or supplemented by an interface or interfaces on the side or bottom surfaces of the interposer.

FIGS. 1-3 provide example arrangements of a platform as described herein, showing varieties of configurations of the chips and connections among the chips, the interposer and external contacts of the package. Other arrangements can be implemented as suits a particular need.

FIG. 4 is a simplified functional block diagram of a platform implemented as described with reference to FIGS. 1-3. The platform includes a processor chip 401, a first memory chip 403, and a second memory chip 402. The processor chip 401 in this example includes a CPU or processor core 410, accelerator core 411, on-chip memory 412, such as SRAM which can be used as working memory and as a cache memory, a first I/O interface 413 and a second I/O interface 414. A bus system 420 provides for intra-chip communications among the components.

The first memory chip 403 in this example comprises a high capacity, nonvolatile memory 440 such as 3D NAND implemented using charge trapping storage technology, for example. The first memory chip 403 includes a first memory I/O interface 441 for off-chip communications. The first memory I/O interface 441 can comprise a high-speed serial port, such as an SPI compatible port, or a parallel port, depending on the particular implementation of the memory chip utilized. A data path 451 is provided in this example between the first memory I/O interface 441, and the first I/O interface 413 on the processor chip 401.

The second memory chip 402, in this example, comprises a high-speed, random-access nonvolatile memory 430, such as 3D phase change storage technology. In other examples, the second memory chip 402 can comprise NOR flash memory using charge trapping storage technology, or other suitable random-access technologies like resistive RAM (e.g. metal oxide memory), magnetic RAM, Ferroelectric RAM and so on.

The second memory chip 402 includes a memory I/O interface 431, for off-chip communications directly connected by vertical interconnections 450 to the second I/O interface 414 on the processor chip 401.

DRAM is an option to bond into the SiP in case on-chip SRAM is not big enough. Thermal (heat) management can used to guarantee data retention.

An accelerator core (e.g. accelerator core 411), as the term is used herein, is a configurable logic circuit including components designed or suitable for execution of some or all of the arithmetic operations of an inference model. Configuration of the accelerator core can include loading a set of weights used in the inference model, or parts of the set of weights. In some embodiments, configuration of the accelerator core can include loading some or all of the of the computation graphs of the inference model to define the sequence and architecture of the operation of the inference model. The inference model can comprise a computation graph of a deep learning neural network, in some examples having a plurality of fully connected and partially connected layers, activation functions, normalization functions and so on.

An accelerator core can be implemented using configurable logic, like arrays of configurable units used in field programmable gate arrays for example, in which compiled computation graphs are configured using bit files. An accelerator core can be implemented using a hybrid of data flow configurable logic and sequential processing configurable logic.

The runtime processor core (e.g. CPU 410) can execute a runtime program to coordinate operation of the accelerator core to accomplish real time inference operations, including data input/output operations, loading computation graphs, moving the set of weights to be applied in the inference operation into and out of the accelerator core, delivering input data to the accelerator core, and performing parts of the computation graphs.

FIG. 5 is a flowchart illustrating an example of logic of a procedure executed by an inference platform, such as platforms described with reference to FIGS. 1-4. The logic can be implemented using computer programs stored in memory, such as the SRAM on-chip memory 412, or other memory accessible by the CPU 410. In this example, the procedure includes downloading a collection of executable artificial intelligence models from an external source, such as a network, and loading the collection in the high capacity NAND flash memory on the platform (501). During runtime, the procedure waits for a control event (502). The control event can include a reset, an expiration of a timer, a message received from a communication network or other external source, data generated by execution of an inference engine in the processor chip itself, or other signals. As long as no control event is detected, the procedure loops.

When the control event is detected, the procedure includes selecting an artificial intelligence model from the collection stored in the NAND flash memory (503). The selected model, or at least a set of weights of the selected model, is then transferred from the NAND flash memory to the weight memory (504). The procedure includes configuring the accelerator core using parameters of the selected model read from the NAND flash memory (505). After loading the weights and configuring the accelerator core, the procedure includes executing an inference procedure using the parameters of the selected models stored in the weight memory, including transferring parameters such as weights, using the direct vertical connections between the processor chip 401, and the second memory chip 402 (506).

Thus, the procedure of FIG. 5 includes a procedure to select an executable model from the collection of executable models stored in the first memory chip to load a computation graph for the selected model including configuring the accelerator core, to transfer the set of weights of the selected model to the second memory chip, and to execute the selected model. Also, as shown in FIG. 5, after executing or beginning to execute the selected model, the process loops to step 502, to wait for a next control event. Upon detection of the next control event, the steps 503 to 506 are traversed, and can include changing the selected model to a different model in the collection of executable models, loading a computation graph for the different model including configuring the accelerator core, transferring the set of weights of the different model to the second memory chip, and executing the different model.

It will be appreciated with reference to FIG. 5, that many of the steps can be combined, performed in parallel or performed in a different sequence without affecting the functions achieved. In some cases, as the reader will appreciate, a rearrangement of steps will achieve the same results only if certain other changes are made as well. In other cases, as the reader will appreciate, a rearrangement of steps will achieve the same results only if certain conditions are satisfied. Furthermore, it will be appreciated that the flow charts herein show only steps that are pertinent to an understanding of the invention, and it will be understood that numerous additional steps for accomplishing other functions can be performed before, after and between those shown.

An SiP platform described, in which one or more 3D NAND chip(s) store a collection including multiple different AI models (computation graph and weights), a weight memory chip(s) stores the weights of a selected AI model, and a processor chip which can be a special purpose AI logic chip (CPU+AI accelerator) is included with the memory system to execute the selected AI model parameters (e.g. weights), hyperparameters (e.g. neural network computation graphs or architectural details) needed by the CPU/NPU (e.g. layers, normalization functions, activation functions, etc.)

Inter-chip bonding between the AI logic chip and the weight memory chip can be Via-to-Via Cu bonding or other 3D (2.5 D) bonding technologies.

While the present invention is disclosed by reference to the preferred embodiments and examples detailed above, it is to be understood that these examples are intended in an illustrative rather than in a limiting sense. It is contemplated that modifications and combinations will readily occur to those skilled in the art, which modifications and combinations will be within the spirit of the invention and the scope of the following claims.

Claims

1. A reconfigurable inference platform, comprising: a processor chip including a runtime processor core, an accelerator core, on-chip memory and a processor-memory interface exposed on a chip-to-chip bonding surface of the processor chip;a first memory chip accessible by the processor chip to store a collection of executable models of an inference engine, each model including a set of weights to be applied in execution of the model;a second memory chip to store the set of weights of a selected executable model, the second memory chip including a memory-processor interface exposed on a surface of the second memory chip and complementary to the processor-memory interface; anddirect vertical connections between the processor-memory interface and memory-processor interface,wherein the processor chip and the second memory chip are stacked and disposed on an interposer, and the first memory chip is disposed on the interposer, the interposer including interconnection wiring forming part of a data path between the first memory chip and the second memory chip, andwherein the processor chip includes a second input/output interface, the data path including a connection from the interconnection wiring of the interposer to the second input/output interface on the processor chip.
2. The platform of claim 1, wherein the direct vertical connections comprise via-to-via connections.
3. The platform of claim 1, wherein the processor core has access to instruction memory, storing executable instructions to perform a procedure including: selecting an executable model from the collection of executable models stored in the first memory chip, loading a computation graph for the selected model including configuring the accelerator core, transferring the set of weights of the selected model to the second memory chip, and executing the selected model.
4. The platform of claim 1, wherein the processor core has access to instruction memory, storing executable instructions to perform a procedure in response to a control event, including changing the selected model to a different model in the collection of executable models, load a computation graph for the different model including configuring the accelerator core, transferring the set of weights of the different model to the second memory chip, and executing the different model.
5. The platform of claim 1, wherein the interposer is below the second memory chip, and the processor chip is disposed above the second memory chip.
6. The platform of claim 1, wherein the interposer is below the processor chip and the second memory chip is disposed above the processor chip.
7. The platform of claim 1, wherein the first memory chip comprises a charge trapping, NAND-architecture memory, and the second memory chip comprises nonvolatile random access memory.
8. The platform of claim 7, wherein the nonvolatile random access memory is phase change memory.
9. The platform of claim 7, wherein the nonvolatile random access memory is a charge trapping, NOR-architecture memory.
10. The platform of claim 1, wherein the processor chip, first memory chip and second memory chip are disposed in a multichip package.
11. A reconfigurable inference method, comprising: providing a processor chip including a runtime processor core, an accelerator core, on-chip memory and a processor-memory interface exposed on a chip-to-chip bonding surface of the processor chip;storing a collection of executable models of an inference engine for a model implemented by machine learning in a first memory chip accessible by the processor chip, each model including a set of weights to be applied in execution of the model;selecting in response to a control event an executable model from the collection of executable models stored in the first memory chip, loading a computation graph for the selected model including configuring the accelerator core, and transferring the set of weights of the selected executable model from the first memory chip to a second memory chip, the second memory chip including a memory-processor interface disposed on a surface the second memory chip and complementary to the processor-memory interface; andexecuting the selected executable model using direct vertical connections between the processor-memory interface and memory-processor interface,wherein the processor chip and the second memory chip are stacked and disposed on an interposer, and the first memory chip is disposed on the interposer, the interposer including interconnection wiring forming part of a data path between the first memory chip and the second memory chip, andwherein the processor chip includes a second processor-memory interface, and including transferring data from the first memory chip to the processor chip on a data path including a connection from the interconnection wiring of the interposer to the second processor-memory interface on the processor chip.
12. The method of claim 11, wherein the direct vertical connections comprise via-to-via connections.
13. The method of claim 11, including changing, in response to a second control event, the selected model to a different model in the collection of executable models, loading a computation graph for the different model including configuring the accelerator core, transferring the set of weights of the different model to the second memory chip, and executing the different model.
14. The method of claim 11, wherein the first memory chip comprises a charge trapping, NAND-architecture memory, and the second memory chip comprises nonvolatile random access memory.
15. The method of claim 14, wherein the nonvolatile random access memory is phase change memory.
16. A reconfigurable inference platform, comprising: a processor chip including a runtime processor core, an accelerator core, on-chip memory and a processor-memory interface exposed on a chip-to-chip bonding surface of the processor chip;a first memory chip accessible by the processor chip to store a collection of executable models of an inference engine, each model including a set of weights to be applied in execution of the model;a second memory chip to store the set of weights of a selected executable model, the second memory chip including a memory-processor interface exposed on a surface of the second memory chip and complementary to the processor-memory interface; anddirect vertical connections between the processor-memory interface and memory-processor interface,wherein the processor chip and the second memory chip are stacked and disposed on an interposer, and the first memory chip is disposed on the interposer, the interposer including interconnection wiring forming part of a data path between the first memory chip and the second memory chip, andwherein (i) the interposer is below the second memory chip and the processor chip, and (ii) the processor chip is disposed above the second memory chip or the second memory chip is disposed above the processor chip.

US Referenced Citations (244)

Number	Name	Date	Kind
4219829	Dorda et al.	Aug 1980	A
4987090	Hsu et al.	Jan 1991	A
5029130	Yeh	Jul 1991	A
5586073	Hiura et al.	Dec 1996	A
5963803	Dawson et al.	Oct 1999	A
6034882	Johnson et al.	Mar 2000	A
6107882	Gabara et al.	Aug 2000	A
6313486	Kencke et al.	Nov 2001	B1
6385097	Liao et al.	May 2002	B1
6486027	Noble et al.	Nov 2002	B1
6593624	Walker	Jul 2003	B2
6829598	Milev	Dec 2004	B2
6856542	Roy et al.	Feb 2005	B2
6906940	Lue	Jun 2005	B1
6960499	Nandakumar et al.	Nov 2005	B2
7081377	Cleeves	Jul 2006	B2
7089218	Visel	Aug 2006	B1
7129538	Lee et al.	Oct 2006	B2
7177169	Scheuerlein	Feb 2007	B2
7368358	Ouyang et al.	May 2008	B2
7436723	Rinerson et al.	Oct 2008	B2
7593908	Abdulkader et al.	Sep 2009	B2
7646041	Chae et al.	Jan 2010	B2
7747668	Nomura et al.	Jun 2010	B2
7948024	Kim et al.	May 2011	B2
8045355	Ueda	Oct 2011	B2
8154128	Lung	Apr 2012	B2
8203187	Lung et al.	Jun 2012	B2
8275728	Pin	Sep 2012	B2
8331149	Choi et al.	Dec 2012	B2
8432719	Lue	Apr 2013	B2
8564045	Liu	Oct 2013	B2
8589320	Breitwisch et al.	Nov 2013	B2
8630114	Lue	Jan 2014	B2
8725670	Visel	May 2014	B2
8860124	Lue et al.	Oct 2014	B2
9064903	Mitchell et al.	Jun 2015	B2
9111617	Shim et al.	Aug 2015	B2
9147468	Lue	Sep 2015	B1
9177966	Rabkin et al.	Nov 2015	B1
9213936	Visel	Dec 2015	B2
9379129	Lue et al.	Jun 2016	B1
9391084	Lue	Jul 2016	B2
9397110	Lue	Jul 2016	B2
9401371	Lee et al.	Jul 2016	B1
9430735	Vali et al.	Aug 2016	B1
9431099	Lee et al.	Aug 2016	B2
9520485	Lue	Dec 2016	B2
9524980	Lue	Dec 2016	B2
9535831	Jayasena et al.	Jan 2017	B2
9536969	Yang et al.	Jan 2017	B2
9589982	Cheng et al.	Mar 2017	B1
9698156	Lue	Jul 2017	B2
9698185	Chen et al.	Jul 2017	B2
9710747	Kang et al.	Jul 2017	B2
9747230	Han et al.	Aug 2017	B2
9754953	Tang et al.	Sep 2017	B2
9767028	Cheng et al.	Sep 2017	B2
9898207	Kim et al.	Feb 2018	B2
9910605	Jayasena et al.	Mar 2018	B2
9922716	Hsiung et al.	Mar 2018	B2
9978454	Jung	May 2018	B2
9983829	Ravimohan et al.	May 2018	B2
9991007	Lee et al.	Jun 2018	B2
10037167	Kwon et al.	Jul 2018	B2
10043819	Lai et al.	Aug 2018	B1
10056149	Yamada et al.	Aug 2018	B2
10073733	Jain et al.	Sep 2018	B1
10157012	Kelner et al.	Dec 2018	B2
10175667	Bang et al.	Jan 2019	B2
10211218	Lue	Feb 2019	B2
10242737	Lin et al.	Mar 2019	B1
10381376	Nishikawa et al.	Aug 2019	B1
10403637	Lue	Sep 2019	B2
10528643	Choi et al.	Jan 2020	B1
10534840	Petti	Jan 2020	B1
10540591	Gao et al.	Jan 2020	B2
10552759	Rich	Feb 2020	B2
10565494	Henry et al.	Feb 2020	B2
10635398	Lin et al.	Apr 2020	B2
10643713	Louie et al.	May 2020	B1
10719296	Lee et al.	Jul 2020	B2
10777566	Lue	Sep 2020	B2
10783963	Hung et al.	Sep 2020	B1
10790023	Harari	Sep 2020	B2
10790828	Gunter et al.	Sep 2020	B1
10825510	Jaiswal et al.	Nov 2020	B2
10860682	Knag et al.	Dec 2020	B2
10880994	Aoki et al.	Dec 2020	B2
10910393	Lai et al.	Feb 2021	B2
10942673	Shafiee Ardestani et al.	Mar 2021	B2
10957392	Lee et al.	Mar 2021	B2
11069704	Lai et al.	Jul 2021	B2
11127108	Sharma et al.	Sep 2021	B2
11181115	Manipatruni et al.	Nov 2021	B2
11410028	Crill et al.	Aug 2022	B2
11443407	Sharma et al.	Sep 2022	B2
11694940	Mathuriya	Jul 2023	B1
12086410	Mathuriya et al.	Sep 2024	B1
20010055838	Walker et al.	Dec 2001	A1
20020028541	Lee et al.	Mar 2002	A1
20030122181	Wu	Jul 2003	A1
20050088878	Eshel	Apr 2005	A1
20050280061	Lee	Dec 2005	A1
20050287793	Blanchet et al.	Dec 2005	A1
20070158736	Arai et al.	Jul 2007	A1
20080101109	Haring-Bolivar et al.	May 2008	A1
20080117678	Shieh et al.	May 2008	A1
20090097321	Kim et al.	Apr 2009	A1
20090184360	Jin et al.	Jul 2009	A1
20100172189	Itagaki et al.	Jul 2010	A1
20100182828	Shima et al.	Jul 2010	A1
20100202208	Endo et al.	Aug 2010	A1
20100270593	Lung et al.	Oct 2010	A1
20110018051	Kim et al.	Jan 2011	A1
20110063915	Tanaka et al.	Mar 2011	A1
20110106742	Pino	May 2011	A1
20110128791	Chang et al.	Jun 2011	A1
20110140070	Kim	Jun 2011	A1
20110194357	Han et al.	Aug 2011	A1
20110286258	Chen et al.	Nov 2011	A1
20110297912	Samachisa et al.	Dec 2011	A1
20120007167	Hung et al.	Jan 2012	A1
20120044742	Narayanan	Feb 2012	A1
20120112264	Lee et al.	May 2012	A1
20120182801	Lue	Jul 2012	A1
20120235111	Osano et al.	Sep 2012	A1
20120254087	Visel	Oct 2012	A1
20130070528	Maeda	Mar 2013	A1
20130075684	Kinoshita et al.	Mar 2013	A1
20130119455	Chen et al.	May 2013	A1
20140043898	Kuo et al.	Feb 2014	A1
20140063949	Tokiwa	Mar 2014	A1
20140119127	Lung et al.	May 2014	A1
20140149773	Huang et al.	May 2014	A1
20140268996	Park	Sep 2014	A1
20140330762	Visel	Nov 2014	A1
20150008500	Fukumoto et al.	Jan 2015	A1
20150170001	Rabinovich et al.	Jun 2015	A1
20150171106	Suh	Jun 2015	A1
20150179661	Huo et al.	Jun 2015	A1
20150199126	Jayasena et al.	Jul 2015	A1
20150331817	Han et al.	Nov 2015	A1
20150340369	Lue	Nov 2015	A1
20160043100	Lee et al.	Feb 2016	A1
20160141299	Hong	May 2016	A1
20160141337	Shimabukuro et al.	May 2016	A1
20160181315	Lee et al.	Jun 2016	A1
20160232973	Jung	Aug 2016	A1
20160247579	Ueda et al.	Aug 2016	A1
20160308114	Kim et al.	Oct 2016	A1
20160329341	Shimabukuro et al.	Nov 2016	A1
20160336064	Seo et al.	Nov 2016	A1
20160342892	Ross	Nov 2016	A1
20160342893	Ross et al.	Nov 2016	A1
20160343421	Pyo	Nov 2016	A1
20160358661	Vali et al.	Dec 2016	A1
20160379115	Burger et al.	Dec 2016	A1
20170003889	Kim et al.	Jan 2017	A1
20170025421	Sakakibara et al.	Jan 2017	A1
20170084748	Yang	Mar 2017	A1
20170092370	Harari	Mar 2017	A1
20170103316	Ross et al.	Apr 2017	A1
20170123987	Cheng et al.	May 2017	A1
20170148517	Harari	May 2017	A1
20170160955	Jayasena et al.	Jun 2017	A1
20170169885	Tang et al.	Jun 2017	A1
20170169887	Widjaja	Jun 2017	A1
20170243879	Yu et al.	Aug 2017	A1
20170263623	Zhang et al.	Sep 2017	A1
20170270405	Kurokawa	Sep 2017	A1
20170287928	Kanamori et al.	Oct 2017	A1
20170309634	Noguchi et al.	Oct 2017	A1
20170316833	Ihm et al.	Nov 2017	A1
20170317096	Shin et al.	Nov 2017	A1
20170337466	Bayat et al.	Nov 2017	A1
20180113649	Shafiee Ardestani et al.	Apr 2018	A1
20180121790	Kim et al.	May 2018	A1
20180129424	Confalonieri et al.	May 2018	A1
20180129936	Young et al.	May 2018	A1
20180144240	Garbin et al.	May 2018	A1
20180157488	Shu et al.	Jun 2018	A1
20180173420	Li et al.	Jun 2018	A1
20180182776	Kim	Jun 2018	A1
20180189640	Henry et al.	Jul 2018	A1
20180240522	Jung	Aug 2018	A1
20180246783	Avraham et al.	Aug 2018	A1
20180247195	Kumar et al.	Aug 2018	A1
20180285726	Baum	Oct 2018	A1
20180286874	Kim et al.	Oct 2018	A1
20180321942	Yu et al.	Nov 2018	A1
20180342299	Yamada et al.	Nov 2018	A1
20180350823	Or-Bach et al.	Dec 2018	A1
20190019538	Li et al.	Jan 2019	A1
20190019564	Li et al.	Jan 2019	A1
20190035449	Saida et al.	Jan 2019	A1
20190043560	Sumbul et al.	Feb 2019	A1
20190050714	Nosko et al.	Feb 2019	A1
20190065151	Chen et al.	Feb 2019	A1
20190073564	Saliou	Mar 2019	A1
20190073565	Saliou	Mar 2019	A1
20190088329	Tiwari et al.	Mar 2019	A1
20190102170	Chen et al.	Apr 2019	A1
20190138891	Kim	May 2019	A1
20190138892	Kim et al.	May 2019	A1
20190148393	Lue	May 2019	A1
20190164044	Song et al.	May 2019	A1
20190164617	Tran et al.	May 2019	A1
20190213234	Bayat et al.	Jul 2019	A1
20190220249	Lee et al.	Jul 2019	A1
20190244662	Lee et al.	Aug 2019	A1
20190286419	Lin et al.	Sep 2019	A1
20190311243	Whatmough et al.	Oct 2019	A1
20190311749	Song et al.	Oct 2019	A1
20190325959	Bhargava et al.	Oct 2019	A1
20190340497	Baraniuk et al.	Nov 2019	A1
20190349426	Smith et al.	Nov 2019	A1
20190363131	Torng et al.	Nov 2019	A1
20200026993	Otsuka	Jan 2020	A1
20200034148	Sumbul et al.	Jan 2020	A1
20200065650	Tran et al.	Feb 2020	A1
20200098784	Nagashima et al.	Mar 2020	A1
20200098787	Kaneko	Mar 2020	A1
20200110990	Harada et al.	Apr 2020	A1
20200117986	Burr et al.	Apr 2020	A1
20200118638	Leobandung et al.	Apr 2020	A1
20200143248	Liu et al.	May 2020	A1
20200160165	Sarin	May 2020	A1
20200227432	Lai et al.	Jul 2020	A1
20200334015	Shibata et al.	Oct 2020	A1
20200343252	Lai et al.	Oct 2020	A1
20200349093	Malladi et al.	Nov 2020	A1
20200365611	Hung et al.	Nov 2020	A1
20200381450	Lue et al.	Dec 2020	A1
20200395309	Cheah	Dec 2020	A1
20200402997	Ahn et al.	Dec 2020	A1
20210125042	Han	Apr 2021	A1
20210168230	Baker	Jun 2021	A1
20210209468	Matsumoto et al.	Jul 2021	A1
20210240945	Strachan et al.	Aug 2021	A1
20220284657	Müller et al.	Sep 2022	A1
20230101654	Nava Rodriguez et al.	Mar 2023	A1
20230153587	Vogelsang	May 2023	A1
20240064044	Liu	Feb 2024	A1

Foreign Referenced Citations (40)

Number	Date	Country
101432821	May 2009	CN
1998012	Nov 2010	CN
103778468	May 2014	CN
105718994	Jun 2016	CN
105789139	Jul 2016	CN
106530210	Mar 2017	CN
106815515	Jun 2017	CN
107077879	Aug 2017	CN
107368892	Nov 2017	CN
107533459	Jan 2018	CN
107767905	Mar 2018	CN
108268946	Jul 2018	CN
110598752	Dec 2019	CN
2048709	Apr 2009	EP
H0451382	Feb 1992	JP
2006127623	May 2006	JP
2009080892	Apr 2009	JP
201108230	Mar 2011	TW
201523838	Jun 2015	TW
201618284	May 2016	TW
201639206	Nov 2016	TW
201715525	May 2017	TW
201732824	Sep 2017	TW
201741943	Dec 2017	TW
201802800	Jan 2018	TW
201807807	Mar 2018	TW
201822203	Jun 2018	TW
201939717	Oct 2019	TW
202004573	Jan 2020	TW
202011285	Mar 2020	TW
202046179	Dec 2020	TW
202103307	Jan 2021	TW
202122994	Jun 2021	TW
202129509	Aug 2021	TW
2012009179	Jan 2012	WO
2012015450	Feb 2012	WO
2016060617	Apr 2016	WO
2016084336	Jun 2016	WO
2017091338	Jun 2017	WO
2018201060	Nov 2018	WO

Non-Patent Literature Citations (53)

Entry
Goplen et al., “Placement of 3D ICs with Thermal and Interlayer Via Considerations,” 2007 44th ACM/IEEE Design Automation Conference, Jun. 4-8, 2007, pp. 626-631.
Tanaka et al., “Through-Silicon via Interconnection for 3D Integration Using Room-Temperature Bonding,” in IEEE Transactions on Advanced Packaging, vol. 32, No. 4, Nov. 2009, pp. 746-753.
Temiz, et al., “Post-CMOS Processing and 3-D Integration Based on Dry-Film Lithography,” in IEEE Transactions on Components, Packaging and Manufacturing Technology, vol. 3, No. 9, Sep. 2013, pp. 1458-1466.
TW Exam Report from 110137775 family member application, English Machine Translation, dated May 18, 2022, 7 pages.
Anonymous, “Data in the Computer”, May 11, 2015, pp. 1-8, https://web.archive.org/web/20150511143158/https://homepage.cs.uri .edu/faculty/wolfe/book/Readings/Reading02.htm (Year: 2015)—See Office Action dated Aug. 17, 2020 in U.S. Appl. No. 16/279,494 for relevance—no year provided by examiner.
Aritome, et al., “Reliability issues of flash memory cells,” Proc. of the IEEE, vol. 81, No. 5, May 1993, pp. 776-788.
Beasley, “Manufacturing Defects,” may be found at https://slideplayer.com/slide/11407304, downloaded May 20, 2020, 61 pages.
Chen et al., “A Highly Pitch Scalable 3D Vertical Gate (VG) NAND Flash Decoded by a Novel Self-Aligned Independently Controlled Double Gate (IDG) StringSelect Transistor (SSL),” 2012 Symp. on VLSI Technology (VLSIT), Jun. 12-14, 2012, pp. 91-92.
Chen et al., “Eyeriss: An Energy-Efficient reconfigurable accelerator for deep convolutional neural networks,” IEEE ISSCC, Jan. 31-Feb. 4, 2016, 3 pages.
Choi et al., “Performance Breakthrough in NOR Flash Memory with Dopant-Segregated Schottky-Barrier (DSSB) SONOS Device”, 2009 Symposium onVLSI Technology Digest of Technical Papers, Jun. 16-18, 2009, pp. 1-2.
Entegris FAQ Series, “Contamination Control in Ultrapure Chemicals and Water: Fundamentals of Contamination Control,” may be found at https://www.entegris.com/en/home/resources/technical-information/faq/contamination-control-in-ultrapure-chemicals-and-water.html., downloaded May 20, 2020, 10 pages.
Fukuzumi et al. “Optimal Integration and Characteristics of Vertical Array Devices for Ultra-High Density, Bit-Cost Scalable Flash Memory,” IEEE Dec. 2007, pp. 449-452.
Gonugondla et al., “Energy-Efficient Deep In-memory Architecture for NAND Flash Memories,” IEEE International Symposium on Circuits and Systems (ISCAS), May 27-30, 2018, 5 pages.
Guo et al., “Fast, energy-efficient, robust, and reproducible mixed-signal neuromorphic classifier based on embedded NOR flash memory technology,” IEEE Int'l Electron Devices Mtg., San Francisco, CA, Dec. 2-6, 2017, 4 pages.
Hsu et al., “Study of Sub-30nm Thin Film Transistor (TFT) Charge-Trapping (CT) Devices for 3D NAND Flash Application,” 2009 IEEE, Dec. 7-9, 2009, pp. 27.4.1-27.4.4.
Hubert et al., “A Stacked SONOS Technology, Up to 4 Levels and 6nm Crystalline Nanowires, With Gate-All-Around on Independent Gates (Flash), Suitable for Full 3D Integration,” IEEE 2009, Dec. 7-9, 2009, pp. 27.6.1-27.6.4.
Hung et al., “A highly scalable vertical gate (VG) 3D NAND Flash with robust program disturb immunity using a novel PN diode decoding structure,” 2011 Symp. on VLSI Technology (VLSIT), Jun. 14-16, 2011, pp. 68-69.
IMEC Magazine, Mar. 2018, 35 pages.
Jang et al., “Vertical Cell Array Using TCAT (Terabit Cell Array Transistor) Technology for Ultra High Density NAND Flash Memory,” 2009 Symposium on VLSI Technology Digest of Technical Papers, Jun. 16-18, 2009, pp. 192-193.
Johnson et al., “512-Mb PROM With a Three-Dimensional Array of Diode/Antifuse Memory Cells,” IEEE Journal of Solid-State Circuits, vol. 38, No. 11, Nov. 2003, pp. 1920-1928.
Jung et al., “Three Dimensionally Stacked NAND Flash Memory Technology Using Stacking Single Crystal Si Layers on ILD and TANOS Structure for Beyond 30nm Node,” International Electron Devices Meeting, 2006. IEDM '06, Dec. 11-13, 2006, pp. 1-4.
Katsumata et al., “Pipe-shaped BiCS flash memory with 16 stacked layers and multi-level-cell operation for ultra high density storage devices,” 2009 Symp. on VLSI Technology, Jun. 16-18, 2009, 2 pages.
Kim et al. “Novel Vertical-Stacked-Array-Transistor (VSAT) for Ultra-High-Density and Cost-Effective NAND Flash Memory Devices and SSD (Solid State Drive)”, Jun. 2009 Symposium on VLSI Technolgy Digest of Technical Papers, pp. 186-187. (cited in parent—not provided herewith).
Kim et al., “Multi-Layered Vertical Gate NAND Flash Overcoming Stacking Limit for Terabit Density Storage,” 2009 Symposium on VLSI Technology Digest of Technical Papers, Jun. 16-18, 2009, pp. 188-189.
Kim et al., “Novel 3-D Structure for Ultra High Density Flash Memory with VRAT (Vertical-Recess-Array-Transistor) and PIPE (Planarized Integration on the same PlanE),” IEEE 2008 Symposium on VLSI Technology Digest of Technical Papers, Jun. 17-19, 2008, pp. 122-123.
Kim et al., “Three-Dimensional NAND Flash Architecture Design Based on Single-Crystalline STacked ARray,” IEEE Transactions on Electron Devices, vol. 59, No. 1, pp. 35-45, Jan. 2012.
Kim, “Abrasive for Chemical Mechanical Polishing. Abrasive Technology: Characteristics and Applications,” Book Abrasive Technology: Characteristics and Applications, Mar. 2018, 20 pages.
Lai et al. “Highly Reliable MA BE-SONOS (Metal-AI203 Bandgap Engineered SONOS) Using a SiO2 Buffer Layer,” VLSI Technology, Systems and Applications 2008, VLSI-TSA International Symposium on Apr. 21-23, 2008, pp. 58-59.
Lai et al., “A Multi-Layer Stackable Thin-Film Transistor (TFT) NAND-Type Flash Memory,” Electron Devices Meeting, 2006, IEDM '06 International, Dec. 11-13, 2006, pp. 1-4.
Liu et al., “Parallelizing SRAM Arrays with Customized Bit-Cell for Binary Neural Networks,” 55th ACM/ESDA/IEEE Design Automation Conference (DAC), Sep. 20, 2018, 4 pages.
Lue et al., “A Highly Scalable 8-Layer 3D Vertical-Gate (VG) TFT NAND Flash Using Junction-Free Buried Channel BE-SONOS Device”, 2010 Symposium on VLSI Technology Digest of Technical Papers, pp. 131-132, Jun. 15-17, 2010.
Lue et al., “A Novel 3D AND-type NVM Architecture Capable of High-density, Low-power In-Memory Sum-of-Product Computation for Artificial Intelligence Application,” IEEE VLSI, Jun. 18-22, 2018, 2 pages.
Lue et al., “A Novel Buried-Channel FinFET BE-SONOS NAND Flash with Improved Memory Window and Cycling Endurance”, 2009 Symposium on VLSI Technology Digest of Technical Papers, p. 224-225.
Meena, et al., “Overview of emerging nonvolatile memory technologies,” Nanoscale Reearch Letters 9:526, Oct. 2, 2014, 34 pages.
Merrikh-Bayat et al., “High-Performance Mixed-Signal Neurocomputing with Nanoscale Flowting-Gate Memory Cell Arrays,” in IEEE Transactions on Neural Netowrks and Learning Systems, vol. 29, No. 10, Oct. 2018, pp. 4782-4790.
Minghao Qi, “ECE 695Q Lecture 10: Optical Lithography—Resolution Enhancement Techniques,” may be found at https://nanohub.org/resources/15325/watch?resid=24507, Spring 2016, 35 pages.
Nowak et al., “Intrinsic fluctuations in Vertical NAND flash memories,” 2012 Symposium on VLSI Technology, Digest of Technical Papers, pp. 21-22, Jun. 12-14, 2012.
Ohzone et al., “Ion-Implanted Thin Polycrystalline-Silicon High-Value Resistors for High-Density Poly-Load Static RAM Applications,” IEEE Trans. on Electron Devices, vol. ED-32, No. 9, Sep. 1985, 8 pages.
Paul et al., “Impact of a Process Variation on Nanowire and Nanotube Device Performance”, IEEE Transactions on Electron Devices, vol. 54, No. 9, Sep. 2007, p. 2369-2376.
Rincon-Mora, et al., “Bandgaps in the crosshairs: What's the trim target?” IEEE, The Georgia Tech Analog & Power IC Labroator, Oct. 18, 2006, 5 pages.
Rod Nussbaumer, “How is data transmitted through wires in the computer?”, Aug. 27, 2015, pp. 1-3, https://www.quora.com/ How-is-data-transmitted-through-wires-in-the-computer (Year: 2015)—See Office Action dated Aug. 17, 2020 in U.S. Appl. No. 16/279,494 for relevance—no year provided by examiner.
Sakai et al., “A Buried Giga-Ohm Resistor (BGR) Load Static RAM Cell,” IEEE Symp. on VLSI Technology, Digest of Papers, Sep. 10-12, 1984, 2 pages.
Schuller et al., “Neuromorphic Computing: From Materials to Systems Architecture,” US Dept. of Energy, Oct. 29-30, 2015, Gaithersburg, MD, 40 pages.
Seo et al., “A Novel 3-D Vertical FG NAND Flash Memory Cell Arrays Using the Separated Sidewall Control Gate (S-SCG) for Highly Reliable MLC Operation,” 2011 3rd IEEE International Memory Workshop (IMW), May 22-25, 2011, 4 pages.
Soudry, et al. “Hebbian learning rules with memristors,” Center for Communication and Information Technologies CCIT Report #840, Sep. 1, 2013, 16 pages.
Tanaka et al., “Bit Cost Scalable Technology with Punch and Plug Process for Ultra High Density Flash Memory,” VLSI Technology, 2007 IEEE Symposium on Jun. 12-14, 2007, pp. 14-15.
The Nikon eReview, “KLA-Tencor Research Scientist Emphasizes Stochastic Challenges at LithoVision 2018,” may be found at https://nikonereview.com/2018/kla-tencor-research-scientist-emphasizes-stochastic-challenges-at-lithovision-Spring 2018, 7 pages.
Wang, Michael, “Technology Trends on 3D-NAND Flash Storage”, Impact 2011, Taipei, dated Oct. 20, 2011, found at www.impact.org.tw/2011/Files/NewsFile/201111110190.pdf.
Webopedia, “DRAM—dynamic random access memory”, Jan. 21, 2017, pp. 1-3, https://web.archive.org/web/20170121124008/https://www.webopedia.com/TERM/D/DRAM.html (Year: 2017)—See Office Action dated Aug. 17, 2020 in U.S. Appl. No. 16/279,494 for relevance—no year provided by examiner.
Webopedia, “SoC”, Oct. 5, 2011, pp. 1-2, https://web.archive.org/web/20111005173630/https://www.webopedia.com/ TERM/S/SoC.html (Year: 2011)—See Office Action dated Aug. 17, 2020 in U.S. Appl. No. 16/279,494 for relevance—no month provided by examiner.
Webopedia, “volatile memory”, Oct. 9, 2017, pp. 1-4, https://web.archive.org/web/20171009201852/https://www.webopedia.com/TERMN/volatile_memory.html (Year: 2017)—See Office Action dated Aug. 17, 2020 in U.S. Appl. No. 16/279,494 for relevance—no year provided by examiner.
Whang, SungJin et al. “Novel 3-dimensional Dual Control-gate with Surrounding Floating-gate (DC-SF) NAND flash cell for 1Tb file storage application,” 2010 IEEE Int'l Electron Devices Meeting (IEDM), Dec. 6-8, 2010, 4 pages.
Y.X. Liu et al., “Comparative Study of Tri-Gate and Double-Gate-Type Poly-Si Fin-Channel Spli-Gate Flash Memories,” 2012 IEEE Silicon Nanoelectronics Workshop (SNW), Honolulu, HI, Jun. 10-11, 2012, pp. 1-2 (MXIC 2261-1).

Related Publications (1)

	Number	Date	Country
	20230067190 A1	Mar 2023	US

Reconfigurable AI system

Information

Patent Number

Date Filed

Date Issued

Inventors

Original Assignees

Examiners

Agents

CPC

Field of Search

CPC

International Classifications