Embodiments of the present disclosure generally relate to a deep neural network (DNN) device utilizing a plurality of spin-orbit torque (SOT) cells.
Deep neural networks (DNNs) are a promising and quickly evolving area of technology utilized in artificial intelligence (AI). DNNs are composed of multiple layers (two or more) between the input and final output layers. DNNs transform data at each layer, creating a new representation of each layer output. Generally, when a DNN is under training, many of its parameter weights are updated, and during inference, the DNN's parameter weights are already fixed by pre-training. When DNNs are used for inference, the states/values of weights are known. In implementations where non-volatile memory cells are configured for DNN applications with weights stored in the cells, the amount and magnitude of current needed to set or read the states from the cells is known as well.
A core feature of many DNNs involves matrix multiplication/summation followed by an activation function (e.g., a non-linear transfer function). Many DNNs currently rely solely on a traditional computing architecture with discrete memory and processor components to perform both the matrix multiplication/summation and the activation function. Traditional computing architecture-based implementations of a DNN generally require more data movement between a main memory and a CPU/GPU, which is more power/memory consuming and slower. Hardware compute-in-memory implementations of DNNs promise lower energy, non-linearity, and higher density for AI applications. However, the current compute-in-memory hardware implementations of DNN are still limited.
Therefore, there is a need in the art for new hardware implementations for DNNs for inference.
The present disclosure is generally related to a deep neural network (DNN) device comprising a plurality of spin-orbit torque (SOT) cells. The DNN device comprises an array comprising n rows and m columns of nodes, each row of nodes coupled to one of n first conductive lines, each column of nodes coupled to one of m second conductive lines, each node of the n rows and m columns of nodes comprising a plurality of SOT cells, each SOT cell comprising: at least one SOT layer, at least one ferromagnetic (FM) layer, and a controller configured to store at least one corresponding weight of an n×m array of weights of a neural network in each of the SOT cell. The FM layer may comprise two or more domains, two or more elliptical arms, or two or more states.
In one embodiment, a deep neural network (DNN) device, the DNN device comprising an array comprising n rows and m columns of nodes, each row of nodes coupled to one of n first conductive lines, each column of nodes coupled to one of m second conductive lines, each node of the n rows and m columns of nodes comprising a plurality of spin orbit torque (SOT), each SOT cell comprising: a SOT layer, a ferromagnetic (FM) layer disposed on the SOT layer, the FM layer being configured with a plurality of domain walls, electrodes disposed adjacent to the SOT layer, the electrodes being spaced from the FM layer, and a first current line configured to apply current through a first electrode of the electrodes, to the SOT layer, to a second electrode of the electrodes, and a controller configured to store at least one corresponding weight of an n×m array of weights of a neural network in each of the SOT cell.
In another embodiment, a deep neural network (DNN) device, the DNN device comprising an array comprising n rows and m columns of nodes, each row of nodes coupled to one of n first conductive lines, each column of nodes coupled to one of m second conductive lines, each node of the n rows and m columns of nodes comprising a plurality of spin orbit torque (SOT) cells, each SOT cell comprising: a first ferromagnetic (FM) layer, a SOT layer disposed on the first FM layer, a second FM layer disposed on the SOT layer, and electrodes disposed adjacent to the SOT layer, the electrodes being spaced from the first and second FM layers, and a controller configured to store at least one corresponding weight of an n×m array of weights of a neural network in each of the SOT cell.
In yet another embodiment, a deep neural network (DNN) device, the DNN device comprising an array comprising n rows and m columns of nodes, each row of nodes coupled to one of n first conductive lines, each column of nodes coupled to one of m second conductive lines, each node of the n rows and m columns of nodes comprising a plurality of spin orbit torque (SOT) cells, each SOT cell comprising: a first SOT layer, a first ferromagnetic (FM) layer disposed on the first SOT layer, the first FM layer having a multi-elliptical shape creating two or more magnetic states at two or more corners or two or more arms formed by the multi-elliptical shape, and a current input configured to apply current through: a first current path across the first SOT layer and a second current path across the first SOT layer, and a controller configured to store at least one corresponding weight of an n×m array of weights of a neural network in each of the SOT cell.
So that the manner in which the above recited features of the present disclosure can be understood in detail, a more particular description of the disclosure, briefly summarized above, may be had by reference to embodiments, some of which are illustrated in the appended drawings. It is to be noted, however, that the appended drawings illustrate only typical embodiments of this disclosure and are therefore not to be considered limiting of its scope, for the disclosure may admit to other equally effective embodiments.
FIG. 3B1 depicts an example spin-orbit torque (SOT) magnetoresistive random-access memory (MRAM) non-volatile memory cell of the apparatus of
FIG. 3B2 depicts another example SOT MRAM non-volatile memory cell of the apparatus of
To facilitate understanding, identical reference numerals have been used, where possible, to designate identical elements that are common to the figures. It is contemplated that elements disclosed in one embodiment may be beneficially utilized on other embodiments without specific recitation.
In the following, reference is made to embodiments of the disclosure. However, it should be understood that the disclosure is not limited to specific described embodiments. Instead, any combination of the following features and elements, whether related to different embodiments or not, is contemplated to implement and practice the disclosure. Furthermore, although embodiments of the disclosure may achieve advantages over other possible solutions and/or over the prior art, whether or not a particular advantage is achieved by a given embodiment is not limiting of the disclosure. Thus, the following aspects, features, embodiments and advantages are merely illustrative and are not considered elements or limitations of the appended claims except where explicitly recited in a claim(s). Likewise, reference to “the disclosure” shall not be construed as a generalization of any inventive subject matter disclosed herein and shall not be considered to be an element or limitation of the appended claims except where explicitly recited in a claim(s).
The present disclosure is generally related to a deep neural network (DNN) device comprising a plurality of spin-orbit torque (SOT) cells for inference. The DNN device comprises an array comprising n rows and m columns of nodes, each row of nodes coupled to one of n first conductive lines, each column of nodes coupled to one of m second conductive lines, each node of the n rows and m columns of nodes comprising a plurality of SOT cells, each SOT cell comprising: at least one SOT layer, at least one ferromagnetic (FM) layer, and a controller configured to store at least one corresponding weight of an n×m array of weights of a neural network in each of the SOT cell. The FM layer may comprise two or more domains, two or more elliptical arms, or two or more states.
Technology is described for using non-volatile memory cells to perform matrix multiplication in deep neural networks (DNNs). In particular, technology is described for using spin-orbit torque (SOT) non-volatile memory cells to perform matrix-vector multiplication in a neuromorphic computing system. A neuromorphic computing system may be used to implement an artificial neural network.
Matrix-vector multiplication may be performed by taking the dot product of a vector with each column vector of a matrix. A vector dot product is the sum of products of the corresponding elements of two equal length vectors. Accordingly, a non-volatile memory system that performs matrix-vector multiplication also may be referred to as a multiplier-accumulator (MAC).
In an embodiment, a non-volatile memory system includes an array that includes n rows and m columns of nodes, with each node including a non-volatile memory cell. In this regard, the array is an n×m array of non-volatile memory cells. In an embodiment, each row of nodes is coupled to one of n first conductive lines (e.g., word lines), and each column of nodes is coupled to one of m second conductive lines (e.g., bit lines).
In an embodiment, each non-volatile memory cell includes an SOT non-volatile memory cell. Thus, in an embodiment each row of SOT non-volatile memory cells is coupled to one of n first conductive lines (e.g., word lines), and each column of SOT non-volatile memory cells is coupled to one of m second conductive lines (e.g., bit lines).
As used herein, the value of a weight stored in an SOT non-volatile memory cell is also referred to herein as a “multiplicand.” While in some approaches each SOT non-volatile memory cell can be a “binary non-volatile memory cell,” which is a non-volatile memory cell that can be repeatedly switched between two physical states, embodiments disclosed herein are directed to multi-state non-volatile memory cells, which are non-volatile memory cells that may be repeatedly switched among more than two physical states.
In binary weight DNN implementations, each memory cell in the n×m array of SOT non-volatile memory cells is configured to store one bit of information. In such cases, each SOT non-volatile memory cell may be programmed to either a low resistance state (also referred to herein as an “ON state”) or a high resistance state (also referred to herein as an “OFF-state”). The low resistance state may be used to represent the first weight value (e.g., “1”), and the high resistance state may be used to represent the second weight value (e.g., “0”). In contrast, multi-state weight cells of the disclosed embodiments can have more than two weight values.
In an embodiment, n input voltages (also referred to herein as “multiply voltages”) are applied to the first conductive lines (e.g., word lines). In an embodiment, each of the n multiply voltages represents a single-bit binary input, and has either a first input value (e.g., “1V”) or a second input value (e.g., “0V”). Other binary voltage values may be used for first input value and second input value. In an embodiment, the n multiply voltages constitute an n-element input vector (also referred to herein as a “multiply vector”).
In an embodiment, the memory cells in the n×m array of SOT non-volatile memory cells generate m output currents at the m second conductive lines (e.g., bit lines). In an embodiment, the m output currents constitute a result of multiplying the n-element input vector (multiply vector) by the n×m array of weights stored in the SOT non-volatile memory cells, where the weights can be more than two states. In an embodiment, each of the m output currents provides an output that can go beyond a binary output, since there are output values representing multiple discrete weights. In an embodiment, the m output currents constitute an m-element output vector.
In this regard, multiplication is performed by applying a multiply voltage to a node and processing a current from the SOT non-volatile memory cell in the node. In an embodiment, each multiply voltage has a magnitude that represents a multiplier. In an embodiment, the multiply voltage is applied across two terminals of the SOT non-volatile memory cell.
In an embodiment, the SOT non-volatile memory cell responds to the multiply voltage by conducting a memory cell current in the second conductive line (e.g., bit line) coupled to the SOT non-volatile memory cell. The magnitude of the memory cell current represents a product of the multiplier applied to the node and the multiplicand stored in the SOT non-volatile memory cell in the node.
As described above, in an embodiment each SOT non-volatile memory cell may be programmed to multiple states beyond a low resistance ON-state or a high resistance OFF-state, and each of the n multiply voltages has either a first input value (e.g., “1V”) or a second input value (e.g., “0V”). As a result, each of the m output currents represents an output that can go beyond a binary output and has an output that represents the multiplication of the multiply input voltage and the multiple weight states in the cell.
As used herein, “multiplier” is used for the magnitude of the multiply voltage, and “multiplicand” is used for the value of the weight stored in the SOT non-volatile memory cell in the node. This is for the convenience of discussion. The terms “multiplier” and “multiplicand” are interchangeable.
An example memory system 100 in which embodiments may be practiced will be discussed.
As depicted, memory system 100 includes a memory chip controller 104 and a memory chip 106. Although a single memory chip 106 is depicted, memory system 100 may include more than one memory chip (e.g., four, eight or some other number of memory chips). Memory chip controller 104 may receive data and commands from host 102 and provide data to host 102. In an embodiment, memory system 100 is used to perform matrix-vector multiplication. In an embodiment, memory system 100 is used to perform matrix-vector multiplication in a neuromorphic computing system.
Memory chip controller 104 may include one or more state machines, page registers, SRAM, decoders, sense amplifiers, and control circuitry for controlling the operation of memory chip 106. The one or more state machines, page registers, SRAM, and control circuitry for controlling the operation of memory chip 106 may be referred to as managing or control circuits.
The managing or control circuits may facilitate one or more memory operations, such as programming, reading (or sensing) and erasing operations. In an embodiment, the managing or control circuits are used to perform multiplication using non-volatile memory cells. Herein, multiplication will be referred to as a type of memory operation.
In some embodiments, the managing or control circuits (or a portion of the managing or control circuits) that facilitate one or more memory array operations, including programming, reading, erasing and multiplication operations, may be integrated within memory chip 106. In some embodiments, the managing or control circuits may include an on-chip memory controller for determining row and column address, bit line, source line and word line addresses, memory array enable signals, and data latching signals.
Memory chip controller 104 and memory chip 106 may be arranged on a single integrated circuit. In other embodiments, memory chip controller 104 and memory chip 106 may be arranged on different integrated circuits. In some cases, memory chip controller 104 and memory chip 106 may be integrated on a system board, logic board, or a PCB.
Memory chip 106 includes memory core control circuits 108 and a memory core 110. In an embodiment, memory core control circuits 108 include circuits that generate row and column addresses for selecting memory blocks (or arrays) within memory core 110, and generating voltages to bias a particular memory array into a read or a write state. In an embodiment, memory core control circuits 108 include circuits for generating voltages to bias a memory array to perform matrix-vector multiplication using non-volatile memory cells in memory core 110.
Memory chip controller 104 controls operation of memory chip 106. In an embodiment, once memory chip controller 104 initiates a memory operation (e.g., read, write, or multiply), memory core control circuits 108 generate the appropriate bias voltages for bit lines, source lines and/or word lines within memory core 110, and generates the appropriate memory block, row, and column addresses to perform memory operations.
In an embodiment, memory core 110 includes one or more arrays of non-volatile memory cells used to perform matrix-vector multiplication. In an embodiment, memory core 110 includes one or more arrays of SOT non-volatile memory cells used to perform matrix-vector multiplication in a neuromorphic computing system. Memory core 110 may include one or more two-dimensional or three-dimensional arrays of SOT non-volatile memory cells.
In an embodiment, memory core control circuits 108 and memory core 110 are arranged on a single integrated circuit. In other embodiments, memory core control circuits 108 (or a portion of memory core control circuits 108) and memory core 110 may be arranged on different integrated circuits.
In an embodiment, memory core 110 includes a three-dimensional memory array of SOT non-volatile memory cells in which multiple memory levels are formed above a single substrate, such as a wafer. The memory structure may include SOT non-volatile memory that is monolithically formed in one or more physical levels of arrays of non-volatile memory cells having an active area disposed above a silicon (or other type of) substrate.
Read/write/multiply circuit 124 includes circuitry for reading and writing non-volatile memory cells in memory core 110. In an embodiment, transfer data latch 126 is used for intermediate storage between memory chip controller 104 (
In an embodiment, when host 102 instructs memory chip controller 104 to write data to memory chip 106, memory chip controller 104 writes a page of host data to transfer data latch 126. Read/write/multiply circuit 124 then writes data from transfer data latch 126 to a specified page of non-volatile memory cells.
In an embodiment, when host 102 instructs memory chip controller 104 to read data from memory chip 106, read/write/multiply circuit 124 reads from a specified page of non-volatile memory cells into transfer data latch 126, and memory chip controller 104 transfers the read data from transfer data latch 126 to host 102.
Read/write/multiply circuit 124 also includes circuitry for performing multiplication operations using non-volatile memory cells. In an embodiment, read/write/multiply circuit 124 stores multiplicands (e.g., weights) in the non-volatile memory cells.
In an embodiment, read/write/multiply circuit 124 is configured to apply multiply voltages to SOT non-volatile memory cells that store multiplicands (e.g., weights). As described above, in an embodiment each multiply voltage has a magnitude that represents a multiplier. In an embodiment, the non-volatile memory cell in a node conducts a memory cell current in response to the multiply voltage applied to the non-volatile memory cell. In an embodiment, the magnitude of the non-volatile memory cell output current depends on the physical state of the non-volatile memory cell and the magnitude of the multiply voltage.
For example, in an embodiment the magnitude of a SOT non-volatile memory cell current depends on the resistance or other magnetization status of the SOT non-volatile memory cell and the voltage applied across two terminals of the SOT non-volatile memory cell. In an embodiment, the magnitude of the non-volatile memory cell current depends on the non-volatile memory cell's weight state.
The multiply voltage may be similar in magnitude to a read voltage, in that the multiply voltage may cause the SOT non-volatile memory cell to conduct a memory cell current without changing the physical state of the SOT non-volatile memory cell. However, whereas a read voltage may have a magnitude that is selected to delineate between physical states, the magnitude of a multiply voltage is not necessarily selected to delineate between physical states. The following examples of a SOT non-volatile memory cell programmed to one of two states will be used to illustrate the basic concept, though the various embodiments provide for the cells being programmed to store more than two states.
In a read operation, after a read voltage is applied the SOT memory cell current may be sensed and compared with a reference current to determine which state the memory cell is in. For example, the magnitude of the output current corresponding to the read voltage may be compared to a reference current to delineate between the two states. However, the multiply voltage could have one of many different magnitudes, depending on what multiplier is desired. Moreover, the memory cell current that results from applying the multiply voltage is not necessarily compared to a reference current.
In an embodiment, read/write/multiply circuit 124 simultaneously applies a corresponding multiply voltage to each node. Each multiply voltage may correspond to an element of an input vector. The current in each bit line generates a vector multiplication result signal that represents multiplication of the first vector by a second vector.
Voltage generators for selected control lines 122a may be used to generate program, read, and/or multiply voltages. In an embodiment, voltage generators for selected control lines 122a generates a voltage whose magnitude is based on a multiplier for a mathematical multiplication operation. In an embodiment, the voltage difference between the voltages for two selected control lines is a multiply voltage.
Voltage generators for unselected control lines 122b may be used to generate voltages for control lines that are connected to memory cells that are not selected for a program, read, or multiply operation. Signal generators for reference signals 122c may be used to generate reference signals (e.g., currents, voltages) to be used as a comparison signal to determine the physical state of a memory cell.
In an embodiment, non-volatile memory cells are used to perform matrix-vector multiplication in a neuromorphic computing system. A neuromorphic computing system may be used to implement an artificial neural network.
In an embodiment, each input neuron x1, x2, x3, . . . , xn has an associated value, each output neuron y1, y2, y3, . . . , ym has an associated value, and each weight w11, w12, w13, . . . , wnm has an associated value. The value of each output neuron y1, y2, y3, . . . , ym may be determined as follows:
In matrix notation, equation (1) may be written as y=xTW, where y is an m-element output vector, x is an n-element input vector, and W is an n×m array of weights, as depicted in
The matrix-vector multiplication operation depicted in
So, for example, with n=4 and m=3,
In an embodiment, a cross-point memory array is used to perform the multiply and accumulate operations described above.
Cross-point memory array 210 includes n rows and m columns of nodes 21211, 21212, . . . , 21234. Each row of nodes 21211, 21212, . . . , 21234 is coupled to one of n first conductive lines (e.g., word lines (WL1, WL2, WL3, WL4). Each column of nodes 21211, 21212, . . . , 21234 is coupled to one of m second conductive lines (e.g., bit lines BL1, BL2, BL3). Persons of ordinary skill in the art will understand that cross-point memory arrays may include more or fewer that four word lines, and more or fewer than three bit lines, and more or fewer than twelve nodes.
In an embodiment, each node 21211, 21212, . . . , 21234 of cross-point memory array 210 includes a non-volatile memory cell having an adjustable resistance. In an embodiment, the non-volatile memory cells in nodes 21211, 21212, . . . , 21234 may be programmed to store a corresponding weight or state of an n×m array of weights w11, w12, w13, . . . , w34, respectively. Thus, each node 21211, 21212, . . . , 21234 is labeled with a corresponding weight w11, w12, w13, . . . , w34, respectively, programmed in the corresponding non-volatile memory cell of the node. In an embodiment, each weight w11, w12, w13, . . . , w34 corresponds to a conductance of the non-volatile memory cell in each node 21211, 21212, . . . , 21234, respectively. The weights may be programmed, for example, during a training phase of the neural network. A common training method involves the weights being selectively and/or iteratively updated using an algorithm such as back propagation.
Input voltages Vin1, Vin2, Vin3 and Vin4 are shown applied to word lines WL1, WL2, WL3, WL4, respectively. The magnitudes of input voltages Vin1, Vin2, Vin3 and Vin4 correspond to the associated values of input neurons x1, x2, x3 and x4, respectively. A bit line select voltage (BL_Select) is applied to each bit line to select that bit line. For ease of explanation, it will be assumed that BL_Select is zero volts, such that the voltage across the non-volatile memory cell in each node 21211, 21212, . . . , 21234 is the word line voltage.
In an embodiment, the non-volatile memory cells in nodes 21211, 21212, . . . , 21234 conduct currents i11, i12, . . . , i34, respectively. Each of currents i11, i12, . . . , i34 is based on the voltage applied to the corresponding non-volatile memory cell and the conductance of the corresponding non-volatile memory cell in the node. This “memory cell current” flows to the bit line connected to the non-volatile memory cell. The memory cell current may be determined by multiplying the word line voltage by the conductance of the non-volatile memory cell.
Stated another way, each non-volatile memory cell current corresponds to the result of multiplying one of the elements of an input vector by the weight stored in the non-volatile memory cell. So, for example, the non-volatile memory cell in node 21211 conducts a current i11 that corresponds to the product Vin1×w11, the non-volatile memory cell in node 21212 conducts a current 12 that corresponds to the product Vin2×w12, the non-volatile memory cell in node 21223 conducts a current i23 that corresponds to the product Vin3×w23, and so on.
Bit lines BL1, BL2, BL3 conduct bit line currents Iout1, Iout2, Iout3, respectively. Each bit line current is the summation of the currents of the memory cells connected to that bit line. For example, bit line current Iout1=i11+i12+i13+i14, bit line current Iout2=i21+i22+i23+i24, and bit line current Iout3=i31+i32+i33+i34. Thus, each bit line current Iout1, Iout2, Iout3 may be viewed as representing a sum of products of the input vector with corresponding weights in a column of the n×m array of weights:
The magnitudes of bit line currents Iout1, Iout2 and Iout3 constitute elements of an output vector, and correspond to the associated values of output neurons y1, y2 and y3, respectively and constitute the result of the matrix-vector multiplication operation depicted in
Apparatus 300 is a cross-point memory array that includes n rows and m columns of nodes 30211, 30212, . . . , 302mn. Apparatus 300 will also be referred to herein as cross-point memory array 300. In an embodiment, each of nodes 30211, 30212, . . . , 302mn includes a corresponding non-volatile memory cell S11, S12, . . . , Smn, respectively. In other embodiments, cross-point memory array 300 may include more than one non-volatile memory cell per node.
Each row of nodes 30211, 30212, . . . , 302mn is coupled to one of n first conductive lines 304, also referred to herein as word lines WL1, WL2, . . . , WLn. For example, the row of nodes 30211, 30221, 30231, . . . , 302m1 is coupled to word line WL1, the row of nodes 30213, 30223, 30233, . . . , 302m3 is coupled to word line WL3, and so on.
In an embodiment, each column of nodes 30211, 30212, . . . , 302mn is coupled to one of m second conductive lines 306, also referred to herein as bit lines BL1, BL2, . . . , BLm. For example, the column of nodes 30211, 30212, 30213, . . . , 3021n is coupled to bit line BL1, the column of nodes 30221, 30222, 30223, . . . , 3022n is coupled to bit line BL2, and so on.
In an embodiment, each row of nodes 30211, 30212, . . . , 302mn is coupled to one of n third conductive lines 308, also referred to as programming lines PL1, PL2, . . . , PLn. For example, the row of nodes 30211, 30221, 30231, . . . , 302m1 is coupled to programming line PL1, the row of nodes 3021n, 3022n, 3023n, . . . , 302mn is coupled to programming line PLn, and so on. The programming lines PL1, PL2, . . . , PLn program a weight or state of each volatile memory cell S11, S12, . . . , Smn.
Each non-volatile memory cell S11, S12, . . . , Smn has a first terminal A11, A12, . . . , Amn, respectively, coupled to one of the n word lines WL1, WL2, . . . , WLn, a second terminal B11, B12, . . . , Bmn, respectively, coupled to one of the m bit lines BL1, BL2, . . . , BLm, and a third terminal C11, C12, . . . , Cmn, respectively, coupled to one of the n programming lines PL1, PL2, . . . , PLn. To simplify this discussion and to avoid overcrowding the diagram, access devices are not depicted in
For example, non-volatile memory cell S11 has a first terminal A11 coupled to word line WL1, a second terminal B11 coupled to bit line BL1, and a third terminal C11 coupled to programming line PL1. Likewise, non-volatile memory cell S32 has a first terminal A32 coupled to word line WL2, a second terminal B32 coupled to bit line BL3, and a third terminal coupled C32 to programming line PL2.
In the following figures, various non-volatile cell embodiments for implementing a multi-weight state SOT DNN approach are disclosed. Briefly, the free layers with domain walls or magnetic shape anisotropy in
The MTJ approach will be described first using FIGS. 3B1-3B2, and
In an embodiment, each non-volatile memory cell S11, S12, . . . , Smn is a spin-orbit torque (SOT) magnetoresistive random-access memory (MRAM) non-volatile memory cell, such as the example SOT MRAM non-volatile memory cell 310a depicted in FIG. 3B1. SOT MRAM non-volatile memory cell 310a includes a first terminal A, a second terminal B, a third terminal C, a MTJ 312a, and a Spin Hall Effect (SHE) layer 314.
MTJ 312a includes a reference (or pinned) layer (PL) 316a, a free layer (FL) 318a, and a tunnel barrier (TB) 320 positioned between pinned layer 316a and free layer 318a. Tunnel barrier 320 is an insulating layer, such as magnesium oxide (MgO) or other insulating material. Pinned layer 316a is a ferromagnetic layer with a fixed direction of magnetization. Free layer 318a is a ferromagnetic layer and has a direction of magnetization that can be switched. Further description of the free layer 318a with domain walls or magnetic shape anisotropy are disclosed below in
Pinned layer 316a is usually a synthetic antiferromagnetic layer which includes several magnetic and non-magnetic layers, but for the purpose of this illustration is depicted as a single layer 316a with fixed direction of magnetization. Pinned layer 316a and free layer 318a each have a perpendicular direction of magnetization. Accordingly SOT MRAM non-volatile memory cell 310a is also referred to herein as “perpendicular stack SOT MRAM non-volatile memory cell 310a.”
When the direction of magnetization of free layer 318a is parallel to the direction of magnetization of pinned layer 316a, the resistance of perpendicular stack SOT MRAM non-volatile memory cell 310a is relatively low. When the direction of magnetization of free layer 318a is anti-parallel to the direction of magnetization in pinned layer 316a, the resistance of perpendicular stack SOT MRAM non-volatile memory cell 310a is relatively high.
Thus, generally the resistance of perpendicular stack SOT MRAM non-volatile memory cell 310a may therefore be used to store one bit of data. In an embodiment, SOT MRAM non-volatile memory cell 310a may be programmed to either a low resistance ON state or a high resistance OFF state. In an embodiment, the low resistance ON state may be used to represent a first weight value (e.g., “1”), and the high resistance OFF state may be used to represent a second weight value (e.g., “0”). The data (“0” or “1”) in SOT MRAM non-volatile memory cell 310a may be read by measuring the resistance of SOT MRAM non-volatile memory cell 310a. However, according to the domain wall or magnetic shape anisotropy based free layer approaches disclosed, discernable or discrete intermediate values between the binary “0” vs “1” may be stored and read-out, providing a mechanism for multi-weight state use in training and inference.
FIG. 3B2 is a cross-sectional view of another SOT MRAM non-volatile memory cell 310b that may be included in each non-volatile memory cell S11, S12, . . . , Smn (
When the direction of magnetization of free layer 318b is parallel to the direction of magnetization of pinned layer 316b, the resistance of in-plane stack SOT MRAM non-volatile memory cell 310b is relatively low. When the direction of magnetization of free layer 318b is anti-parallel to the direction of magnetization in pinned layer 316b, the resistance of in-plane stack SOT MRAM non-volatile memory cell 310b is relatively high.
Thus, generally the resistance of in-plane stack SOT MRAM non-volatile memory cell 310b may therefore be used to store one bit of data. In an embodiment, SOT MRAM non-volatile memory cell 310b may be programmed to either a low resistance ON state or a high resistance OFF state. In an embodiment, the low resistance ON state may be used to represent a first weight value (e.g., “1”), and the high resistance OFF state may be used to represent a second weight value (e.g., “0”). The data (“0” or “1”) in SOT MRAM non-volatile memory cell 310b may be read by measuring the resistance of SOT MRAM non-volatile memory cell 310b. However, according to the domain wall or magnetic shape anisotropy based free layer approaches disclosed, additional values beyond the binary “0” vs “1” may be stored and read-out, providing a mechanism for multi-weight state use in training and inference.
Referring again to FIG. 3B1, in an embodiment, SHE layer 314 comprises a heavy metal with strong spin orbit coupling and large effective Spin Hall Angle. Examples of heavy metal materials include platinum, tungsten, tantalum, platinum gold (PtAu), bismuth bopper (BiCu). In other embodiments, SHE layer 314 comprises a topological insulator, such as bismuth antimony (BiSb), bismuth selenide (Bi2Se3), bismuth telluride (Bi2Te3) or antimony telluride (Sb2Te3). In particular embodiments, SHE layer 314 comprises BiSb with (012) orientation, which is a narrow gap topological insulator with both giant Spin Hall Effect and high electrical conductivity. In other embodiments, SHE layer 314 comprises a topological semi-metal (TSM) material, such as BiSb, YPtBi, FeSi, or CoSi.
The spin of an electron is an intrinsic angular momentum. In a solid, the spins of many electrons can act together to affect the magnetic and electronic properties of a material, for example endowing it with a permanent magnetic moment as in a ferromagnet. In many materials, electron spins are equally present in both up and down directions. However, various techniques can be used to generate a spin-polarized population of electrons, resulting in an excess of spin up or spin down electrons, to change the properties of a material. This spin-polarized population of electrons moving in a common direction through a common material is referred to as a spin current.
The Spin Hall Effect is a transport phenomenon that may be used to generate a spin current in a sample carrying an electric current. The spin current is in a direction perpendicular to the plane defined by the electrical current direction and the spin polarization direction. The spin polarization direction of such a SHE-generated spin current is in the in-plane direction orthogonal to the electrical current flow.
For example, an electrical current 322 through SHE layer 314 (from third terminal C to second terminal B) results in a spin current 324 being injected up into free layer 318a, and having a direction of polarization into the page. Spin current 324 injected into free layer 318a exerts a spin torque (or “kick”) on free layer 318a, which causes the direction of magnetization of free layer 318a to oscillate in the y-z plane or to be changed entirely. This can be leveraged for programming the weight state in the free layer during the training phase.
In an embodiment, during a “programming phase,” each SOT MRAM non-volatile memory cell S11, S12, . . . , Smn is programmed to store a corresponding weight of an n×m array of weights w11, w12, w13, . . . , wnm, respectively. For example, any suitable programming may be used to store weights w11, w12, w13, . . . , wnm in SOT MRAM non-volatile memory cell S11, S12, . . . , Smn, respectively. As described above, in an embodiment, each of weights w11, w12, w13, . . . , wnm can represent one of several states in a multi-state weight scheme.
After SOT MRAM non-volatile memory cells S11, S12, . . . , Smn have been programmed with weights w11, w12, w13, . . . , wnm, respectively, e.g., as part of training a neural network, cross-point memory array 300 may be used during an “inferencing phase” to perform the matrix-vector multiplication operation depicted in
In an embodiment with use of MTJ based cell of FIGS. 3B1-B2, during the inferencing phase, third conductive lines 308 (programming lines PL1, PL2, . . . , PLn) are not used, and may be floated. In addition, for simplicity it will be assumed that bit line select voltages of 0 volts are applied to each of bit lines BL1, BL2, . . . , BLm to select those bit lines. In an embodiment, read/write/multiply circuit 124 is configured to apply bit line select voltages of 0 volts to bit lines BL1, BL2, . . . , BLm.
During the inferencing phase, each SOT MRAM non-volatile memory cell S11, S12, . . . , Smn conducts a memory cell current that corresponds to the result of multiplying one of the elements of the n-element input vector (multiply vector) by the corresponding weight stored in the non-volatile memory cell. For example, SOT MRAM non-volatile memory cell S11 conducts a memory cell current that corresponds to the product Vin1×w11, SOT MRAM non-volatile memory cell S12 conducts a memory cell current that corresponds to the product Vin2×w12, SOT MRAM non-volatile memory cell S23 conducts a memory cell current that corresponds to the product Vin3×w23, and so on.
During the inferencing phase, the memory cell currents in SOT MRAM non-volatile memory cells S11, S12, . . . , Smn flow to the bit line BL1, BL2, . . . , BLm connected to the memory cell. Bit lines BL1, BL2, . . . , BLm conduct bit line currents Iout1, Iout2, . . . , Ioutm, respectively. Each bit line current is the summation of the memory cell currents of the memory cells connected to that bit line. Thus, each bit line current Iout1, Iout2, . . . , Ioutm may be viewed as representing a sum of products of the multiply vector with corresponding weights in a column of the n×m array of weights:
The magnitudes of bit line currents Iout1, Iout2, . . . , Ioutm constitute elements of an m-element output vector, and correspond to the associated values of output neurons y1, y2, . . . , ym, respectively, and constitute the result of the matrix-vector multiplication operation depicted in
The magnitude of each individual bit line current Ik represents a vector-vector multiplication result. That is, the magnitude of bit line current Ik represents the result of multiplying the input vector Vin1, Vin2, . . . , Vinn by the k-th column vector of the n×m array of weights w11, w12, w13, . . . , wnm.
Collectively, bit line currents Iout1, Iout2, . . . , Ioutm represent a result of matrix-vector multiplication. In an embodiment, bit line currents Iout1, Iout2, . . . , Ioutm represent output neurons y1, y2, y3, . . . , ym, respectively, of artificial neural network 200 of
In an embodiment, a sense amplifier is used to compare the magnitude of each bit line current Iout1, Iout2, . . . , Ioutm to a reference current. The sense amplifier may output a signal (e.g., one bit of information) that indicates whether the magnitude of the bit line current is less than or greater than the reference current. In an embodiment, the magnitude of the bit line current may be input to an activation function in an artificial neural network. The activation function may take various forms (e.g., Rectified Linear Unit (ReLu)) and may involve operations on the bit line current other than comparing to a reference current. In some applications, the activation function outputs a “fire” or “don't fire” signal based on the magnitude of the summed signal.
As described above, to avoid overcrowding the diagram, access devices are not depicted in
In an embodiment, first access device T11a and second access device T11b are each MOS transistors, although other types of access device may be used. First access device T11a has a first drain/source terminal coupled to signal line S1, a second drain/source terminal coupled to first terminal A11 of SOT MRAM non-volatile memory cell S11, and a control (gate) terminal coupled to first conductive line 304 (word line WL1). Second access device T11b has a first drain/source terminal coupled to signal line S1, a second drain/source terminal coupled to third terminal C11 of SOT MRAM non-volatile memory cell S11, and a control (gate) terminal coupled to third conductive line 308 (programming line PL1). It is noted that
During inferencing, first conductive line 304 (word line WL1) is HIGH, third conductive line 308 (programming line PL1) is LOW, first access device T11a is ON, second access device T11b is OFF, and multiply voltage Vin1 is applied to signal line S1 while a bit line select voltage (e.g., 0V) is applied to second conductive line 306 (bit line BL1). As a result, multiply voltage Vin1 is applied across first terminal A11 and second terminal B11 of SOT MRAM non-volatile memory cell S11, and SOT MRAM non-volatile memory cell S11, conducts a memory cell current that corresponds to the product Vin1×w11.
Similar programming and inferencing techniques to those described above in connection with SOT MRAM non-volatile memory cell S11 of
While FIGS. 3B1-3C above disclose various MTJ based SOT cell embodiments for leveraging the magneto-resistive property of the MTJ for reading out (i.e., multiplying) the weight states,
The SOT cell 400 of
During operation, to set a weight, a current is applied to the I+/I− current line (across terminals B and C) such that current flows through the first electrode 406a, to the SOT layer 404, to the second electrode 406b in the x-direction. Due to the spin Hall effect, the SOT cell 400 magnetic state will be set by an induced spin current flowing in the y-direction. By doing so, the weight is set (i.e., a preset resistance) in the FM layer 408. The exact weight/state is decided during the training process. Once the weight is preset, the preset current can be removed. Due to the nonvolatile nature of SOT cells, such weight can still be maintained without power (i.e., current).
To read the state of the FM layer 408 during inference, in the inverse spin Hall effect case, the Vin current line is supplied with current through the supply current (Vdd) and the transistor 420 (via terminal A). A read output (Vout1) current is read through the I+/I− current line to read the state of the FM layer 408 based on the inverse spin Hall effect. The Vout1 at terminal B for the respective cells are, as shown in
Alternatively, the direct spin Hall effect may be used, in which case, the read output is through Vout2. In this case, an input voltage would be provided at terminal C (instead of A), which in the context of
The SOT cell 425 of
The first and second FM layers 408, 412 may comprise different materials. For example, the first FM layer 408 may comprise a high coercivity (Hk) material and the second FM layer 412 may comprise a low coercivity material, or vice versa. Examples of low Hk materials include CoFe, NiFe, CoFeB, CoB, CoHf, and combinations thereof. Examples of high Hk materials include Pt, such as CoFePt, CoPt, CoPtCrB. By adjusting the concentration of Pt, the material properties of the FM layers 408, 412 can be effectively tuned, resulting in a wide range of Hk values. The SOT cell 425 has four possible states: 1) the first FM layer 408 being 0 and the second FM layer 412 being 1; 2) the first FM layer 408 being 1 and the second FM layer 412 being 2; 3) the first and second FM layers 408, 412 both being 0; or 4) the first and second FM layers 408, 412 both being 1. By doing so, multi-state weight can be achieved instead of binary.
The first and second FM layer 408, 412 may have different states, and can be individually programmed via the I+/I− current line. The FM layer 408 or 312 comprising the high coercivity material requires a greater amount of current to set the state. In one embodiment, the FM layer 408 or 412 comprising the high coercivity material is programmed first, and the state of the FM layer 408 or 412 comprising the low coercivity material is programmed last with a smaller current than that needed for the high coercivity material, such that the state of higher coercivity layer is not disturbed.
The state of the FM layers 408, 412 may be read out through either at Vout1 (terminal B), by utilizing the inverse spin Hall effect, or at Vout2 (terminal B′), by utilizing the direct spin Hall effect. The direct spin Hall effect may be used, in which case, an input voltage would be provided at terminal C (instead of A), which in the context of
With the two FM layers present, the output signal would be a composite, representative of the magnetization states of the two FM layers. For example, if both layers are at a “high” state direction, the composite output signal would be “high.” Similarly, the output signal would be “low” if both FM layers are at a “low” state magnetization direction. A third state between “high” and “low” would represent the scenario when one FM layer has an opposite magnetization relative to that of the other FM layer. So at least three states can be encoded with the use of two FM layers. The same logic would apply to any arrangement with additional FM layer(s), including the example embodiment of
The SOT cell 450 of
The first, second, and third FM layers 408, 412, 416 comprise different materials or materials having different coercivities. The SOT cell 450 has eight possible states, where each binary digit of the following numbers represents a state of each of the FM layers 408, 412, 416, respectively: 1) 000; 2) 100; 3) 110; 4) 111; 5) 010; 6) 001; 7) 101; and 8) 011. As noted above, the read output signal is a composite signal representing the magnetization of the FM layers, and would provide discrete levels to distinguish among these eight states. The state of the FM layers 408, 412 are read out through either at Vout1 (terminal B), by utilizing the inverse spin Hall effect, or at Vout2 (terminal B′), by utilizing the direct spin Hall effect, as described above. Such multi-state FM layers enable multi-state weight compared to binary weight (BNN).
The SOT layers 404 and 414 may comprise a topological insulator (TI) material or a topological semi-metal (TSM) material, such as BiSb, YPtBi, FeSi, or CoSi. Thanks to their giant spin Hall angle, TI and TSM materials having spin-momentum locking surface states are promising for a spin current source. Hence, TI and TSM materials have the potential for ultra-low-power consumption when utilized in SOT cells. The SOT layers 404 and 414 each have a thickness in the y-direction of about 10 nm to about 20 nm, and a length in the x-direction of about 10 nm to about 1 μm.
The electrodes 406a, 406b may comprise Cu, Al, AlN, or regular metals, have a thickness in the y-direction of about 200 nm to about 400 nm, and a length in the x-direction of about 10 nm to about 1 μm. The FM layers 408, 412, 416 may comprise CoFe, NiFe, CoFeB, CoB, CoHf, CoFePt, CoPt, CoPtCrB, or a combination thereof, have a thickness in the y-direction of about 5 nm to about 20 nm, and a length in the x-direction of about 10 nm to about 1 μm. The cap layer may comprise Spinel, HfN, NiFeGe, MgO, Ru, Ta, W, or a combination thereof, have a thickness in the y-direction of about 4 nm to about 10 nm, and a length in the x-direction of about 10 nm to about 1 μm.
The FM layer 500 of
Varying amounts of current applied in the x-direction are used to change the state of each of domains 500a-500d. For example, a first current with a large amount of magnitude may be applied to the FM layer 500 to reset the state of each of domains 500a-500d to 0. A second small amount of current applied in the opposite direction may then be applied to the FM layer 500 to change the first domain 500a to 1 and drive the domain wall movement. The time duration of the second current is set such that the state of one or more domains 500a-500d are changed. Thus, the longer the dwell time of the second current that is applied, the more the domains 500a-500d are changed. In an SOT cell having multiple FM layers, such as the SOT cells 425 and 450 of
The FM layer 525 of
The FM layer 550 of
The varying directions and magnitude of current (I) applied in the xz-plane inside the SOT layer 404, which is disposed below the FM layer 550, is used to control/set the magnetic orientation of the FM layer 550 along one of the axes for the corners 552a-552d. The current is applied directionally using two current sources: Iz+/Iz− and Ix+/Ix−. The associated terminals are marked with the corresponding terminal notations in
The FM layer 575 of
Therefore, utilizing a plurality of SOT cells in an artificial neural network or DNN enables the neural network to have ultra-low-power consumption. Furthermore, by utilizing two or more FM layers, FM layers having multiple domains, or FM layers having a multi-elliptical shape, within each SOT cell can encode weight states beyond binary states, thus improving the accuracy of the neural network.
In one embodiment, a deep neural network (DNN) device, the DNN device comprising an array comprising n rows and m columns of nodes, each row of nodes coupled to one of n first conductive lines, each column of nodes coupled to one of m second conductive lines, each node of the n rows and m columns of nodes comprising a plurality of spin orbit torque (SOT) cells, each SOT cell comprising: a SOT layer, a ferromagnetic (FM) layer disposed on the SOT layer, the FM layer being configured with a plurality of domain walls, electrodes disposed adjacent to the SOT layer, the electrodes being spaced from the FM layer, and a first current line configured to apply current through a first electrode of the electrodes, to the SOT layer, to a second electrode of the electrodes, and a controller configured to store at least one corresponding weight of an n×m array of weights of a neural network in each of the SOT cell.
The SOT layer comprises BiSb, YPtBi, FeSi, or CoSi. The FM layer has a plurality of cutting notches, so that at least one cutting notch separates a domain from another adjacent domain. The FM layer comprises a first surface and a second surface, the first and second surfaces being zigzagged. The DNN device further comprises a tunnel barrier layer on the FM layer, a second FM layer on the tunnel barrier layer, wherein the FM layer, the tunnel barrier layer and the second FM layer form a magnetic tunnel junction and a magnetic state of the FM layer is read via a magneto-resistive effect. The DNN device further comprises a transistor coupled to the SOT cell for applying an input voltage flowing perpendicular to a plane of the SOT layer and FM layer, wherein a magnetic state of the FM layer is read via the inverse spin Hall effect. A magnetic state of the FM layer is read via the direct spin Hall effect, where an input current is driven between the first and second electrodes in a plane of the SOT layer. The controller is further configured to use a magnetic state of each of a plurality of domains of the FM layer to encode a multi-state weight value having three or more states.
In another embodiment, a deep neural network (DNN) device, the DNN device comprising an array comprising n rows and m columns of nodes, each row of nodes coupled to one of n first conductive lines, each column of nodes coupled to one of m second conductive lines, each node of the n rows and m columns of nodes comprising a plurality of spin orbit torque (SOT) cells, each SOT cell comprising: a first ferromagnetic (FM) layer, a SOT layer disposed on the first FM layer, a second FM layer disposed on the SOT layer, and electrodes disposed adjacent to the SOT layer, the electrodes being spaced from the first and second FM layers, and a controller configured to store at least one corresponding weight of an n×m array of weights of a neural network in each of the SOT cell.
At least one of the first FM layer or the second FM layer has a multi-elliptical shape creating two or more states, wherein each state of the two or more states is configured to have a weight value. At least one of the first FM layer or the second FM layer FM layer has a plurality of cutting notches, so that at least one cutting notch separates a domain from another adjacent domain. At least one of the first FM layer or the second FM layer FM layer comprises a first surface and a second surface, the first and second surfaces being zigzagged. The first FM layer and the second FM layer comprise different materials. The first FM layer comprises CoFe, NiFe, CoFeB, CoB, CoHf, or a combination thereof, and wherein the second FM layer comprises CoFePt, CoPt, CoPtCrB, or a combination thereof. The first FM layer has a higher coercivity than the second FM layer. The controller is further configured to use a magnetic state of the first FM layer and a magnetic state of the second FM layer to encode a multi-state weight value having three or more states.
In yet another embodiment, a deep neural network (DNN) device, the DNN device comprising an array comprising n rows and m columns of nodes, each row of nodes coupled to one of n first conductive lines, each column of nodes coupled to one of m second conductive lines, each node of the n rows and m columns of nodes comprising a plurality of spin orbit torque (SOT) cells, each SOT cell comprising: a first SOT layer, a first ferromagnetic (FM) layer disposed on the first SOT layer, the first FM layer having a multi-elliptical shape creating two or more magnetic states at two or more corners or two or more arms formed by the multi-elliptical shape, and a current input configured to apply current through: a first current path across the first SOT layer and a second current path across the first SOT layer, and a controller configured to store at least one corresponding weight of an n×m array of weights of a neural network in each of the SOT cell.
The DNN device further comprises: a second FM layer, a second SOT layer, and a third FM layer. The first, second, and third FM layer each comprises a different material. The first, second, and third FM layer each comprises a material selected from the group consisting of: CoFe, NiFe, CoFeB, CoB, CoHf, CoFePt, CoPt, CoPtCrB, or a combination thereof. One or more of the second FM layer and the third FM layer has a multi-elliptical shape creating two or more magnetic states at two or more corners or two or more arms formed by the multi-elliptical shape. The first and second current paths are perpendicular to each other.
While the foregoing is directed to embodiments of the present disclosure, other and further embodiments of the disclosure may be devised without departing from the basic scope thereof, and the scope thereof is determined by the claims that follow.
This application is a continuation-in-part of co-pending U.S. patent application Ser. No. 17/172,155, filed Feb. 10, 2021, which is herein incorporated by reference.
Number | Date | Country | |
---|---|---|---|
Parent | 17172155 | Feb 2021 | US |
Child | 18954415 | US |