Embodiments generally relate to systems and devices for analog computation, machine learning and artificial intelligence, and, more particularly, to systems and devices directed to utilizing one or more magnetic tunnel junctions (MTJ) and to multiplications of non-binary matrices.
Artificial intelligence (AI) has a wide range of applications in modern life (smart cities, smart appliances, autonomous self-driving vehicles, information processing, speech recognition, patent monitoring, etc.). AI can leverage machine learning to perform two primary functions—training and inference. Machine learning in the context of neural networks is generally referred to as “deep learning.” Algorithms for these tasks require multiplication of large matrices, such as in updating the synaptic weight matrices in deep learning networks, which is an essential feature of training a neuronal circuit.
There are current devices, referred to as “hardware accelerators” directed to performing matrix multiplications. However, these have shortcomings. One is data volatility, i.e., when the hardware accelerator is powered down all computational results within the accelerator are lost. Another is high power consumption. Another is large footprint. Optical matrix multipliers have low energy consumption, but large footprint and electronic matrix multipliers have small footprint, but high energy consumption.
Accordingly, what is needed is a non-volatile, low power consumption, small footprint hardware accelerator for matrix multiplication. Magnetic matrix multipliers can satisfy this need; however, magnetic multipliers that are currently extant are binary in nature, i.e. they can multiply only matrices whose elements are either 0 or 1 and cannot multiply non-binary matrices whose elements are different from 0 or 1. This is inadequate for most artificial intelligence and machine learning tasks. The current invention is a non-binary magnetic matrix multiplier that can multiply any two matrices, not just binary ones, which makes it much more powerful than a currently existing magnetic matrix multiplier.
A matrix multiplier has two components: a multiplier and an accumulator. In an embodiment a non-volatile MTJ based accumulator is provided, which can include a spin torque generating conducting body and, non-conductively secured to or against an external surface of the spin torque generating conducting body, a hard-layer-soft layer MTJ, where the soft layer acts as a domain wall synapse.
An embodiment of a non-volatile multiplier is also provided which includes a straintronic MTJ whose conductance can be changed by straining the soft layer of the MTJ with a gate voltage applied to a piezoelectric substrate upon which the MTJ is fabricated. A magnetic field may be applied in the plane of the soft layer to ensure that the MTJ conductance versus the gate voltage characteristic has a linear region. The MTJ is biased in that linear region with a de voltage source to make it act as a multiplier. The multiplier and the multiplicand are encoded in the gate voltage and another voltage applied across the MTJ. The current that flows through the MTJ (and any resistor connected in series with it) is proportional to the product of the multiplier and the multiplicand. This current is passed through the spin torque generating conducting body of the accumulator to combine the multiplier with the accumulator and implement the matrix multiplier.
This Summary identifies example features and aspects and is not an exclusive or exhaustive description of disclosed subject matter. Whether features or aspects are included in or omitted from this Summary is not intended as indicative of relative importance of such features or aspects. Additional features are described, explicitly and implicitly, as will be understood by persons of skill in the pertinent arts upon reading the following detailed description and viewing the drawings. which form a part thereof.
Example embodiments include a straintronic MTJ based non-volatile multiplier-non-volatile MTJ-based accumulator that can provide, among other features and advantages, reliable non-volatile non-binary multiplication of matrices and other sum-of-products processing at low energy and low hardware footprint cost. Example applications can include, but are not limited to, artificial intelligence (AI) and other applications that can require high speed multiplication of row-column matrices.
According to various embodiments a non-volatile multiplier-accumulator can include a straintronic MTJ based multiplier that is spin orbit torque (SOT) coupled to a magnetic tunnel junction (MTJ) based accumulator. The multiplier and multiplicand are encoded in voltage pulses provided as inputs to the multiplier and the output of the multiplier is a current pulse whose magnitude is proportional to the product of the multiplier and multiplicand. In an embodiment the accumulator can comprise a heavy metal (HM) strip, and can be configured to receive the output current pulses from the multiplier and, in response, effectuate generation by the HM strip of a spin orbit torque (SOT) pulse that can move the domain wall in the soft layer of the accumulator MTJ by a distance proportional to the current injected into it and hence proportional to the product of the multiplier voltage and the multiplicand voltage. Successive current pulses, cach proportional to the product of an element along the row of the first matrix and an element along the column of the second matrix, move the domain wall by an amount proportional to the product and these domain wall displacements add up (or accumulate) to produce a total displacement that is proportional to the sum of the products of the row elements and column elements, thereby producing one element of the product matrix. According to various embodiments, the MTJ based accumulator can comprise an accumulator MTJ that includes a hard ferromagnetic (FM) layer and a FM soft layer, non-conductively supported on the HM strip in an arrangement that produces, corresponding to the SOT pulse, a SOT coupling between the HM strip and the MTJ's FM soft layer. In the various embodiments, the arrangement of the FM soft layer and configuration of the HM strip can combine such that, within a defined range of multiplier voltages and multiplicand voltages, the corresponding SOT coupling with the FM soft layer deterministically effects a change in the non-volatile magnetization state proportional to the SOT pulse. The change is therefore proportional to the product of the multiplier voltage and the multiplicand voltage.
According to various embodiments, the multiplier providing this SOT generation function can include features that in combination, effectuate flow of a multiplication result current pulse through the HM strip that is both proportional to the multiplication of the product of the multiplier voltage and the multiplicand voltage, and has a current density through the HM strip within a density range in which the HM strip produces a SOT coupling with FM soft layer that obtains a deterministic change in the magnetization that is proportional to the SOT coupling, i.e., proportional to the multiplication product. According to various embodiments the current path can include a voltage controlled conductance which can be controlled by the multiplicand voltage pulse, e.g., via a multiplicand terminal. The above configuration can effectuate, through the voltage controlled current path and its HM strip, in response to concurrent reception at the multiplier terminal and the multiplicand terminal of, respectively, the multiplier voltage pulse and the multiplicand voltage pulse, a current through the HM strip proportional to the product of the multiplier voltage and the multiplicand voltage.
The
Referring to
According to various embodiments, input-output functionality of the multiplier 102 can include effectuating, responsive to concurrently receiving a first operand voltage pulse Vin1 at the first operand input terminal 102A and a second operand voltage pulse Vin2 at the second operand input terminal, a pulse of a SOT coupling pulse between the HM conductor body 104 and the FM soft layer of the non-volatile SOT coupled MTJ accumulator 106. In further accordance with various embodiments, such functionality of the multiplier 102 can also include effectuating the magnitude of the above described pulse of SOT coupling to be, conjunctively: i) proportional to a multiplication product of the second operand voltage pulse Vin2 and the first operand voltage pulse Vin1, and ii) be generated with temporal-spatial characteristics, including magnitude of the coupling at the FM soft layer, that deterministically produces a corresponding change in the magnetization of the FM soft layer.
According to various embodiments, features of the multiplier 102, the HM conductor body 104, and the FM soft layer of the non-volatile SOT coupled MTJ accumulator 106, in providing the above-described multiplication product dependent changes in FM soft layer magnetization can include instantiation, in the FM soft layer when in an initialized fully parallel magnetic anisotropic state, of an anti-parallel domain, having an area deterministically proportional to the magnitude of the SOT coupling. In such embodiments, features can also include subsequent successive enlargements of the instantiated non-parallel domain, each enlargement being proportional to a magnitude of a corresponding SOT coupling, i.e., proportional to a multiplication product of a column element of given row in a first matrix A by a row element of a corresponding column in a second matrix B. In one or more embodiments, instantiation of the antiparallel domain can include establishing a domain wall between said domain and the remaining area of the FM soft layer. In such embodiments, the instantiation effectively creates an anti-parallel and a parallel domain, separated by a domain wall.
Regarding examples of first operand voltage pulses Vin1 and second operand voltage pulses Vin2, in a matrix multiplication processes, e.g., multiplying a first matrix A by a second matrix B, first operand voltage pulses Vin1 and second operand voltage pulses Vin2 may be provided to the NVM, SOT coupled MTJ based MXP 100 as a series of blocks, Continuing with the example, cach of the blocks can correspond to a particular row of the matrix A, and a particular column of the matrix B, and in such an example, each of the blocks can comprise a sequence of integer R operand pairs, each pair having another column element of the particular row of matrix A and another row element of the particular column of matrix B. By operation described in more detail in subsequent paragraphs, the multiplier 102 can perform, responsive to each of the integer R operand pairs, a multiplication using the first element in the pair as a multiplier and the second clement as a multiplicand
In the
According to various embodiments, the NVM, SOT coupled MTJ based MXP 100 can further include an initialization/reset logic block 110 that can be configured to selectively reset. e.g., to a fully parallel magnetization state, the FM soft layer of the non-volatile SOT coupled MTJ accumulator 106. In an illustrative example, of multiplying two K×K row-column matrices, a process can include K2 repeats of feeding K operand pairs, cach being another column element from a row of the first matrix and a corresponding row clement from a column of the second matrix. Responsive to each of the K operand pairs, the magnetization state of the FM soft layer of the non-volatile SOT coupled MTJ is changes by an amount proportional to the multiplication product. After the K multiplications, the conductance of the FM soft layer is detected, which indicated the sum of the multiplication products and, hence, the value of another element of the product matrix. The initialization/reset logic block 110 then re-initializes or resets the magnetization state of the FM soft layer, e.g., to a fully parallel state, aligned with the FM hard layer. The process then repeats, using another row of the first matrix or another column or the second matrix, or both.
As described above, after performing the K multiplications, e.g., a row by a column, a resource can read the resulting conductance of the non-volatile SOT coupled MTJ accumulator 106. Implementation can comprise a Detect Product Matrix Elements Ci,j block 112 to perform this function.
The FM soft layer 202S is shown in example magnetization state comprising a p-domain, an anti-p domain, and a domain wall, each respectively labeled by cross-hatching according to the cross-hatching legend on the figure. As described above in reference to the
Each of the current pulses Iout-k passes through the HM strip configured SOT coupler 304 and, because of spin-orbit interactions in the structure 304 heavy metal material, a spin orbit torque (SOT) is generated. The SOT can extend out from the surface of the HM strip configured SOT coupler 304 supporting the MTJ accumulator 302, through the thin insulating layer 306, through the metal layer 308 and into the FM soft layer 302. The SOT then injects spins into the FM soft layer 302, causing domain wall motion due to the spin Hall effect. . . . The velocity of the domain wall movement can be proportional to the current density through the HM strip configured SOT coupler 306, i.e., the magnitude of the current pulse Iout-k divided by the cross-sectional area of the body 306. As described above, in accordance with various embodiments, the duration of each current pulse Iout. Therefore, the distance of domain wall movement effectuated by each Iout-k can be proportional the amplitude of the Iout-k pulse.
After a number of Iout-k pulses a fraction of the soft layer will have magnetization parallel to that of the hard layer, a small fraction will be un-magnetized and will serve as a boundary or “domain wall” between the parallel magnetized portion and the remainder, which magnetization anti-parallel to that of the FM hard layer, which is not explicitly visible in
Conductance of the p-MTJ, measured between the FM hard layer and the FM soft layer is a combination of three parallel conductances, one being the conductance of the parallel domain region, another being the conductance of the anti-parallel domain, and the third being the conductance of the domain wall DW.
The Gp-MTJ (x) conductance can be represented by the following Equation (1)
where
The MTJ straintronic configured non-volatile NMG, SOT coupled MTJ matrix multiplier 600 according to one or more embodiment can include an elliptical MTJ 602 having an elliptical hard layer 602H and elliptical soft layer 602S, separated by an intervening insulating spacer layer. It will be understood that as used herein, in this context, “hard” and “soft” mean hard magnetically and soft magnetically. The elliptical soft layer 602S can be magnetostrictive and placed in an elastic contact with an underlying poled piezoelectric thin film 604 that can be deposited on a conducting substrate. Such construction can constitute a 2-phase multiferroic. Two electrically shorted electrodes, 606A and 606B, (collectively “electrically shorted electrode pair 606A-606B”) on the piezoelectric thin film 604 can be arranged to flank the elliptical MTJ 602, and the back of the substrate can be connected to ground.
In an operation, a (gate) voltage VG applied to the electrically shorted electrode pair 606A-6066B, can generate biaxial strain in the piezoelectric thin film 604. The biaxial strain can transfer to the elliptical soft layer 602S. The strain can be either compressive along the major axis and tensile along the minor axis, or vice versa, depending on the voltage polarity. These biaxial strains can rotate the elliptical soft layer 602S magnetization by an angle via the Villari effect, while the elliptical hard layer 602H magnetization can remain unaffected. The resistance of the elliptical MTJ 602 depends on the angle between the magnetizations of the hard layer 602H and soft layer 602S. Therefore the biaxial strain induced by the gate voltage VG changes the elliptical MTJ 602 resistance.
In an implementation of functionality of the multiplier, a constant current source Ibias is connected between the hard and soft layers (terminals ‘1’ and ‘2’), as shown in
For practices, determination of a respective ranges of the multiplier voltages and multiplicand voltage at the straintronic MTJ can be obtained via modeling rotation □ of the soft layer's magnetization as a function of the gate voltage VG in the presence of thermal noise. Such modeling can use, for example, stochastic Landau-Lifshitz-Gilbert simulations. MTJ resistance through the straintronic MTJ can be according to Equation (2)
where,
RAP is the MTJ resistance when the magnetizations are antiparallel.
From the θSS versus VG relation, the RMTJ versus VG characteristic can be calculated. With appropriate proper choice of MTJ parameters, a region can be found in which
i.e., the transfer characteristic V0 versus VG is roughly hyperbolic can be identified. When VG is chosen in such a region, one can perform an analog multiplication of two voltages Vin1 and Vin2 with a single s-MTJ by using a (variable) voltage source δ.
Referring to
where I is the current inducing the domain wall motion, R is the resistance of the HM strip 202 and Δr is the pulse width. Since the example HM strip 202 is assumed having a width of 500 nm, thickness 5 nm, the example cross-sectional area is 2500 nm2. A non-limiting example current density through the HM strip 202 that, subject to an appropriate configuration of the insulating layer 208 and metal conducting layer 210 configuration, can induce domain wall motion in the MTJ FM soft layer 206S can be equal or approximately equal on the order of 1011 A/m2.
In this example, current passing through the HM strip 202 2500 nm2 cross-section area can be approximately 250 μA. The resistivity of Pt, the assumed material for the HD strip 202. is 10-7 ohm-m. The resistance R of this example HM strip 202, having the example cross-sectional area of 2500 nm2 and length LT of 1 μm will therefore be R=40 ohms. Assuming, for purposes of example, a pulse width Δt being 1 ns, the energy dissipation per accumulation operation would be 2.5 fJ. Assuming a straintronic implementation of the voltage controlled conductance 106 portion of the multiplier 102, which can consume energy per multiplication that can be substantially smaller than 2.5 fJ, a total energy dissipated per multiply-and-accumulate (MAC) operation can be approximately 2.5 fJ. This is an illustration of the feature of small energy cost that can be provided, together with small footprint and non-volatility, by multiplier-accumulator devices and methods according to disclosed embodiments.
The amplitudes of the operand voltage pulses Vin1 and Vin2 are proportional to the two matrix elements a and b that are to be multiplied. The operand voltage pulses Vin can have a fixed width, Δt. The i-th current Iout will be represented as (Iout)i and has an amplitude that is proportional to a multiplication of value ai by value bi. With ai encoded in the amplitude of the i-th pulse or Vin1 and bi encoded in the amplitude of the i-th pulse of Vin2.
The i-th current therefore has an amplitude (Iout)i
The i-th current pulse can move the domain wall by an amount Δxi, in accordance with Equation (6)
where vi is the domain wall velocity imparted by the i-th current pulse. The domain wall velocity can be proportional to current density, over ranges of current density reasonably related to practices according to disclosed embodiments. Accordingly, the domain wall velocity can be proportional to the amplitude of the current pulse. Therefore, based on Equation (6), the movement amount Δxi can be represented by Equation (7):
As can be seen from Equation (7) in practices according to disclosed embodiments the domain wall moves after each pulse by an amount that proportional to the product of the two values ai and bi, i.e., to the product of the respective elements of the first matrix and the second matrix.
The example is described with perpendicular anisotropic layers This is an example configuration and is not intended as a limitations. In another one or more embodiments, the hard layer and soft layer can be in-plane anisotropic layers. The operation of domain movement and corresponding incremental, integrating movement of domain walls provided by such embodiments has some similarities and some differences.
It is noted that, as used herein and in the appended claims, the singular forms “a,” “an,” and “the” include plural referents unless the context clearly dictates otherwise. It is further noted that the claims may be drafted to exclude any optional element. As such, this statement is intended to serve as antecedent basis for use of such exclusive terminology as “solely,” “only” and the like in connection with the recitation of claim elements, or use of a “negative” limitation. As will be apparent to those of skill in the art upon reading this disclosure, each of the individual embodiments described and illustrated herein has discrete components and features which may be separated from or combined with the features of any of the other several embodiments without departing from the scope or spirit of the present invention. Any recited method can be carried out in the recited order of events or in any other order which is logically possible. Where a range of values is provided, it is understood that each intervening value, to the tenth of the unit of the lower limit unless the context clearly dictates otherwise, between the upper and lower limit of that range and any other stated or intervening value in that stated range, is encompassed within the range. The upper and lower limits of these smaller ranges may independently be included in the smaller ranges and are also encompassed within the invention, subject to any specifically excluded limit in the stated range. Where the stated range includes one or both of the limits, ranges excluding either or both of those included limits are also included in the invention. While exemplary embodiments of the present invention have been disclosed herein, one skilled in the art will recognize, upon reading this disclosure in its entirety, that various changes and modifications may be made without departing from the scope as defined by the following claims.
This application claims the benefit under 35 U.S.C. § 119 (e) of U.S. Provisional Patent Application No. 63/261,382, filed Sep. 20, 2021, which is hereby incorporated by reference in its entirety.
This invention was made with Government Support under Grant Nos. CCF-2001255 and CCF-2006843 awarded by the National Science Foundation. The Government has certain rights in the invention.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2022/076727 | 9/20/2022 | WO |
Number | Date | Country | |
---|---|---|---|
63261382 | Sep 2021 | US |