The present application relates to artificial neural networks, and more specifically, to a spin orbit torque based electronic neuron.
Artificial neural networks (ANNs) attempt to replicate the remarkable efficiency of the biological brain for performing cognitive tasks such as learning, pattern recognition and classification. At the heart of any ANN is an artificial neuron whose transfer function mimics that of a biological neuron. One of the most widely used models of an artificial neuron with an output (y) and a transfer function (ƒ) can be written as y=ƒ(Σi wi xi+b) where, xi is an input to the neuron, wi is its corresponding synaptic weight, and b is a constant bias term. Thus, the two main computational units of the artificial neuron are weighted summation of inputs followed by a thresholding operation. Traditionally, ANNs have been implemented in software running on a Von-Neumann type general-purpose computer. The implementation of large scale ANNs on general purpose computers requires significant computational capability and consumes energy that is orders of magnitude larger than its biological counterpart. Recent developments in the field of neuromorphic computation attempt to bridge this gap by emulating artificial neurons using custom analog/digital CMOS circuits. However, the emulation of artificial neurons using CMOS circuits remains highly inefficient in terms of energy consumption and silicon area. The inefficiency in CMOS based ANNs arises due to the significant mismatch between the functionality of a biological neuron and the CMOS devices which are better suited for Boolean logic. Therefore, improvements are needed in the field.
The present disclosure provides an electronic neuron device that includes a thresholding unit which utilizes current-induced spin-orbit torque (SOT). A two-step switching scheme is implemented with the device. In the first step, a charge current through heavy metal (HM) places the magnetization of a nano-magnet along the hard-axis (i.e. an unstable point for the magnet). In the second step, the device receives a current (from an electronic synapse) which moves the magnetization from the unstable point to one of the two stable states. The polarity of the net synaptic current determines the final orientation of the magnetization. A resistive crossbar array may also be provided which functions as the synapse generating a bipolar current that is a weighted sum of the inputs of the device.
In the following description and drawings, identical reference numerals have been used, where possible, to designate identical features that are common to the drawings.
The attached drawings are for purposes of illustration and are not necessarily to scale.
In the following description, some aspects will be described in terms that would ordinarily be implemented as software programs. Those skilled in the art will readily recognize that the equivalent of such software can also be constructed in hardware, firmware, or micro-code. Because data-manipulation algorithms and systems are well known, the present description will be directed in particular to algorithms and systems forming part of, or cooperating more directly with, systems and methods described herein. Other aspects of such algorithms and systems, and hardware or software for producing and otherwise processing the signals involved therewith, not specifically shown or described herein, are selected from such systems, algorithms, components, and elements known in the art. Given the systems and methods as described herein, software not specifically shown, suggested, or described herein that is useful for implementation of any aspect is conventional and within the ordinary skill in such arts.
In the HM layer 102, when the spin Hall effect (SHE) is the dominant underlying physical mechanism in play, a flow of charge current through the HM layer 102 generates pure spin current in the direction transverse to the charge current due to preferential scattering of different spins to different directions. This pure spin current is then used to control the free layer 104 on top, via the spin-transfer torque effect.
In certain embodiments, the HM layer 102 may comprise beta-Tantalum, Tungsten, or Platinum. The free layer 104 and pinned layer 110 may comprise any ferromagnetic material including, but not limited to, CoFe or CoFeB. The oxide tunnel barrier 108 may comprise an oxide material such as MgO.
According to certain aspects, a two-step switching scheme is applied to the thresholding device 100. The two-step switching scheme a) minimizes the required current from the electronic synapse, and b) utilizes the spin hall effect for an energy efficient thresholding operation. As shown in
The direction and the magnitude of spin current and its spin polarization in the SHE can be determined from the relationship, JS=θSH(σ×Jq) where JS and Jq are the transverse spin current and charge current, respectively, θSH is a material-dependent spin Hall angle, and σ is the polarization of the spin current. Magnetization dynamics of the free layer 104 are obtained by solving Landau-Lifshitz-Gilbert equation with additional term to account for the torque due to transverse spin current per equation (1) below:
where {circumflex over (m)} is the unit vector of free layer magnetization, γ=2μsμ0/h is the gyromagnetic ratio for electron [rad·m/(A·s)], Heff is the effective magnetic field [A/m], and IS=θSH(AMTj/AHM)1q{circumflex over (σ)} is the spin current injected into the free-layer [A]. Ns is the number of spins in the free layer defined as MSV/μs where MS is saturation magnetization [A/m], V the volume of the free layer (m3), and μB the Bohr magneton (A·m2). The effective field Heff includes shape anisotropy field Hshape=−(NXX, NYY, NZZ)MS the demagnetization factors, NXX, NYY, NZZ for elliptical disks, magnetocrystalline anisotropy HKu2 perpendicular to the free layer 104 plane direction, external magnetic field Ha, and thermal fluctuation field Hthermal given by
where, G0,1 is a Gaussian distribution with zero mean and unit standard deviation, kB the Boltzmann constant, T is the temperature, δt the simulation time-step, chosen as 0.1 ps in this example.
To determine the appropriate magnitude of clock and write currents for the device 100, a switching phase diagram for a range of clock and write currents is constructed as shown in
According to one embodiment, an arrangement of a neural network 400 with m number of inputs 402, h number of hidden layer neurons 404, and outputs 406 according to one embodiment is shown in
In order to implement bipolar weights, two row crossbars 504 are used for each input I as shown in
Therefore, the charge current Ilqqqq is proportional to the weighted summation of the inputs (Wii) and the synaptic weights (GGiiqq). The sign of the charge current determines the direction of the resultant spin current and hence the final state of the nano-magnet in the device 100.
According to one embodiment, interlayer communication is performed using a read circuit as shown in
In one test example, the neural network was designed to recognize the first 4 digits from the MNIST machine learning dataset. The images were downscaled to size 8×8 and 100 images were utilized for evaluating the performance of the network. The network consisted of 25 hidden layer neurons and 4 output layer neurons. The presently disclosed neuromorphic architecture falls into the category of hardware that utilizes off-chip learning. The weights and biases obtained from offline training of the network using backpropagation algorithm were mapped to conductance values of a resistive crossbar network similar to RCN 502. The mapping was done assuming a 5 bit discretization in the resistance levels of the crossbar network and a dynamic range (ratio of highest to lowest resistance in the array) of 20. Input currents obtained from SPICE simulations of the RCN 502 were then used to solve stochastic magnetization dynamics for the SOT based neuron. For the first stage of the switching process, a charge current of ˜85 μA (from
In order to perform an iso-throughput comparison with digital CMOS technology, a neural network hardware was synthesized using a standard cell library in 45 nm commercial CMOS technology. A 5-bit precision was used for the weights and each neuron was pipelined after every stage of multiplication and addition. The average power consumption per neuron was ˜1.06 mW.
In order to assess the functionality of the presently disclosed device due to the presence of a finite delay between the IClock and IWrite signals, we determine the variation of the probability of switching PSW of the free layer 104 with the synaptic current, corresponding to a clocking current of 85 μA (
where HK is effective anisotropy field. Using simulation parameters used above, the relaxation time constant τD is calculated as 3.5 ns. As a result, if the delay time between IClock and IWrite is less than then the functionality of the proposed neuron would not be significantly affected. A worst case simulation of the feed-forward ANN with an average delay of 1 ns between the clocking and synaptic currents for each neuron in the network show a degradation in classification accuracy by ˜5% only. The inherent error resiliency of such neural computing algorithms helps in nullifying the effect of delay between clocking and synaptic currents to a large extent.
In certain embodiments, the device 100 may be implemented as part of a random number generator as shown in
Various aspects described herein may be embodied as systems or methods. Accordingly, various aspects herein may take the form of an entirely hardware aspect, an entirely software aspect (including firmware, resident software, micro-code, etc.), or an aspect combining software and hardware aspects These aspects can all generally be referred to herein as a “service,” “circuit,” “circuitry,” “module,” or “system.”
The invention is inclusive of combinations of the aspects described herein. References to “a particular aspect” and the like refer to features that are present in at least one aspect of the invention. Separate references to “an aspect” (or “embodiment”) or “particular aspects” or the like do not necessarily refer to the same aspect or aspects; however, such aspects are not mutually exclusive, unless so indicated or as are readily apparent to one of skill in the art. The use of singular or plural in referring to “method” or “methods” and the like is not limiting. The word “or” is used in this disclosure in a non-exclusive sense, unless otherwise explicitly noted.
The invention has been described in detail with particular reference to certain preferred aspects thereof, but it will be understood that variations, combinations, and modifications can be effected by a person of ordinary skill in the art within the spirit and scope of the invention.
The present patent application is related to and claims the priority benefit of U.S. Provisional Patent Application Ser. No. 62/300,852, filed Feb. 28, 2016, the contents of which is hereby incorporated by reference in its entirety into the present disclosure.
Number | Date | Country | |
---|---|---|---|
62300852 | Feb 2016 | US |