The field of invention pertains generally to the electronic arts, and, more specifically, to an electronic neural network circuit having a resistance based learning rule circuit.
In the field of computing science, artificial neural networks may be used to implement various forms of cognitive science such as machine learning and artificial intelligence. Essentially, artificial neural networks are adaptable information processing networks having a design that is structured similar to the human brain and characterized as having a number of neurons interconnected by synapses.
A better understanding of the present invention can be obtained from the following detailed description in conjunction with the following drawings, in which:
A class of neural networks referred to as “spiking” neural networks have synaptic messages that take the form of spikes. Here, a neuron “fires” a spike/message to the neurons it is connected to if its state reaches a particular value. Simplistically, the value of a neuron's state will change as it receives spikes/messages from other neurons. If the magnitude of the received spiking activity reaches a certain intensity, the receiving neuron's state may change to a level that causes it to fire.
The weight of a synapse affects the magnitude of the message it transports. Spike Timing Dependent Plasticity (STDP) is a learning function for changing the weight of a synapse in a spiking neural network in response to the spike timing difference on either end of the synapse. There are generally two types of STDP learning functions: inhibitory and excitory. An inhibitory learning function is used for synapses whose messages tend to reduce its receiving neuron's firing activity. By contrast, an excitory learning function is used for synapses whose messages tend to contribute to its receiving neuron's firing activity.
Through application of the learning functions, the weight of a synapse will change in view of the observed pre and post neuron firings which, in turn, corresponds to the learning activity of the network.
A problem with the implementation and construction of a practical spiking neural network is the sheer number of synapses. Here, note from
The manufacture of a semiconductor chip whose constituent circuitry is designed to implement a spiking neural network therefore faces the challenge of attempting to implement a synapse with a reduced number of active devices so as to reduce its overall size and manufacturing complexity.
A solution to the problem described in the Background is to construct a synapse circuit with a magnetic tunneling junction (MTJ) device. A magnetic tunneling device exhibits high or low resistance depending on the relative orientation of two magnetic moments within the device. Here, according to one type of MTJ implementation, when a first magnetic layer of the device (e.g., a fixed layer) has a magnetic moment that points in the same direction as a second magnetic layer of the device (e.g., a free layer), the device has low resistance (RL). By contrast, referring to
As observed in
A timing measurement circuit 403 measures the difference between firing times of the two neuron circuits 401, 402 and generates an output signal (e.g., a digital signal, a voltage or current) that is representative, in terms of the magnitude and polarity, of the firing time difference. Specifically, if the post neuron fires 402 after the pre neuron 401 (where the spike/message propagates from the pre neuron 401 to the post neuron 402 along an execution path), Δt is positive and the timing circuit 403 will generate a signal of a first polarity (e.g., positive) whose magnitude is representative of the difference in time.
The signal is then applied to learning circuit 404 having an MTJ device 405 in the high resistance state. The input signal from the timing measurement circuit 404 is processed by the learning rule circuit 404 in a manner that causes a representative signal to be applied to the MTJ device 405 and the resistance of the device is measured.
For example, if a voltage that is representative of Δt is applied across the MTJ device's terminals, the resultant current that flows through the MTJ device is measured to determine the MTJ device's resistance. Likewise, if a current that is representative of Δt is driven through the MTJ, device the resultant voltage across the MTJ device is measured to determine the device's resistance. The measured resistance is then used to generate an input signal to a weight circuit 406.
Here, recall that the measured resistance of the MTJ device represents a change in the weight of the synapse between the two neurons 401, 402. In response to the input signal received from the learning rule circuit 404, a weight circuit 406 calculates a new weight value for the synapse. Messages may then continue to proceed from the pre-neuron to the post-neuron along an execution path through the weight circuit 406 so as to apply the new weight to the message. A new weight may therefore be applied for the synapse each time the timing measurement circuit sends a new signal along the learning path.
In various embodiments, the learning rule circuit 404 may implement an inhibitory or excitory rule depending on an input control signal provided as a value from a register (not shown). Here, the value in the register may be loaded as part of the configuration of the neural network circuit.
Referring to
According to the learning rule circuit of
Current source circuitry 505 therefore accepts an input that indicates the magnitude of the time difference between the pre and post neurons (e.g., as provided by timing measurement circuit 403 of
The polarity of the time difference measurement has no effect on an inhibitory learning rule output. Thus the inhibitory learning channel 501 is just a straight read of the MTJ device resistance. In an embodiment, the resistance of the MTJ device, as provided by logic circuit 508, is taken to be a value having positive polarity.
By contrast, the polarity of the time difference measurement does have an effect on the excitory learning rule output. Specifically, in the case of a negative Δt measurement, the output resistance is positive and is therefore just the output of logic circuit 508. By contrast, in the case of a positive Δt measurement, the output value of the learning rule is negative and the correct output is the value of the resistance from logic circuit but having a negative polarity. As such, excitory channel includes two sub-channels, one that provides positive resistance and one that provides negative resistance. Multiplexer 510 selects the positive sub-channel in the case of a negative Δt measurement and selects the negative sub-channel in the case of a positive Δt measurement.
As such, as the magnitude of the time difference (Δt) increases, the amount of voltage applied to the MTJ device 603 increases. A current meter circuit 602 measures the current through the MTJ device 603 while a voltage meter circuit 605 measures the current through the MTJ device 603. A division circuit 606 receives the outputs of the current and voltage meters 602, 605 and determines the resistance of the MTJ circuit (V/I=R).
A pair of switching circuits 608, 609 provide the output of the learning rule circuit as a function of: 1) a control signal (e.g., as provided by configuration register) that indicates whether the inhibitory or excitory learning rule is to be effected; and, 2) the polarity of the Δt time difference signal. In the embodiment of
By contrast, if the control signal indicates that the excitory rule is to be applied, the NFET device of switch 609 is “off” and the PFET device of switch 609 is “on”. In this state, the output of the learning circuit is either the positive polarity output of the division circuit 606 (if Δt is positive the NFET of switch 608 is “on” and the PFET of switch 608 is “off”), or, a negative polarity output of the division circuit 606 as crafted by a unity inverting amplifier 607 (if Δt is negative the NFET of switch 608 is “off” and the PFET of switch 608 is “on”). For simplicity, a pass gate structure 609 has been shown for illustrative ease. To avoid a voltage drop issues across the gate structure a transmission gate structure may be used in place of the pass gate structure 609.
An implementation improvement on the circuitry of
In an embodiment, each of the various parallel resistances have two programmable states: open circuit or resistance R. When in the open circuit state a parallel resistance has no effect on the circuit. When activated in the resistance R state, however, a parallel resistance reduces the resistance of the MTJ device, which, in turn, reduces the slope of the learning rule and its vertical axis intercept. Each parallel resistance is individually set in the open/R state so as to permit a wide range of different learning rule slopes by activating/inactivating different combinations of parallel resistances. The more parallel resistances that are activated, the more the slope and height of the circuit's learning rule is reduced. Other embodiments may choose to implement programmable resistance ranges (e.g., each resistance can be set to any one of open circuit, maximum resistance R and a number of resistance values between open and R). Conceivably similar changes to the shape of the resistance/rule curve can be implemented by placing programmable resistance values in series with the MTJ. We can also have an implementation where programmable resistances can be replaced MTJs to enable learning function with different height(s) and slope(s).
Referring back to
Although embodiments above have been described in reference to an MTJ, in various other embodiments another type of resistive element may be utilized in place of the MTJ as described above. Here, any resistive device that has the property of voltage dependent resistance with certain slope and height can be used for STDP learning.
The neural network circuitry discussed herein may be embodied in various semiconductor circuits at least some of which may be integrated with a computing system (such as an intelligent machine learning peripheral such as a voice or image recognition peripheral).
An applications processor or multi-core processor 950 may include one or more general purpose processing cores 915 within its CPU 901, one or more graphical processing units 916, a memory management function 917 (e.g., a memory controller) and an I/O control function 918. The general purpose processing cores 915 typically execute the operating system and application software of the computing system. The graphics processing units 916 typically execute graphics intensive functions to, e.g., generate graphics information that is presented on the display 903. The memory control function 917 interfaces with the system memory 902. The power management control unit 912 generally controls the power consumption of the system 900.
Each of the touchscreen display 903, the communication interfaces 904-907, the GPS interface 908, the sensors 909, the camera 910, and the speaker/microphone codec 913, 914 all can be viewed as various forms of I/O (input and/or output) relative to the overall computing system including, where appropriate, an integrated peripheral device as well (e.g., the camera 910). Depending on implementation, various ones of these I/O components may be integrated on the applications processor/multi-core processor 950 or may be located off the die or outside the package of the applications processor/multi-core processor 950.
Embodiments of the invention may include various processes as set forth above. The processes may be embodied in machine-executable instructions. The instructions can be used to cause a general-purpose or special-purpose processor to perform certain processes. Alternatively, these processes may be performed by specific hardware components that contain hardwired logic for performing the processes, or by any combination of programmed computer components and custom hardware components.
Elements of the present invention may also be provided as a machine-readable medium for storing the machine-executable instructions. The machine-readable medium may include, but is not limited to, floppy diskettes, optical disks, CD-ROMs, and magneto-optical disks, FLASH memory, ROMs, RAMs, EPROMs, EEPROMs, magnetic or optical cards, propagation media or other type of media/machine-readable medium suitable for storing electronic instructions. For example, the present invention may be downloaded as a computer program which may be transferred from a remote computer (e.g., a server) to a requesting computer (e.g., a client) by way of data signals embodied in a carrier wave or other propagation medium via a communication link (e.g., a modem or network connection).
In the foregoing specification, the invention has been described with reference to specific exemplary embodiments thereof. It will, however, be evident that various modifications and changes may be made thereto without departing from the broader spirit and scope of the invention as set forth in the appended claims. The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense.