The present technology relates to a multiply-accumulate operation device performing a multiply-accumulate operation.
A multiply-accumulate operation is an operation in which each of loads (weights) is added to each of a plurality of input values respectively and each of the plurality of input values is added to each other, and the multiply-accumulate operation is utilized in order to recognize an image and speech by a neural network, for example. A neural network model that is multilayer-perceptron-type may be used in a multiply-accumulate operation processing. The processing may be performed by a general-purpose digital calculator or a digital application-specific-integrated-circuit, and a specific example in which the digital application-specific-integrated-circuit is utilized is described in Non-Patent Literature 1. The example in Non-Patent Literature 1 is an example in which a spiking neuron model being one method of the neural networks. The multiply-accumulate operation by the spiking neuron model is described in Non-Patent Literature 2.
Here, from a perspective of decreasing electric energy consumption of the multiply-accumulate operation processing, it is conceivable that an analog integrated circuit is more preferable than a digital integrated circuit. In order to realize the similar integration degree and the similar electric energy consumption to a brain of an organism (Ultimately, human), adoption of the analog integrated circuit is studied, and the content thereof is described in Non-Patent Literature 3.
However, a processing of a multiply-accumulate operation in which positive loads and negative loads coexist with each other is not clarified in Non-Patent Literatures 2 and 3, and there is a problem that in a case that the positive loads and the negative loads exist, a way to mount the analog integrated circuit is unclear.
In view of the above circumstances, it is an object of the present invention to provide a multiply-accumulate operation device capable of performing a processing of a multiply-accumulate operation in which positive loads and negative loads coexist with each other by an analog method.
A multiply-accumulate operation device according to the present invention in accordance with the object performs a series of processing by using an analog circuit, the series of processing including causing each of N electric signals being given to correspond to each of loads, multiplies each of values of the electric signals by each of values of the loads corresponding to the electric signals to obtain N multiplied values, and derives a sum of the N multiplied values,
The multiply-accumulate operation device according to the present invention, in which the multiply-accumulate operation device may
The multiply-accumulate operation device according to the present invention, in which
A multiply-accumulate operation device according to the present invention may perform a multiply-accumulate operation in which positive loads and negative loads coexist with each other by an analog method, because an analog circuit in the multiply-accumulate operation device includes N+ first output means that cause each of N+ electric signals being given in a predetermined period T1 to correspond to each of positive loads and output electric charges respectively, each of the electric charges having a size depending on each of values of the electric signals and each of values of the positive loads corresponding to the electric signals, a first capture-and-storage means in which the electric charges output from each of the N+ first output means are stored, (N−N+) second output means that cause each of (N−N+) electric signals being given in the period T1 to correspond to each of absolute values of negative loads and output electric charges respectively, each of the electric charges having a size depending on each of values of the electric signals and each of the absolute values of the negative loads corresponding to the electric signals, a second capture-and-storage means in which the electric charges output from each of the (N−N+) second output means are stored, and a multiply-accumulate deriving means calculating a first multiply-accumulate value when detecting that a voltage held in the first capture-and-storage means reaches a first threshold, calculating a second multiply-accumulate value when detecting that a voltage held in the second capture-and-storage means reaches a second threshold, and obtaining a sum of the N multiplied values by subtracting the second multiply-accumulate value from the first multiply-accumulate value, the first multiply-accumulate value being a sum of N+ multiplied values obtained by multiplying each of the positive loads corresponding to the N+ electric signals by each of the values of the N+ electric signals respectively, the second multiply-accumulate value being a sum of (N−N+) multiplied values obtained by multiplying each of the absolute values of the negative loads corresponding to the (N−N+) electric signals by each of the values of the (N−N+) electric signals respectively. Moreover, the multiply-accumulate operation device may derive the first multiply-accumulate value and the second multiply-accumulate value in a period T2 having the same length as the length of the period T1 after the period T1, and it is possible that the analog circuit has a simpler structure. Furthermore, the multiply-accumulate operation device independently executes each of multiply-accumulate operations about the positive loads and the negative loads in the same-type circuit respectively, and a movement of each of the electric charges is only in one direction. As a result, decreasing electric energy consumption of each of circuit operations may be realized.
Subsequently, with reference to the attached drawings, embodiments specifying the present invention will be described and offered for understanding the present invention.
As shown in
A value of the electric signal Ii (Hereinafter, simply referred to as “electric signal”) is referred to as xi, and the N electric signals (In the present embodiment, pulse signals) are given to the one analog circuit 11 in a predetermined period T1. A calculated-object value of the analog circuit 11 (In other words, the sum of the N multiplied values) is shown as below.
As shown in
Each of the analog circuits 11 in the upper layer obtains the calculated-object value by causing each of the values of the electric signals transmitted from the plurality of analog circuits 11 in the lowest layer (In other words, lower layer) to correspond to each of the loads wi respectively, and transmits electric signals indicating the calculated-object value to the analog circuits 11 in a further upper layer. In the present embodiment, the multiply-accumulate operation device 10 is designed to be applicable to a neural network, and by performing a processing in which each of the analog circuits 11 in the upper layer obtains the calculated-object value on the basis of the calculated-object value obtained in each of the analog circuits 11 in the lower layer a plurality of times, for example, the multiply-accumulate operation device 10 performs recognition of an image and the like.
First, a structure of an analog circuit 11a according to a reference example in which the analog circuit 11a obtains the calculated-object value by performing basically the same processing as the processing performed by the analog circuit 11 and the processing performed by the analog circuit 11a will be described where the value xi of the electric signal is a variable that is 0 or more and 1 or less. Note that the load wi includes a positive load wi+ and a negative load wi−, (because the positive load wi+ and the negative load wi− are calculated independently) and in the analog circuit 11a. However, the load wi is considered to have no difference between the positive load wi+ and the negative load wi−.
As shown in
Each of the output means 13 includes an input terminal 15 to which each of the electric signals is given, a resistance 16 to which the input terminal 15 is connected in series, and a diode 17 to which the resistance 16 is connected in series. A size of the load wi of each of the output means 13 may be determined by a resistance value of each of the resistances 16.
Electric signals having different sizes are given to the input terminal 15 of each of the output means 13 at different timings in the period T1.
A length of the period T1 is referred to as Tin, and the timing at which the electric signals are given to the input terminals 15 of the output means 13 is referred to as ti. As shown in
[Math. 2]
ti=Tin(1−x1) (Equation 1)
Where a waveshape that is produced from the timing ti at which the electric signal is given and increases or decreases proportionally to elapse of time t is referred to as response-waveshape W (See
[Math. 3]
ki=λwi (Equation 2)
Note that λ is a positive constant.
Here, as shown in
β is a sum of the loads wi, and the β is expressed by an equation 4 described below.
On the basis of the equations 1 to 4, a calculated-object value of the analog circuit 11a is expressed by an equation 5 described below.
Here, all of the loads wi are assumed to be positive values. When the value xi of each of the electric signals given to all of the input terminals 15 is the minimum value 0, the left side of the equation 5 is 0. As a result, the timing tν is the latest, and the timing tνmin is expressed by an equation 6 described below.
That the left side of the equation 5 is 0 means that an output timing corresponding to the voltage held in the capture-and-storage means 14 is the latest.
On the other hand, when the value xi of each of the electric signals given to all of the input terminals 15 is the maximum value 1, the left side of the equation 5 is β. As a result, the timing tν is the earliest, and the timing tνmax is expressed by an equation 7 described below.
That the left side of the equation 5 is β means that an output timing corresponding to the electric-charge amount stored in the capture-and-storage means 14 is the earliest. As a result, on the basis of the equations 6 and 7, a period T2 in which the pulse signal corresponding to the voltage held in the capture-and-storage means 14 is output is [tνmax,tνmin], and a time length Tν of the period T2 is given by an equation 8 described below.
[Math. 9]
Tν=tνmin−tνmax=Tin (Equation 8)
Thus, the time length Tν of the period T2 in which the pulse signal corresponding to the voltage held in the capture-and-storage means 14 is output is equal to the time length Tin of the period T1 in which the electric signal is given to each of the output means 13.
To make the calculated-object value of the analog circuit 11a to reflect all of the electric signals given to each of the output means 13, it is necessary that the period T2 exists after the period T1 and the θ is a suitable value. Thus, a condition expressed by an equation 9 described below is necessary.
[Math. 10]
tνmax>Tin (Equation 9)
On the basis of the equation 7, the equation 9 may be replaced with an equation 10 described below.
[Math. 11]
θ>λβTin (Equation 10)
Here, a minute number ε (>0) is defined. By using the ε, the threshold θ is expressed by an equation 11 described below.
[Math. 12]
θ=(1+ε)λβTin (Equation 11)
On the basis of the equation 11, it is necessary for the threshold θ to have a size proportional to a product of the sum β of the loads wi and the length Tin of the period T1.
Moreover, an equation 12 described below is obtained by the equations 6 and 11, and an equation 13 described below is obtained by the equations 7 and 11.
[Math. 13]
tνmin=2Tin+εTin (Equation 12)
[Math. 14]
tνmax=Tin+εTin (Equation 13)
Thus, a time range of the period T2 may be expressed by the equations 12 and 13.
Next, the analog circuits 11 obtaining the calculated-object values where the loads wi are separated into the positive loads wi+ and the negative loads wi− will be described.
Each of the analog circuits 11 causes each of N+ (The N+ is a natural number that is the N or less) electric signals of the N electric signals given in the period T1 to correspond to each of the positive loads wi+, and causes each of (N−N+) electric signals of the N electric signals given in the period T1 to correspond to each of absolute values of the negative loads wi−.
As shown in
Each of the first output means 18 includes an input terminal 20 to which the electric signal is given, a PMOS transistor 21, a source side of the PMOS transistor 21 is connected to the input terminal 20, and a diode (rectifier) 22 that is connected to a drain side of the PMOS transistor 21. A voltage-output terminal 23 that gives a gate voltage (bias voltage) to each of the PMOS transistors 21 is connected to each of the first output means 18, and each of the first output means 18 has a state in which the same resistance is produced in the PMOS transistor 21 of each of the first output means 18.
In the present embodiment, the first capture-and-storage means 19 is a capacitor (capable of diverting a gate capacitance of a MOS transistor), and a signal-transmission part 24 that outputs a pulse signal at a timing (Hereinafter, the timing is also referred to as “first timing”) at which a voltage held in the first capture-and-storage means 19 reaches a preset first threshold is connected to the first capture-and-storage means 19. A size of the voltage held in the first capture-and-storage means 19 is determined by an amount of the electric charge stored in the first capture-and-storage means 19.
Moreover, the analog circuit 11 includes N− (N−=N−N+) second output means 26 that cause each of the N− electric signals given in the period T1 to correspond to each of the absolute values of the negative loads wi− and output electric charges having sizes depending on each of values of the N− electric signals given in the period T1 and each of the absolute values of the negative loads wi− corresponding to the N− electric signals respectively, and a second capture-and-storage means 27 to which the N− second output means 26 are connected in parallel and in which the electric charges output from each of the N− second output means 26 are stored. Each of the second output means 26 corresponds to each of the negative loads wi−.
Thus, a period in which the N+ electric signals are given to the N+ the first output means 18 coincides with a period in which the N− electric signals are given to the N− second output means 26.
Each of the second output means 26 includes an input terminal 28 to which the electric signal is given, a PMOS transistor 29, a source side of the PMOS transistor 29 is connected to the input terminal 28, and a diode 30 that is connected to a drain side of the PMOS transistor 29. A voltage-output terminal 31 that gives a gate voltage to each of the PMOS transistors 29 is connected to each of the second output means 26, and each of the second output means 26 has a state in which the same resistance is produced in the PMOS transistor 29 of each of the second output means 26. A size of the load wi+ of each of the first output means 18 is determined by the PMOS transistor 21 of each of the first output means 18, and a size of the absolute value of the load wi− of each of the second output means 26 is determined by the PMOS transistor 29 of each of the second output means 26.
In the present embodiment, the second capture-and-storage means 27 is a capacitor (capable of diverting a gate capacitance of a MOS transistor), and a signal-transmission part 32 that outputs a pulse signal at a timing (Hereinafter, the timing is also referred to as “second timing”) at which a voltage held in the second capture-and-storage means 27 reaches a preset second threshold is connected to the second capture-and-storage means 27. A size of the voltage held in the second capture-and-storage means 27 is determined by an amount of the electric charge stored in the second capture-and-storage means 27.
Here, a size of the first threshold is referred to as θ+, a size of the second threshold is referred to as θ−, a sum of the N+ positive loads wi+ is referred to as β+, and a sum of the absolute values of the N− negative loads wi− is referred to as β−. The β+ and the β− are expressed by equations 14 and 15 described below respectively.
Where the first timing is referred to as tν+ and the second timing is referred to as tν−, on the basis of N=N++N−, β=β+−β−, and the equation 3, the θ+ and θ− are expressed by equations 16 and 17 described below respectively.
Note that, in the equations 16 and 17, λ=1 is assumed.
Thus, where the calculated-object value (sum of N multiplied values) is separated into the positive loads wi+ and the negative loads wi−, equations 18 and 19 described below are obtained.
In the present embodiment, a value calculated by the equation 18 is referred to as a first multiply-accumulate value (sum of N+ multiplied values obtained by multiplying each of the positive loads wi+ corresponding to the N+ electric signals by each of the values of the N+ electric signals respectively), and a value calculated by the equation 19 is referred to as a second multiply-accumulate value (sum of N− multiplied values obtained by multiplying each of the absolute values of the negative loads wi− corresponding to the N− electric signals by each of the values of the N− electric signals respectively). As shown in
Thus, the arithmetic part 33 calculates the first multiply-accumulate value when detecting that the voltage held in the first capture-and-storage means 19 reaches the first threshold θ+, and calculates the second multiply-accumulate value when detecting that the voltage held in the second capture-and-storage means 27 reaches the second threshold θ−. As a result, the arithmetic part 33 obtains the calculated-object value by subtracting the second multiply-accumulate value from the first multiply-accumulate value. In the present embodiment, the signal-transmission parts 24 and 32 and the arithmetic part 33 are mainly included in a multiply-accumulate deriving means 34 deriving the calculated-object value.
An equation for obtaining the calculated-object value is expressed by an equation 20 described below.
Here, it is assumed that the arithmetic part 33 calculates the first multiply-accumulate value and the second multiply-accumulate value in the period T2. To make the calculated-object value to reflect all of the electric signals given to each of the first output means 18 and all of the electric signals given to each of the second output means 26, it is necessary that the period T2 exists after the period T1. Both of the time length of the period T1 and the time length of the period T2 are the Tin. Thus, it is necessary for the first threshold θ+ and the second threshold θ− to satisfy equations 21 and 22 described below respectively.
[Math. 22]
θ+=(1+ε)λβ+Tin (Equation 21)
[Math. 23]
θ−=(1+ε)λβ−Tin (Equation 22)
By the equations 21 and 22, the first threshold θ+ is made to have a size proportional to a product of the sum β+ of the N+ loads wi+ and the length Tin of the period T1, and the second threshold θ− is made to have a size proportional to a product of the sum β− of the absolute values of the N− loads wi− and the length Tin of the period T1. As a result, it is shown that all of the electric signals given to each of the first output means 18 and all of the electric signals given to each of the second output means 26 may be reflected by the calculated-object value. In the present embodiment, the values θ+ and θ− are set in order that the equations 21 and 22 are satisfied respectively.
Moreover, a product of the tν+ and the β+ and a product of the tν− and the β− exist on the right side of the equation 20, and to calculate the calculated-object value on the basis of the equation 20, a circuit unit 35 being analog in
Thus, in the present embodiment, to make the circuit structure of the arithmetic part 33 simple, an absolute value of a dummy load w0 (The w0 is expressed by an equation 23 described below) that corresponds to a virtual electric signal having a value 0 and is obtained by multiplying −1 by a difference of the β+ (In other words, the sum of the N+ positive loads wi+) and the β− (In other words, the sum of the absolute values of the N− negative loads wi−) is added to a smaller one of the β+ or the β−.
By adding the dummy load w0, the β+ and the β− are equal to each other. At this time, β+=β−, and on the basis of the equations 21 and 22, θ+=θ−. Thus, where β+=β−=β0 is assumed, the equation 20 may be replaced with an equation 24 described below.
On the basis of the equation 24, where the positive loads wi+ and the negative loads wi− coexist with each other, the calculated-object value is obtained on the basis of a difference of the first timing and the second timing.
As shown in
As shown in
As shown in
As shown in
Moreover, the present invention may be applied to a spiking neural network model showing information by spike pulses. Hereinafter, with reference to
As shown in
In a normal neural network model, a bias value is input to a neuron. Thus, the signal output terminal 51 gives an electric signal indicating a value 1 as the bias value to each of the analog circuits 11, and as a result, the multiply-accumulate operation device 50 treats the bias value input to the neuron by a multiply-accumulate operation.
The neural network model repeats a processing in which a value obtained by the multiply-accumulate operation in each of the neurons is transformed nonlinearly by an activation function f expressed by an equation 25 described below and is delivered to each of the neurons in an upper layer.
In a deep neural network model in recent years, a so-called lamp function or ReLU function (See equation 26 described below) is used as the activation function.
Thus, in the multiply-accumulate operation device 50, ReLU function circuits (an example of a switch mechanism) 52 performing processings of the ReLU function (an example of the activation function) are provided between each of the layers, and the plurality of analog circuits 11 are hierarchically connected via the ReLU function circuits 52.
In a case that the calculated-object value obtained by each of the analog circuits 11 in the lower layer is positive or 0, each of the ReLU function circuits 52 transmits the electric signals indicating the calculated-object value to each of the analog circuits 11 in the upper layer as they are, and in a case that the calculated-object value output from each of the analog circuits 11 in the lower layer is negative, each of the ReLU function circuits 52 transmits electric signals indicating a value 0 (zero) to each of the analog circuits 11 in the upper layer.
In the case that the calculated-object value is positive, on the basis of tν+≤tν−, each of the ReLU function circuits 52 transmits the electric signals indicating the calculated-object value to each of the analog circuits 11 in the upper layer in a case that the first timing is the same as the second timing, or, in a case that the first timing is earlier than the second timing. In a case that the first timing is later than the second timing, each of the ReLU function circuits 52 transmits the electric signals indicating the value 0 to each of the analog circuits 11 in the upper layer.
Each of the ReLU function circuits 52 may include a circuit in
Each of switches 59 and 60 is connected to each of the output terminals 55 and 58 respectively, and a control part 61 that controls on/off of each of the switches 54, 57, 59, and 60 is connected to each of the switches 54, 57, 59, and 60. Moreover, a signal-transmission part 62 that transmits the electric signal corresponding to the value 0 to each of the output terminals 55 and 58 when each of the switches 59 and 60 is on is connected to each of the switches 59 and 60.
In the case that the first timing is the same as the second timing, or, in the case that the first timing is earlier than the second timing, the control part 61 turns each of the switches 54 and 57 to on and each of the switches 59 and 60 to off, and makes each of the output terminals 55 and 58 output the calculated-object value.
In the case that the first timing is later than the second timing, the control part 61 turns each of the switches 54 and 57 to off and each of the switches 59 and 60 to on, and makes each of the output terminals 55 and 58 output the electric signal indicating the value 0.
Moreover, a reason that the dummy load is adopted is to make each of the sum of the positive loads and the sum of the negative loads equal to each other. In a case of the neural network having the plurality of layers in
Furthermore, similarly, by changing the signal line from the lowest layer with a pair of signal lines, the dummy load is unnecessary. As shown in
As shown in
Moreover, in the embodiment described above, it is efficient that a nonvolatile memory device such as a resistance random-access memory device or a ferroelectric-gate-type MOS transistor is applied to the resistance or the MOS transistor used as each of the load values, or each of the change-over switches in order that each of the resistance values and each of the thresholds are variable.
Next, a numerical-simulation experiment performed to confirm the effect of the present invention will be described.
In the present simulation, it is confirmed whether the calculated-object values in a case that the dummy load is adopted in a multiply-accumulate operation device to which 500 input parts giving electric signals to analog circuits and signal output terminals giving bias values are connected and in a case that the dummy load is not adopted in the multiply-accumulate operation device to which the 500 input parts giving the electric signals to the analog circuits and the signal output terminals giving the bias values are connected are equal to each other.
Each of positive loads, each of values of electric signals corresponding to the positive loads, each of negative loads, and each of values of electric signals corresponding to the negative loads are shown in
According to the result of the simulation, both in the case that the dummy load is adopted and in the case that the dummy load is not adopted, the calculated-object value is 4.718, and it is confirmed that the calculated-object values are equal to each other.
As described above, the embodiments of the present invention are described. However, the present invention is not limited only to the embodiments, and a change of a condition or the like may be made without departing from the gist of the present invention in the applicable range of the present invention.
For example, the dummy load may not necessarily be adopted to obtain the calculated-object value. In the case that the dummy load is not adopted, a circuit obtaining the calculated-object value on the basis of the equation 20 may be provided.
Moreover, the processing by the activation function is not necessarily needed.
A multiply-accumulate operation device according to the present invention may improve an operation capacity of a neural network, and as a result, it is expected that the multiply-accumulate operation device according to the present invention is applied and developed to an IoT sensing edge terminal and the like.
Number | Date | Country | Kind |
---|---|---|---|
JP2016-161182 | Aug 2016 | JP | national |
This application is a Continuation Application of application Ser. No. 16/325,003, filed Feb. 12, 2019, which is a 371 Nationalization of PCT/JP2017/028247, filed Aug. 3, 2017 and claims the benefit of Japanese Priority Patent Application JP 2016-161182 filed on Aug. 19, 2016, the entire contents of which are incorporated herein by reference.
Number | Name | Date | Kind |
---|---|---|---|
10831447 | Morie | Nov 2020 | B2 |
20050160130 | Korekado et al. | Jul 2005 | A1 |
20150339570 | Scheffler | Nov 2015 | A1 |
20170243124 | Wang | Aug 2017 | A1 |
20180357527 | Benosman et al. | Dec 2018 | A1 |
20190237137 | Buchanan | Aug 2019 | A1 |
Number | Date | Country |
---|---|---|
2004110421 | Apr 2004 | JP |
2016099707 | May 2016 | JP |
19900700894 | Dec 1990 | KR |
Entry |
---|
T. Morie et al., “An Analog-Digital Merged Neural Circuit Using Pulse Width Modulation Technique”, IEICE Trans. Fundamentals, vol. E82-A, No. 2., Feb. 1999. |
T. Tohara et al., “Silicon Nanodisk Array With a Fin Field-Effect Transistor for Time-Domain Weighted Sum Calculation Toward Massively Parallel Spiking Neural Networks”, Applied Physics Express 9, #034201 pp. 1-4, Feb. 12, 2016. |
Morie. Takashi et al., “An Analog-Digital Merged Neural Circuit Using Pulse Width Modulation Technique,” Analog Integrated Circuits and Signal Processing , vol. 25, No. 3, pp. 319-328, 2000. |
Iwata A., et al., A Multinanodot Floating-Gate Mosfet Circuit for Spiking Neuron Models IEEE Transactions on Nanotechnology, IEEE Service Center, Piscataway, NJ, vol. 2, No. 3, Sep. 1, 2003, pp. 158-164 XP011100549. |
International Search Report, form PCT/ISA/210 dated Oct. 3, 2017 for corresponding International Application No. PCT/JP2017/028247. |
Extended European Search Report dated Jul. 19, 2019 for corresponding European Application No. 17841390.2. |
Japanese Notice of Allowance dated Mar. 16, 2021 for corresponding Japanese Application No. 2018-534339. |
Korean Office Action dated Aug. 23, 2021 for corresponding Korean Application No. 10-2019-7003622. |
Number | Date | Country | |
---|---|---|---|
20210081176 A1 | Mar 2021 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 16325003 | US | |
Child | 17039002 | US |