This application is based upon and claims the benefit of priority of the prior Japanese Patent Application No. 2016-126941, filed on Jun. 27, 2016, the entire contents of which are incorporated herein by reference.
The embodiments discussed herein are directed to a neural network apparatus, and a control method of the neural network apparatus.
A brain of a creature has many neurons, and each neuron acts to receive a signal inputted from other many neurons and to output signals to other many neurons. Such a mechanism of the brain tried to be realized by a computer is a neural network, and is an engineering model that mimics the behavior of the nerve cell network of the creature. There are various neural networks including, for example, a hierarchical neural network that is often used for object recognition and an undirectional graph (bidirectional graph) neural network that is used for optimization problem and image restoration.
As one example of the hierarchical neural network, perceptron composed of two layers such as an input layer and an output layer is illustrated in
Defining energy E(x) of the undirectional graph neural network as an expression illustrated in
When the neural network is mounted as software, a large quantity of parallel operation is operated, resulting in slow processing. Hence, there is a proposed technique of improving the processing speed of the neural network by mounting the neural network by a circuit being hardware (refer to, for example, Patent Document 1 and Non-Patent Documents 1, 2).
Examples of mounting the neural network by the circuit will be explained referring to
The digital adder 2311 adds weighted input signals w1x1, w2x2, w3x3, . . . , wnxn inputted into the neuron unit 2310 to obtain a total sum. The DA converter 2312 outputs an analog signal obtained by digital-analog converting the total sum of the weighted inputs outputted from the digital adder 2311. The -AD converter 2313 analog-digital converts the analog signal outputted from the DA converter 2312 into a pulse signal as a digital signal according to the amplitude of the analog signal, and outputs the pulse signal. The digital arithmetic unit 2320 multiplies a pulse signal y outputted from the neuron unit 2310 (-AD converter 2313) by a weight w, and outputs a weighted signal wy.
The DA converters 2331 output analog signals made by digital-analog converting weighted input signals w1x1, w2x2, w3x3, . . . , wnxn inputted into the neuron unit 2330 respectively. The analog adder 2332 adds the analog signals outputted from the DA converters 2331 to obtain the total sum. The ΔΣ-AD converter 2333 analog-digital converts the analog signal outputted from the analog adder 2332 into a pulse signal as a digital signal according to the amplitude of the analog signal, and outputs the pulse signal. The digital arithmetic unit 2340 multiplies a pulse signal y outputted from the neuron unit 2330 (-AD converter 2333) by a weight w, and outputs a weighted signal wy.
When the ΔΣ-AD converter is used as a determiner as in the circuit configurations illustrated in
For example, where the frequency band of the input signal is BW and the sampling frequency is fs in the ΔΣ-AD converter, an over sampling rate (OSR) is defined to be (fs/2BW). Where the number of valid bits of resolution of the ΔΣ-AD converter is N, the SNR in the bandwidth of the input signal in a primary ΔΣ-AD converter is expressed by about (6.02 N+1.76-5.17+30 log (OSR)), and the SNR in the bandwidth of the input signal in a secondary ΔΣ-AD converter is expressed by about (6.02 N+1.76-12.9+50 log (OSR)). Accordingly, when the over sampling rate decuples, namely, the sampling frequency fs decuples, the SNR increases by about 30 dB in the primary ΔΣ-AD converter and the SNR increases by about 50 dB in the secondary ΔΣ-AD converter. As explained above, with an increase in the sampling frequency, the quantization noise existing in the frequency band of the input signal can be decreased.
In the conventional neural network apparatus, a clock signal having a fixed frequency is supplied as an operating clock to the neuron unit and the digital arithmetic unit, so that all of circuits such as the neuron unit and the digital arithmetic unit operate with the same constant frequency regardless of the layer and the operation time. This is the same also in the neural network apparatus of the undirectional graph neural network. Therefore, the operating frequency of the neural network apparatus is restricted by the circuit operating with the highest frequency, and all of the circuits operate with the operating frequency of a layer where the calculated amount is large and which is required to have high accuracy (SNR), resulting in increased power consumption.
An aspect of the neural network apparatus includes: a plurality of neuron units each including: an adder that performs addition processing and one or more digital analog converters that perform digital-analog conversion processing, relating to a plurality of weighted inputs; and a delta-sigma analog digital converter that converts an analog signal indicating an added value obtained by adding all of the plurality of weighted inputs obtained from the adder and the one or more digital analog converters, into a pulse signal according to an amplitude, and outputs the pulse signal; a plurality of arithmetic units each of which multiplies the pulse signal outputted from one neuron unit by a weighted value, and outputs a result to another neuron unit; and an oscillator that is capable of changing a frequency of a clock signal to be outputted and supplies the clock signal to the neuron unit and the arithmetic unit according to control from a control unit.
The object and advantages of the invention will be realized and attained by means of the elements and combinations particularly pointed out in the claims.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory and are not restrictive of the invention.
Hereinafter, embodiments will be explained with reference to drawings.
A first embodiment will be explained.
Each of the neuron units 10A includes a digital adder 11, a digital analog converter (DAC) 12, and a delta-sigma analog digital converter (ΔΣ-ADC) 13. The digital adder 11 adds all weighted input signals inputted into the neuron unit 10A to obtain a total sum. The DA converter 12 digital-analog converts a total sum value of weighted inputs outputted from the digital adder 11, and outputs an analog signal according to the total sum value. The ΔΣ-AD converter 13 analog-digital converts the analog signal outputted from the DA converter 12 into a pulse signal y as a digital signal according to the amplitude of the analog signal, and outputs the pulse signal y.
The integrator 220 includes an adder 221 and a delay circuit 222. The adder 221 adds the signal u outputted from the adder (subtracter) 210 and an output from the delay circuit 222. The delay circuit 222 delays the output from the adder 221 and outputs a result. The integrator 220 integrates the signal u outputted from the adder (subtracter) 210 by using the adder 221 and the delay circuit 222, and outputs a result as a signal w. The integrator 220 is, for example, an analog integrator that includes an operational amplifier 301, a resistor 302, and a capacitor 303, and integrates an input signal VIN and outputs a result as an output signal VOUT.
The comparator (quantizer) 230 performs quantization processing on the signal w outputted from the integrator 220, and outputs a result as a 1-bit digital signal y. NQ denotes a quantization noise. The comparator 230 is, for example, a comparator whose circuit configuration is illustrated in
Further, when the clock signal CLK is at a high level, the switch 411 is turned on and the switches 407 to 410 are turned off, whereby the comparator 230 illustrated in
The delay circuit 240 delays the signal (the 1-bit digital signal y) outputted from the comparator 230 and outputs a result. The DA converter 250 digital-analog converts the digital signal delayed by the delay circuit 240 and outputs a result. The DA converter 250 gives a gain corresponding to a reciprocal of the gain of the comparator 230 to the analog signal to be outputted. Note that the configuration of the ΔΣ-AD converter and its internal configurations illustrated in
Returning to
The variable frequency oscillator 30 is an oscillator capable of changing the frequency of a clock signal CK to be outputted, and outputs a clock signal CK having a frequency according to a control signal CTL outputted from the control unit 40. The control unit 40 performs control relating to functional units to control operations to be executed in the neural network apparatus.
The DA converter 501 includes, for example, a resistor ladder circuit 511 and a switch circuit 512 as illustrated in
The voltage control oscillator 502 is, for example, a ring oscillator in which an odd number of inverters 521 are connected as illustrated in
The neural network apparatus illustrated in
For example, a variable frequency oscillator 30-(n−1) in the (n−1)-th layer supplies a clock signal CKn−1 having a frequency according to a control signal CTLn−1 relating to the (n−1)-th layer to a ΔΣ-AD converter and a digital arithmetic unit (not illustrated) in an (n−2)-th layer, and to a digital adder 11-(n−1) and a DA converter 12-(n−1) in the (n−1)-th layer. A variable frequency oscillator 30-n in the n-th layer supplies a clock signal CKn having a frequency according to a control signal CTLn relating to the n-th layer to a ΔΣ-AD converter 13-(n−1) and the digital arithmetic unit 20-(n−1) in the (n−1)-th layer, and to a digital adder 11-n and a DA converter 12-n in the n-th layer. A ΔΣ-AD converter 13-n in the n-th layer is supplied with a clock signal CKn+1 having a frequency according to a control signal CTLn+1 relating to an (n+1)-th layer from a variable frequency oscillator 30-(n+1) in the (n+1)-th layer.
Arranging the variable frequency oscillator 30 capable of changing the frequency of the clock signal CK to be outputted for each layer in the neural network apparatus makes it possible to set the operating frequency for each layer. Thus, a layer required to have high accuracy (SNR) is made to operate with high frequency to reduce the quantization noise existing in a frequency band of the input signal by noise shaping, and the other layers are made to operate with low frequency to suppress power consumption. Accordingly, it is possible to suppress and reduce the power consumption in the whole neural network apparatus while keeping high accuracy. Further, by controlling the operating frequency according to the accuracy (SNR) required, even the same layer is made to operate with low frequency to suppress power consumption in a period when low accuracy is allowable and made to operate with high frequency in a period when high accuracy is required, thereby making it possible to achieve a balance between the power consumption and the accuracy.
Each of the neuron units 10B includes DA converters (DACs) 16, an analog adder 17, and a ΔΣ-AD converter ΔΣ-AD converter (ΔΣ-ADC) 13. The DA converters 16 individually digital-analog convert weighted inputs inputted into the neuron unit 10B, and output analog signals according to the weighted inputs. The analog adder 17 adds the analog signals outputted individually from the DA converters 16 to obtain a total sum. The ΔΣ-AD converter 13 analog-digital converts the analog signal outputted from the analog adder 17 into a pulse signal y as a digital signal according to the amplitude of the analog signal, and outputs the pulse signal y.
As in the neural network apparatus illustrated in
Also the neural network apparatus illustrated in
For example, a variable frequency oscillator 30-(n−1) in the (n−1)-th layer supplies a clock signal CKn−1 having a frequency according to a control signal CTLn−1 relating to the (n−1)-th layer to a ΔΣ-AD converter and a digital arithmetic unit (not illustrated) in an (n−2)-th layer, and to a DA converter 16-(n−1) in the (n−1)-th layer. A variable frequency oscillator 30-n in the n-th layer supplies a clock signal CKn having a frequency according to a control signal CTLn relating to the n-th layer to a ΔΣ-AD converter 13-(n−1) and the digital arithmetic unit 20-(n−1) in the (n−1)-th layer, and to a DA converter 16-n in the n-th layer. A ΔΣ-AD converter 13-n in the n-th layer is supplied with a clock signal CKn−1 having a frequency according to a control signal CTLn+1 relating to an (n+1)-th layer from a variable frequency oscillator 30-(n+1) in the (n+1)-th layer.
Also in the thus configured neural network apparatus illustrated in
Note that the neural network apparatus including the variable frequency oscillator 30 in each layer is illustrated in the above-explained configuration example, but the neural network apparatus is not limited to this configuration, and the clock signal CK may be supplied from one variable frequency oscillator 30 to a plurality of layers. Besides, the clock signal CK may be supplied from one variable frequency oscillator 30 to all of the neuron units 10A(10B) and the digital arithmetic units 20 as illustrated, for example, in
Next, a control example of the neural network apparatus of the hierarchical neural network in the first embodiment will be explained.
A first control example in which the neuron unit and the digital arithmetic unit are made to operate with an operating frequency according to the required level of the accuracy (SNR) for each layer in the neural network apparatus will be explained.
In a convolution layer 702, a product-sum operation of input data 701 of an image and a numerical value of a filter is repeated, and a result of the product-sum operation is outputted to a next layer via an output function. In a max-pooling layer 703, processing of selecting an output having a high numerical value from a certain block in the output from the convolution layer 702 in order to reduce the calculation amount is performed to reduce the number of data pieces. In a convolution layer 704, the same processing as in the convolution layer 702 is performed using the output data from the max-pooling layer 703. In a max-pooling layer 705, the same processing as in the max-pooling layer 703 is performed on the output from the convolution layer 704.
In a full-connect layer 706, the output value from each of neuron units in the max-pooling layer 705 are weighted and added all together. In a reLU layer 707, an output having a negative value of the outputs from the max-pooling layer 705 is converted into 0. In a full-connect layer 708, output values from neuron units in the reLU layer 707 are weighted and added all together. In a softmax layer 709, final recognition is performed to determine what is the input data 701. The convolution layers 702, 704 and the full-connect layers 706, 708 of the above-explained layers are required to perform extremely many arithmetic operations and have high accuracy. On the other hand, the max-pooling layers 703, 705 and the reLU layer 707 may have low accuracy. The softmax layer 709 is required to have accuracy at a middle level though it is not required to have an accuracy as high as those of the convolution layers 702, 704 and the full-connect layers 706, 708.
Hence, as illustrated in
Supplying a clock signal with an appropriate frequency to each of the layers in the neural network apparatus makes it possible to operate the neuron units and the digital arithmetic units in each layer with the appropriate frequency, thereby reducing power consumption as compared with the case of operating all of them with the same constant frequency. Further, the neuron units and the digital arithmetic units in the layer which are required to have high accuracy are made to operate with high frequency and thereby can keep high accuracy without decreasing the accuracy.
Next, a second control example of performing a test every time a certain number of times of learning is iterated and detecting an accuracy rate, and switching the operating frequency of the neuron units and the digital arithmetic units according to the detection result will be explained. In the case where learning is performed at a certain constant learning rate in the hierarchical neural network, the accuracy rate, at the time when performing a test every time a certain number of times of learning is iterated, changes like an accuracy rate 801 illustrated in
In the neural network apparatus mounted with a circuit of the hierarchical neural network as explained above, it is considered that, in the case of a low SNR, when the accuracy rate increases, the value calculated in the learning is buried in noise and the accuracy rate does not increase even if the iteration of learning is repeated. For example, as illustrated in
To solve the above, it is only necessary to perform operation with high operating frequency to increase the SNR to thereby increase the accuracy of the circuit. However, the operation with high operating frequency from the start of learning is waste of power consumption. In the second control example, the operating frequency of the neuron units and the digital arithmetic units is switched according to the detected accuracy rate to perform control to increase stepwise the operating frequency. In more detail, when the accuracy rate at the time when performing a test every time a certain number of times of learning is iterated is not higher than the accuracy rate at the previous time, namely, is equal to or lower than the accuracy rate at the previous time, the operating frequency is switched to an operating frequency at a next stage that is higher than the current operating frequency.
For example, as in an example illustrated in
At step S1002, the control unit 40 starts learning in the neural network apparatus to execute the circuit operation a certain number of times. Then, after the circuit operation performed the certain number of times, the control unit 40 performs a test to acquire an accuracy rate (A1) at step S1003.
Next, at step S1004, the control unit 40 performs learning in the neural network apparatus to execute the circuit operation a certain number of times. Then, after the circuit operation performed the certain number of times, the control unit 40 performs a test to acquire an accuracy rate (A2) at step S1005. Subsequently, at step S1006, the control unit 40 compares the accuracy rate (A1) being the accuracy rate acquired at the previous time and the accuracy rate (A2) being the accuracy rate acquired at this time. When the accuracy rate (A2) is higher than the accuracy rate (A1) as a result, namely, the accuracy rate by the test at this time is higher than the accuracy rate by the test at the previous time, the control unit 40 substitutes the accuracy rate (A2) into the accuracy rate (A1) (updates with the accuracy rate (A2) as the accuracy rate (A1)) at step S1007, and returns to step S1004 and performs learning without changing the operating frequency.
On the other hand, when the accuracy rate (A2) is not higher than the accuracy rate (A1) as a result of the comparison at step S1006, namely, the accuracy rate by the test at this time is equal to or lower than the accuracy rate by the test at the previous time, the control unit 40 determines whether the current operating frequency is the highest set value or not at step S1008. When the current operating frequency is not the highest set value, the control unit 40 substitutes the accuracy rate (A2) into the accuracy rate (A1) (updates with the accuracy rate (A2) as the accuracy rate (A1)) at Step S1009, and increases the operating frequency by an arbitrary value (sets the operating frequency to an operating frequency at a next stage) at step S1010, and returns to step S1004 and performs learning with an operating frequency higher than the operating frequency at the previous time (with an increased SNR).
On the other hand, when the current operating frequency is the highest set value as a result of the determination at step S1008, the control unit 40 performs control relating to data analysis processing of executing final processing at step S1011, and obtains a final result and then ends the operation. Note that the processing at step S1009 and the processing at step S1010 which are explained above are not in order, and the processing at step S1010 may be performed before the processing at step S1009 or may be performed concurrently with the processing at step S1009.
Next, a third control example of controlling switching of the operating frequency of the neuron units and the digital arithmetic units according to the number of times of iteration of learning (learning rate) in the neural network apparatus will be explained. For example, in an AlexNet being a type of the hierarchical neural network, when the learning rate is decreased every time a certain number of times of learning is iterated, and learning is further iterated, the accuracy rate improves. In the case of performing control to decrease the learning rate every time a certain number of times of learning is iterated, it is conceivable that, in the case of a low SNR, when the learning rate is decreased, the value calculated in the learning is buried in noise and the learning is not normally performed.
The above-explained inconvenience can be solved by operation with high operating frequency to increase the SNR. However, the operation with high operating frequency from the start of learning where the learning rate is set to be high is waste of power consumption. In the third control example, in the neural network apparatus controlled to decrease the learning rate every time a certain number of times of learning is iterated, the operating frequency of the neuron units and the digital arithmetic units is switched according to the number of times of iteration of learning (learning rate) to increase the operating frequency stepwise.
For example, as in an example illustrated in
Next, at step S1202, the control unit 40 performs learning in the neural network apparatus to execute the circuit operation a certain number of times. Then, after the circuit operation performed the certain number of times, the control unit 40 decreases the learning rate to a learning rate at a next stage and increases the operating frequency by an arbitrary value (sets the operating frequency to that at the next stage) at step S1203. Subsequently, at step S1204, the control unit 40 determines whether the operating frequency is the highest set value or not. When the operating frequency is not the highest set value, the control unit 40 returns to step S1202 and performs learning with an operating frequency higher than the operating frequency at the previous time (with an increased SNR). On the other hand, when the operating frequency is the highest set value, the control unit 40 performs control relating to data analysis processing of executing final processing at step S1205, and obtains a final result and then ends the operation.
Next, a second embodiment will be explained.
In the neural network apparatus illustrated in
Each of the neuron units 1410 includes a digital adder 1411, a DA converter (DAC) 1412, and a ΔΣ-AD converter (ΔΣ-ADC) 1413. The digital adder 1411 adds all weighted input signals inputted into the neuron unit 1410 to obtain a total sum. The DA converter 1412 digital-analog converts a total sum value of the weighted inputs outputted from the digital adder 1411, and outputs an analog signal according to the total sum value. The ΔΣ-AD converter 1413 analog-digital converts the analog signal outputted from the DA converter 1412 into a pulse signal y as a digital signal according to the amplitude of the analog signal, and outputs the pulse signal y.
The digital arithmetic unit 1420 multiplies the digital signal inputted by the pulse signal y by a weight value w, and outputs a weighted signal. The variable frequency oscillator 1430 is an oscillator capable of changing the frequency of a clock signal CK to be outputted, and outputs a clock signal CK having a frequency according to a control signal CTL outputted from the control unit 1440, to all of the neuron units 1410 and the digital arithmetic units 1420 of the neural network apparatus. The control unit 1440 performs control relating to functional units to control operations to be executed in the neural network apparatus.
Note that the configuration of the ΔΣ-AD converter 1413 and its internal configurations and the configuration of the variable frequency oscillator 1430 are the same as the configuration of the ΔΣ-AD converter 13 and its internal configurations and the configuration of the variable frequency oscillator 30 in the first embodiment. Besides, the neuron unit 1410 obtains the total sum of the weighted inputs using the digital adder in FIG. 14, but may be a circuit that obtains the total sum of the weighted inputs using an analog adder similarly to the neuron unit 10B in the first embodiment. In the case of using the neuron unit that obtains the total sum of the weighted inputs using the analog adder, the variable frequency oscillator 1430 supplies the clock signal CK having the frequency according to the control signal CTL to the DA converters and the ΔΣ-AD converters of the neuron units 1410 and to the digital arithmetic units.
Arranging the variable frequency oscillator 1430 capable of changing the frequency of the clock signal CK to be outputted makes it possible to change the operating frequency in the neural network apparatus according to the required accuracy, the temperature parameter in the Boltzmann machine and the like. This makes it possible to suppress and reduce the power consumption in the whole neural network apparatus while keeping high accuracy to achieve a balance between the power consumption and the accuracy. For example, the neural network apparatus is made to operate with high frequency to reduce the quantization noise existing in a frequency band of the input signal by noise shaping in a period when high accuracy is required, whereas the neural network apparatus is made to operate with low frequency to suppress the power consumption in a period when low accuracy is allowable.
Next, a control example of the neural network apparatus of the undirectional graph neural network in the second embodiment will be explained.
The temperature parameter in the Boltzmann machine will be explained.
In the Boltzmann machine, application of heat noise enables shift also to a direction where energy increases in a certain magnitude, so that the heat noise increases with a larger value of the temperature parameter T to enable a shift to a state with a large energy difference. For example, in the Boltzmann machine, application of appropriate heat noise by the temperature parameter T enables convergence to the optimal solution 1601 by performing the circuit operation even if energy converges to the local solutions 1602 to 1605.
For example, an artificial neuron 1701 is assumed to output 1 when a value obtained by adding a noise n to a local field hi (=x1w1l+ . . . +xjwij+ . . . +xnwiN+bi) being a total sum of the weighted inputs is 0 or more, and output 0 when it is less than 0 as illustrated in
The function indicating the probability is the sigmoid function, and the gradient of the probability changing according to the value of the temperature parameter T changes as in an example illustrated in
Returning to
Next, at step S1504, the control unit 1440 executes the circuit operation of the neural network apparatus (Boltzmann machine) a certain number of times. Then, after the circuit operation performed the certain number of times, the control unit 1440 decreases the value of the temperature parameter T by an arbitrary value at step S1505, and increases the operating frequency by an arbitrary value at step S1506. Subsequently, at step S1507, the control unit 1440 determines whether the value of the temperature parameter T is a minimum set value (end value). When the value of the temperature parameter T is not the minimum set value (end value), the control unit 1440 returns to step S1504 and performs the circuit operation with an operating frequency higher than the operating frequency at the previous time (with an increased SNR). On the other hand, when the value of the temperature parameter T is the minimum set value (end value), the control unit 1440 performs control relating to data analysis processing of executing final processing at step S1508, and obtains a final result and then ends the operation.
Such a control of the value of the temperature parameter and the operating frequency controls the frequency of the clock signal CK outputted from the variable frequency oscillator 1430 to increase an operating frequency 2002 every time a value 2001 of the temperature parameter is decreased as illustrated in
It should be noted that the above embodiments merely illustrate concrete examples of implementing the present invention, and the technical scope of the present invention is not to be construed in a restrictive manner by these embodiments. That is, the present invention may be implemented in various forms without departing from the technical spirit or main features thereof.
In an aspect of the embodiments, it is possible to control the operating frequency according to the required accuracy, thereby reducing the power consumption in the whole apparatus while keeping high accuracy.
All examples and conditional language provided herein are intended for the pedagogical purposes of aiding the reader in understanding the invention and the concepts contributed by the inventor to further the art, and are not to be construed as limitations to such specifically recited examples and conditions, nor does the organization of such examples in the specification relate to a showing of the superiority and inferiority of the invention. Although one or more embodiments of the present invention have been described in detail, it should be understood that the various changes, substitutions, and alterations could be made hereto without departing from the spirit and scope of the invention.
Number | Date | Country | Kind |
---|---|---|---|
2016-126941 | Jun 2016 | JP | national |