The present invention relates to artificial neural networks. In particular, the present invention relates to techniques for implementing artificial neurons.
The idea of artificial neural networks has existed for a long time. Nevertheless, limited computation ability of hardware had been an obstacle to related researches. Over the last decade, there are significant progresses in computation capabilities of processors and algorithms of machine learning. Not until recently did an artificial neural network that can generate reliable judgements become possible. Gradually, artificial neural networks are experimented in many fields such as autonomous vehicles, image recognition, natural language understanding, and data mining.
Neurons are the basic computation units in a brain. Each neuron receives input signals from its dendrites and produces output signals along its single axon (usually provided to other neurons as input signals). The typical operation of an artificial neuron can be modeled as:
wherein x represents the input signal, y represents the output signal. Each dendrite multiplies a weight w to its input signal x; this parameter is used to simulate the strength of influence of one neuron on another. The symbol b represents a bias contributed by the artificial neuron itself. During the process of machine learning, the weights and bias of a neuron may be modified over and over again. Therefore, these parameters are also called learnable parameters. The symbol f represents an activation function and is generally implemented as a sigmoid function, hyperbolic tangent (tanh) function, or rectified linear function in practical computation.
Currently, most artificial neural networks are designed as having a multi-layer structure. Layers serially connected between the input layer and the output layer are called hidden layers. The input layer receives external data and does not perform computation. In a hidden layer or the output layer, input signals are the output signals generated by its previous layer, and each artificial neuron therein respectively performs computation according to the aforementioned equation. Each hidden layer and output layer can respectively be a convolutional layer or a fully-connected layer. At the present time, there are a variety of network structures. Each structure has its unique combination of convolutional layers and fully-connected layers. Taking the AlexNet structure proposed by Alex Krizhevsky et al. in 2012 as an example, the network includes 650,000 artificial neurons that form five convolutional layers and three fully-connected layers connected in serial. When a complicated judgment is required, an artificial neural network may include up to twenty-nine computational layers.
To deal with such a huge computation amount, an artificial neural network at the present time is usually implemented by a supercomputer or a multi-core central processing unit. Because these large-scale processors are originally designed for performing diverse computations, inside there are lots of generic computation units (e.g. circuits for performing adding function, subtracting function, multiplying function, dividing function, trigonometric function, exponential function, logarithmic function, . . . , etc.) and lots of logical units (e.g. AND gates, OR gates, XOR gates, . . . , etc.) However, for computations in an artificial neural network, many circuits in these large-scale processors are unnecessary or unsuitable. Implementing an artificial neural network in this way usually leads to over-wasting in hardware resources. In other words, an artificial neural network may include a lot of dispensable circuits and the overall cost is raised.
To solve the aforementioned problem, a new artificial neuron and controlling method thereof are provided.
One embodiment according to the invention is a neural network including a controller and a plurality of neurons. The controller is configured to generate a forward propagation instruction in a computation process. Each neuron includes an instruction register, a storage device, and an application-specific computation circuit. The instruction register is configured to receive and temporarily store the forward propagation instruction provided by the controller. The storage device is configured to store at least one input data and at least one learnable parameter for this neuron. The application-specific computation circuit is configured to dedicate on computations related to the neuron. In response to the forward propagation instruction received by the instruction register, the application-specific computation circuit performs a computation on the at least one input and the at least one learnable parameter according to an activation function and to feed back a computation result to the storage device.
Another embodiment according to the invention is an artificial neuron including a storage device and a computation circuit. The storage device is configured to store at least one input, at least one learnable parameter, and a look-up table including plural sets of parameters that describe an activation function. The computation circuit is configured to first generate an index based on the at least one input and the at least one learnable parameter, and then find out, based on the look-up table, an output value corresponding to the index in the activation function as a computation result of this neuron.
Another embodiment according to the invention is a controlling method for an artificial neuron. First, an index is generated based on at least one input and at least one learnable parameter of this artificial neuron. Then, based on a look-up table including plural sets of parameters that describe an activation function, an output value corresponding to the index in the activation function is found out and taken as a computation result of this artificial neuron.
The advantage and spirit of the invention may be understood by the following recitations together with the appended drawings.
These and other objectives of the present invention will no doubt become obvious to those of ordinary skill in the art after reading the following detailed description of the preferred embodiment that is illustrated in the various figures and drawings.
The figures described herein include schematic block diagrams illustrating various interoperating functional modules. It should be noted that such diagrams are not intended to serve as electrical schematics and interconnections illustrated are intended to depict signal flow, various interoperations between functional components and/or processes and are not necessarily direct electrical connections between such components. Moreover, the functionality illustrated and described via separate components need not be distributed as shown, and the discrete blocks in the diagrams are not necessarily intended to depict discrete electrical components.
One embodiment according to the invention is a neural network including a controller and a plurality of neurons.
Please refer to
If plural sets of training data are sequentially provided to the neural network 100, the controller 140 can alternatively and repeatedly sends out the above two instructions to the neurons. The learnable parameters of the neurons will be accordingly modified over and over again until the difference between ideal results and training results is converged to be lower than a predetermined threshold. The training process is then completed at that time. Thereafter, in normal computation processes, the controller 140 can generate and send out a forward propagation instruction, so as to request the neurons to perform computations according to learnable parameters determined by the training process.
In the neural network 100, each neuron respectively includes an instruction register, a storage device, and an application-specific computation circuit. The neuron 121 in the hidden layer 120 is taken as an example, and the connections between its components are illustrated in
The application-specific computation circuit 121C is specifically designed for computations responsible by the neuron 121. In other words, the application-specific computation circuit 121C is configured to dedicate on computations related to the neuron 121. First consider computations related to a forward propagation instruction. In response to a forward propagation instruction received by the instruction register 121A, the application-specific computation circuit 121C performs computations on input data and learnable parameters stored in the storage device 121B. Then, the computation result is fed back to the storage device 121B. If the activation function of the neuron 121 is a hyperbolic tangent function, aiming at computations related to a forward propagation instruction, the application-specific computation circuit 121C can be fixedly configured as only including circuits needed for performing a hyperbolic tangent function. For example, the application-specific computation circuit 121C can include only plural multipliers, one adder, one divider, and a circuit for performing exponential function. The multipliers are configured to multiply each input data with a corresponding weight w. The adder sums up the weighted values with a bias b; this summation result is the input value of the activation function. Then, based on the input value, the divider and exponential function circuit can generate a corresponding output value of the hyperbolic tangent function.
Now consider computations related to a backward propagation instruction. In response to a backward propagation instruction received by the instruction register 121A, the application-specific computation circuit 121C performs a backward propagation computation and feeds back its computation results (i.e. modified learnable parameters) to the storage device 121B. Aiming at computations related to a backward propagation instruction, the application-specific computation circuit 121C can be fixedly configured as only including a subtracter, an adder, and a multiplier.
Practically, since a forward propagation instruction and a backward propagation instruction do not come about at the same time, circuits for performing these two instructions can be shared, so as to further reduce components in the application-specific computation circuit 121C.
It is noted that computation details related to the above two instructions may have lots of variations. For example, the activation function of the neuron 121 can be a sigmoid function, a rectified linear function, or a multi-segment linear function instead. With respect to different activation functions, circuit components included in the application-specific computation circuit 121C might be different. For another example, the same activation function can usually be represented by a variety of mathematic equations. Accordingly, required circuit components would be different, too. The variations of each activation function, the computation details, and corresponding circuit components are comprehended by those skilled in the art and not enumerated.
In summary, the application-specific computation circuit 121C can include only circuit components for performing computations related to forward and backward propagation instructions. Compared with a large-scale processor, the neuron structure and number of circuits shown in
The scope of the invention is not limited to a specific storage mechanism. Practically, the storage device 121B can include one or more volatile or non-volatile memory device, such as a dynamic random access memory (DRAM), a magnetic memory, an optical memory, a flash memory, etc. Physically, the storage device 121B can be a single device disposed adjacent to the application-specific computation circuit 121C. Alternatively, the storage devices of plural neurons can be integrated into a larger memory.
Moreover, the controller 140 can be implemented by a variety of fixed and/or programmable logic, such as field-programmable logic, application-specific integrated circuits, microcontrollers, microprocessors and digital signal processors. The controller 140 may also be designed as executing a process stored in a memory as executable instructions.
In one embodiment, the storage device 121B further stores a look-up table including plural sets of parameters that describe an activation function. More specifically, the plural sets of parameters describe the input/output relationship of the activation function. Under this condition, the application-specific computation circuit 121C can be configured as including only plural multipliers and one adder for generating an index based on input data and learnable parameters of the neuron 121. The index is an input for the activation function. Subsequently, based on the look-up table, the application-specific computation circuit 121C finds out an output value corresponding to the index in the activation function. The output value is the computation result of the neuron 121. The advantage of utilizing a look-up table herein is that non-linear computations related to the activation function can be omitted and the circuit components in the application-specific computation circuit 121C can be further simplified. For instance, the divider and exponential function circuit are not required.
In another embodiment, the application-specific computation circuit 121C is configured as dedicating to a limited number of computations respectively corresponding to different activation functions. For example, the application-specific computation circuit 121C can include two sets of circuit. One set is for performing computations corresponding to a hyperbolic tangent function, and the other set is for performing computations corresponding to a multi-segment linear function. When the neural network 100 is dealing with complicated judgements, the user can request, through the controller 140, the application-specific computation circuit 121C to take the hyperbolic tangent function as its activation function. On the contrary, when the neural network 100 is dealing with simple judgements, the application-specific computation circuit 121C can be set as taking the multi-segment linear function as its activation function. It is noted that circuit components related to these two functions may be shared. The advantage of this practice is that considerable computation flexibility can be provided without adding too many hardware in the application-specific computation circuit 121C.
In another embodiment, the neural network 100 is reconfigurable. In other words, the routing between neurons can be modified so as to change the structure of the neural network 100. Under this condition, the controller 140 can be configured to perform a reconfiguration process in which some neurons in the neural network 100 can be optionally abandoned. For example, if after a training process, the controller 140 finds out the neurons 123 and 124 have little influence on the final output generated by the output layer 130, the controller 140 can generate an abandoning instruction and provide the abandoning instruction to the neurons 123 and 124. Thereby, the controller 140 can request the application-specific computation circuits in the neurons 123 and 124 not to perform any computation.
Another embodiment according to the invention is an artificial neuron including a storage device and a computation circuit. Practically, the computation circuit herein can be the application-specific computation circuit shown in
Another embodiment according to the invention is a controlling method for an artificial neuron. The flowchart of this controlling method is shown in
With the example and explanations above, the features and spirits of the invention will be hopefully well described. Those ordinarily skilled in the art will readily observe that numerous modifications and alterations of the device may be made while retaining the teaching of the invention. Accordingly, the above disclosure should be construed as limited only by the metes and bounds of the appended claims. Additionally, mathematical expressions are contained herein and those principles conveyed thereby are to be taken as being thoroughly described therewith. It is to be understood that where mathematics are used, such is for succinct description of the underlying principles being explained and, unless otherwise expressed, no other purpose is implied or should be inferred. It will be clear from this disclosure overall how the mathematics herein pertain to the present invention and, where embodiment of the principles underlying the mathematical expressions is intended, the ordinarily skilled artisan will recognize numerous techniques to carry out physical manifestations of the principles being mathematically expressed.
Those skilled in the art will readily observe that numerous modifications and alterations of the device and method may be made while retaining the teachings of the invention. Accordingly, the above disclosure should be construed as limited only by the metes and bounds of the appended claims.