Embodiments of the present disclosure described herein relate to a method and system for training a spiking neural network based on a conversion aware training model.
Neuromorphic technology is a technology for imitating the human neural structure in hardware, and has been proposed to overcome limitations of existing computing architectures that have very low efficiency and high power consumption compared to humans in performing cognitive processing functions. Therefore, the neuromorphic technology for driving an edge device with limited power and battery with low power/low energy is attracting attention.
A typical example of the neuromorphic technology is a spiking neural network (SNN). The SNN is a neural network designed to imitate the characteristics of the human brain, which has a neuron-synapse structure, and synapses connecting neurons transfer information in spike-type electrical signals. This SNN processes information based on the time difference between transmission of spike signals. In this case, the SNN transfers information with a binary spike signal, and transfers the information in the form of a set of ‘0’ or ‘ 1 ’ binary spikes. The signals are transferred to neurons through neurons in the SNN, and whether or not spikes occur is determined by differential equations representing various biological processes. In detail, when a spike arrives at an input of a neuron, the input spike is decoded and calculated with a synaptic weight, and the result is accumulated on the membrane potential of the neuron. In this case, when the accumulated membrane potential value has a value greater than or equal to a threshold value, the neuron generates an output spike, and the spike is transferred to the next neuron. In this process, the membrane potential of the neuron that generates the spike is initialized to ‘0’.
As such, since the operation of the SNN operates only when a spike occurs, low-power hardware may be implemented.
Embodiments of the present disclosure provide a spiking neural network training method and system that minimizes data loss occurring during conversion of ANN training data into SNN training data.
In addition, embodiments of the present disclosure provide a spiking neural network training method and system to which an ANN training model similar to the SNN training model is applied.
According to an embodiment of the present disclosure, an embodiment spiking neural network training method based the conversion aware training includes an ANN generation operation of generating an analog artificial neural network (ANN) model and inputting variable data, a conversion aware training operation of simulating a spiking neural network (SNN) model by using one or more activation functions with respect to the analog ANN model, and an SNN generation operation of generating the SNN model by correcting parameters and weights of layers based on a result of the simulation.
In addition, according to an embodiment of the present disclosure, a spiking neural network training system based a conversion aware training includes an ANN generator that generates an analog artificial neural network (ANN) model and to input variable data, a conversion aware training unit that simulates a spiking neural network (SNN) model by using one or more activation functions with respect to the analog ANN model, and an SNN generator that generates the SNN model by correcting parameters and weights of layers based on a result of the simulation.
The above and other objects and features of the present disclosure will become apparent by describing in detail embodiments thereof with reference to the accompanying drawings.
Hereinafter, embodiments of the present disclosure will be described with reference to the accompanying drawings in order to describe the technical idea of the present disclosure in detail to the extent that those skilled in the art can easily carry out it. technique.
As illustrated in
The general ANN-to-SNN conversion technique has higher accuracy than data trained only by SNN, but has a problem that data loss occurs while the ANN data having an analog value is converted into discrete spikes that occur at a specific time. To minimize such data loss, a method is used to train ANNs and SNNs separately, compare the result values of the two networks, and normalize the weights or adjust the parameters of the neural network layer based on the comparison result. However, this method has a problem of increasing the burden on hardware driving, and thus an energy-efficient ANN-to-SNN conversion technique is required.
Referring to
In this case, the analog artificial neural network (ANN) model may be a Deep Neural Network (DNN), a Convolution Neural Network (CNN), a Recurrent Neural Network (RNN), etc., but is not limited thereto, and is an artificial intelligence neural network excluding a Spiking Neural Network (SNN) model.
In the conversion aware training operation (S120), the activation function may be used for one or more layers of the ANN model. In this case, the activation function may include at least one of a ReLU function, a Clip function, and a Time to First Spike (TTFS) function, but is not limited thereto, and may be a function enabling SNN simulation.
In this case, the activation function is a function that serves to transfer a signal to a neuron in another layer by converting the result value of the previous layer, and may improve the complexity of the ANN model. The ReLU function and the Clip function are well-known activation functions, and may be expressed by the following equations.
However, the TTFS (Time to First Spike) function is an optimal activation function developed to implement the present disclosure, and may be expressed by the following equation.
Where, ‘T’ is time, ‘κl’ is a kernel of the layer, ‘τ’ is a time constant of the layer, tlref is start time of a spike, and θ0 is a set threshold value.
A detailed description of the activation function will be described later with reference to
In the SNN generation operation (S130), the SNN model may be generated by converting parameters and weights with respect to the layer using at least one the activation function.
In the conversion aware training operation (S120), the activation function may be used with respect to the layers of the ANN model in the order of the ReLU function, the Clip function, and the TTFS function, but is not limited thereto.
Referring to
After using the ReLU function, a stable SNN simulation operation may be performed by using the Clip function as the second activation function (S220).
After using the Clip function, an SNN simulation operation with improved accuracy may be performed using the TTFS function developed to implement the present disclosure as a third activation function (S230).
As illustrated in
Examples 1 to 3 are applied to commercially available datasets, CIFAR10, CIFAR100, and Tiny-ImageNet, respectively. Table 1 is a table comparing the accuracy of the data resulting from training the datasets with respect to the ANN model.
Example 1 represents that only the Clip function is used as an activation function with respect to each layer of the ANN model, Example 2 represents that only the TTFS function is used as an activation function with respect to a first input layer of the ANN model, and Example 3 represents that the TTFS function is used as an activation function with respect to all layers of the ANN model.
Referring to Table 1, it may be seen that the drop in accuracy of Example 3 is less than that of Examples 1 and 2. Therefore, it may be seen that the accuracy of the data is improved when the TTFS function is used as an activation function with respect to all layers of the ANN model.
Table 2 is a table comparing the performances of the prior art T2FSNN model (comparative example) and Examples 4 to 6 according to the present disclosure. In this case, the comparative example T2FSNN model indicates the conventional ANN-to-SNN conversion technique that is disclosed in the paper “T2FSNN: Deep Spiking Neural Networks with Time-to-first-spike Coding” (authors S. Park, S. Kim, B. Na and S. Yoon).
Table 2 is a table comparing the performances by applying the comparative example and Examples 4 to 6 to commercially available datasets, CIFAR10, CIFAR100, and Tiny-ImageNet, respectively.
Referring to the experimental conditions, VGG16 is used as the network, a training length is 200 epochs, an optimizer is SGD (momentum 0.9), and a training rate is 0.1 (x0.1 on epoch 80, 120, 160).
As illustrated in Table 2, in Examples 4 to 6, logarithmic base and time conditions are applied differently, and the TTFS function according to the present disclosure is used as an activation function.
When comparing the performance values of the comparative example and Examples 4 to 6, it may be seen that the performance of the present disclosure is better in all conditions. In particular, the T2FSNN model may not be applied to complex datasets such as the Tiny-ImageNet.
Therefore, according to the present disclosure, high-performance training that may be processed only by the existing ANN training model may be easily performed by using the SNN training model.
Referring to
In this case, the conversion aware training unit 200 may use the activation functions with respect to one or more layers of the ANN model. The activation functions may include at least one of a ReLU function, a Clip function, and a Time to First Spike (TTFS) function, but is not limited thereto, and may be a function enabling SNN simulation. A detailed description of the activation functions is as described above.
In addition, in the conversion aware training unit 200, the activation functions may be used with respect to the layers of the ANN model in the order of the ReLU function, the Clip function, and the TTFS function, but is not limited thereto.
In addition, the SNN generator 300 may generate the SNN model by converting parameters and weights with respect to the layers using at least one activation function.
Referring to
The input generator includes an input buffer of 48 KB and minfind units, and merges input spikes. The PE array is composed of 128 PEs and four 90 KB weight buffers, and includes a spiking neural network training system based on conversion aware training according to the present disclosure.
The output processing device is composed of a post-processing unit (PPU) and a spike encoder, processes the output of the PE array as a spike, stores the output spike in an output buffer, and then transmits spike information to a DRAM. In addition, the output control device controls the entire processing devices, and a DMA engine manages data access with respect to an off-chip DRAM.
In this case, the input spikes are processed in an aligned manner in the input generator, and the aligned spikes are supplied to the PE array and accumulated as a membrane potential. The output of the PE array is transferred to the output processing device and is encoded into output spikes (fire operation).
Accordingly, by applying the spiking neural network training system based on conversion aware training according to the present disclosure to the processing device, data loss occurring during conversion of ANN training data into SNN training data is minimized.
Therefore, according to the present disclosure, it is possible to drive hardware with low power while performing high-performance training that can be processed only by the existing ANN training model with the SNN training model.
According to an embodiment of the present disclosure, data loss occurring during conversion of ANN training data into SNN training data may be minimized.
In addition, it is possible to drive hardware with low power while performing high-performance training that can only be processed by the existing ANN training model with the SNN training model.
The above descriptions are specific embodiments for carrying out the present disclosure. Embodiments in which a design is changed simply or which are easily changed may be included in the present disclosure as well as an embodiment described above. In addition, technologies that are easily changed and implemented by using the above embodiments may be included in the present disclosure. While the present disclosure has been described with reference to embodiments thereof, it will be apparent to those of ordinary skill in the art that various changes and modifications may be made thereto without departing from the spirit and scope of the present disclosure as set forth in the following claims.
Number | Date | Country | Kind |
---|---|---|---|
10-2022-0121032 | Sep 2022 | KR | national |
10-2023-0030377 | Mar 2023 | KR | national |
This application claims priority under 35 U.S.C. § 119 to Korean Patent Application No. 10-2022-0121032 filed on Sep. 23, 2022 and Korean Patent Application No. 10-2023-0030377 filed on Mar. 8, 2023, in the Korean Intellectual Property Office, the disclosures of which are incorporated by reference herein in their entireties.