Pure integer quantization method for lightweight neural network (LNN)

Description

CROSS REFERENCE TO THE RELATED APPLICATIONS

This application is the national phase entry of International Application No. PCT/CN2021/119513, filed on Sep. 22, 2021, which is based upon and claims priority to Chinese Patent Application No. 202110421738.5, filed on Apr. 20, 2021, the entire contents of which are incorporated herein by reference.

TECHNICAL FIELD

The present invention relates to a quantization method for a lightweight neural network (LNN).

BACKGROUND

Recently, a great deal of work has explored quantization techniques for traditional models. However, when these techniques are applied to lightweight networks, there will be a large loss of accuracy. For example, when MobileNetv2 is quantized, the accuracy of the ImageNet dataset drops from 73.03% to 0.1% (Jacob Benoit et al. Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference. IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2018: pp. 2,704-2,713). In another example, a 2% loss of accuracy is caused (Raghura Krishnamoorthi. Quantizing Deep Convolutional Networks for Efficient Inference: A Whitepaper. CoRR, abs/1806.08342, 2018). To recover these losses of accuracy, much work opts for retraining or training-time quantization techniques. But these techniques are time-consuming and require dataset support. To solve these problems, Nagel et al. proposed a data-free quantization (DFQ) algorithm. They attributed the poor performance of traditional quantization methods on models adopting depthwise separable convolutions (DSCs) to differences in weight distribution. For that, they proposed cross-layer weight balance to adjust the balance of weights between different layers. This technique is only applicable to network models using the rectified linear unit (ReLU) as an activation function. However, currently, most lightweight networks use ReLU6. If ReLU6 is directly replaced by ReLU, it will cause a significant loss of accuracy. Furthermore, the method proposed by Nagel et al. is not suitable for pure integer quantization.

SUMMARY

The technical problems to be solved by the present invention are that: A simple combination of a lightweight neural network (LNN) and a quantization technique will lead to a significantly reduced accuracy or a longer retraining time. In addition, currently, many quantization methods only quantize the weights and feature maps, but the offset and quantization coefficients are still floating-point numbers, which is not beneficial to the application specific integrated circuit (ASIC) field-programmable gate array (FPGA).

In order to solve the above technical problems, the present invention adopts the following technical solution: a pure integer quantization method for an LNN, including the following steps:

- step 1: supposing that one feature map has N channels, N≥1, and acquiring a maximum value of each pixel in each of the channels of the feature map of a current layer;
- step 2: processing each pixel in each of the channels of the feature map as follows:
- dividing a value of each pixel in an n-th channel of the feature map by a t-th power of a maximum value in the n-th channel acquired in step 1, t∈[0,1]; and
- acquiring N groups of weights corresponding to the N channels of the feature map of a next layer, each of the groups of weights being composed of N weights corresponding to the N channels of the feature map of the current layer, and processing each of the groups of weights as follows:
- multiplying the N weights in an n-th group respectively by the maximum value of each pixel in the N channels acquired in step 1; and
- step 3: convolving the feature map processed in step 2 with the N groups of weights processed in step 2 to acquire the feature map of the next layer.

Preferably, when t=0, no imbalance transfer is performed; and when t=1, all imbalances between the channels of the feature map of the current layer are transferred to the weights of the next layer.

Preferably, the current layer is any layer except a last layer in the LNN.

The algorithm provided by the present invention is verified on SkyNet and MobileNet respectively, and lossless INT8 quantization on SkyNet and maximum quantization accuracy so far on MobileNetv2 are achieved.

BRIEF DESCRIPTION OF THE DRAWINGS

The FIGURE is a schematic diagram of an imbalance transfer for 1×1 convolution.

DETAILED DESCRIPTION OF THE EMBODIMENTS

The present invention will be described in detail below with reference to specific embodiments. It should be understood that these embodiments are only intended to describe the present invention, rather than to limit the scope of the present invention. In addition, it should be understood that various changes and modifications may be made on the present invention by those skilled in the art after reading the content of the present invention, and these equivalent forms also fall within the scope defined by the appended claims of the present invention.

The analysis and modeling of the quantization process of a neural network shows that the balance of a tensor can be used as a predictive index of a quantization error. Guided by this predictive index, the present invention proposes a tunable imbalance transfer algorithm to optimize the quantization error of a feature map, the specific contents are as follows:

In the current neural network computing mode, weights can be quantized channel by channel, but the feature maps can only be quantized layer by layer. Therefore, the quantization error of the weights is small, but the quantization error of the feature maps is large.

The present invention divides a value of each pixel in each of the channels of the feature map of a current layer in a neural network by a maximum value of each pixel in the channel, and then performs quantization to achieve equivalent channel-by-channel quantization. In order to ensure that the calculation result remains unchanged, for weights convolved with the feature map, the value of each of the channels is multiplied by the maximum value of each pixel in the channel of the corresponding feature map. As a result, the imbalances between the channels of the feature map of the current layer are all transferred to the weights of the next layer.

However, in fact, it is not the optimal solution to transfer all the imbalances between the channels of the feature map. In order to tune the level of the imbalance transfer, the present invention additionally adds a hyperparameter imbalance transfer coefficient t. In the above steps, the value of each pixel in each of the channels of the feature map is divided by the t-th power of the maximum value of each pixel in the channel, where t ranges from 0 to 1. When t=0, no imbalance transfer is performed. When t=1, all the imbalances are transferred as mentioned above. By tuning t, the present invention can obtain optimal quantization accuracy. This operation is applicable to any network model and any convolution kernel size.

The FIGURE shows an imbalance transfer for 1×1 convolution. The dotted tensors share the same quantization coefficients. The value of each pixel in each of the channels of A1 is divided by the maximum value of each pixel in the channel, and the corresponding channel of W2 is multiplied by this maximum value. This operation ensures that the calculation result remains unchanged, but the balance of A1 is greatly increased, and simultaneously, the balance of the weights is not decreased significantly. Therefore, the quantization error of the feature map can be reduced, so as to improve the accuracy of the model after quantization.

Claims

1. A pure integer quantization method for implementing a lightweight neural network (LNN) in an application specific integrated circuit (ASIC) or a field-programmable gate array (FPGA), comprising the following steps: step 1: setting, by the ASIC or FPGA, a feature map with N channels, N≥1, and acquiring a maximum value of each pixel in each of N channels of a first feature map of a current layer:step 2: processing, by the ASIC or FPGA, each pixel in each of the N channels of the first feature map as follows:dividing, by the ASIC or FPGA, a value of each pixel in au n-th channel of the first feature map by a t-th power of a maximum value in the n-th channel acquired in step 1, t∈[0,1]; andacquiring, by the ASIC or FPGA, N groups of weights corresponding to N channels of a second feature map of a next layer, wherein each of the N groups of the weights comprises N weights corresponding to the N channels of the first feature map of the current layer, and processing each of the N groups of the weights as follows:multiplying, by the ASIC or FPGA, the N weights in an n-th group respectively by the maximum value of each pixel in the N channels acquired in step 1;step 3: convolving, by the ASIC or FPGA, the first feature map processed in step 2 with the N groups of the weights processed in step 2 to acquire the second feature map of the next layer; andstep 4: obtaining by the ASIC or FPGA, a quantization accuracy based on a result of step 3, and tuning, by the ASIC or FPGA, the t value to obtain a maximum quantization accuracy.
2. The pure integer quantization method for the LNN according to claim 1, wherein the current layer is any layer except a last layer in the LNN.

Priority Claims (1)

Number	Date	Country	Kind
202110421738.5	Apr 2021	CN	national

PCT Information

Filing Document	Filing Date	Country	Kind
PCT/CN2021/119513	9/22/2021	WO

Publishing Document	Publishing Date	Country	Kind
WO2022/222369	10/27/2022	WO	A

US Referenced Citations (7)

Number	Name	Date	Kind
10527699	Cheng et al.	Jan 2020	B1
20190042948	Lee	Feb 2019	A1
20190279072	Gao	Sep 2019	A1
20190294413	Vantrease et al.	Sep 2019	A1
20200401884	Guo	Dec 2020	A1
20210110236	Shibata	Apr 2021	A1
20220086463	Coban	Mar 2022	A1

Foreign Referenced Citations (12)

Number	Date	Country
105528589	Apr 2016	CN
110930320	Mar 2020	CN
111311538	Jun 2020	CN
111402143	Jul 2020	CN
111937010	Nov 2020	CN
112418397	Feb 2021	CN
112488070	Mar 2021	CN
112560355	Mar 2021	CN
113128116	Jul 2021	CN
WO-0074850	Dec 2000	WO
WO-2005048185	May 2005	WO
2018073975	Apr 2018	WO

Non-Patent Literature Citations (7)

Entry
Cho et al. (Per-channel Quantization Level Allocation for Quantizing Convolutional Neural Networks, Nov. 2020, pp. 1-3) (Year: 2020).
Kang et al. (Decoupling Representation and Classifier for Long-Tailed Recognition, Feb. 2020, pp. 1-16) (Year: 2020).
Polino et al. (Model Compression via Distillation and Quantization, Feb. 2018, pp. 1-21) (Year: 2018).
Benoit Jacob, et al., Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference, IEEE, 2018, pp. 2704-2713.
Raghuraman Krishnamoorthi, Quantizing deep convolutional networks for efficient inference: A whitepaper, 2018, pp. 1-36.
Markus Nagel, et al., Data-Free Quantization Through Weight Equalization and Bias Correction, Qualcomm AI Research, 2019.
Liu Guanyu, et al., Design and implementation of real-time defogging hardware accelerator based on image fusion, Hefei University of Technology, Master's Dissertation, 2020, pp. 1-81.

Related Publications (1)

	Number	Date	Country
	20230196095 A1	Jun 2023	US

Pure integer quantization method for lightweight neural network (LNN)

Information

Patent Number

Date Filed

Date Issued

Inventors

Original Assignees

Examiners

Agents

CPC

Field of Search

CPC

International Classifications

Abstract