This application is based upon and claims the benefit of priority of the prior Japanese Patent Application No. 2019-167657, filed on Sep. 13, 2019, the entire contents of which are incorporated herein by reference.
The embodiments discussed herein are related to an information processing device, an information processing method, and a recording medium storing an information processing program.
By quantizing variables to be used in a neural network and carrying out arithmetic operations thereon, the amount of arithmetic operations is reduced and the amount of memory used is reduced as compared with a case where arithmetic operations are carried out using floating-point numbers. For example, there is proposed a method in which a fixed-point representation of a variable for each channel is determined based on a statistical portion for each channel of a floating-point variable generated by pre-training (for example, see Japanese Laid-open Patent Publication No. 2019-32833). In addition, a method is proposed in which update values of fixed-point weights used for arithmetic operations of a neural network are accumulated, and when the accumulated value is equal to or greater than a threshold value, the weights are updated by using the accumulated update values (for example, see Japanese Laid-open Patent Publication No. 2019-79535).
There is proposed a method in which processing results by pooling processing after convolution operation of a neural network are integrated to calculate an average value and a standard deviation, and a result of the pooling processing is subjected to normalization processing in the arithmetic operations of the next layer by using the calculated average value and standard deviation (for example, see Japanese Laid-open Patent Publication No. 2017-156941).
For example, Japanese Laid-open Patent Publication No. 2019-32833, 2019-79535,2017-156941 and the like are disclosed.
According to an aspect of the embodiments, an information processing device, includes a memory; and a processor coupled to the memory and configured to: determine a plurality of bit ranges after quantization for at least one of a plurality of types of variables to be used in a neural network, calculate a plurality of recognition rates of the neural network by using each of a plurality of variable groups which includes the plurality of types of variables, and in which a bit range of at least one of the plurality of types of variables is different, and determine to use a variable group of the plurality of variable groups, the variable group having a maximum recognition rate among the plurality of calculated recognition rates, for calculation of the neural network.
The object and advantages of the invention will be realized and attained by means of the elements and combinations particularly pointed out in the claims.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory and are not restrictive of the invention.
When a variable to be used in a neural network is quantized, the amount of arithmetic operations and the amount of memory used are reduced, and the calculation time is shortened. However, the accuracy of the calculation of the neural network is lowered. When the accuracy of the calculation is lowered, the accuracy of the learning of the neural network is lowered, and for example, a recognition rate may deteriorate in inference processing using the neural network after the learning.
In one aspect, an object of the disclosure is to reduce the deterioration in a recognition rate when calculations of a neural network are carried out using quantized variables.
Hereinafter, embodiments are described with reference to drawings.
The information processing device 100 includes a central processing unit (CPU) 10, a memory 20, and an accelerator 30 that are coupled to one another via a communication bus 40. The information processing device 100 may include another processor, instead of the CPU 10. The information processing device 100 may include elements other than those illustrated in the drawing, and the information processing device 100 may not include the accelerator 30 and may make the CPU 10 execute calculation processing to be executed by the accelerator 30.
The CPU 10 includes a bit range determination unit 12, a recognition rate calculation unit 14, and a variable determination unit 16. The CPU 10 further includes an arithmetic unit (not illustrated). In
At least one of the bit range determination unit 12, the recognition rate calculation unit 14, and the variable determination unit 16 may be implemented by hardware. In this case, the bit range determination unit 12, the recognition rate calculation unit 14, and the variable determination unit 16 may not be included in the CPU 10, but included in a field-programmable gate array (FPGA), an application-specific integrated circuit (ASIC), or the like (not illustrated).
The bit range determination unit 12 determines a plurality of bit ranges when quantized for at least one of a plurality of types of variables to be used in the neural network. For example, the plurality of types of variables includes a weight, an activity, and a gradient.
The recognition rate calculation unit 14 causes the accelerator 30 to execute learning of the neural network by using each of a plurality of variable groups including the plurality of types of variables and having different bit ranges of the variables, and calculates each recognition rate in the neural network. Each of the plurality of variable groups includes any of the plurality of bit ranges of the determination target variable for which the plurality of bit ranges is determined. For example, at least one of the determination target variables included in each of the plurality of variable groups has a different bit range from one another.
The variable determination unit 16 determines to use a variable group having a maximum recognition rate among a plurality of recognition rates calculated by the recognition rate calculation unit 14 for subsequent learning of the neural network. Examples of operations of the bit range determination unit 12, the recognition rate calculation unit 14, and the variable determination unit 16 will be described later.
The memory 20 stores various programs such as an operating system (OS), an application program and an information processing program, and also stores data, variables, and the like to be used in neural network processing. When the bit range determination unit 12, the recognition rate calculation unit 14, and the variable determination unit 16 are implemented by the information processing program executed by the CPU 10, the memory 20 stores the information processing program.
The accelerator 30 is, for example, a graphics processing unit (GPU), a digital signal processor (DSP), or a dedicated processor for deep learning, and is capable of executing calculation of the neural network. For example, the accelerator 30 may include a large number of fixed-point arithmetic units (not illustrated), and may not include a floating-point arithmetic unit. The accelerator 30 may include a large number of fixed-point arithmetic units and a large number of floating-point arithmetic units which are not illustrated.
For example, the information processing device 100 inputs each of a plurality of pieces of learning data (input data) included in a mini-batch to the input layer, and sequentially executes calculations of the convolutional layer, the pooling layer, and the like, thereby carrying out forward propagation processing in which the information obtained by the arithmetic operations is sequentially transmitted from the input side to the output side. The mini-batch is obtained by dividing a data set (batch) to be used for learning into a plurality of pieces, and it includes a predetermined number of pieces of input data (image data or the like). For example, in the convolutional layer, activities which are output data (intermediate data) from the previous layer, and weights prepared in advance as learning data are subjected to a convolutional arithmetic operation, and activities which are output data obtained by the convolutional arithmetic operation are output as input data of the next layer.
After the execution of the forward propagation processing by the mini-batch, backward propagation processing is executed to calculate gradients in order to reduce a difference (for example, a square sum of errors) between output data output from the output layer and correct answer data. Subsequently, update processing to update variables such as weights is carried out based on the execution of the backward propagation processing. For example, as an algorithm for determining an update width of the weight to be used for the calculation of the backward propagation processing, a method of gradient descent is used. For example, after the variable is updated, a recognition rate (correct answer rate) is calculated by operating the neural network by using data for determination.
In the following, the weights, activities, and gradients to be used in the calculation of the neural network are also referred to as variables. By executing the forward propagation processing, the backward propagation processing, and the update processing of variables in each of the plurality of mini-batches, the recognition rate gradually increases, and the deep neural network is optimized.
<8, 3> Indicates that the fixed-point number has 8 bits, in which the decimal part uses the lower 3 bits, and the integer part uses the upper 4 bits while excluding the sign bit S. <8, 4> indicates that the fixed-point number has 8 bits, in which the decimal part uses the lower 4 bits, and the integer part uses the upper 3 bits while excluding the sign bit S.
<16, 10> indicates that the fixed-point number has 16 bits, which the decimal part uses the lower 10 bits, and the integer part uses the upper 5 bits while excluding the sign bit S. <16, 12> indicates that the fixed-point number has 16 bits, in which the decimal part uses the lower 12 bits, and the integer part uses the upper 3 bits while excluding the sign bit S.
Intermediate data (the activity, the gradient, or the weight optimized by the learning) output from a certain node is, for example, an arithmetic operation result by the fixed-point arithmetic unit, and is stored in an accumulator (for example, 40 bits) of the fixed-point arithmetic unit every time an arithmetic operation by the node is executed. It is assumed that the initial weight has a predetermined number of bits (8 bits, 16 bits, or the like).
When the initial weight or the intermediate data is floating-point number data, the floating-point number data is converted into fixed-point number data of 40 bits or less, resulting in the state illustrated in
Further, the bit range determination unit 12 may determine, based on distribution of the most significant bits of the variable, the bit range 1 to the bit range 3 from the most significant bit side of the distribution. At this time, the bit range determination unit 12 may determine, based on the distribution of the most significant bits of the variable, the bit ranges 1 to 3 from the most significant bit side included in an effective range of the distribution.
In
For example, the effective range of the distribution is set by excluding a predetermined ratio of variables with respect to the total number of variables from the distribution in descending order of magnitude of the values. By determining the plurality of bit ranges based on the distribution of the most significant bits of the variable, it is possible to reduce a quantization error with respect to the original variable (for example, floating-point type) before quantization, compared to a case where the distribution of the most significant bits is not used. As a result, it is possible to reduce the deterioration in the recognition rate obtained by the learning of the neural network.
By determining a plurality of bit ranges from the most significant bit side of the distribution within the effective range of the distribution of the most significant bits, it is possible to determine the plurality of bit ranges by using a region in which the appearance frequency of the variable is high in the distribution. As a result, in the calculation of the neural network, it is possible to reduce an error between a case where the quantized variable is used and a case where the original variable before quantization is used, and it is also possible to execute the learning while suppressing a decrease in accuracy. This makes it possible to reduce the deterioration in the recognition rate.
When quantizing a variable into a bit range, the information processing device 100 carries out saturation processing in which bits on the upper side relative to the most significant bit of the bit range are taken into the bit range and rounding processing in which bits on the lower side relative to the least significant bit of the bit range are taken into the bit range.
When the bit range determination unit 12 determines the bit range based on the distribution of the most significant bits of the variable, the information processing device 100 may include a statistical information acquisition unit configured to acquire the distribution of the most significant bits of the variable. In this case, the statistical information acquisition unit may acquire the distribution of the most significant bits of only the determination target variable of the plurality of bit ranges.
The bit range determination unit 12 may determine a plurality of bit ranges in ascending order of quantization errors when a variable is quantized into the plurality of bit ranges. In this case, the information processing device 100 may include a quantization error calculation unit configured to calculate quantization errors when a variable is quantized into the plurality of bit ranges. The quantization error calculation unit may calculate quantization errors only for the determination target variable of the plurality of bit ranges.
In the example illustrated in
The recognition rate calculation unit 14 generates a plurality of variable groups by combining the bit ranges of each variable determined by the bit range determination unit 12. Numbers of the weight, activity, and gradient of the variable groups illustrated in
The learning execution period is an example of a calculation execution period in which calculation of the neural network is executed using the variable group determined in the group determination period. The learning cycle in which the operations in the group determination period and the operations in the learning execution period are executed is an example of a calculation cycle and corresponds to, for example, a predetermined epoch number. One epoch indicates one-time learning corresponding to a data set input by a user.
First, in Step S10, the information processing device 100 determines a plurality of bit ranges for each variable. Processing in Step S10 is carried out by the bit range determination unit 12. Next, in Step S12, the information processing device 100 selects one variable group including any of a plurality of bit ranges for each variable. Processing in Step S12 is executed by the recognition rate calculation unit 14 or another functional unit of the CPU 10.
Next, in Step S14, the information processing device 100 executes the learning of the neural network using each variable quantized in accordance with the bit range corresponding to the selected variable group, and calculates a recognition rate. Processing in Step S12 is carried out by the recognition rate calculation unit 14.
Next, in Step S16, the information processing device 100 determines whether the learning of the neural network using the variables of all the variable groups is completed. When the learning using the variables of all the variable groups is completed, the process proceeds to Step S18. When there is a variable group that has not been used for the learning, the process returns to Step S12 in order to execute the learning using the variables of the variable group that has not been used for the learning.
In a case where the information processing device 100 recognizes in advance that there are several variable groups having similar recognition rates among the plurality of variable groups, the information processing device 100 may execute the learning by using any of the variable groups having similar recognition rates as a representative. In this case, the number of times of the learning may be reduced, so that the learning time may be reduced.
As described in
Next, in Step S18, the information processing device 100 determines to use a variable group having a maximum recognition rate among the plurality of recognition rates calculated in Step S14, for subsequent learning. Processing in Step S18 is carried out by the variable determination unit 16. Next, in step S20, the information processing device 100 executes the learning in the learning execution period using the variables of the variable group determined in Step S18.
Next, in step S22, when the information processing device 100 has executed the learning (one epoch) corresponding to a data set input by the user a predetermined number of times, for example, the process illustrated in
In the present embodiment, in the group determination period set in the first half of the learning cycle, the use of the variable group that brings the maximum recognition rate among the plurality of variable groups is determined, and the determined variable group is used for the learning in the learning execution period set in the second half of the learning cycle. As a result, it is possible to raise the possibility of improving the recognition rate as compared with the case where the learning cycle is repeatedly executed while being fixed to one variable group. In addition, in each of the plurality of learning cycles, by determining the variable group having the maximum recognition rate before the learning execution period, it is possible to raise the possibility of improving the recognition rate as compared to the case of using the variable group having the non-maximum recognition rate in each learning execution period.
As described with reference to
Since the floating-point type data is converted into the fixed-point type data by quantization and then the learning is executed, the calculation time is shortened and the memory usage is reduced. Accordingly, by executing the learning while using a plurality of variable groups (fixed-point type) in which combinations of bit ranges of a plurality of types of variables are different, it is possible to reduce the calculation time while reducing the deterioration in the recognition rate.
In contrast, as illustrated in a comparative example (broken line), when learning of the neural network is executed using one bit range for each fixed-point type variable (for example, one variable group is used), the recognition rate may not be improved even when the learning is repeated. When the learning of the neural network is executed using one variable group, although there exists a learning cycle (for example, predetermined epoch number times of learning), neither a group determination period nor a learning execution period is present.
As described above, in the present embodiment, the information processing device 100 determines a plurality of bit ranges for each variable to be used for learning of the neural network, and executes the learning of the neural network using a plurality of variable groups including any of the plurality of bit ranges for each variable. The information processing device 100 executes the subsequent learning by using the variable group having the maximum recognition rate among the plurality of recognition rates obtained by the learning of the neural network executed by using each of the plurality of variable groups. By executing the learning of the neural network using a variable group having a higher recognition rate than other variable groups, it is possible to reduce the deterioration in the recognition rate as compared to a case where the learning is executed using one bit range for each fixed-point type variable. Even when a fixed-point type is used for a variable, the recognition rate may be gradually improved by continuing the learning.
The information processing device 100 determines the use of a variable group that brings the maximum recognition rate among the plurality of variable groups, in the learning executed in the group determination period that is set in the first half of the learning cycle. The information processing device 100 uses the determined variable group for the learning in the learning execution period that is set in the second half of the learning cycle. As a result, it is possible to raise the possibility of improving the recognition rate as compared with a case where the learning is executed while being fixed to one variable group in the learning cycle.
In addition, in each of the plurality of learning cycles, by determining the variable group having the maximum recognition rate before the learning execution period, it is possible to raise the possibility of improving the recognition rate as compared to the case of using the variable group having the non-maximum recognition rate in each learning execution period. As described above, by executing the learning while using a plurality of variable groups including fixed-point type variables, it is possible to reduce the calculation time while reducing the deterioration in the recognition rate.
By determining a plurality of bit ranges (quantization positions) from the most significant bit side of the distribution based on the distribution of the most significant bits of the fixed-point type variables, it is possible to reduce a quantization error with respect to the original variable before quantization, compared to a case where the distribution of the most significant bits is not used. As a result, it is possible to reduce the deterioration in the recognition rate obtained by the learning of the neural network.
By determining a plurality of bit ranges from the most significant bit side of the distribution within the effective range of the distribution of the most significant bits, it is possible to determine the plurality of bit ranges by using a region in which the appearance frequency of the variable is high in the distribution. As a result, in the calculation of the neural network, it is possible to reduce an error between a case where the quantized variable is used and a case where the original variable before quantization is used, and it is also possible to execute the learning while suppressing a decrease in accuracy. This makes it possible to reduce the deterioration in the recognition rate.
For example, the information processing device 100A is a server, and includes a CPU 10A, a memory 20, an accelerator 30, an auxiliary storage device 50, a communication interface 60, and an input and output interface 70 that are coupled to one another via a communication bus 40. The information processing device 100A may include a constituent element other than the illustrated constituent elements.
The CPU 10A includes a statistical information acquisition unit 11 in addition to the configuration of the CPU 10 illustrated in
The statistical information acquisition unit 11, the bit range determination unit 12, a recognition rate calculation unit 14, and a variable determination unit 16 are achieved by the CPU 10A executing an information processing program stored in the memory 20. At least one of the statistical information acquisition unit 11, the bit range determination unit 12, the recognition rate calculation unit 14, and the variable determination unit 16 may be implemented by hardware.
The auxiliary storage device 50 stores various programs to be executed by the CPU 10A such as an operating system (OS) and an information processing program, and also stores data and various variables such as weights to be used for calculation of the neural network, and the like. For example, the programs stored in the auxiliary storage device 50 are transferred to the memory 20 and are executed by the CPU 10A. The data and various variables to be used for the calculation of the neural network that are stored in the auxiliary storage device 50 are transferred from the auxiliary storage device 50 to the memory 20 when learning of the neural network is executed.
The communication interface 60 has a function of communicating with another information processing device and the like via a network, for example. Therefore, a plurality of information processing devices may be used to execute the calculation of the neural network in parallel. The input and output interface 70 has a function of inputting data from and outputting data to a recording medium 80 to be coupled to the information processing device 100A.
For example, the recording medium 80 is a Compact Disc (CD: registered trademark), a Digital Versatile Disc (DVD; registered trademark), a Universal Serial Bus (USB) memory, or the like, and the information processing program may be recorded therein. The information processing program recorded in the recording medium 80 is transferred to the auxiliary storage device 50 via the input and output interface 70, and then is developed over the memory 20 and executed by the CPU 10A.
The bit range determination unit 12 determines a plurality of bit ranges as quantization ranges, based on the distribution of the most significant bits of the variable (
In this example, the decimal point is set between a bit position indicated by the reference symbol a6 and a bit position indicated by the reference symbol a7. A quantization range of a bit range 1 is set to <7, 1>, a quantization range of a bit range 2 is set to <7, 2>, and a quantization range of a bit range 3 is set to <7, 3>. The bit range determination unit 12 may determine a bit range from the most significant bit side included in an effective range, which is set by excluding a predetermined ratio of variables with respective to the total number of variables from the distribution in descending order of magnitude of the values.
For example, in the bit range 1 where the most significant bit in the distribution coincides with the most significant bit of the bit range, the CPU 10A carries out rounding processing on the bits on the lower side relative to the least significant bit of the bit range 1, and quantizes the variable. On the other hand, in the bit range 2 and the bit range 3 where the bit ranges are included inside the distribution, the CPU 10A carries out saturation processing on the bits on the upper side relative to the most significant bit of the bit range and rounding processing on the bits on the lower side relative to the least significant bit of the bit range, and quantizes the variable.
First, in
Next, in
Next, in
In step S8, the information processing device 100A acquires distribution of the most significant bits for each type of variable (for example, weight, activity, and gradient) used in the neural network. Processing in Step S8 is carried out by the statistical information acquisition unit 11. In Step S10, the information processing device 100 determines a plurality of bit ranges for each variable based on the distribution of the most significant bits for each type of variable acquired in Step S8.
An example of transitions of recognition rates when learning of the neural network is executed is similar to that in
Thus, also according to the embodiment described with reference to
Furthermore, in the embodiment described with reference to
The information processing device 100B includes a CPU 106 instead of the CPU 10A in
Based on the distribution of the most significant bits for each variable acquired by the statistical information acquisition unit 11, the quantization error calculation unit 13 calculates a quantization error when quantization is carried out in a plurality of bit ranges for each variable used in the learning of the neural network. For example, the number of the plurality of bit ranges is preferably larger than the number of the plurality of bit ranges determined by the bit range determination unit 12.
The bit range determination unit 12 determines a plurality of bit ranges from the most significant bit side of the distribution of the most significant bits acquired by the statistical information acquisition unit 11 in ascending order of quantization errors calculated by the quantization error calculation unit 13, for each variable used in the learning of the neural network. Accordingly, the bit range determination unit 12 does not necessarily determine the plurality of bit ranges in the order from the most significant bit side of the distribution.
Similarly to
Based on the distribution of the most significant bits of the variable, the quantization error calculation unit 13 calculates a quantization error when the bit range of the variable is set to a predetermined quantization range (
For example, the quantization error is calculated by carrying out saturation processing on the bits positioned on the left side of the most significant bit of the quantization range and carrying out rounding processing on the bits positioned on the right side of the least significant bit of the quantization range. In the example illustrated in
Quantization error=(a1·b1+a2·b2)−(a3·b1+a3·b2)+(a10·b10+a11·b11)−(a9·b10+a9·b11) (1)
In Equation (1), “·” indicates a product, the first and second terms indicate a saturation error, and the third and fourth terms indicate a rounding error. Then, for example, an average value of the calculated quantization errors for each variable is calculated and is determined as the quantization error to be compared with a threshold value.
In
An example of transitions of recognition rates when learning of the neural network is executed is similar to that in
As discussed thus far, also in this embodiment, in the same manner as in the above-described embodiment, by executing the learning using a variable group having a higher recognition rate than other variable groups, it is possible to reduce the deterioration in the recognition rate when executing the learning of the neural network using quantized variables. Further, in this embodiment, the quantization error is calculated by the quantization error calculation unit 13, so that the bit range determination unit 12 may determine a plurality of bit ranges in ascending order of quantization errors. Accordingly, in the calculation of the neural network, it is possible to reduce an error between a case where a quantized variable (fixed-point type) is used and a case where the original variable before quantization (floating-point type) is used, and it is also possible to execute the learning while suppressing a decrease in accuracy. As a result, the deterioration in the recognition rate due to the learning of the neural network may be further reduced.
In the above-described embodiment, an example is described in which learning in the group determination period is executed using a plurality of variable groups, and learning in the learning execution period is executed using a variable group having the maximum recognition rate obtained by the learning. However, in a group learning period, a plurality of variable groups to be used for learning in the learning execution period may be determined based on a loss function (Train loss or Test loss) at the time of learning using a plurality of variable groups.
In the above-described embodiment, an example is described in which learning in the group determination period is executed using a plurality of variable groups including any of a plurality of bit ranges for each variable, and learning in the learning execution period is executed using a variable group having the maximum recognition rate obtained by the learning. However, inference of the neural network may be carried out using a plurality of variable groups including any of a plurality of bit ranges for each variable, and the subsequent inference may be carried out using a variable group having the maximum recognition rate obtained by the inference.
In this case, for example, the learning cycles in
As a result, also in the inference using the neural network, it is possible to obtain the same effects as in the case of executing the learning in the learning execution period using a variable group having the maximum recognition rate obtained by executing the learning in the group determination period using a plurality of variable groups. For example, by executing the inference using a variable group having a higher recognition rate than other variable groups, it is possible to reduce the deterioration in the recognition rate when executing the inference of the neural network using quantized variables in comparison with a case where the inference is executed using floating-point type variables. Even when a fixed-point type is used for variables, the recognition rate may be gradually improved by continuing the inference.
In the above-described embodiment, an example is described in which a plurality of bit ranges is determined for each type of variable used in the neural network. For example, an example is described in which a plurality of bit ranges common to all weights (or activities or gradients) used in the neural network is determined. However, for example, a plurality of bit ranges may be determined for each type of variable in units of a predetermined number of layers of the neural network. For example, a plurality of bit ranges may be determined for each of a weight (or activity or gradient) used in a certain layer and a weight (or activity or gradient) used in another layer.
A plurality of bit ranges may be determined for a weight (or activity or gradient) used in a certain layer, and one bit range may be determined for a weight (or activity or gradient) used in another layer. For example, a plurality of bit ranges may be determined for a weight and an activity used in the neural network, and one bit range may be determined for a gradient used in the neural network. Which of the variables is to be given a plurality of bit ranges is determined in consideration of a calculation load and an effect of improvement in the recognition rate by the past learning.
The following supplementary appendices are further disclosed related to the embodiments described in
(Appendix 1)
An information processing device including a processor, wherein
the processor is configured to:
determine a plurality of bit ranges after quantization for at least one of a plurality of types of variables to be used in a neural network;
calculate recognition rates of the neural network by using each of a plurality of variable groups which includes the plurality of types of variables, and in which a bit range of at least one of the plurality of types of variables is different; and
determine to use the variable group having a maximum recognition rate among a plurality of the calculated recognition rates, for calculation of the neural network.
(Appendix 2)
The information processing device according to appendix 1, wherein
the processor executes the calculation of the neural network in such a manner that the calculation is executed in a plurality of calculation cycles each of which includes a group determination period and a calculation execution period,
calculating the recognition rates and determining the variable group having the maximum recognition rate are operated in the group determination period, and
in each of the plurality of calculation cycles, the calculation in the calculation execution period is executed using the variable group determined in the group determination period.
(Appendix 3)
The information processing device according to appendix 1 or 2, wherein
the determining of the bit ranges determines, based on distribution of most significant bits when a determination target variable of the plurality of bit ranges is represented by a fixed-point number, the plurality of bit ranges from the most significant bit side of the distribution.
(Appendix 4)
The information processing device according to appendix 3, wherein
the determining of the bit ranges determines the plurality of bit ranges from the most significant bit side of the distribution within an effective range of the distribution of the most significant bits.
(Appendix 5)
The information processing device according to appendix 3 or 4, wherein
the processor calculates each of quantization errors when the determination target variable of the plurality of bit ranges is quantized in the plurality of bit ranges, and
the determining of the bit ranges determines the plurality of bit ranges from the most significant bit side of the distribution in ascending order of the calculated quantization errors.
(Appendix 6)
The information processing device according to any one of appendices 3 to 5, wherein
the processor acquires distribution of the most significant bits of the determination target variable of the plurality of bit ranges among the plurality of types of variables calculated by the calculation of the neural network.
(Appendix 7)
The information processing device according to any one of appendices 1 to 6, wherein
the processor executes learning of the neural network by using the determined variable group having the maximum recognition rate.
(Appendix 8)
The information processing device according to appendix 7, wherein
the plurality of types of variables includes a weight, an activity, and a gradient.
(Appendix 9)
The information processing device according to any one of appendices 1 to 6, wherein
the processor executes inference of the neural network by using the determined variable group having the maximum recognition rate.
(Appendix 10)
An information processing method for causing a processor included in an information processing device to execute a process, the process including
determining a plurality of bit ranges after quantization for at least one of a plurality of types of variables to be used in a neural network;
calculating recognition rates of the neural network by using each of a plurality of variable groups which includes the plurality of types of variables, and in which a bit range of at least one of the plurality of types of variables is different; and
determining to use the variable group having a maximum recognition rate among a plurality of the calculated recognition rates, for calculation of the neural network.
(Appendix 11)
An information processing program for causing a processor included in an information processing device to execute a process, the process including
determining a plurality of bit ranges after quantization for at least one of a plurality of types of variables to be used in a neural network;
calculating recognition rates of the neural network by using each of a plurality of variable groups which includes the plurality of types of variables, and in which a bit range of at least one of the plurality of types of variables is different; and
determining to use the variable group having a maximum recognition rate among a plurality of the calculated recognition rates, for calculation of the neural network.
Features and advantages of the embodiments would be apparent from the foregoing detailed description. The scope of the claims is intended to cover the features and advantages of the embodiments as described above without departing from the spirit and scope of the claims. Any person having ordinary knowledge in the art may readily conceive of any improvements and changes. Accordingly, there is no intention to limit the scope of the inventive embodiments to those described above, and it is also possible to derive from appropriate modifications and equivalents included in the scope disclosed in the embodiments.
All examples and conditional language provided herein are intended for the pedagogical purposes of aiding the reader in understanding the invention and the concepts contributed by the inventor to further the art, and are not to be construed as limitations to such specifically recited examples and conditions, nor does the organization of such examples in the specification relate to a showing of the superiority and inferiority of the invention. Although one or more embodiments of the present invention have been described in detail, it should be understood that the various changes, substitutions, and alterations could be made hereto without departing from the spirit and scope of the invention.
Number | Date | Country | Kind |
---|---|---|---|
2019-167657 | Sep 2019 | JP | national |