One or more aspects of embodiments according to the present disclosure relate to artificial neural networks, and more particularly to a system and method for efficient processing for neural network inference operations.
In an artificial neural network, processing for inference operations may involve large numbers of multiplication and addition operations. In some circumstances, it is advantageous for the processing to be performed quickly, and it may also be advantageous for the processing to consume little energy.
It is with respect to this general technical environment that aspects of the present disclosure are related.
According to an embodiment of the present disclosure, there is provided a system, including: a circuit configured to multiply a first number by a second number, the first number being represented as: a sign bit five exponent bits, and seven mantissa bits, representing an eight-bit full mantissa.
In some embodiments, the circuit includes: a first multiplier; a second multiplier; a third multiplier; and a fourth multiplier, each of the first multiplier, the second multiplier, the third multiplier, and the fourth multiplier being a 4 bit by 8 bit multiplier.
In some embodiments, the second number is represented as: a sign bit five exponent bits, and seven mantissa bits, representing an eight-bit full mantissa.
In some embodiments, the circuit is configured, in a first configuration, to multiply the mantissa of the first number and the mantissa of the second number using the first multiplier and the second multiplier.
In some embodiments, the circuit is configured, in a second configuration, to calculate an approximate product of a third number and a fourth number, each of the third number and the fourth number being represented as: a sign bit five exponent bits, and ten mantissa bits, representing an 11-bit full mantissa.
In some embodiments, the circuit is configured, in the second configuration, to: multiply eight bits of the full mantissa of the third number by eight bits of the full mantissa of the fourth number using the first multiplier and the second multiplier, multiply three bits of the full mantissa of the third number by eight bits of the full mantissa of the fourth number using the third multiplier, and multiply eight bits of the full mantissa of the third number by three bits of the full mantissa of the fourth number using the fourth multiplier.
In some embodiments, the circuit is configured not to calculate the product of the three least significant bits of the full mantissa of the third number and the three least significant bits of the full mantissa of the fourth number.
In some embodiments, the second number is an 8-bit integer and the circuit is configured to multiply 8 bits of the full mantissa of the first number by the second number using the first multiplier and the second multiplier.
In some embodiments, the second number is a 4-bit integer and the circuit is configured to multiply 8 bits of the full mantissa of the first number by the second number using the first multiplier.
According to an embodiment of the present disclosure, there is provided a method including: multiplying, by a circuit, a first number by a second number, the first number being represented as: a sign bit five exponent bits, and seven mantissa bits, representing an eight-bit full mantissa.
In some embodiments, the circuit includes: a first multiplier; a second multiplier; a third multiplier; and a fourth multiplier, each of the first multiplier, the second multiplier, the third multiplier, and the fourth multiplier being a 4 bit by 8 bit multiplier.
In some embodiments, the second number is represented as: a sign bit five exponent bits, and seven mantissa bits, representing an eight-bit full mantissa.
In some embodiments, the circuit is configured, in a first configuration, to multiply the mantissa of the first number and the mantissa of the second number using the first multiplier and the second multiplier.
In some embodiments, the circuit is configured, in a second configuration, to calculate an approximate product of a third number and a fourth number, each of the third number and the fourth number being represented as: a sign bit five exponent bits, and ten mantissa bits, representing an 11-bit full mantissa.
In some embodiments, the circuit is configured, in the second configuration, to: multiply eight bits of the full mantissa of the third number by eight bits of the full mantissa of the fourth number using the first multiplier and the second multiplier, multiply three bits of the full mantissa of the third number by eight bits of the full mantissa of the fourth number using the third multiplier, and multiply eight bits of the full mantissa of the third number by three bits of the full mantissa of the fourth number using the fourth multiplier.
In some embodiments, the circuit is configured not to calculate the product of the three least significant bits of the full mantissa of the third number and the three least significant bits of the full mantissa of the fourth number.
In some embodiments, the second number is an 8-bit integer and the circuit is configured to multiply 8 bits of the full mantissa of the first number by the second number using the first multiplier and the second multiplier.
In some embodiments, the second number is a 4-bit integer and the circuit is configured to multiply 8 bits of the full mantissa of the first number by the second number using the first multiplier.
According to an embodiment of the present disclosure, there is provided a system, including: means for multiplying a first number by a second number, the first number being represented as: a sign bit five exponent bits, and seven mantissa bits, representing an eight-bit full mantissa.
In some embodiments, the means for multiplying includes: a first multiplier; a second multiplier; a third multiplier; and a fourth multiplier, each of the first multiplier, the second multiplier, the third multiplier, and the fourth multiplier being a 4 bit by 8 bit multiplier.
These and other features and advantages of the present disclosure will be appreciated and understood with reference to the specification, claims, and appended drawings wherein:
The detailed description set forth below in connection with the appended drawings is intended as a description of exemplary embodiments of a system and method for efficient processing for neural network inference operations provided in accordance with the present disclosure and is not intended to represent the only forms in which the present disclosure may be constructed or utilized. The description sets forth the features of the present disclosure in connection with the illustrated embodiments. It is to be understood, however, that the same or equivalent functions and structures may be accomplished by different embodiments that are also intended to be encompassed within the scope of the disclosure. As denoted elsewhere herein, like element numbers are intended to indicate like elements or features.
An artificial neural network performing inference operations may perform large numbers of multiplication and addition operations. These operations may form a significant part of the computational burden of the artificial neural network, so that reducing the cost of such operations may have a significant impact on the performance (e.g., the processing speed) of the artificial neural network. Further, inference operations may in some situations be performed in a portable device, e.g., in a mobile phone. In such a situation it may be advantageous to limit both circuit size and power consumption, and hardware that is capable of performing multiplications and additions in a relatively small amount of chip area, and using little power, may be advantageous.
Some of (e.g., the majority of) the multiplication operations performed by an artificial neural network may be multiplication operations each forming the product of a weight and an activation. In some circumstances both the weight and the activation may be represented as floating point numbers.
As shown in
In some embodiments, an approximate product of two FP16 numbers (each of which has a full mantissa with 11 bits) may be calculated using significantly smaller hardware than a full 11-bit by 11-bit (or “11 x 11”) multiplier. As shown in
These partial products may then be summed to form the approximate product of the two FP16 numbers; the partial product that is the product of the least significant 3 bits of the first full mantissa and the least significant 3 bits of the second full mantissa may be omitted, to reduce the size of the hardware employed to perform the calculation, at the cost of at most 0.02% accuracy loss, or no accuracy loss, compared to FP16.
As used herein a “multiplier”, or “means for multiplying” is a digital circuit for calculating products, the digital circuit being a state machine that is not a stored-program computer. As used herein, “a portion of” something means “at least some of” the thing, and as such may mean less than all of, or all of, the thing. As such, “a portion of” a thing includes the entire thing as a special case, i.e., the entire thing is an example of a portion of the thing. As used herein, when a second quantity is “within Y” of a first quantity X, it means that the second quantity is at least X-Y and the second quantity is at most X+Y. As used herein, when a second number is “within Y%” of a first number, it means that the second number is at least (1-Y/100) times the first number and the second number is at most (1+Y/100) times the first number. As used herein, the term “or” should be interpreted as “and/or”, such that, for example, “A or B” means any one of “A” or “B” or “A and B”.
As used herein, when a method (e.g., an adjustment) or a first quantity (e.g., a first variable) is referred to as being “based on” a second quantity (e.g., a second variable) it means that the second quantity is an input to the method or influences the first quantity, e.g., the second quantity may be an input (e.g., the only input, or one of several inputs) to a function that calculates the first quantity, or the first quantity may be equal to the second quantity, or the first quantity may be the same as (e.g., stored at the same location or locations in memory as) the second quantity.
It will be understood that, although the terms “first”, “second”, “third”, etc., may be used herein to describe various elements, components, regions, layers and/or sections, these elements, components, regions, layers and/or sections should not be limited by these terms. These terms are only used to distinguish one element, component, region, layer or section from another element, component, region, layer or section. Thus, a first element, component, region, layer or section discussed herein could be termed a second element, component, region, layer or section, without departing from the spirit and scope of the inventive concept.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the inventive concept. As used herein, the terms “substantially,” “about,” and similar terms are used as terms of approximation and not as terms of degree, and are intended to account for the inherent deviations in measured or calculated values that would be recognized by those of ordinary skill in the art.
As used herein, the singular forms “a” and “an” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises” and/or “comprising”, when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. As used herein, the term “and/or” includes any and all combinations of one or more of the associated listed items. Expressions such as “at least one of,” when preceding a list of elements, modify the entire list of elements and do not modify the individual elements of the list. Further, the use of “may” when describing embodiments of the inventive concept refers to “one or more embodiments of the present disclosure”. Also, the term “exemplary” is intended to refer to an example or illustration. As used herein, the terms “use,” “using,” and “used” may be considered synonymous with the terms “utilize,” “utilizing,” and “utilized,” respectively.
Any numerical range recited herein is intended to include all sub-ranges of the same numerical precision subsumed within the recited range. For example, a range of “1.0 to 10.0” or “between 1.0 and 10.0” is intended to include all subranges between (and including) the recited minimum value of 1.0 and the recited maximum value of 10.0, that is, having a minimum value equal to or greater than 1.0 and a maximum value equal to or less than 10.0, such as, for example, 2.4 to 7.6. Similarly, a range described as “within 35% of 10” is intended to include all subranges between (and including) the recited minimum value of 6.5 (i.e., (1 - 35/100) times 10) and the recited maximum value of 13.5 (i.e., (1 + 35/100) times 10), that is, having a minimum value equal to or greater than 6.5 and a maximum value equal to or less than 13.5, such as, for example, 7.4 to 10.6. Any maximum numerical limitation recited herein is intended to include all lower numerical limitations subsumed therein and any minimum numerical limitation recited in this specification is intended to include all higher numerical limitations subsumed therein.
Although exemplary embodiments of a system and method for efficient processing for neural network inference operations have been specifically described and illustrated herein, many modifications and variations will be apparent to those skilled in the art. Accordingly, it is to be understood that a system and method for efficient processing for neural network inference operations constructed according to principles of this disclosure may be embodied other than as specifically described herein. The invention is also defined in the following claims, and equivalents thereof.
The present application claims priority to and the benefit of U.S. Provisional Application No. 63/293,400, filed Dec. 23, 2021, entitled “FP13 AND APPROXIMATED FP16 FOR THE DEEP LEARNING NEURAL NETWORK ACCELERATION”, the entire content of which is incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
63293400 | Dec 2021 | US |