The present disclosure relates to circuits and methods for implementing trigonometric functions, and in particular to circuits and methods for hardware-efficient adaptive calculation of floating-point trigonometric functions using coordinate rotate digital computer (CORDIC).
An accelerator circuit such as a graphical processing unit (GPU) may include circuits configured to perform the calculations of numerical functions. The numerical functions may convert one or more input values into one or more output values according to certain mathematical relations defined by the numerical functions. Examples of the numerical functions may include trigonometric functions that are widely used in practical applications such as image processing and machine learning.
The types of operators used to carry out the calculations of the numerical functions determine the complexity of circuits implementing these numerical functions and the time needed to perform these calculations. It is known that the circuit implementation of a multiplication operator is much more complex compared to the circuit implementation of a shift operator or an addition operator. Thus, circuits with small footprint integrated circuits (e.g., small-footprint field-programmable gate array (FPGA) circuits) often do not support direct calculation of a multiplication operator. For such applications, the coordinate rotation digital computer (CORDIC) algorithm is employed to perform calculations of a wide range of numerical functions. The CORDIC algorithm uses rotations rather than multiplications to perform the calculations, thus significantly reducing the complexity of the hardware circuit implementing these numerical functions.
The disclosure will be understood more fully from the detailed description given below and from the accompanying drawings of various implementations of the disclosure. The drawings, however, should not be taken to limit the disclosure to the specific implementations, but are for explanation and understanding only.
The CORDIC algorithms as described in the disclosure are developed to compute trigonometric functions for fixed-point input values. The CORDIC algorithms employ a series of iterative steps of rotations to approximate the trigonometric functions with respect to one or more input values. The trigonometric functions may include sin x, cos x, sin−1 x (or arcsin(x)), cos−1 x (or arccos(x)), tan−1 x (or arctan(x)) etc. Because each iterative step of the CORDIC algorithms involves rotation calculations without invoking multiplication calculations, the circuit supporting implementations of the CORDIC algorithms can be much simpler and can be realized in a small circuit footprint (i.e., small circuit areas) implemented on a FPGA circuit board.
The input values such as a real number can be represented as fix-pointed numbers or floating-point numbers when calculating the trigonometric functions. Current implementations of the CORDIC algorithms are primarily for fix-pointed input values. In computing, a fixed-point number representation of a real number includes a first fixed number of bits for representing the integer portion of the real number and a second fixed number of bits for the representing the fractional portion of the real number. A n-bit (binary) point number can be thought of as an n-bit integer divided by a scale factor, 2m. This is equivalent to treating the number as though there were a radix-point between bits m and m−1. The diagram below assumes an 8-bit number with a scale factor of 25, so the radix point is between bits 5 and 4.
In this case, the bit-pattern 0101_1001 is interpreted as the real number
Fixed-point numbers usually represent negative numbers the same way as integers deal with signed numbers, using 2's complement representation, instead of an explicit sign bit.
The floating-point number representation of a real number includes a sign bit, a fixed number of bits representing significant digits (or significand), and an exponent for scaling the significand. For example, in the IEEE floating-point number representation, a real number is represented as ±1. m*2exp, where the mantissa 1.m is a number in the range (1.0 . . . 2.0], with fraction m of some fixed number of bits where the number of bits is implementation-dependent. The exponent exp is an integer in a range that is also implementation-dependent. A sign bit is used to indicate the sign (+ or −). In the case of IEEE single precision floating-point, 23 bits are used for the fractional part, m. The exponent exp has range 127 to −126. IEEE floating-point number representation also includes representations of special cases such as denormals and infinities.
The CORDIC algorithm uses rotations rather than multiplications in calculating trigonometric functions, allowing for efficient hardware implementations of calculations of trigonometric functions. Using the calculations of sine (i.e., sin( )) and cosine (i.e., cos( )) functions as an example, the CORDIC algorithm is to compute trigonometric functions sin x and cos x by repeatedly applying the identities
sin(x+y)=sin(x)cos(y)+cos(x)sin(y)
cos(x+y)=cos(x)cos(y)−sin(x)sin(y)
The above equations can be rewritten as:
sin(x+y)=cos(y)[sin(x)+tan(y)cos(x)]
cos(x+y)=cos(y)[cos(x)−tan(y)sin(x)]
Choosing x=θi=tan−1(2−i), the above equations can be written as
sin(θi+y)=cos(θi)[sin(y)+tan(θi)cos(y)]
cos(θi+y)=cos(θi)[cos(y)−tan(θi)sin(y)]
which can be expanded to:
sin(θi+y)=cos(θi)[sin(y)+cos(y)/2i]
cos(θi+y)=cos(θi)[cos(y)−sin(y)/2i]
where the division by 2i can be implemented (for fixed-point) in hardware as a right shift by i times.
A given input angle α can be approximated in the first quadrant as αn=Σi=0nδiθi, where δi=±1. The accuracy of the approximation is determined by the number of terms, n. Given an a, the trigonometric function values sin αn and cos αn can be calculated using the following steps:
sin(θ0)=The CORDIC algorithms
cos(θ0)=1/√{square root over (2)}
sin(δ1θ1+θ0)=cos(θ1)[sin(θ0)+δ1 cos(θ0)/21]
cos(δ1θ1+θ0)=cos(θ1)[cos(θ0)−δ1 sin(θ0)/21]
sin(δ2θ2+δ1θ1+θ0)=cos(θ2)[sin(δ1θ1+θ0)+δ2 cos(δ1θ1+θ0)/22]
cos(δ2θ2+δ1θ1+θ0)=cos(θ2)[cos(δ1θ1+θ0)−δ2 sin(δ1θ1+θ0)/22]
These formulae can be generalized as the following recurrence:
sin(δiθi+δi−1θi−1+ . . . +θ0)=cos(θi)[sin(δi−1θi−1+ . . . +θ0)+δi cos(δi−1θi−1+ . . . +θ0)/2i]
cos(δiθi+δi−1θi−1+ . . . +θ0)=cos(θi)[cos(δi−1θi−1+ . . . +θ0)+δi sin(δi−1θi−1+ . . . +θ0)/2i]
Note that the calculation of this sequence includes a multiplication by cos (θi) at each step. This can be avoided by recognizing that these multiplications can be factored out so that there is a single multiplication by the product Kn=Πi=0n cos (θi). To take advantage of this fact, the recurrence can be rewritten so that:
X0=1/√{square root over (2)}
Y0=1/√{square root over (2)}
Xi=Xi−1+δiYi−1/2i
Yi=Yi−1+δiXi−1/2i
The sin αn and cos αn can be recovered by multiplying at the end with Kn so
sin(αi)=KiXi
cos(αi)=KiYi
If n can be predetermined based on the accuracy of the approximation to α, then the final multiplication with Kn can be avoided by instead pre-multiplying with Kn. This is equivalent to initializing X0 and Y0 with Kn. So,
X0=Kn/√{square root over (2)}
Y0=Kn/√{square root over (2)}
sin(αn)=Xn
cos(αn)=Yn
Given an angle α for computing sin(α) or cos(α), the CORDIC algorithm includes the calculation of the δi at each step i such that eventual αn can best approximate α. The standard approach is to pick + or − based on whether the current approximation αi is less than α or not.
The pseudo code for the calculation of sin and/or cos using CORDIC is:
The code includes a minor variation—that is, instead of initializing the Ai/Y/X with the values corresponding to θ0, they are initialized outside the loop as 0/0/Kn. At the end of the first iteration, they correspond to the values for θ0. As shown, the CORDIC algorithm for calculating sin and/or cos functions involves shift operators (») and addition/subtraction operators (+/−) but without multiplication operators. Thus, the CORDIC algorithm can be implemented in small footprint circuits such as FPGA circuits.
Some implementations may use a double rotation 2θi rather than a single rotation θi. In double rotation, instead of adding ±θi, each step adds ±2θi.
sin(x+δ12θi)=sin(x)cos(δ12θi)+cos(x)sin(δi2θi)
cos(x+2δ1θi)=cos(x)cos(2θi)−sin(x)sin(δi2θi)
After expansion
sin(x+2θi)=sin(x)[cos2(θi)−sin2(θi)]+cos(x)2δi sin(θ1)cos(θi)
cos(x+2θi)=cos(x)[cos2(θi)−sin2(θi)]−sin(x)2δi sin(θ1)cos(θi)
Rearranging and factoring provides
sin(x+2θi)=cos2(θi)[sin(x)+δi2 tan(θi)cos(x)−tan2(θi)sin(x)]
cos(x+2θi)=cos2(θi)[cos(x)−δi2 tan(θi)sin(x)−tan2(θi)cos(x)]
Expanding tan(θi)provides
sin(x+2θi)=cos2(θi)[sin(x)+δi cos(x)/2i−1−sin(x)/22i]
cos(x+2θi)=cos2(θi)[cos(x)−δi sin(x)/2i−1−cos(x)/22i]
The recurrence relationship for the double rotation is:
Xi=Xi−1+δiYi−1/2i−1−Xi−1/22i
Yi=Yi−1−δiXi−1/2i−1−Yi−1/22i
In this case, αn=Σi=0nδi2θi and Kn=Πi=0n cos2 (θi).
The code for implementing sin/cos using double rotation CORDIC is
As to the calculation of sin−1(v) using CORDIC, the algorithm may choose a series of δi to build αn such that sin(αn) approximates v. The algorithm is to choose + or − for δi+1 based on whether the sin(αi) is less than v or not. To compute Xi and Yi instead of sin αi and cos αi, this approach may need to be modified to evaluate:
sin(αi)<V≡KiXi<v≡Xi<v/Ki
Now, let vi=v/Ki. In that case, the following recurrence for double rotation CORDIC) can be use.
vi=v/Ki=vi−1/cos2(θi)=vi−1(1/cos2(θi))
Note that for the single rotation CORDIC, the 1/cos2 (θi) term would be replaced by 1/cos(θi), which would need a multiplication to implement. In the double rotation CORDIC, the recurrence for vi can be simplified as
vi=vi−1(1+2−2i)
that can be implemented using a shift and add.
The code for arcsin is:
Correspondingly, the arccos of an input can be computed from the arcsin by using the relation:
cos−1 x=π/2−sin−1 x
It is possible to use the CORDIC infrastructure to compute tan−1(v). The standard approach is to initialize X to v and Y to 1, and then force X to 0. This results in the following code:
While the initial values are set as Y=1 and X=V, they are not limited to these initial values. In practice, any initial values that follow the relation of Y/X=tan(v) may work. Alternatively, the initial values of X and Y can be set so that X/Y=tan(v), and then return π/2−ai.
The above sections include description of fixed-point CORDIC. In a fixed-point implementation, there are a fixed number of bits after the radix point, limiting the number of bits of precision available. If there are N bits after the radix point, the granularity of number that can be represented is 2−N. This generally means that αN+1 can be an exact representation of the number, resulting in very good numerical evaluation of the trigonometric functions described above. Thus only a small number (N) of steps of the recurrence needs to be evaluated, depending on the number of bits used for the precision.
Compared to the fixed-point number representation, the floating-point number representation, however, includes an exponent that can be very small such as 2−126 for non-denormal IEEE single precision floating-point numbers or 2−1022 for double precision floating-point numbers. The smallest granularity that can be represented is so small that it would require a very large number of steps of the recurrence to be evaluated if the fixed-point CORDIC is used to evaluate the trigonometric functions with a floating-point input value. On the other hand, if only a small number of bits are used to represent the small floating-point number, the relative error can be very large although the absolute error can be small. For example, if 31 bits (based 2−31) is used to represent a floating-point number in the level of 2−55 base, the relative error can be as high as 224 which may mean that the all the bits in the mantissa of the evaluated trigonometric functions are incorrect. Thus, it is not hardware-efficient or accurate to apply the fixed-point CORDIC algorithm directly to floating-point number representations when the input value is very small.
Instead of the directly application of fixed-point CORDIC to floating-point number presentations, implementations of the disclosure first determine whether the floating-point input value is small. Responsive to determining that the floating-point input value is not small, implementations may employ the CORDIC algorithm to calculate trigonometric functions. Responsive to determining that the floating-point input value is small, implementations may employ an approximation approach to calculate trigonometric functions. Thus, the values of trigonometric functions can be evaluated in an adaptive manner. One implementation of the disclosure may use the first term of the Taylor series expansions as the approximation of the trigonometric functions with a small floating-point input. These approximations based on the first terms of the Taylor series expansions for small input value α are:
sin(α)˜α
cos(α)˜1−α2
sin−1(α)˜α
tan−1(α)˜α
where the input value α is measured in terms of radians. The trigonometric functions of large input values may be calculated using the CORDIC algorithm. In one implementation, instead of just the first term of the Taylor series, the approximation for the small input value α may also include the second term (or more terms) of the Taylor series. Because the input value is a small floating-point number, the multiplication result of the higher-order terms (second order or above) may be represented using fewer bits (e.g., 8 bits) while the multiplication circuits for the small floating-point input values can be a low-bit (e.g., 4 or 5 bits) multiplication circuits which are cheaper to implement compared to the standard 16-bit or 32-bit multiplication circuits.
Compared to the fixed-point number representation, the floating-point number representation, however, includes an exponent that can be very small such as 2−126 for non-denormal IEEE single precision floating-point numbers or 2−1022 for double precision floating-point numbers. The smallest granularity that can be represented is very small that it would require a very large number of steps of the recurrence to be evaluated compared to the fixed-point number presentation. Thus, it is not efficient to apply the fixed-point CORDIC algorithm directly to floating-point number representations when the input value is very small. The adaptive calculation of trigonometric functions may allow hardware-efficient implementations of the trigonometric functions.
At 104, the method may include determining whether the input value is a small number. The determination of whether the input value is small may be based on one or more factors including, but not limited to, estimated absolute errors, estimated relative errors, types of the trigonometric functions, types of floating-point number representations, or the hardware constraints. In one implementation, the determination of whether the floating-point input value can be based on an estimated absolute error. For example, a target bound of the estimated absolute error may be represented by a number (N) of bit (e.g., 2−N). Thus, a floating-point input value smaller than the target bound of the estimated absolute error (e.g., 2−N) is determined to be small; a floating-point input value that is in the range or larger than the target bound is determined to be not small.
Similarly, in another implementation, the determination of whether the floating-point input value can be based on an estimated relative error. For example, a target bound of the estimated relative error. Current implementations of CORDIC are fixed-point algorithms. An n-step CORDIC algorithm may have a residual error of k2−n. For floating-point number representations, however, the residual errors are in a range that is determined by the size of the exponent. In the case of single-precision numbers, the exponent can be as small as 2−149. Covering this range using purely a CORDIC approach would require approximately 149 steps. Using the first term of Taylor expansion as an approximation to a small input value may reduce the complexity of the calculation.
As discussed above, instead of performing the CORDIC algorithm, the first term of Taylor expansion can be used as the approximation of a trigonometric function for small input values. The residual errors for using the first term of Taylor expansions to approximate the trigonometric functions depend on the function themselves. For example, for sin(x) functions, the residual errors are bound by (x3/3!); for cos(x) functions, the residual errors are bounded by (x4/4!). Thus, the trigonometric functions may determine the bounds of the residual errors of different approximations. These bounds of the residual errors may be used to determine the bounds of absolute errors or relative errors, thereby determining whether the input value is small.
Using the sin with single-precision floating-point input value as an example, only 24 significant bits are needed. If the approximation used is sin(θ)˜θ, the relative error is θ2/6. If θ≤2−11 then the relative error is less than 2−24. So the approximation will differ by at most the least significant bit. This reduces the problem for CORDIC to be accurate for input values greater than 2−11. If the input is 2−10, then the result is approximately 2−10. If the result is to be in single precision format, then the least significant bit is 2−34. This means that approximately 34 CORDIC steps will be necessary to reduce the error to the least significant bit.
If an error in the last 2k+1 bits is acceptable, then for single precision, the cutoff between a small value and a non-small value can be set at 2−11+k. This will keep the error in the desired range. Then, the CORDIC algorithm only needs 34-3k steps—the smallest result generated is 2−11+k and the smallest bit position which needs to be accurate is at 24-2k. So, a 5 least significant bit (LSB) error is acceptable, the cutoff can be set at 2−9, and the CORDIC algorithm needs at most 25 steps.
Responsive to determining that the input value is a small value, at 106, the method may include using an approximation of the trigonometric function with respect to the small input value. In one implementation, the approximation to the trigonometric functions with a small floating-point input value is the first term of Taylor expansion. These approximations based on the first terms of the Taylor series expansions for small input value α are:
sin(α)˜α
cos(α)˜1−α2
sin−1(α)˜α
tan−1(α)˜α
where the input value α is measured in terms of radians.
Customarily, when dealing with trigonometric functions, angles are given in units of radians. However, using radians may require taking the modulus with respect to 2π to remove the periodicity for computing functions such as sin/cos, and then, possibly require further dividing by π/2 to identify the quadrant and to constrain the input to one quadrant. This can be computationally expensive. An alternative is to express any angles in terms of multiples of π/2 (or of some other rational multiple of π, such as π or 2π). If angles are expressed in multiples of π/2, then:
Adopting this approach may greatly simplify the handling of angles for trigonometric functions. The Taylor series approximations work for small angles expressed in radians. If angles are expressed as multiples of π/2, then:
sin(α)˜απ/2
cos(α)˜1−α2π4/4
sin−1(a)˜2α/π
tan−1a˜2α/π
Responsive to determining that the input value is not a small value, at 108, the method may include using the CORDIC algorithms to calculate the trigonometric functions with respect to the input value, where the CORDIC algorithms may have been implemented as circuit logic blocks (e.g., on an FPGA circuit board) including shift operator circuit and/or addition operator circuit. Thus, implementations of the disclosure may reuse the circuit logic blocks implementing CORDIC algorithms and save the circuit area.
Standard IEEE floating-point representations include representations of special values such as +/−Infinity, signaling and quiet NaNs, and denormals. In addition, IEEE floating-point representations can identify certain exceptions, specifically INVALID, OVERFLOW, UNDERFLOW, DIVIDE-BY-ZERO and INEXACT. The table below summarizes the actions recommended for the operations.
In one implementation, system 200 may provide a co-processor (or accelerator circuit) 204 with designated circuits to support the adaptive calculation of floating-point trigonometric functions 210. Co-processor 204 can be part of processor 202 or a separate logic circuit communicatively connected to processor 202. For example, co-processor 204 can be an accelerator circuit implemented on a FPGA board to accelerate the calculation of trigonometric functions. Co-processor 204 may include a register file 218, a determination circuit 212, an approximation circuit 214, and a CORDIC recurrence circuit 216.
Register file 218 may include instruction registers 220 and data registers 222. Instruction registers 220 may receive instructions from the execution pipeline of processor 202 executing floating-point trigonometric function 210 and store the instructions therein. In one implementation, the instructions may include trigonometric calculation instructions for evaluating trigonometric functions with respect to an input value. Data registers 222 may store input values and output values associated with a corresponding a trigonometric calculation function instruction. In one implementation, data registers 222 may include floating-point data registers that may store floating-point input values and floating-point output values. The execution pipeline of processor 202 may store the input values associated with a trigonometric calculation function in data registers 222 and retrieve the results of executing the trigonometric calculation function from data registers 222.
Determination circuit 212 may identify an instruction for calculating a trigonometric function from instruction register 220 and a corresponding input value associated with the instruction from data register 222. Responsive to identifying the instruction including the input value, determination circuit 212 may parse the instruction and the input value, and further determine whether the input value is a small value. As discussed above, the determination circuit 212 may determine whether the input value is a small value based on one or more factors including, but not limited to, estimated absolute errors, estimated relative errors, types of the trigonometric functions, types of floating-point number representations, and the hardware constraints. Determination circuit 212 may include a switch circuit (e.g., a multiplexer) that may route the input value based on the determination of whether the input value is a small value.
Responsive to determining that the input value is a small value, determination circuit 212 may route the input value to approximation circuit 214 to calculate an approximation of the trigonometric function. The trigonometric functions supported by approximation circuit may include sin, cos, sin−1, tan−1. For example, approximation circuit 214 may include logic circuits that implement the first term Taylor approximation of trigonometric functions sin, cos, sin−1, tan−1, respectively. The first term Taylor approximations of trigonometric functions are described above. The output of approximation circuit 214 can be the evaluation of the trigonometric function and can be transmitted to processor 202.
Responsive to determining that the input value is not a small value, determination circuit 212 may route the input value to CORDIC recurrence circuit 216 to evaluate the trigonometric function based on the input value. CORDIC recurrence circuit 216 may include logic circuits that respectively implement CORDIC recurrence for trigonometric functions including sin, cos, sin−1, tan−1. In particular, CORDIC recurrence circuit 216 may include shift circuits and addition circuit that may be combined to perform different CORDIC algorithms for trigonometric functions. The output of CORDIC recurrence circuit 216 can be the evaluation of the trigonometric function and can be transmitted to processor 202. As such, co-processor 204 may implement in hardware circuits the adaptive calculation of trigonometric functions in a hardware efficient way.
To take into consideration of special input values as shown in Table 1, when computing one of the trigonometric functions, the method and the system may include:
Using the CORDIC algorithm to compute trigonometric functions involves the following steps:
When the input numbers fall in the range appropriate to using CORDIC, it is possible to restrict the numbers as well as the intermediate results to fall in the range [0 . . . 4). This means that only 2 bits above the radix point are required.
As shown in
Certain components of inner stage 300 may be reconfigured to implement different elementary functions and/or trigonometric functions. In particular, comparator 308 is configured to be a greater comparator (“>”) except for arctan function. In the arctan function case, comparator 308 is configured to be a lesser or equal comparator (“≤”). Summation/subtraction circuit 312B is configured to evaluate trigonometric functions, but is configured as a subtraction/summation circuit for the evaluation of elementary functions. Subtraction/submission circuits 310A, 310B are configured to evaluate trigonometric functions, but are configured as summation/subtraction circuits for the evaluation of elementary functions. Multiplexers 302A, 302B may select ai/A for trigonometric, and select Y/V for elementary functions. The index value i for shifters 304A, 304C, 306A are sequential for trigonometric functions, but include repeating terms as described above for elementary functions.
In one implementation of the disclosure, the input value to the trigonometric functions can be IEEE single precision, double precision, quadruple precision, or octuple precision floating-point number. For a single precision implementation, the fixed-point numbers are represented with as 61 bit numbers, with a 59 bits after the (implicit) radix point.
For this implementation, the calculation of sin/cos functions may include the following:
Implementations of the disclosure provide an adaptive calculation of trigonometric functions with floating-point input values using an approximation function and the CORDIC algorithm. Implementations may leverage a common CORDIC circuit block to reduce the circuit area. For each type of trigonometric function, the adaptation of CORDIC to floating-point includes identifying alternative approximation techniques that can be used in each of these cases to deal with ranges of inputs that may be costly if computed using CORDIC algorithm alone.
While the disclosure has been described with respect to a limited number of implementations, those skilled in the art will appreciate numerous modifications and variations there from. It is intended that the appended claims cover all such modifications and variations as fall within the true spirit and scope of this disclosure.
A design may go through various stages, from creation to simulation to fabrication. Data representing a design may represent the design in a number of manners. First, as is useful in simulations, the hardware may be represented using a hardware description language or another functional description language. Additionally, a circuit level model with logic and/or transistor gates may be produced at some stages of the design process. Furthermore, most designs, at some stage, reach a level of data representing the physical placement of various devices in the hardware model. In the case where conventional semiconductor fabrication techniques are used, the data representing the hardware model may be the data specifying the presence or absence of various features on different mask layers for masks used to produce the integrated circuit. In any representation of the design, the data may be stored in any form of a machine readable medium. A memory or a magnetic or optical storage such as a disc may be the machine readable medium to store information transmitted via optical or electrical wave modulated or otherwise generated to transmit such information. When an electrical carrier wave indicating or carrying the code or design is transmitted, to the extent that copying, buffering, or re-transmission of the electrical signal is performed, a new copy is made. Thus, a communication provider or a network provider may store on a tangible, machine-readable medium, at least temporarily, an article, such as information encoded into a carrier wave, embodying techniques of implementations of the present disclosure.
A module as used herein refers to any combination of hardware, software, and/or firmware. As an example, a module includes hardware, such as a micro-controller, associated with a non-transitory medium to store code adapted to be executed by the micro-controller. Therefore, reference to a module, in one implementation, refers to the hardware, which is specifically configured to recognize and/or execute the code to be held on a non-transitory medium. Furthermore, in another implementation, use of a module refers to the non-transitory medium including the code, which is specifically adapted to be executed by the microcontroller to perform predetermined operations. And as can be inferred, in yet another implementation, the term module (in this example) may refer to the combination of the microcontroller and the non-transitory medium. Often module boundaries that are illustrated as separate commonly vary and potentially overlap. For example, a first and a second module may share hardware, software, firmware, or a combination thereof, while potentially retaining some independent hardware, software, or firmware. In one implementation, use of the term logic includes hardware, such as transistors, registers, or other hardware, such as programmable logic devices.
Use of the phrase ‘configured to,’ in one implementation, refers to arranging, putting together, manufacturing, offering to sell, importing and/or designing an apparatus, hardware, logic, or element to perform a designated or determined task. In this example, an apparatus or element thereof that is not operating is still ‘configured to’ perform a designated task if it is designed, coupled, and/or interconnected to perform said designated task. As a purely illustrative example, a logic gate may provide a 0 or a 1 during operation. But a logic gate ‘configured to’ provide an enable signal to a clock does not include every potential logic gate that may provide a 1 or 0. Instead, the logic gate is one coupled in some manner that during operation the 1 or 0 output is to enable the clock. Note once again that use of the term ‘configured to’ does not require operation, but instead focus on the latent state of an apparatus, hardware, and/or element, where in the latent state the apparatus, hardware, and/or element is designed to perform a particular task when the apparatus, hardware, and/or element is operating.
Furthermore, use of the phrases ‘to,’ capable of/to,′ and or ‘operable to,’ in one implementation, refers to some apparatus, logic, hardware, and/or element designed in such a way to enable use of the apparatus, logic, hardware, and/or element in a specified manner. Note as above that use of to, capable to, or operable to, in one implementation, refers to the latent state of an apparatus, logic, hardware, and/or element, where the apparatus, logic, hardware, and/or element is not operating but is designed in such a manner to enable use of an apparatus in a specified manner.
A value, as used herein, includes any known representation of a number, a state, a logical state, or a binary logical state. Often, the use of logic levels, logic values, or logical values is also referred to as 1's and 0's, which simply represents binary logic states. For example, a 1 refers to a high logic level and 0 refers to a low logic level. In one implementation, a storage cell, such as a transistor or flash cell, may be capable of holding a single logical value or multiple logical values. However, other representations of values in computer systems have been used. For example, the decimal number ten may also be represented as a binary value of 910 and a hexadecimal letter A. Therefore, a value includes any representation of information capable of being held in a computer system.
Moreover, states may be represented by values or portions of values. As an example, a first value, such as a logical one, may represent a default or initial state, while a second value, such as a logical zero, may represent a non-default state. In addition, the terms reset and set, in one implementation, refer to a default and an updated value or state, respectively. For example, a default value potentially includes a high logical value, i.e. reset, while an updated value potentially includes a low logical value, i.e. set. Note that any combination of values may be utilized to represent any number of states.
The implementations of methods, hardware, software, firmware or code set forth above may be implemented via instructions or code stored on a machine-accessible, machine readable, computer accessible, or computer readable medium which are executable by a processing element. A non-transitory machine-accessible/readable medium includes any mechanism that provides (i.e., stores and/or transmits) information in a form readable by a machine, such as a computer or electronic system. For example, a non-transitory machine-accessible medium includes random-access memory (RAM), such as static RAM (SRAM) or dynamic RAM (DRAM); ROM; magnetic or optical storage medium; flash memory devices; electrical storage devices; optical storage devices; acoustical storage devices; other form of storage devices for holding information received from transitory (propagated) signals (e.g., carrier waves, infrared signals, digital signals); etc., which are to be distinguished from the non-transitory mediums that may receive information there from.
Instructions used to program logic to perform implementations of the disclosure may be stored within a memory in the system, such as DRAM, cache, flash memory, or other storage. Furthermore, the instructions can be distributed via a network or by way of other computer readable media. Thus a machine-readable medium may include any mechanism for storing or transmitting information in a form readable by a machine (e.g., a computer), but is not limited to, floppy diskettes, optical disks, Compact Disc, Read-Only Memory (CD-ROMs), and magneto-optical disks, Read-Only Memory (ROMs), Random Access Memory (RAM), Erasable Programmable Read-Only Memory (EPROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), magnetic or optical cards, flash memory, or a tangible, machine-readable storage used in the transmission of information over the Internet via electrical, optical, acoustical or other forms of propagated signals (e.g., carrier waves, infrared signals, digital signals, etc.). Accordingly, the computer-readable medium includes any type of tangible machine-readable medium suitable for storing or transmitting electronic instructions or information in a form readable by a machine (e.g., a computer).
Reference throughout this specification to “one implementation” or “an implementation” means that a particular feature, structure, or characteristic described in connection with the implementation is included in at least one implementation of the present disclosure. Thus, the appearances of the phrases “in one implementation” or “in an implementation” in various places throughout this specification are not necessarily all referring to the same implementation. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more implementations.
In the foregoing specification, a detailed description has been given with reference to specific exemplary implementations. It will, however, be evident that various modifications and changes may be made thereto without departing from the broader spirit and scope of the disclosure as set forth in the appended claims. The specification and drawings are, accordingly, to be regarded in an illustrative sense rather than a restrictive sense. Furthermore, the foregoing use of implementation and other exemplarily language does not necessarily refer to the same implementation or the same example, but may refer to different and distinct implementations, as well as potentially the same implementation.
This application is a national stage application of PCT/US2020/018975 filed Feb. 20, 2020, which claims priority to U.S. provisional application 62/807,852 filed Feb. 20, 2019. The contents of above-identified applications are hereby incorporated by reference in their entireties.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2020/018975 | 2/20/2020 | WO |
Publishing Document | Publishing Date | Country | Kind |
---|---|---|---|
WO2020/172368 | 8/27/2020 | WO | A |
Number | Name | Date | Kind |
---|---|---|---|
4896287 | O'Donnell et al. | Jan 1990 | A |
5134578 | Garverick et al. | Jul 1992 | A |
6385632 | Choe et al. | May 2002 | B1 |
8510354 | Langhammer | Aug 2013 | B1 |
8572150 | Dick | Oct 2013 | B1 |
10168992 | Pillai et al. | Jan 2019 | B1 |
20060282489 | Khan et al. | Dec 2006 | A1 |
20110119520 | Khan et al. | May 2011 | A1 |
20110320513 | Langhammer | Dec 2011 | A1 |
20120089807 | Rupley | Apr 2012 | A1 |
20180321910 | Langhammer et al. | Nov 2018 | A1 |
Number | Date | Country |
---|---|---|
2255223 | Jun 2000 | CA |
Entry |
---|
J.L. Hennessy et al., Computer Architecture, A Quantitative Approach, Elsevier Science, 2003 (Year: 2003). |
S.F. Hsiao et al., Efficient Designs of Flaoting-Point CORDIC Rotation and Vectoring Operations, IEEE 2008 (Year: 2008). |
International Application No. PCT/US2020/018975, International Search Report and Written Opinion mailed Jun. 15, 2021, 15 pages. |
Meher et al., “50 Years of CORDIC: Algorithms, Architectures, and Applications”, IEEE Transactions on Circuits and Systems, vol. 56, No. 9, 2009, pp. 1893-1907. |
International Application No. PCT/US2020/018976, International Search Report and Written Opinion mailed May 7, 2021, 12 pages. |
Chinese Search Report issued in CN Application No. 202080028532.X, mailed Jul. 26, 2023, 3 pages. |
Chinese Search Report issued in CN Application No. 2020800287147, mailed Aug. 19, 2023, 3 pages. |
Pramod K. Meher, “50 Years of CORDIC: Algorithms, Architectures, and Applications,” pub. date: Sep. 2009. |
European Search Report issued in EU Application No, 20759868, mailed Dec. 14, 2022, 9 pages. |
Milos D. Ercegovac, REFEREX, Jan. 2004 (Jan. 1, 2004), XP040426237. |
Alvaro Vazquez, “Implementation of the Exponential Function in a Floating-Point Unit,” pub. date: Nov. 3, 2000. |
US Non Final Office Action issued in U.S. Appl. No. 17/427,843 mailed Apr. 27, 2023, 25 pages. |
Chinese 1st Office Action issued in CN Application No. 202080028532.X mailed Jul. 29, 2023. |
Chinese 1st Office Action issued in CN Application No. 202080028714.7 mailed Aug. 24, 2023. |
European Search Report issued in EU Application No. 20760098.2, mailed Oct. 18, 2022, 8 pages. |
Number | Date | Country | |
---|---|---|---|
20220137925 A1 | May 2022 | US |
Number | Date | Country | |
---|---|---|---|
62807852 | Feb 2019 | US |