1. Field of the Invention
The present invention relates to processor-based implementations of mathematical functions and, more specifically but not exclusively, to coordinate rotation digital computer (CORDIC) processing.
2. Description of the Related Art
This section introduces aspects that may help facilitate a better understanding of the invention. Accordingly, the statements of this section are to be read in this light and are not to be understood as admissions about what is prior art or what is not prior art.
The CORDIC algorithm, also referred to as the digit-by-digit method or Volder's algorithm, is a hardware-efficient iterative bound convergence method for solving trigonometric and other mathematical functions. Using the CORDIC algorithm, elementary functions can be realized just by table look-up, shift, and add operations, which make the CORDIC algorithm very attractive for hardware implementations.
Assuming a vector (Xi,Yi) rotates by an angle αi and arrives at (Xi+1, Yi+1) in the Cartesian plane, the angular rotation can be described by Equations (1) and (2) as follows:
X
i+1
=X
i cos αi−Yi sin αi=cos αi(Xi−Yi tan αi) (1)
Y
i+1
=Y
i cos αi+Xi sin αi=cos αi(Yi+Xi tan αi) (2)
If the rotation angle is restricted such that tan αi=±2−i, then the multiplication by the tangent can be reduced to a simple shift operation.
R
i+1
=R
i/cos αi (3)
With the above analysis, the classical CORDIC functions in each iteration i of circular rotation mode assuming pseudo-rotation are given by Equations (4)-(6) as follows, where Z stands for θ:
X
i+1
=X
i
−d
i
Y
i2−i (4)
Y
i+1
=Y
i
+d
i
X
i2−i (5)
Z
i+1
=Z
i
−d
i arctan(2−i), di=sign(Zi){−1,1} (6)
Considering a series of n pseudo-rotations, the vector magnitude increases by a factor of 1/K, where K=π cos(arctan(2−i)). The coordinates (Xn,Yn) after this series of n pseudo-rotations are given by:
X
n=1/K(X0 cos θ−Yo sin θ) (7)
Y
n=1/K(Y0 cos θ+X0 sin θ) (8)
If the initial values of X0 and Yo are chosen to be K and 0, respectively, and Z0 is initialized to angle θ, then, after a predetermined set of pseudo-rotations determined by arctan(2−i), Xn and Yn converge to cosine and sine components of the angle θ as Z converges to zero.
In general, to achieve n bits of precision, n pre-computed rotation angles (arctan(2−i)) are stored in a look-up table (LUT) and n CORDIC iterations perform Equations (4)-(6). High accuracy demands a large number of iterations to be performed, and the main challenge is to reduce the number of iterations and thus speed up the computation process.
The double-step CORDIC method combines two classical CORDIC angle rotations (iterations) into a single step. It was originally proposed for redundant binary signed digit (BSD) arithmetic as an enhancement over the branching CORDIC method.
In one embodiment, the present invention is a double-step CORDIC processor comprising a plurality of iteration stages connected in series. At least one iteration stage generates a set of outgoing signals from a set of incoming signals. The at least one iteration stage comprises one or more equation blocks, a selector, and a decision block. The one or more equation blocks implement CORDIC equation functions based on the set of incoming signals generated by a previous iteration stage to generate at least a first subset of intermediate signals and a second subset of intermediate signals. The selector selects one of the at least first and second subsets of intermediate signals to be the set of outgoing signals based on a control signal. The decision block generates the control signal for the selector, wherein the decision block operates in parallel with the one or more equation blocks.
Other aspects, features, and advantages of the present invention will become more fully apparent from the following detailed description, the appended claims, and the accompanying drawings in which like reference numerals identify similar or identical elements.
As represented in
The speed of the overall circuit (comprising n instances of stage 200) is limited by the delay of the adders (usually in the X or Y signal path), which depends on the carry-propagation network. With the use of fast addition schemes like carry look ahead or carry select addition, the delay of the overall circuit will be a logarithmic function of the word length.
As stated earlier, the double-step CORDIC algorithm was originally proposed for redundant binary signed digit (BSD) arithmetic as an enhancement over the branching CORDIC method. Since the control logic of the above implementation is extremely complex and results in large area penalty, the present invention explores the basic idea of the double-step CORDIC algorithm using conventional 2's complement arithmetic.
In a conventional implementation (using 2's complement notation) of the double-step CORDIC algorithm, two iterations of the classical CORDIC algorithm of
Z
i+2
=Z
i
−d
i arctan(2−i)−di+1 arctan(2−(i+1), where di=sign(Zi), di+1=sign(Zi+1) (9)
Since signal di+1 is required in advance to evaluate Equation (9) in a single step, two computations (“α” and “β”) are performed in parallel for the two possible values of signal di+1 (i.e., for “α”, di+1=di and, for “β”, di+1=−di) as shown in Equations (10) and (11) as follows:
Z
k+1
α
=Z
k
−d
k arctan(2−2k)−dk arctan(2−(2k+1)) (10)
Z
k+1
β
=Z
k
−d
k arctan(2−2k)+dk arctan(2−(2k+1)) (11)
Note that each step k in this double-step method is equivalent to two classical iterations.
Once signals Zk+1α and Zk+1β are generated, the decision block shown in the following pseudocode is used to determine the correct path.
As represented in
The magnitude comparison in Decision_Block_C( ) after the Zk+1α and Zk+1β signal generation of CORDIC_Equations_C( ) implies a full word-length addition/subtraction in decision block 330. As a result, direct implementation of the double-step CORDIC algorithm using stage 300 will not yield any better performance than two classical CORDIC iterations of stage 200 of
Double-Step CORDIC Algorithm and Implementation with Decision Postponing
As discussed previously, the magnitude comparison (|Zk+1α|<|Zk+1β|) in decision block 330 of
The following pseudocode can be used to implement the Z signal path; analogous pseudocode would be implemented for the X and Y signal paths.
In
Note that equation blocks 410 and 420 and decision block 430 are executed in parallel. The instance of decision block 430 shown in
Although stage 400 can be used to implement each of the (n/2+1) stages for an implementation of the double-step CORDIC with decision postponing algorithm, the first (i.e., k=0) stage and the last (i.e., k=n/2) stage do not require all of the components of stage 400.
In theory, the double-step CORDIC with decision postponing algorithm can be implemented in the context of any processor-based signal-processing application in which CORDIC-based computations can be performed. These signal-processing applications include, for example, a programmable tone generator used in a read channel for a hard disk drive.
Table I presents experimental results for four different CORDIC implementations. The classical CORDIC implementation with one iteration per stage corresponds to
As can be seen from Table I, the conventional implementation of the double-step CORDIC algorithm does not yield any better performance than two classical CORDIC iterations combined together into a single stage. On the other hand, the double-step CORDIC with decision postponing algorithm significantly reduces the critical path delay for the chosen word length.
The present invention may be implemented as (analog, digital, or a hybrid of both analog and digital) circuit-based processes, including possible implementation as a single integrated circuit (such as an ASIC or an FPGA), a multi-chip module, a single card, or a multi-card circuit pack. As would be apparent to one skilled in the art, various functions of circuit elements may also be implemented as processing blocks in a software program. Such software may be employed in, for example, a digital signal processor, micro-controller, general-purpose computer, or other processor.
The present invention can be embodied in the form of methods and apparatuses for practicing those methods. The present invention can also be embodied in the form of program code embodied in tangible media, such as magnetic recording media, optical recording media, solid state memory, floppy diskettes, CD-ROMs, hard drives, or any other non-transitory machine-readable storage medium, wherein, when the program code is loaded into and executed by a machine, such as a computer, the machine becomes an apparatus for practicing the invention. The present invention can also be embodied in the form of program code, for example, stored in a non-transitory machine-readable storage medium including being loaded into and/or executed by a machine, wherein, when the program code is loaded into and executed by a machine, such as a computer, the machine becomes an apparatus for practicing the invention. When implemented on a general-purpose processor, the program code segments combine with the processor to provide a unique device that operates analogously to specific logic circuits.
It should be appreciated by those of ordinary skill in the art that any block diagrams herein represent conceptual views of illustrative circuitry embodying the principles of the invention. Similarly, it will be appreciated that any flow charts, flow diagrams, state transition diagrams, pseudo code, and the like represent various processes which may be substantially represented in computer readable medium and so executed by a computer or processor, whether or not such computer or processor is explicitly shown.
Unless explicitly stated otherwise, each numerical value and range should be interpreted as being approximate as if the word “about” or “approximately” preceded the value of the value or range.
It will be further understood that various changes in the details, materials, and arrangements of the parts which have been described and illustrated in order to explain the nature of this invention may be made by those skilled in the art without departing from the scope of the invention as expressed in the following claims.
The use of figure numbers and/or figure reference labels in the claims is intended to identify one or more possible embodiments of the claimed subject matter in order to facilitate the interpretation of the claims. Such use is not to be construed as necessarily limiting the scope of those claims to the embodiments shown in the corresponding figures.
It should be understood that the steps of the exemplary methods set forth herein are not necessarily required to be performed in the order described, and the order of the steps of such methods should be understood to be merely exemplary. Likewise, additional steps may be included in such methods, and certain steps may be omitted or combined, in methods consistent with various embodiments of the present invention.
Although the elements in the following method claims, if any, are recited in a particular sequence with corresponding labeling, unless the claim recitations otherwise imply a particular sequence for implementing some or all of those elements, those elements are not necessarily intended to be limited to being implemented in that particular sequence.
Reference herein to “one embodiment” or “an embodiment” means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the invention. The appearances of the phrase “in one embodiment” in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments necessarily mutually exclusive of other embodiments. The same applies to the term “implementation.”
The embodiments covered by the claims in this application are limited to embodiments that (1) are enabled by this specification and (2) correspond to statutory subject matter. Non-enabled embodiments and embodiments that correspond to non-statutory subject matter are explicitly disclaimed even if they fall within the scope of the claims.