1. Field of the Invention
The invention relates to digital devices for performing rotation of an arbitrary input. In particular, the invention relates to circuit and methods for reducing the complexity and power dissipation in physical circuits that perform angle rotation.
2. Related Art
An angle rotator performs the conceptually straightforward operation of rotating an arbitrary point (X0, Y0) in the X-Y plane, counter-clockwise, about the origin in the plane, through a given angle θ. A digital angle rotator performs such rotations on data points whose coordinate values are specified by digital data words. It can also be insightful to view the rotation operation as a rotation of a complex number X0+jY0 in the complex plane. The digital angle rotator may be used to implement a digital modulator or a digital mixer in a communication system, as well as for implementing angle-rotation operations for other popular signal-processing systems, such as, but not limited to, discrete Fourier transformers and/or trigonometric interpolators to provide some examples.
The digital angle rotator differs significantly from a traditional digital mixer that employs an interconnection of two subsystems: a direct digital frequency synthesizer (DDFS) and a complex multiplier. Crucial insight into the digital angle rotator derives from the observation that the multiplication of an input complex number by a special complex number, one having the form cos θ+j sin θ, is a more special operation than simply the multiplication of two arbitrary complex numbers. Namely this is the multiplication of an arbitrary complex-number by a complex number having magnitude one, making the complex multiplication become a counter-clockwise rotation of an input complex number, about the origin in the complex plane, through the angle θ. The complexity of a conventional angle rotator is approximately the same as that of the multiplier block alone in the traditional digital mixer implementation. Therefore, what is needed is an angle rotator that is reduced in complexity and power dissipation as compared to the conventional prior-art angle rotator.
The present invention will now be described with reference to the accompanying drawings. In the drawings, like reference numbers generally indicate identical, functionally similar, and/or structurally similar elements. The drawing in which an element first appears is indicated by the leftmost digit(s) in the reference number.
The following detailed description of the present invention refers to the accompanying drawings that illustrate exemplary embodiments consistent with this invention. References in the specification to “one embodiment,” “an embodiment,” “an example embodiment,” etc., indicate that the embodiment described may include a particular feature, structure, or characteristic, but every embodiment may not necessarily include the particular feature, structure, or characteristic. Moreover, such phrases are not necessarily referring to the same embodiment.
Further, when a particular feature, structure, or characteristic is described in connection with an embodiment, it is submitted that it is within the knowledge of one skilled in the art to effect such feature, structure, or characteristic in connection with other embodiments whether or not explicitly described.
Other embodiments are possible, and modifications may be made to the embodiments within the spirit and scope of the invention. Therefore, the detailed description is not meant to limit the invention. Rather, the scope of the invention is defined by the appended claims.
X2=X0 cos θ−Y0 sin θ
Y2=Y0 cos θ+X0 sin θ. (1)
As shown in
X
1
=X
0 cos θM−Y0 sin θM
Y
1
=Y
0 cos θM+X0 sin θM (2)
and
X
2
=X
1 cos θL−Y1 sin θL
Y
2
=Y
1 cos θL+X1 sin θL, (3)
where (2) represents the coarse rotation of the complex input signal 150, and (3) represents the fine rotation of the intermediate complex number 154.
Note that normalized angle values are shown in
In the conventional angle rotator 100, a coarse rotation butterfly circuit 104 performs the coarse rotation of the complex input signal 150 to produce the intermediate complex number 154 having coordinates (X1, Y1) using the sin θl and [cos θl] values from the ROM 102. More specifically, the coarse rotation butterfly circuit 104 calculates:
X
1
=X
0[cos θl]−Y0 sin θl
Y
1
=Y
0[cos θl]+X0 sin θl, (4)
for the given most significant portion
A fine rotation butterfly circuit 106 performs the fine rotation of the intermediate complex number 154 to produce the complex output signal 152. More specifically, the fine rotation butterfly circuit 106 calculates:
X
2
=X
1 cos θL−Y1 sin θL
Y
2
=Y
1 cos θL+X1 sin θL, (5)
for a given least significant portion
As will be understood by those skilled in the relevant art(s) from the teachings provided herein, the angle rotator 200 may be readily implemented in hardware, software, or a combination of hardware and software. For example, based on the teachings provided herein, those skilled in the relevant art(s) could implement the angle rotator 200 via a combination of one or more application specific integrated circuits and a processor core for implementing software commands stored in one or more attached memories and/or a state machine having digital logic devices in integrated or hybrid from to provide some examples. However, these examples are not limiting, and other implementations are within the spirit and scope of the present invention.
The angle rotator 200 rotates a complex input signal 250 having coordinates (X0, Y0) to produce a rotated complex output signal 252. Rotation of the complex input signal 250 in the X-Y plane counterclockwise, around the origin, by the angle θ, results in the rotated complex output signal 252 having coordinates (X2, Y2). The complex output signal 252 is related to the complex input signal 250 as:
X
2
=X
0 cos θ−Y0 sin θ
Y
2
=Y
0 cos θ+X0 sin θ. (6)
The rotation of the complex input signal 250 may be decomposed into two stages: a coarse rotation of the complex input signal 250 by θM followed by a fine rotation of an intermediate complex number 254 by θL. More specifically, the rotation of the complex input signal 250 may be decomposed into:
X
1
=X
0 cos θM−Y0 sin θM
Y
1
=Y
0 cos θM+X0 sin θM (7)
and
X
2
=X
1 cos θL−Y1 sin θL
Y
2
=Y
1 cos θL+X1 sin θL, (8)
where (7) represents the coarse rotation of the complex input signal 250, and (8) represents the fine rotation of the intermediate complex number 254.
The angle θ may be separated into a coarse angle θM and a fine angle θL. In an exemplary embodiment shown in
A coarse rotation butterfly circuit 204 performs the coarse rotation of the complex input signal 250 to produce the intermediate complex number 254 having coordinates (X1, Y1) based upon the coarse angle θM. More specifically, the coarse rotation butterfly circuit 204 generates the intermediate complex number 254 by calculating:
X
1
=X
0 cos θM−Y0 sin θM
Y
1
=Y
0 cos θM+X0 sin θM. (9)
Unlike the conventional coarse rotation butterfly circuit 104, the coarse rotation butterfly circuit 204 implements the multiplications, such as, but not limited to, X0 cos θM, Y0 sin θM, Y0 cos θM, and X0 sin θM, in shift and add/subtract signed-power-of-two (SPT) form rather than simply storing and retrieving multiplier coefficient values using a memory storage device, such as the ROM 102.
The specific cosine and sine values, denoted as (C, S), such as, but not limited to, the cos θM and/or the sin θM, may be expressed in an SPT numeral system. Those skilled in the arts will recognize that the specific cosine and sine values (C, S) may also define a point (C, S) in the plane that is within, but close to the boundary of the unit circle without departing from the spirit and scope of the present invention. The specific cosine and sine values (C, S) for any angle θ may be expressed using a positional numbering system, such as, but not limited to, the binary numeral system. In the positional numbering system, specific cosine and sine values (C, S) may be represented by a unique series of digits. Each digit position may be represented by an associated weight of bi, where b represents the base, or radix of the number system. A general form of specific cosine and sine values (C, S) in such a system is given as:
d
p−1
d
p−2
. . . d
1
d
0•
d
−1
d
−2
. . . d
−n, (10)
where there are p digits to the left of a point, called the radix point, and n digits to the right of the point. The value of the specific cosine and sine values (C, S) may be expressed as:
where D represents the summation of each digit multiplied by the corresponding power of the radix. The binary numeral system is a positional numbering system wherein each digit position has a weight of 2i. For example, for an angle θ≈0.794 radians, the specific cosine and sine values (C, S) may be expressed in the binary numeral system as:
cos θ=0.101101010
sin θ=0.10111, (12)
where the binary representations of cos B and sin B have been truncated to nine bits and five bits, respectively.
In a signed positional numeral system, the specific cosine and sine values (C, S) may be represented by a non-unique series of digits. Each digit position may be represented by an associated weight of bi, where b represents the base, or radix of the number system, and an associated sign. The value of the specific cosine and sine values (C, S) may be expressed as:
where D represents the summation of each digit multiplied by the corresponding power of the radix and the corresponding sign. The SPT numeral system is a signed positional numeral system wherein each digit position has a corresponding weight of 2i and a corresponding sign. For example, for the angle θ≈0.794 radians, the specific cosine and sine values (C, S) may be expressed in the SPT numeral system as:
cos θ=1.0
sin θ=1.0
where the symbol
Unlike the uniqueness of the positional number system, a representation of specific cosine and sine values (C, S) in the signed positional numeral system is non-unique. In other words, the representation of the specific cosine and sine values (C, S) in the signed positional numeral system may include one or more non-unique series of digits. For example, the binary numeral system representation of a decimal integer 21 is a unique representation 10101. However, the decimal integer 21 may also be represented by one or more non-unique series of digits in the SPT numeral system, such as 11011 (or 24-3 in decimal), 10
As a special case of the SPT numeral system, the specific cosine and sine values (C, S) may be represented in a Canonic Signed-Digit (CSD) numeral system. In the CSD numeral system, the specific cosine and sine values (C, S) may be represented by a unique series of signed digits. Each digit position may be represented by an associated weight of bi, where b represents the base, or radix of the number system, and an associated sign. The value of the specific cosine and sine values (C, S) may be expressed as:
where D represents the summation of each digit multiplied by the corresponding power of the radix and the corresponding sign. Conversion of the specific cosine and sine values (C, S) from a representation in the positional number system, such as the binary numeral system, to the CSD numeral system is well known in the art. The CSD form ensures the specific cosine and sine values (C, S) contain the fewest number of non-zero bits. For example, for the angle θ≈0.794 radians, the specific cosine and sine values (C, S) may be expressed in the CSD numeral system as:
cos θ=1.0
sin θ=1.0
where the symbol
The coarse rotation butterfly circuit 204 employs the SPT numeral system representation of sine and cosine values to implement the multiplications in shift and add/subtract SPT form to calculate the intermediate complex number 254. However, those skilled in the relevant art(s) will recognize that the teachings herein may be equally applied to representations in the CSD numeral system without departing from the spirit and scope of the present invention. From the discussion above, the coarse rotation butterfly circuit 204 calculates the intermediate complex number 254 according to:
X
1
=X
0 cos θM−Y0 sin θM
Y
1
=Y
0 cos θM+X0 sin θM, (17)
which may be expressed as:
where C may represent the cos θM and S may represent the sin θM. Consider an angle θ whose specific cosine and sine values (C, S) may be represented by an n-bit number in the SPT numeral system and a k-bit number in the SPT numeral system, respectively. The specific cosine value C for the angle θ may be represented as C=<c0•c−1c−2 . . . c−n>. Likewise, the specific sine value S for the angle θ may be represented as S=<s0•s−1s−2 . . . s−k>. Thus, for the angle θ, the desired computation for the coarse rotation butterfly circuit 204 rotation of the complex input signal 250 may be expressed as:
where the coefficients C0 through C−n and S0 through S−k represent the magnitude of the coefficients c0 through c−n and s0 through s−k. Using “rtsh(X0, d)” to denote a right-shift of the X0 coordinate of the complex input signal 250 by d bits, which is equivalent to multiplying X0 by 2−d, the computation of the intermediate complex number 254 may be expressed as:
As an example, consider a specific computation of the coarse rotation butterfly circuit 204 for an angle θ whose specific cosine value C is represented by the 8-bit binary fraction of 0.11100010 and whose specific sine value S is represented by the 5-bit binary fraction of 0.01111. However, this example is not limiting, those skilled in the relevant art(s) will recognize that this example is for illustrative purposes only. From the discussion above, the binary representations for the specific cosine and sine values (C, S) may be expressed in the SPT numeral system as C=1.00
Once again, using “rtsh(X0, d)” to denote the right-shift of the X0 coordinate of the complex input signal 250 by d bits, the computation of the intermediate complex number 254 may be expressed as:
The representations of approximations to specific cosine and sine values (C, S) for the angle θ may be implemented using a signed positional numeral system, such as the SPT numeral system, or the CSD numeral system to facilitate the use of multiplexers in the coarse-rotation butterfly circuit 204. The non-uniqueness feature of the SPT numeral system allows (20) and/or (22) to be adjusted such that certain specific pairs of shifting operations will never occur in any expression for specific (C, S) values in the SPT numeral system. For example, the specific (C, S) values in (22) may be adjusted such that at most one of rtsh(X0,2), rtsh(X0,3), and rtsh(X0,6) and/or at most one of Y0, rtsh(Y0,5), rtsh(Y0,6) is necessary to produce the X1 coordinate of the intermediate complex number 254. In other words, the specific (C, S) values representing cosine and sine for the angle θ may be adjusted such that no specific value C in the coarse-rotation butterfly circuit 204 will have more than one of its fractional bits 2, 3, and 6 nonzero, and similarly, no specific sine value S in the coarse-rotation butterfly circuit 204 will have more than one of its bits 0, 5 and 6 nonzero.
The non-uniqueness feature of the SPT numeral system also allows (20) and/or (22) to be adjusted such that certain specific pairs of coefficients will never occur in any expression of the specific (C, S) values in the SPT numeral system. For example, the coefficients C0 through C−8 and S0 through S−6 in (20) may be adjusted such that at most one of C−2, C−3, and C−6 will be non-zero, both C−4 and C−5 will not be non-zero, both C−7 and C−8 will not be non-zero. Also, at most one of S−1 and S−2 will be non-zero, at most one of S−3 and S−4 will be non-zero, and at most one of S0, S−5 and S−6 will be non-zero.
From the discussion above, the computation of the X1 coordinate and the Y1 coordinate of the intermediate complex number 254 may involve a right shift of the X0 coordinate of the complex input signal 250 by one or more bits and a right shift of the Y0 coordinate of the complex input signal 250 by one or more bits. The right shifting of the X0 coordinate may occur simultaneously with and/or separately from the right shifting of the Y0 coordinate.
Likewise, the “hard-wired” method of shifting may shift A0 by two bits to produce <b0b0b0 . . . bn−4bn−3bn−2>, denoted as A2 in
Similarly, the “hard-wired” method of shifting may shift the data stream A0 by k bits to produce a bit sequence <b0b0b0 . . . bn−(k+2)bn−(k+1)bn−k>, denoted as Ak, in
Alternatively, one or more shift registers or any other suitable method of shifting that is capable of right shifting the X0 coordinate by one or more bits and right shifting the Y0 coordinate of the complex input signal 250 by one or more bits, such as an algorithm implemented by a software routine to provide an example, may be used to right shift the X0 coordinate and the Y0 coordinate of the complex number.
The operation of the coarse rotation butterfly circuit 204 may be explained using an angle θ whose specific cosine value C may be represented by an 8-bit number in the SPT numeral system and whose specific sine value S may be represented by a 6-bit number in the SPT numeral system, such that:
However, this example is not limiting, those skilled in the relevant art(s) will recognize that (23) is solely used to illustrate the operation of one embodiment of the coarse rotation butterfly circuit 204. For example, those skilled in the relevant art(s) may implement (23) differently in accordance with the teachings herein without departing from the spirit and scope of the present invention.
As shown in
The multi-addition term generator 502 includes a shifting module 510. The shifting module 510 shifts the X0 coordinate of the complex input signal 250 according to the methods of shifting discussed in
The multi-addition term generator 502 additionally includes AND gates 514.1 through 514.3 to produce the corresponding multi-addition terms p2 through p4 according to the corresponding control bits z2 through z4. The multi-addition terms p2 through p4 are the subset of shifted digital signals X2 through X8 that have been selected for combination. For example, the AND gate 514.1 produces the multi-addition term p4 based upon the control bit z4. However, those skilled in the relevant art(s) will recognize that the use of AND gates is not required, one or more suitable logic gate may be used to provide a means of causing any unnecessary digits to be set to zero without departing from the spirit and scope of the present invention.
The coarse rotation butterfly circuit 204 includes an adder module 516 to combine the multi-addition terms p1 through p4 from the multi-addition term generator 502 and the multi-addition terms p5 through p7 from the multi-addition term generator 504 and an adder module 520 to combine the multi-addition terms q1 through q4 from the multi-addition term generator 506 and the multi-addition terms q5 through q7 from the multi-addition term generator 508. The adder module 520 operates in a substantially similar manner as adder module 516, as will be apparent to those skilled in the relevant art(s), and therefore will not be described in further detail herein.
The adder module 516 performs a conditional negation or inversion of the multi-addition terms p2 through p7 based upon a corresponding control bit s2 through s7. In other words, the determination of whether a corresponding multi-addition term p2 through p7 is to be added to and/or subtracted from the X1 coordinate of the intermediate complex number 254 is based upon the corresponding control bit s2 through s7. For example, CSA 518.1 adds and/or subtracts the multi-addition term p2 based upon control bit s2, adds and/or subtracts the multi-addition term p3 based upon control bit s3, and adds and/or subtracts the multi-addition term p7 based upon control bit s7.
As shown in
The coarse rotation butterfly circuit 204 includes an adder module 616 to combine the multi-addition terms p1 through p4 from the multi-addition term generator 602 and the multi-addition terms p5 through p7 from the multi-addition term generator 604 and an adder module 620 to combine the multi-addition terms q1 through q4 from the multi-addition term generator 606 and the multi-addition terms q5 through q7 from the multi-addition term generator 608. As will be apparent to those skilled in the relevant art(s), the adder module 620 operates in a substantially similar manner as the adder module 616, and therefore will not be described in further detail herein.
Referring back to
Referring to flowchart 650, in step 652, a digital input signal is received representing an input complex number having a first coordinate and a second coordinate. In step 654, a representation of a rotation angle value is received having a coarse angle value and a fine angle value. In step 656, the input complex number is rotated based on the coarse angle value to generate an intermediate complex number. For example, the coarse stage butterfly circuit 204 rotates the input complex number 250 according to a normalized rotation angle
Referring to flowchart 6C, in step 658, a plurality of shifted digital signals are generated based on the digital input signal. In step 660, a plurality of control bits are retrieved from a memory device based on the coarse angle value. For example, the memory device could be a read-only-memory (ROM), or other type of device. In step 662, a subset of the shifted digital signals is selectively combined based on the control bits retrieved from the memory device, to produce at least one coordinate of an intermediate complex number.
In one embodiment, the first coordinate of said digital input signal includes a plurality of bit positions numbered 1-n having a corresponding number of bits. In step 664, the plurality of bits are right-shifted by a predetermined number of bit positions to generate the first shifted digital signal of the plurality of shifted digital signals. In step 668, the step of right-shifting is repeated to generate each of the plurality of shifted digital signals. For example, as
In step 670, a subset of shifted digital signals is selected from said plurality of shifted digital signals using one or more multiplexers that are controlled by corresponding control bits of said plurality of control bits. For example, in
In step 672, at least one pair of the selected shifted digital signals are added or subtracted to produce an intermediate complex number. For example, returning to
From the discussion of
It may be also useful to scale the single bit sequence S by a scaling factor to produce a scaled single bit sequence T having the bit sequence <T1T2T3 . . . T>. For example, as shown in
A CORDIC-type angle rotation performs one or more sub-rotations of the X0 coordinate and the Y0 coordinate of the complex input signal 250 to produce the X1 coordinate and the Y1 coordinate of the intermediate complex number 254. More specifically, the coarse rotation butterfly circuit 204 may include rotation stages 802.1 to 802.6 to rotate the X0 coordinate and the Y0 coordinate of the complex input signal 250 by six sub-rotations to produce the X1 coordinate and the Y1 coordinate of the intermediate complex number 254. In an exemplary embodiment, the coarse rotation butterfly circuit 204 includes six rotation stages 802.1 through 802.6. However, this example is not limiting; those skilled in the relevant art(s) will recognize that the teachings herein may be used to implement the coarse rotation butterfly circuit 204 using any suitable number of rotation stages without departing from the spirit and scope of the present invention.
An input-output relationship for a corresponding rotation stage, such as the rotation stage 802.2, may be expressed as:
where the ± and ∓ operators in (25) determine whether a rotation to be performed by the corresponding rotation stage is clockwise or counter-clockwise. For example, the six rotation stages 802.1 through 802.6 may be expressed as:
However, in an exemplary embodiment as shown in
Because the determinant of (25)-(27) will exceed one, each rotation stage will provide a magnitude scaling operation in addition to a rotation. The scaling operation will increase the magnitude of the rotated vector. However, the magnitude scaling amount is invariant regardless of whether the rotation to be performed by any rotation stage is clockwise or counter-clockwise.
Referring to
wherein Xout,1 represents an X coordinate of the first rotation stage expressed in carry-save notation and Yout,1 represents a Y coordinate of the first rotation stage expressed in carry-save notation.
From (28), the Xout,1 coordinate of the first rotation stage is expressed in carry-save notation and the Yout,1 coordinate of the first rotation stage expressed in carry-save notation. Any subsequent rotation stage, such as the rotation stages 802.2 through 802.6, require more carry-save adders. For example the rotation stage 802.2 requires four carry-save additions: one carry-save addition for each summation component for a total of two carry-save additions and an additional one carry-save addition for each carry component, for a total of four carry-save additions. For example, to compute the Xout coordinate for the subsequent stages involves computing
X
temp
=X
a
+X
b−rtsh(Ya,k), (32)
followed by
X
out
=X
a
temp
+X
b
temp−rtsh(Yb,k). (33)
The subscript a and the subscript b denote the two words, such as Xatemp and Xbtemp whose sum represents a value of a carry-save number, such as Xtemp.
An alternate embodiment may obtain a greater computational advantage by keeping the system's computations expressed in terms of the original data (X and Y) for several rotation stages. For example, if the rotation stages 802.1 and 802.2 perform
X
1
=X−rtsh(Y,p) X2=X1−rtsh(Y1,q)
and,
Y
1
=Y+rtsh(X,p) Y2=Y1+rtsh(X1,q) (34)
respectively, only four CSA computations are used, which may be determined from the product of the two rotation matrices, namely:
X
1
=X−rtsh(Y,p)−rtsh(Y,q)
Y
1
=Y+rtsh(X,p)+rtsh(X,q)
X
2
=X
a
1
+X
b
1−rtsh(rtsh(X,p),q)
Y
2
=Y
a
1
+Y
b
1−rtsh(rtsh(Y,p),q). (35)
In a further exemplary embodiment, the fixed rotation stage, such as the rotation stage 802.1, and the first two variable-direction rotation stages, such as the rotation stages 802.2 and 802.3, may be expressed as:
However, this example is not limiting; those skilled in the relevant art(s) will recognize that any suitable number of variable-direction rotation stages may be expressed in a similar manner without departing from the spirit and scope of the present invention. The expression in (37) provides a choice of one of four cases, each involving a single matrix:
All four entries differ in (38), from case to case, but all may be represented in the form:
where each entry in (38) has the same determinant of P2+Q2 and the matrices of (38) all scale an input vector by exactly the same magnitude scaling factor, namely √{square root over (P2+Q2)}≈1.1614, where any one of entries in (38) may be used to compute the magnitude scaling operation.
Expressing each entry in (38) in the SPT numeral system may result in:
which may readily be implemented as the addition/subtraction of signed-power-of-two (SPT) form, as will be apparent to those skilled in the relevant art(s). However, this example is not limiting, those skilled in the relevant art(s) will recognize that any suitable number of fixed rotation stages, variable-direction rotation stages, and/or combinations of fixed rotation stages and variable-direction rotation stages may be expressed in a similar manner without departing from the spirit and scope of the present invention.
The CORDIC-type angle rotation implementation for the coarse rotation butterfly circuit 204 shown in
Referring back to
The memory storage device 202 generates control information for the coarse stage butterfly circuit 204 and the fine stage butterfly circuit 206 based upon the normalized coarse angle
Recalling from the discussion above, the normalized angle
Since the coarse stage actually performs a rotation by the angle arctan(S/C) instead of the desired rotation of θM+offset, the fine stage may be used to compensate for the coarse-stage angular-rotation error. Thus, the rotation angle for the fine stage becomes (θM+offset)−arctan(S/C)+(π/4)
One more constraint arises as a result of the fine-stage angle (θM+offset)−arctan(S/C)+(π/4)
0≦(θM+offset)−arctan(S/C)+(π/4)
which holds for all values of
L<(4/π)( 1/16)−
and therefore, using one extreme
1/16≦(4/π)( 1/16)−
and using the other extreme
0≦
Thus,
0≦
Therefore, for each coarse-stage segment M, the C and S values may be chosen such that
−(1−π/4)( 1/16)+θM+offset≦arctan(S/C)≦θM+offset.
These constraints specify a relatively narrow interval below the angle θM+offset within which the angle arctan(S/C) would have to be located. Namely,
(θM+offset)−0.0134≦arctan(S/C)≦(θM+offset).
To satisfy such constraints would typically require many bits in the specification of C and S, which would tend to reduce computational efficiency in hardware implementations. The specifications for C and S that derive from the coarse stage constraints of Fu 1 are, for example, a five-bit S value and an eight-bit C value. It is improbable that such values could be found that satisfy the above inequality constraints. A modified approach is next developed.
The modified approach begins by handling two cases separately: the case of large fine-stage rotation angles and the case of small fine-stage rotation angles. These two cases may be designated by the value of the first (high-order) bit of the normalized fine-stage angle
(θM+offset)+(π/4)
is easily satisfied.
Case 1, (
Here,
(θM+offset)+(π/4)(2−4−2−13)− 1/16<arctan(S/C)≦(θM+offset)+(π/4)( 1/32) (44)
These inequalities provide a reasonably large interval around θM+offset within which one might expect to find a suitable value for arctan(S/C).
Case 2, (
Here,
(θM+offset)+(π/4)(2−5−2−13)− 1/16<arctan(S/C)≦(θM+offset).
Again, these inequalities yield a reasonably large interval below θM+offset within which one might expect to find a suitable value for arctan(S/C).
The process explained above shows how to determine upper and lower limits on the angle arctan(S/C) associated with each of the M coarse-stage angular sectors. Each of the coarse-stage rotations actually employs sines and cosines of this rotation angle to perform the rotation computations. For this reason, two values S and C that represent these sine and cosine values, respectively, are of concern. The two values S and C also define a point (C, S) in the plane that is within, but close to the boundary of the unit circle.
For each coarse-stage segment M, one of the principal concerns in determining the appropriate values of C and S is the following. When performing a coarse-stage rotation of a point (X0, Y0) by the angle arctan(S/C), to obtain the rotated output of the coarse stage (X1, Y1), using the mapping:
the magnitude scaling of the vector [X0 Y0]T must not be excessive. In other words, the magnitudes of the two vectors [X1 Y1]T and [X0 Y0]T may only differ by a scale factor that is suitably close to 1.
It is well known that when C=cos θ and S=sin θ, for any angle θ, the above mapping becomes a pure rotation and, as such, the magnitudes of both vectors [X1 Y1]T and [X0 Y0]T will be equal; that is, the mapping's magnitude scaling factor is then exactly 1. This fact can also be derived directly by computing the Euclidean norm of each vector (i.e., ∥[X Y]T∥=√{square root over (X2+Y2)}) and using the fact that, for any angle θ, sin2 θ+cos2 θ=1.
When C and S do not satisfy C=cos θ and S=sin θ exactly (for some angle θ), then the mapping will impose a magnitude scaling as well as a rotation. The determinant of the mapping's matrix (i.e., C2+S2) can indicate whether the magnitude scaling factor is greater than one (when C2+S2>1) or less than one (when C2+S2<1). When a rotation by an angle θM is desired, and θl is the angle for which sin θl=S=[sin θM], with [sin θM] an approximation of sin θM ([sin θM] being, in fact, a truncated and then rounded-up version of sin θM), and if C=[cos θl] is a truncated version of cos then S2+C2=sin2 θl+[cos θl]2<1, and hence the mapping's scale factor is less than 1. By choosing a value of θl such that sine is a 5-bit approximation to sin θM and [cos θl] is an 8-bit truncation of cos θl then the mapping's magnitude scaling factor can be sufficiently close to 1 that it is possible to correct the scaling error by multiplying the coordinates of the rotated vector by 1+δ, where the binary fraction=0.00000000xxxx; that is, each rotated vector coordinate can be scaled by performing a multiplication of the coordinate by a four-bit number xxxx and shifting the resulting product and adding it to the coordinate; this is a reasonably efficient scaling correction process. Similar efficiency may be achieved by an appropriate choice of C and S values. This will require C2+S2≈1, a matter that will now be made more precise.
Let the point p=(C, S) be a point in the plane that is inside the unit circle, but close to it. For example, let the point p lie outside the circle of radius r=1/(1+2−8), centered at the origin. Then, by multiplying the coordinates of p by a factor f to put the resulting point on the unit circle, f will be closer to 1 than the f value that scales the lower limit point pr=1/(1+2−8). Let f=1+δr, then f×pr=(1+δr)(1/(1+2−8)=1 implies δr=2−8=24×2−12. That is, for any point within the two concentric circles, the scaling factor that moves the point to the unit circle is 1+δ where δ<δr, that is 0.00000000xxxx. Those skilled in the relevant art(s) will recognize that a similar procedure could be followed with other values of r, leading to δ values that are smaller than, or larger than, 2−8 without departing from the spirit and scope of the invention.
The above discussion has shown how one can define 16 slightly curved rectangles (“boxes”), for each of the two possibilities (Case 1 and Case 2) of normalized fine-stage rotation words, within which a suitable point (C, S) may be found. The boxes shown in
The radial width of each box is 1−1/(1+2−8)≈2−8, hence, due to the slanting orientation of each box, if a horizontal X-row intersects a box, it is virtually a certainty that one X (but usually just one X) lies inside the box.
From inequalities (44), it is evident that the angular length of a box is
(θM+offset)+(π/4)( 1/32)−(θM+offset)−(π/4)(2−4−2−13)+ 1/16≈(π/128)−(π/64)+ 1/16= 1/16−π/128≈0.038
while the distance between X-rows is the smaller amount 2−5=0.03125. These facts make it virtually inevitable that each box will be intersected by at least one (and possibly two) X-rows. As the box orientations tend to have more shallow slopes near the top of the first octant, it follows that their vertical height then becomes shorter (≈0.038/√{square root over (2)}=0.027 in the limit) but the vertical height of all boxes is still close enough to 0.03125 that it is quite reasonable to expect intersections with X-rows—and they do, indeed, occur for all M sectors, as is shown in
A similar situation exists for the boxes that are created for Case 2 angles, as shown in
A complete set of C and S pairs, with two such pairs corresponding to each of the M=16 sectors [one pair for each of the Case 1 and Case 2 possibilities] can be generated in the manner described above. These values can be used to generate arctan(S/C) values, which can be employed in the construction of ROM tables containing θM−θm values, in the spirit of the ROM 102, where now the content of each ROM entry is, according to inequalities (41), (θM+offset)−arctan(S/C), this value representing the difference between the desired coarse-stage rotation value and the actual coarse-stage rotation value. Upon the adding of this ROM value to the (π/4)
For each of the M sectors it is also possible to store delta values (δ) in a ROM and they can be retrieved during the operation of the mixer; these are used to direct the magnitude-scaling operation that compensates for the small amount of coarse-stage magnitude reduction that is produced by the circuits performing the coarse-stage rotations by the arctan(S/C) angles.
It is evident that a radial bounding region of similar type to that discussed in the previous section could be located just outside the unit circle, rather than just inside it. Doing so would lead to S and C values for which the coarse stage rotation would slightly lengthen the rotated vectors rather than slightly shortening them. This would require a compensating scaling procedure that shortens, rather than lengthens the coarse stage output vectors. Following a procedure similar to that described previously, we can, for example, consider a point p=(C, 5) in the plane that lies outside the unit circle, but close to it. Namely, let p lie inside the circle of radius r=1+2−8 centered at the origin. Then, if we multiply the components of p by a factor f to put the resulting point on the unit circle, f will be closer to 1 than the f value that scales the upper limit point pr=1+2−8. Let f=1−δr, then f×pr=(1−δr)(1+2−8)=1 implies δr≈2−8=24×2−12. That is, for any point within the two concentric circles, the scaling factor that moves the point to the unit circle is 1−δ where δ=0.00000000xxxx. (Again, other values of r can be employed.) By continuing to follow this kind of development, patterned after the discussion of the previous section, it will be clear to those of ordinary skill in the art that a similar method of determining suitable locations for points (C, 5) can be obtained. It will also be clear that in the overall system, compensating scaling circuitry can be employed that is similar to that used for the previous (C, S) points, except subtractions of “delta” values could be used rather than additions.
In addition, it will be clear to those of ordinary skill in the art that it is possible to determine radial bounding regions of the sort we have been describing that extend both inside and outside the unit circle, but remain close to it. It may happen that this leads to scaling compensation circuitry that must consider both up-scaling and down-scaling of the length of the rotated vector. This could entail circuitry capable of conditionally adding or subtracting, in the processing of δ values. The number of nonzero bits employed in the representation of the δ values could differ from the four bits employed in the preceding discussion. It will also be evident to one of ordinary skill in the art that, by creating such bounds, we may increase the horizontal width of the boxes discussed in the preceding section, which could provide additional intersections between the points on the horizontal X-rows and the boxes, leading to an expansion in the possible options available for the overall system design, in the spirit of the techniques discussed elsewhere in this patent—techniques described in the next section, for example.
The bit sequence representing the specific cosine value C and/or the S specific sine value S may be extend by one or more bits to create interstitial rows to ensure that all boxes in
Referring to
Since one radian corresponds to approximately 57.3 degrees, which is greater than the π/4 value that defines the upper limit of an interval of angles of interest, it is evident that the larger angles in this set would never be accessed by the unnormalized address word θM, i.e., by the four most-significant bits of the angle θ=(π/4)×
The angle rotation performed by coarse stage 204 discussed above may alter the magnitude scaling of its X1, Y1 outputs because the coefficient values employed to approximate cos θ and sin θ do not exactly satisfy cos2 θ+sin2 θ=1. This is discussed in detail in Fu I and Fu, “Efficient Synchronization Architectures for Multimedia Communications,” Ph.D. dissertation, Univ. California, Los Angeles, 2000 (Fu II), each of which is incorporated by reference in its entirety. Thus, in addition to providing a small further rotation according to the fine angle, the fine stage corrects for small coarse-stage angular rotation imprecisions and re-scales the output values to correct for slight magnitude scaling errors caused by the coarse stage.
Fine stage 1506 may include a magnitude scaling module 1510 and an angle-rotation module 1520. Magnitude scaling module 1510 is configured to correct for scaling errors introduced by coarse stage 204. Magnitude scaling module 1510 is discussed in detail in Section 3.3 below. Angle-rotation module 1520 corrects for small coarse-stage angular rotation imprecisions. Angle-rotation module 1520 is discussed in Section 3.2 below. As illustrated in
As illustrated in
where the matrix element sin φ is an approximation to the sine of the angle through which the fine stage must rotate the input vector [Xin Yin]T. That approximate value is, according to Fu I, sin φ≈θl, where θl is generated at the top of
The magnitude-scaling operation that the fine stage 106 of the system of
where the matrix element 1+δ[cos θ
is employed.
In computing the products indicated by each entry in this matrix, the error analysis, described in Fu I, justifies the omission of the terms that result from multiplying two elements that are both small; more specifically, the product terms δ[cos θ
Another way to organize the same fine-stage computations is described mathematically as follows. The fine stage angle-rotation matrix of (45) is written in detail, using sin φ≈θl and cos φ≈(1−θl2/2), and a diagonal matrix containing the approximations used for the cos φ terms is factored out. The following equation results:
The error made in approximating θl/(1−θl2/2) by the much simpler expression θl is on the order of θl3/2. Thus, this approximation is used to obtain:
The magnitude-scaling matrix for the fine-stage computations is applied by multiplying the matrix product shown above by 1+δ[cos θ
and, when multiplying the two diagonal matrices, the same omission of the small term δ[cos θ
As discussed above, the diagonal matrices that were factored out of the fine-stage matrix could have been located on either side of the remaining matrix (because they are more special than just “diagonal”—they are a multiple of the identity matrix). Thus, the same types of magnitude-scaling circuitry could be employed at the output of the fine-stage (e.g., as illustrated in
Equation (52) exactly describes the revised fine-stage architecture shown in
As described above, the cosine multipliers, along with their added magnitude scaling feature, are factored-out from the fine stage. These computations can be re-introduced either at the input to the remaining fine-stage circuitry or at the output of the remaining fine-stage circuitry. In the example embodiment of
The magnitude-scaling module 1610 employs the same multiplication and addition operations as those appearing in the corresponding circuitry of
Performing a multiplication by a four-bit coefficient (as feeds the 4×5 multipliers in
Angle-rotation module 1620 can be structured in various ways. Fundamentally, it is a block having a pair of inputs that can be considered the coordinates of a point in the X-Y plane, and an input specifying an angle θl (in radians) by which the point is to be rotated to arrive at a new point in the X-Y plane whose coordinates are the pair of output values X2, Y2. Moreover θl is a sufficiently small angle that the approximation sin θl≈θl applies. The function required by this angle-rotation block is similar to the function required in various publications in the current literature and, depending upon the mixer's application, one or another of the existing techniques may be the most preferable approach. Many publications have discussed the use of the CORDIC algorithm for this task. For more details on the use of the CORDIC algorithm for this task, see J. Voider, “The CORDIC Trigonometric Computing Technique,” IEEE Trans. Computers, vol. EC-8, pp. 330-334, September 1959 (Voider), J. Vankka and K. Halonen, Direct Digital Synthesizers: Theory, Design, Applications. Boston: Kluwer, 2001 (Vankka), Y. Ahn, et al., “VLSI Design of a CORDIC-based Derotator,” in Proc. IEEE Int. Symp. Circuits Syst., vol. 2, May 1998, pp. 449-452 (Ahn), D. DeCaro, N. Petra, and A. G. M. Strollo, “A 380 MHz, 150 mW Direct Digital Synthesizer/Mixer,” in 0.25 μm CMOS,” (DeCaro I) in IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers, February 2006, pp. 258-259, and D. DeCaro, N. Petra, and A. G. M. Strollo, “A 380 MHz Direct Digital Synthesizer/Mixer with Hybrid CORDIC Architecture in 0.25-μm CMOS” IEEE J. Solid-State Circuits, vol. 42, no. 1, pp. 151-160, January 2007 (DeCaro II), each of which is incorporated by reference in its entirety.
In addition, a modified CORDIC approach has been proposed. For more details on the modified CORDIC approach, see A. Madisetti, “VLSI Architectures and IC Implementations for Bandwidth Efficient Communications,” Ph.D. dissertation, Univ. California, Los Angeles, 1996 (Madisetti I) and A. Madisetti and A. Y. Kwentus, “Method and Apparatus for Direct Digital Frequency Synthesizer,” U.S. Pat. No. 5,737,253, issued Apr. 7, 1998 (Madisetti II), each of which is incorporated by reference in its entirety.
Additionally, angle-rotation may also be performed using the minority-select technique described in U.S. Patent Publication No. 2006/0167962, entitled “Method and Apparatus for Improved Direct Digital Frequency Synthesizer,” filed Jan. 26, 2005, which is incorporated herein by reference in its entirety. In some of these solutions (e.g., Madisetti, minority-select), it may be required to incorporate an additional (conditional) rotation of the (X1, Y1) point. Alternatively, this conditional rotation could occur at various locations in the datapath, i.e., at the block's input (or even before that), or at the block's output, or at some internal point along the datapath.
The fine-stage datapath can be constructed as a sequence of subrotations, with each rotation controlled by certain information provided by the rotation angle θl of
3.2.1.1 Minority Select with Conditional Rotation
Initial conditional rotation stage 1780 receives the input coordinates (X1, Y1) and at least a portion of the bit sequence b1b2b3 . . . bn representing θl and performs an initial conditional rotation of the coordinates. In an embodiment, the initial conditional rotation is the maximum fine rotation. In this embodiment, the output of the initial conditional rotation stage 1780 is over-rotated. The subsequent minority select stages then must subtract rotation (i.e., perform a clockwise rotation).
It is clear that the complete system of
Returning to
Multiplexer 1790 determines whether to provide the over-rotated input coordinates or the unrotated input coordinates to minority select stages 1792 based on the value of the minority bit controlling the multiplexer. The output of multiplexer 1790 is represented as intermediate coordinates, Xint, Yint. For example, if the minority bit indicates that “0” bit is in the minority, multiplexer 1790 outputs the over-rotated input coordinates from initial conditional rotation stage 1780. Alternatively, if the minority bit indicates that “1” bit is in the minority, multiplexer 1790 outputs the unrotated input coordinates.
Minority select stages 1792 receive the intermediate coordinates and either rotate the coordinates clockwise (subtracting rotations) or counter clockwise (adding rotations) based on the shift and zero signals received from minority bit detector 1770. For example, if the intermediate coordinates are the over-rotated input coordinates, minority select stages 1792 are used to subtract rotation (i.e., rotate intermediate coordinates in the clockwise direction). For information on minority select stages, see U.S. Patent Publication 2006/0167962.
3.2.1.2 Minority Select with Offset
If a two-stage mixer is designed such that the coarse-stage rotation can efficiently offer a choice of two outputs, one differing from the other by a built-in offset rotation of a fixed angular amount that is appropriate for supplying the conditional rotation needed by a minority-select fine stage, then this feature can lead to an attractive overall system. An alternative, when it is apparent that the cost of generating a choice of two coarse-stage outputs would be excessive, is a system wherein, for each data point processed, the specific minority-select fine-stage angle rotation requirements are made apparent prior to performing (or prior to completing) the coarse-stage rotation. Then, for each data point processed, rather than providing a choice of two output pairs (Xout1, Yout1), (Xout2, Yout2), the coarse stage need only proved the one output pair (Xout, Yout) that is most appropriate for improving the fine stage's computational efficiency via “minority select.”
In this embodiment, the fine stage of the two-stage mixer includes a minority-select rotation feature for a range of consecutive bits within θl, as described previously; most typically this range would be all bits of the angle θl. Rather than explicitly providing a conditional rotation for this minority-select fine stage system by, for example, the use of the circuit in
Consider, for simplicity, the example situation where there are just three bits (B=3) in the fine stage angle θl and a minority-select fine-stage rotation involving a single addition of a scaled Y value to the X value is desired (and, of course, a similar computation for generating a rotated Y output). When ‘one’ bits are in the minority, just one addition is needed. When ‘zero’ bits are in the minority, however, the minority-select method requires the performance of the conditional rotation by 111 and then the processing of the minority-zero bit. For example, to provide the rotation 011 a conditional (counter-clockwise) rotation of 111 would be “built-in” and the single fine stage (clockwise) rotation by 100 then performed. However, when adding 111 to whatever overall rotation is otherwise required, if the result is that the fine-stage rotation angle were to change from 011 (three) to 1010 (ten=three+seven) then, assuming the leading (high-order) bit of 1010 (i.e., “eight”) becomes a part of whatever angle-rotation processing precedes the minority-select fine stage, it would be necessary for the fine stage to interpret the new three-bit pattern 010 as if it were 101, which, while correctly compensating, by rotating (clockwise) for a counter-clockwise rotation by 8 (because 3=8−5), would not be a rotation driven by a single ‘one’ bit. Moreover, when this process is employed on other minority-zero patterns, a similar situation results. In general:
011=>1010=>8−(101)
110=>1101=>8−(010)
101=>1100=>8−(011)
111=>1110=>8−(001)
which does not always yield the desired simple fine-stage processing.
The problem being encountered here is the need to always provide a clockwise rotation of 001 which compensates for the offset rotation by 8, which is larger than the 7 that is intended. Assuming the larger “eight” offset can be implemented at “no cost,” the larger “eight” offset can be used along with the original three-bit pattern to indicate the “zero-bit driven” clockwise-rotation processing. Additionally, another “clean-up” rotation of −1 is then required (an additional one-LSB subtraction must also be performed in all cases where “0” is the minority bit).
As described above, minority bit detector 1970 receives the input angle θl bit sequence. Minority bit detector 1970 then determines whether the minority of bits in the bit sequence are “1” or “0” bits. Minority bit detector 1970 generates a minority bit signal 1972 indicating which bit is in the minority. In an embodiment, the minority bit signal 1972 is used to determine whether a final rotation should be applied to the output of the minority select stages. Minority bit detector 1970 also generates the shift and zero signals 1974 for the minority-select stages 1992.
As described above, minority select stages 1992 receive the input coordinates (Xin, Yin) and either rotate the coordinates clockwise (subtracting rotations), counter clockwise (adding rotations) or perform no rotation, based on the shift and zero signals 1974 received from minority bit detector 1970.
Final rotation stage 1994 is configured to rotate the output of the minority select stages clockwise by a predetermined number (e.g., 1) to compensate for the rotation error introduced by the offset. In an embodiment, final rotation stage 1994 receives the minority select bit 1972 and the offset angle 1976. The final rotation stage 1994 then provides the final rotation, based on the offset angle, only when “0” bit is in the minority. As would be appreciated by persons of skill in the art, various mechanisms for performing this selection, including, but not limited to multiplexers, could be used in the present invention.
In summary, this minority-select implementation encounters a cost of one additional subtraction, which could be an acceptable cost in the overall system, particularly if the string of minority-select bits is long—longer than the three bits used in this simple example. Table 5 shows a tally of the cost for various B-bit lengths for the minority-select method. The right-most column shows other data for comparison purposes.
The issue of whether or not the “carry bit” (e.g., the “eight” in the example we've been considering), can be included for free in previous angle rotation processing must also be addressed. If, for example, the overall system consisted of a coarse-stage rotation driven by a radian-dimensioned address word, and if the “carry bit” simply incremented the coarse-rotation sector by one, then such a system could provide the sort of “free” inclusion of the carry bit that is desirable for the above-described type of system.
While the minority-select fine-stage architecture provides a good means of improving the fine-stage efficiency of the
As illustrated in
In an embodiment, the radian-valued angle is a nine-bit word and each subrotation module 2022 receives a 3-bit group of the nine-bit angle word. As demonstrated below, it is useful to group all rotation bit positions into three-bit groups and to rotate (or not rotate) once for the entire group in either a clockwise or a counter-clockwise direction. The discussion below considers one such three-bit group, and examines how many additions, on average, are required to perform its complete three-bit rotation operation.
Table 1 shows each of the eight possible values that the group's three bits can represent. As shown in Column 3 of Table 1, except for two cases (3 and 5) whose rotation requirements are indicated by asterisks (***), each of the other six cases requires no more than a single add/subtract operation. Also, among those six cases, in two cases (6 and 7) a presumption is made that it is possible to perform the angle-rotation associated with a current three-bit group prior to the next higher-order (i.e., more significant) three-bit group, thereby making it possible to carry a “one” into that next higher-order group before its processing begins. In this regard, when such intergroup carries are being employed, a system must be prepared to process an incoming carry bit. This can lead to the possibility of one more line being required in Table 1, which deals with a “three-bit value” of 8. Here, one more case exists where a carry-out must be passed on to the next higher-order 3-bit group.
For each of the cases in Table 1 where the asterisks appear, two rotations could be employed (avoiding the use of any “minority-select” methods here). In fact, the case of 5 is treated as representing 8−3, and if, similar to the method discussed in the previous paragraph, a carry into the next higher-order group is used to handle the “8” part, for both *** cases it suffices to be able to multiply the input being added or subtracted by “three” prior to the add/subtract. Therefore, the three-times X value, is simply computed, which costs one additional add (to get the 3X value, a one-bit-shifted copy of the input X value is added to itself). Notice that, when rotation stages associated with “single-bit angles” that represent sufficiently small angles are considered, a single three-times value that can then be repeatedly employed, whenever needed, in any of the three-bit groups is required. Thus, assuming such sufficiently small angles are involved in the complete set of these rotations, there is just the single penalty of one extra add operation that must be paid to facilitate the complete processing of all of the three-bit groups. Note that, of course, a rotated Y value must also be computed; hence a three-times Y value is needed as well. With this in mind, the values appearing in the fourth column of Table 1 are recognized. In the processing of any three-bit group, when the “one add/subtract” corresponding to a case of 3 or 5 is performed, the precomputed three-times value is simply used.
A benefit of this development is that the new three-bit-group processing method provides a means of performing all angle rotations at a cost of no more than N+1 add/subtract operations, where N is the number of three-bit groups. To appreciate the method's efficiency, it is instructive to compare its maximum number of add/subtracts with those of the minority-select method, for various numbers of three-bit groups.
Table 2 illustrates that whenever the number of three-bit groups is larger than N=1, no more total add/subtracts are required by the non-minority-select method than are required by the minority-select method. Moreover, the non-minority-select method has fewer add/subtracts whenever N>3 (i.e., whenever there are more than 9 total stages in the overall rotations that are being performed by minority-select or non-minority-select means). Again, however, this add/subtract number depends on keeping the set of angle bits being processed by these methods restricted to those where Madisetti's lookahead processing would apply. That is, the represented angles must be sufficiently small.
The excess-threes offset technique is a further variation on the above-mentioned non-minority-select method that avoids the necessity to do the “inter-group carry” processing described above. Suppose the overall angle rotation that is specified for the fine subrotation module includes, for each 3-bit group, an extra rotation by 3×2−t
Notice that, in addition to the simplifying of the processing of each angle rotation by removing the need to handle inter-group carry propagations in some manner, a further advantage over the minority-select method is achieved. That is, there is no need for the conditional rotations.
An alternative embodiment of the previously described excess-threes non-minority-select method employs an excess-fours offset instead of excess-threes. Table 4 illustrates an exemplary excess-fours offset technique.
It is perhaps not immediately obvious that any systematic advantage exists by which one could determine whether one should choose the excess-threes offset method or the excess-fours offset method. A comparison of Table 3 and Table 4 indicates comparable computational complexity for both. There is, however, one somewhat subtle observation that could lead an person of skill in the art to utilize the “excess-fours” technique. Notice that, unlike Table 3, it would be possible to add a ninth row (call it group value 8) to Table 4, wherein a binary value of 8 could be accommodated by a single addition of 4. Thus, the “treat as” column of Table 4 would then span the range −4 to 4, not just −4 to 3. This 4 value could easily be processed by a single add/subtract, just like the processing of the other eight rows.
One practical use of this “excess-fours” feature relates to its facilitation of an efficient means of rounding the binary value being represented by the bits of the word from which the three-bit groups are taken. For the least significant one of these three-bit groups, if the excess-fours method is being used, one can include one additional LSB, which would bring in the requirement that that three-bit group must be prepared to represent values spanning 0 through 8. But since this representation is translated into the range “−4 through 4” by the excess-fours feature, it is quite possible to accommodate this extension with essentially no additional computational cost. There will be one add/subtract operation for processing the least-significant three-bit group whether or not that group is extended into a four-bit group as an implementation of the rounding of the additional fourth bit.
Returning to
A subrotation module 2022 may have an associated fine stage magnitude scaling module 2028. In an embodiment, only the first subrotation module 2022A has an associated magnitude scaling module 2028 because the subsequent fine-stage subrotations employ sufficiently small angles. That is, undesired magnitude scaling associated with each of these n-bit groups (of the radian-valued angles) is substantially smaller than that of the first group. In this embodiment, magnitude-scaling compensation for the subsequent fine-stage subrotations is ignored.
The fine stage magnitude scaling module 2028 corrects for the by-product of the process of doing each of these n-bit subrotations—production of undesired magnitude enhancement. The size of this magnitude enhancement can be assessed as follows.
Suppose a rotation of a vector [x y]T by an angle α is desired. Then the following matrix multiplication produces this rotation exactly (i.e., with no magnitude scaling):
Suppose, however, the angle α is a sufficiently small angle that the approximations sin α≈α, cos α≈1, and tan α≈α apply. Then, the simpler rotation matrix
can be used instead of the original matrix. This simpler matrix is also used in the angle rotation block of
More precisely, however, the simpler matrix actually provides a rotation by the angle β=arctan α. In other words, the exact relationship between a pure rotation and the mapping produced by the simpler matrix above, involves the pure rotation by β radians. It is expressed by:
Clearly, when the simpler matrix (54) is used in place of the original matrix of (53), an angular rotation error is introduced, in that the rotation is by β, not α, and a magnitude scaling error is introduced in the rotated vector, where the scaling factor is 1/cos β=√{square root over (1+α2)}.
In a first fine-stage “excess-threes” rotation, described below, (according to Table 3) the rotation is by the small angle α and the scaling factor that must be used to compensate for the magnitude enhancement is 1+u=1/√{square root over (1+α2)}, from which can easily be computed μ≈−α2/2. If the rotation angles used in the first subrotation are, for example, 0, 1, 2, 3, and 4 times the amount α=±2−7, then the corresponding μ amounts of 0, −2−15, −2−13, −(9/8)2−12≈−2−12 and −2−11 result.
Each subrotation module 2122 includes two 4-to-1 multiplexers 2134a and 2134b. The outputs of the two 4-to-1 multiplexers 2134 are fed as input to a 2-to-1 multiplexer 2136. The output of the 2-to-1 multiplexer is the subrotation value for that subrotation stage. Each subrotation module 2122 receives a three-bit group of the nine-bit excess angle. The subrotation module must then interpret the three-bit group as three less than the binary representation of the bit group. For example, subrotation module 21221 receives three-bit group b1b2b3, subrotation module 21222 receives three-bit group b4b5b6, and subrotation module 21223 receives three-bit group b7b8b9. The bits in each three-bit group are used to control the multiplexers in their associated subrotation module.
For example, in subrotation module 21221 of
Rotation circuit 2126 includes three adder circuits 2142. In an embodiment, adder circuits 2142 are carry-save adders (CSAs). Adder 2142a is configured to rotate an input coordinate by the subrotation value generated by subrotation module 21221. Adder 2142b receives the output from adder 2142a and the subrotation value generated by subrotation module 21222. Thus, adder 2142b rotates an input coordinate by the additional subrotation value generated by subrotation module 21222. Similarly, adder 2142, receives the output from adder 2142b and the subrotation value generated by subrotation module 21223. Adder 2142, rotates an input coordinate by the additional subrotation value from subrotation module 21223. The output of adder 2142, is a coordinate (e.g., X2 or Y2) of the rotated complex number.
Note that in
Fine stage magnitude scaling module 2128 is configured to provide magnitude scaling for the fine-stage angle rotation. Magnitude scaling module 2128 includes two 4-to-1 multiplexers 2154a and 2154b. The outputs of the two 4-to-1 multiplexers 2154 are fed as input to a 2-to-1 multiplexer 2156. Multiplexers 2154a and 2154b are controlled by bits b2b3 of the three-bit group associated with subrotation module 21221. Multiplexer 2156 is controlled by bit b1 of the three-bit group.
Multiplexer 2154a receives input coordinate X1 shifted by eleven at the 3 (1 1) input, input coordinate X1 shifted by twelve at the 2 (1 0) input, input coordinate X1 shifted by a thirteen at the 1 (0 1) input, and input coordinate X1 shifted by the fifteen at the 0 (0 0) input. Multiplexer 2154b receives a zero at the 3 (1 1) input, input coordinate X1 shifted by fifteen at the 2 (1 0) input, input coordinate X1 shifted by thirteen at the 1 (0 1) input, and input coordinate X1 shifted by twelve at the 0 (0 0) input. Multiplexer 2156 receives the output of multiplexer 2154a at the 1 input and the output of multiplexer 2154b at the 0 input. The output of multiplexer 2156 is inverted and fed as an input to adder 2142A of the rotation circuit 2126.
As would be appreciated by persons of skill in the art, various modifications to the circuits of
In step 2210, one or more input coordinates and at least a portion of the fine rotation angle are received. In an embodiment, the received input coordinate(s) are the output coordinates of a coarse rotation stage.
In step 2220, an initial rotation of the input coordinate(s) by an initial rotation angle are performed. In an embodiment, the initial rotation angle is the maximum possible fine rotation. That is, the initial rotation angle includes the sum of all full angles corresponding to the final rotation stages. As would be appreciated by persons of skill in the art, other initial rotation angles could be used with the present invention.
In step 2230, the bit value in the minority in the received fine rotation angle is determined. In an embodiment, this determination is made in minority bit detector 1770. Minority bit detector 1770 also generates a minority select bit signal 1772 which indicates which bit is in the minority in the received fine rotation angle.
In step 2240, either the initially rotated input coordinate(s) or the unrotated input coordinate(s) are selected based on the value of the minority select bit signal 1772. In an embodiment, the minority select bit signal 1772 controls one or more multiplexers which then output the selected coordinate(s) for processing by the minority select stages 1792. For example, if bit ‘0’ is in the minority, the rotated input coordinate(s) are output.
In step 2250, a determination is made for each minority select stage whether a positive (counter clockwise), negative (clockwise), or no rotation is required based on the value of the minority select bit and the bit value of the fine rotation angle corresponding to the minority select stage. In an embodiment, minority bit detector 1770 transmits a shift and zero signal 1774 to each minority select stage 1792.
For example, if the ‘0’ bit is in the minority, for a ‘1’ bit in the fine rotation angle, no rotation is needed because the rotation has already been performed in the initial rotation stage 1780. However, for a ‘0’ bit in the fine rotation angle, a negative (clockwise rotation) is needed. If the ‘1’ bit is in the minority, for a ‘1’ bit in the fine rotation angle, a positive rotation (counter clockwise) is needed. For a ‘0’ bit in the fine rotation angle, no rotation is required.
In step 2260, the minority select stages 1792 perform the necessary rotations to produce at least one rotated input coordinate.
In step 2310, one or more input coordinates and at least a portion of the fine rotation angle are received. Additionally, final rotation stage 1994 receives the offset angle 1976. In an embodiment, the received input coordinate(s) are the output coordinates of a coarse rotation stage.
In step 2320, the bit value in the minority in the received fine rotation angle is determined. In an embodiment, this determination is made in minority bit detector 1970. Minority bit detector 1970 also generates a minority select bit signal 1972 which indicates which bit is in the minority in the received fine rotation angle.
In step 2330, a determination is made for each minority select stage whether a positive (counter clockwise), negative (clockwise), or no rotation is required based on the value of the minority select bit and on the bit value of the fine rotation angle corresponding to the minority select stage. In an embodiment, minority bit detector 1970 transmits a shift and zero signal 1974 to each minority select stage 1992A-N.
In step 2340, the minority select stages 1992A-N perform the necessary rotations to produce at least one temporary rotated input coordinate.
In step 2350, a determination is made whether ‘0’ is the minority bit. If ‘0’ is the minority bit, operation proceeds to step 2360. If ‘1’ is the minority bit, operation proceeds to step 2370.
In step 2360, final rotation stage 1994 performs a final rotation by the offset angle to produce at least one rotated output coordinate.
In step 2370, no final rotation is performed. In this situation, the output from minority select stages 1992A-N is the rotated output of the fine stage.
In step 2410, the fine-stage selects one of the input coordinates as the first term. In an embodiment, the selection is based on data retrieved from memory (not shown). Step 2410 is optional. Step 2410 is typically present in embodiments requiring only a single rotated output coordinate such as the embodiments of
In step 2415, a fine-stage subrotation module receives at least one coordinate of the input complex number. In fine-stage rotation, the input complex number is the intermediate complex number generated by the coarse stage.
In step 2420, the fine-stage subrotation module receives at least one coordinate of the scaled input complex number. As described above, the scaled input complex number is scaled by an integer, m. In an embodiment, the input complex number (e.g., intermediate complex number from coarse stage) is scaled by three.
In step 2425, the fine-stage subrotation module receives an n-bit group from the bit sequence representing an excess fine rotation angle. The excess fine rotation angle is greater than the representation of the input angle by a predetermined number. For example, in an excess three embodiment, each n-bit group is three greater than the value of the corresponding n-bit group of the input fine rotation angle. In an excess four embodiment, each n-bit group is four greater than the value of the corresponding n-bit group of the input fine rotation angle.
In step 2430, a plurality of shifted input signals based on the input coordinate are generated. For example, in the first subrotation module 21221 of
In step 2435, at least one shifted input signal based on the scaled input coordinate is generated. For example, in the first subrotation module 21221 of
In step 2440, the subrotation module interprets the bit-group from the excess fine rotation angle. For example, in an excess three embodiment, the subrotation module interprets the bit-group as three smaller than its binary representation. In an excess four embodiment, the subrotation module interprets the bit-group as four smaller than its binary representation.
In step 2445, the subrotation module uses the interpreted excess angle bit-group to produce a subrotation value for the subrotation module. For example, the bits in the bit-group are used to control one or more multiplexers in the subrotation module. As illustrated in subrotation module 2122 of
Steps 2415-2445 are performed for each subrotation module in the fine stage. The performance does not necessarily occur sequentially. For example, a subrotation module may be performing one of the steps 2415-2445 while another subrotation module is performing another of the steps (or the same step of) 2415-2445. That is, multiple subrotation modules may be performing some of the steps 2415-2445 in parallel.
Accordingly, in step 2450, a determination is made whether the subrotation module has completed steps 2415-2445. When at least one subrotation module has completed steps 2415-2445, operation proceeds to step 2455.
In step 2455, a fine-stage magnitude scaling factor is generated for one or more of the subrotation modules. Step 2455 is described in further detail in
In step 2460, at least one input coordinate is scaled by the fine-stage magnitude scaling factor generated in step 2455 and rotated by the subrotation values generated by each subrotation module. Note that step 2460 may begin after the first subrotation module has completed steps 2415-2445 but prior to completion of those steps by the remaining subrotation modules. For example, in the rotation circuit 2126 of
In step 2510, a plurality of shifted signals based on an input coordinate are generated. For example, in the Ydatapath processing circuit depicted in
In step 2520, a sequence of control bits are received from ROM 202. In an embodiment, a first set of the control bits controls a set of input multiplexers and a second set of the control bits controls a final selection multiplexer.
In step 2530, the magnitude scaling value is generated as the output of the final selection multiplexer. That is, the magnitude scaling circuit outputs 0 or one of the shifted input coordinate values.
The magnitude scaling value is then either added to the input coordinate datapath, as illustrated in
As described in detail above, coarse stage 204 introduces magnitude-scaling errors which require compensation. In the coarse stage, the angle rotations discussed previously are performed by matrices of the form:
where the S and C values are always such that S2+C2≦1. Therefore, a by-product of the coarse-stage rotations is the scaling down of the magnitude of the output vector. Thus, the coarse-stage magnitude scaling correction that must be applied is a scaling up of the output magnitude.
If the fine-stage rotation angle θl is sufficiently small that the term θl2/2 can be ignored in the magnitude scaling factor, then the magnitude scaling factor becomes just the four-bit δ value (i.e., δ[cos θ
Magnitude scaling module 2610 includes two 4-to-1 multiplexers 2616, 2618 and two adders 2612, 2614 (e.g., CSAs). Multiplexer 2616 receives scaled input coordinate X1×3 (i.e., 3X1) shifted by ten at the 3 (1 1) input, input coordinate X1 shifted by nine at the 2 (1 0) input, input coordinate X1 shifted by ten at the 1 (0 1) input, and a zero at the 0 (0 0) input. Multiplexer 2618 receives scaled input coordinate X1×3 (i.e., 3X1) shifted by twelve at the 3 (1 1) input, input coordinate X1 shifted by eleven at the 2 (1 0) input, input coordinate X1 shifted by twelve at the 1 (0 1) input, and a zero at the 0 (0 0) input.
ROM 202 stores two correction values, δ, for each of the 16 sectors.
As mentioned above, δ values are stored in a ROM and retrieved during operation. These δ values are shown in
The high-order pair of bits δ4δ3 of the received correction value control 4-to-1 multiplexer 2616 and the low-order pair of bits δ2δ1 of the received correction value control 4-to-1 multiplexer 2618. The multiplexers each provide the X1 data value, or the scaled X1 data value shifted, as appropriate, so that it can be added to the X datapath value.
Adder 2612 receives the X input coordinate and the output of multiplexer 2616 as inputs. Adder 2614 receives the output of adder 2612 and the output of multiplexer 2618 as inputs. The output of adder 2614 is the magnitude-scaled X input coordinate. In an embodiment, adders 2612 and 2614 are CSAs.
In an embodiment, an assumption is made that the fine-stage is employing the “excess-threes offset” technique, discussed above, where the coarse-stage output (X1) has been determined, as has three times that value (3X1). Alternatively, other fine-stage computation schemes and architectures, such as the minority-select method, could be employed as well. The excess-threes technique, however, has an additional advantage, regarding the
Notice that the scaling value being added into the X-component datapath is not necessarily a scaled version of that same X-component (which could also be employed), but rather, a scaled version of X1, the coarse-stage output. In this example, simulations have shown that adequate accuracy in the system output is retained when using the X1 value. The implementation shown in
It is evident that the coarse-stage output X1 data and 3X1 data being employed in
As described above, magnitude scaling to compensate for error introduced by the coarse stage may be performed prior to the fine angle stage or after the fine angle stage. For ease of description, flowchart 2700 generally refers to input coordinates. As would be appreciated by person skill in the art, the input coordinates may be the coarse rotated coordinates output by the coarse stage if magnitude scaling occurs at the input to the fine stage or alternatively, the input coordinates may be the final rotated coordinates output by the fine stage if magnitude scaling occurs at the output to the fine stage.
In step 2710, magnitude scaling module 2610 (2810) receives one or more input coordinates and one or more scaled input coordinates. In an embodiment, the scaled input coordinates are three-times the input coordinate (e.g., 3X1).
In step 2720, a plurality of shifted signals based on an input coordinate(s) and the scaled input coordinate(s) are generated. For example, in the Xdatapath processing circuit depicted in
In step 2730, an n-bit correction value is received from ROM 202. In an embodiment, the n-bit correction value has 4-bits. In the exemplary scaling circuit of
In step 2740, the magnitude scaling value is combined with the input coordinate(s). For example, as illustrated in
As illustrated in
As shown in
X path 2840 includes three subrotation modules 2822-X, a rotation circuit 2826-X, a fine stage magnitude scaling module 2828-X, and a magnitude scaling module 2810-X. Subrotation modules 2822-X and fine stage magnitude scaling module 2828-X were described above in reference to
Rotation circuit 2826-X includes multiple adders (e.g., CSAs). A first CSA 2841-X combines the three inputs from fine-stage rotation multiplexers with outputs controlled by bits b1, b4 and b7, on the left side of
Similarly, Y path 2860 includes three subrotation modules 2822-Y, a rotation circuit 2826-Y, a fine stage magnitude scaling module 2828-Y, and a magnitude scaling module 2810-Y. Subrotation modules 2822-Y and fine stage magnitude scaling module 2828-Y were described above in reference to
Rotation circuit 2826-Y includes multiple adders (e.g., CSAs). A first CSA 2841-Y combines the three inputs from fine-stage rotation multiplexers with outputs controlled by bits b1, b4 and b7, on the right side of
Thus, the fine-stage output is completed by using five CSA on the X datapath and, simultaneously, five CSA on the Y datapath. Both X and Y datapath legs experience simultaneous delays of just four CSA. In this manner, the overall fine-stage processing happens to take approximately the same amount of time as the coarse-stage processing. Of course, for other examples, or for other choices of implementation methods for either the coarse or fine stages, the speeds could differ from each other.
It will be understood by those of ordinary skill in the art that, in order to meet circuit data-rate requirements, it may be necessary to subdivide the coarse and/or fine rotation stages into substages that can operate simultaneously, in a “pipeline” manner. This can be done by, for example, inserting registers into the datapath at appropriate locations, such that the amount of computation required by the circuitry between registers is small enough. Clearly, an inspection of
Quadrature modulator 2900 may also include a module (not shown) to truncate the M-bit output of the phase accumulator to W bits (e.g., 16 bits). The truncation module may be a stand-alone module or may be included in the phase accumulator 2914. The output of the truncation module is the sequence of bits {circumflex over (φ)}1{circumflex over (φ)}2 . . . {circumflex over (φ)}16.
Direct digital frequency synthesizer (DDFS) 2950 can be considered a special case of a digital mixer. While the mixer rotates an arbitrary point in the plane by an angle specified by the normalized rotation angle
DDFS 2950 provides two outputs—sin 2π{circumflex over (φ)} and cos 2π{circumflex over (φ)}. Multiplier 2962 receives as input X0 and cos 2π{circumflex over (φ)}. Multiplier 2964 receives as input Y0 and sin 2π{circumflex over (φ)}. The outputs of multiplier 2962 and multiplier 2964 are combined by adder 2970 to produce Xout.
Conventional two-stage mixer architectures, such as described in Fu I, have previously been used for two-output mixers, in which the point in the plane (X0, Y0) is rotated counterclockwise about the origin, through a specified angle θ, first rotating by a coarse-stage approximation of the rotation angle and then rotating again using a fine-stage angle. Normally, this produces a point in the plane whose coordinates (Xout, Yout) are both required, as they represent the result of multiplying the complex number (X0+jY0) by the complex number ejθ. In modulator 3000, the computation of the two coarse-stage outputs is retained, but a factor, cos θM, is factored out of the coarse-rotation matrix, which simplifies coarse-stage computations. One row of the fine-rotation matrix is retained and the cosine factor is used as a scaling multiplier for the (single) fine-stage output. This simplification, of course, saves hardware and lowers power consumption.
The removal of the cosine factor from the coarse stage causes the sine multiplication coefficients to become tangents, along with causing the cosine multiplication coefficients to become unity. Thus:
In (58) the coarse-rotation angle is θM and the fine rotation angle is θL.
The phase accumulator (not shown in
The phase accumulator provides a set of bits from the truncated angle to conditional angle negation module 3030. As is known to persons skilled in DDFS technology, it suffices to deal in detail with angles lying within the first octant (Octant 0) only. For example, rather than storing data for all needed values of {circumflex over (φ)} within [0, 1), it suffices to have a ROM that contains only data for 0≦{circumflex over (φ)}≦¼ (first quadrant represented by Octants 0 and 1). Values of the sine function for angles in the other three quadrants can easily be determined from values of the sine within the first quadrant—just a conditional two's complement negation of {circumflex over (φ)} (applied only in the second and fourth quadrants) and a conditional negation of the sin 2π{circumflex over (φ)} output value (applied only in the third and fourth quadrants) can extend the definition of sin 2π{circumflex over (φ)} to the complete interval 0≦{circumflex over (φ)}≦1.
Conditional angle negation module 3030 receives bit {circumflex over (φ)}3 of the truncated angle as a first input and bits {circumflex over (φ)}4{circumflex over (φ)}5{circumflex over (φ)}6 . . . {circumflex over (φ)}16 as a second input. The conditional angle negation module 3030 outputs 13 bits, {circumflex over (φ)}4{circumflex over (φ)}5{circumflex over (φ)}6{circumflex over (φ)}7{circumflex over (φ)}8{circumflex over (φ)}9 . . . {circumflex over (φ)}16. After being processed by the conditional negation block, angles {circumflex over (φ)} are represented as φ (i.e., without the “hat”). Conditional angle negation module provides bits φ4φ5φ6φ7φ8 to coarse stage ROM 3040 and bits φ1φ2φ3 and φ8φ9φ10φ11 . . . φ16 to fine stage ROM 3060.
As mentioned above, Octant 0 is partitioned into 16 sectors. The standard method of partitioning wherein the bit pattern of the four sector-bits C4C5C6C7 of the normalized angle φ specify the lower angular boundary on each sector is not optimal in transferring state-of-the-art DDFS architectures to the combined DDFS/modulator embodiment of
Multiple θm angles are created, one within each of the 16 sectors into which Octant 0 (i.e., 0≦θ<π/4) has been partitioned.
In doing the above manipulations, two such sets of boxes are required, one for θ in which large normalized fine-stage angles φL are present and one for small φL angles. In embodiments, the computations that employ cos θm are implemented by use of signed-powers-of-two (SPT) representations of cos θm. This results in a minimal number of add/subtract operations when multiplying a data word by cos θm. As described herein, such multiplications are performed using hard-wired shifts of the data words into multiplexers that are followed by carry-save adders. This “cheap implementation” yields low power dissipation. Unlike previous implementations, rather than storing coarse-angle cosine and sine values in ROM tables, suitable bits for controlling multiplexers and for indicating add/subtract choices for each of the various, e.g., cos θm, multiplications are stored in the ROM tables.
ROM 3040 is configured to store the control bits and the add/subtract indications (referred to herein generally as “control bits” for ease of discussion) for the coarse stage 3004 and coarse stage scaling module 3008. In an embodiment, ROM 3040 includes two ROMs 3042 and 3044. ROM 3042 stores the control bits for large angles and ROM 3044 stores the control bits for small angles. In an alternate embodiment, the control bits for large angles and small angles are stored in a single ROM. Because the total number of bits stored in ROM 3040 is very small, in embodiments, ROM 3040 (or ROMs 3042 and 3044) could be implemented in combinatorial logic.
The various contents of ROM 3042 and 3044 have been determined in such a way as to maximize the bit pattern similarities when using SPT representations.
The coarse stage ROM 3040 selects from either the large angle control bits (e.g., from ROM 3042) or the small angle control bits (e.g., from ROM 3044) depending on bit φ8 (i.e., the appropriate ROM for small or large fine-stage angles). After determining whether to access the large angle data (ROM 3042) or the small angle data (ROM 3044), ROM 3040 selects one of the 16 coarse-sector ROM values using four address bits φ4φ5φ6φ7 (i.e., C4C5C6C7). The coarse stage ROM outputs the set of coarse stage control bits associated with the sector to coarse stage 3004 and the set of coarse stage scaling control bits associated with the sector to coarse stage scaling module 3008.
Coarse stage 3004 receives X0, Y0 coordinates of a complex number as input. Coarse stage 3004 also receives the set of coarse stage control bits from ROM 3040. Coarse stage 3004 is described in further detail in Section 5.1 below.
Modulator 3000 also includes a multiplier 3050. Multiplier 3050 receives as inputs a set of bits (e.g., φ8φ9 . . . φ16) from conditional negation block 3030. Multiplier 3050 converts the angle into a radian-valued angle by multiplying it by an approximation of π/4. The output of multiplier 3050 is nine bits (e.g., θ8 . . . θ16) of the radian-valued angle, θ. The π/4 value reflects the 2π/8 value that would be applied to a normalized “Octant-0 angle” where normalized values within the interval [0,1) correspond to radian-valued angles within the Octant-0 interval [0, π/4).
While a routine specialization of a mixer architecture discussed herein would employ a π/4 multiplier (as appears in the fine-angle signal fed to the mixer's fine stage in
The difference between the normalized rotation angle and the unnormalized rotation angle is simply the factor π/4. Consider the normalized angle
The minority-select DDFS fine stage already has its ROM storage doubled in order to provide two ROM-rotation options. The present “π/4-elimination” technique will, effectively, re-double this amount of ROM storage. That is, assuming the use of a minority select architecture (as described in U.S. Patent Publication No. 2006/0167962), two ROMs of the type shown in
For the use of the “excess threes” or “excess fours” fine-stage architecture, just one ROM is needed, but it requires the storage of 3π/4X1 and 3π/4Y1 values, in addition to the fine stage ROM values shown in
Fine stage ROM 3060 includes a ROM configured to store a set of control bits for fine stage processing. ROM 3060 may further include a ROM configured to store a first term (X1) and a negate output. This ROM is addressed by the first three bits of the normalized angle O1O2O3.
Fine stage 3006 receives the outputs of coarse stage 3004 and a bit sequence b1b2b3b4b5b6b7b8b9 representing a fine rotation angle in the same manner as in the case of the 4-multiplexer angle rotator. Fine stage 3006 also receives a set of control bits from fine stage ROM 3060. Fine stage 3006 may use the minority select system or the “excess threes” or “excess fours” system discussed herein. Fine stage 3006 is described in further detail in Section 5.2 below.
Coarse-stage scaling circuit 3008 receives the output (X2) and optionally a scaled output (e.g., 7X2) from fine stage 3006. Coarse-stage scaling circuit 3008 also receives a set of control bits from ROM 3040. Coarse-stage scaling circuit 3008 is described in further detail in Section 5.3 below.
The reduction in the overall power dissipation in the circuit for this special two-multiplier mixer case can be expected to approximate half that of the general improved four-multiplier-type angle-rotator circuit that constitutes the major topic of the previous discussion. The coarse-stage circuit can have a complexity less than that of a single Booth multiplier. Moreover, the fine-stage circuit—consisting of just half of the more general fine-stage circuit, along with the cos θM scaling—will have a lower complexity than that of the more general (two output) fine-stage circuit, hence a lower complexity than that of a normal two-output direct digital frequency synthesizer (DDFS) circuit, which would traditionally be employed to generate the sin θ and cos θ values appearing in the conventional implementation of the
Coarse stage 3304 includes a Ydatapath processing section 3310 and an Xdatapath processing section 3350. Ydatapath processing section 3310 includes a shifting module 3312. Shifting module 3312 shifts the X0 coordinate of the input complex number and provides the shifted data signals to a series of multiplexers 3314a, 3314b, and 3314. Multiplexer 3314a is controlled by the δ2δ1 control bits received from ROM 3040. Multiplexer 3314a outputs term p3. Multiplexer 3314b is controlled by the δ4δ3 control bits received from ROM 3040. The output of multiplexer 3314b is provided to AND gate 3316. AND gate 3316 also receives control bit z1 from ROM 3040. The output of AND gate 3316 is term p1. Multiplexer 3314, is controlled by the δ6δ5 control bits received from ROM 3040. Multiplexer 3314, outputs term p2.
Adder 3318 receives term p3 from multiplexer 3314a, term p2 from multiplexer 3314, and term p1 from AND gate 3316. Term p3 is always added. However, adder 3318 performs a conditional negation or inversion of term p2 based on the value of control bit s2 and a conditional negation or inversion of term p1 based on the value of control bit s1.
Multiplexer 3320 receives the Y0 coordinate of the input complex number and a shifted Y0 coordinate. Multiplexer 3320 is controlled by the δ7 control bit received from ROM 3040. Adder 3322 receives term p4 from multiplexer 3320 and the output from adder 3318. Term p4 is always added.
Xdatapath processing section 3350 includes a shifting module 3352. Shifting module 3352 shifts the Y0 coordinate of the input complex number and provides the shifted data signals to a series of multiplexers 3354a, 3354b, and 3354c. Multiplexer 3354a is controlled by the δ2δ1 control bits received from ROM 3040. Multiplexer 3354a outputs term q3. Multiplexer 3354b is controlled by the δ4δ3 control bits received from ROM 3040. The output of multiplexer 3354b is provided to AND gate 3356. AND gate 3356 also receives control bit z1 from ROM 3040. The output of AND gate 3356 is term q1. Multiplexer 3354, is controlled by the δ6δ5 control bits received from ROM 3040. Multiplexer 3354, outputs term q2.
Adder 3358 receives term q3 from multiplexer 3354a, term q2 from multiplexer 3354, and term q1 from AND gate 3356. Term q3 is always subtracted. However, adder 3358 performs a conditional negation or inversion of term q2 based on the value of control bit s2 and a conditional negation or inversion of term q1 based on the value of control bit s1.
Multiplexer 3360 receives the X0 coordinate of the input complex number and a shifted X0 coordinate. Multiplexer 3360 is controlled by the δ7 control bit received from ROM 3040. Adder 3362 receives term q4 from multiplexer 3360 and the output from adder 3358. Term q4 is always added. The outputs of coarse stage, X1 and Y1, are provided to CRA 3364 and CRA 3324, respectively. The outputs of CRA 3364 and CRA 3324 are in turn provided to fine-stage 3006.
As illustrated in
The fine-rotation stage 3006 is driven by the nine least significant bits of the 16-bit rotation angle θ. The bits have been relabeled b1 through b9 for notational simplicity. These bits are grouped into 3-bit groups and each group drives a sub-rotation operation. Fine stage 3006 also incorporates various operations that are driven by the octant bits O1O2O3. Such operations are typically performed by a DDFS in a separate output stage. The price paid to eliminate the output stage is that several more complicated multiplexer control bit values must be computed. In embodiments, these values are used because several multiplexers are consolidated into a single 4-to-1 multiplexer (and an AND gate) in each of four sub-circuits (see, e.g.,
The fine-rotation stage can be implemented as described above in Section 3. There will still be comparable coarse and fine magnitude-scaling compensation circuits required. The fine-stage complexity will be halved, however, since only half of it need be built (as discussed above). This is illustrated by the following equation:
In this equation just the second component of the [X2 Y2]T output vector is retained. Alternatively, just the first component instead could have been retained, which would be computed by the following similar equation:
In an embodiment, only one of the two outputs must be implemented. However, the system does not simply pick one of the two parts shown, for example, in the “remaining angle-rotation block” of
Input selection circuit 3425 is configured to allow selection of either the X1 or Y1 input. During the circuit's operation, a determination as to whether it is the X or Y output that is being produced is given by data in a ROM table 3402. Table 3402 is addressed by the value of the octant associated with the octant part of the normalized rotation angle θ (i.e., the highest-order three bits of
Input selection circuit 3425 includes a pair of 2-to-1 multiplexers controlled by a data bit from ROM table 3402. For example, if the ROM table data is a “0” for the first term value, then input coordinate X1 is provided to the subrotation modules 3422 and input coordinate Y1 is provided to the rotation circuit 3426. Note that one of these signals is multiplied by three in a block labeled “×3” prior to input at the subrotation modules.
Rotation circuit 3426 includes three adders 3442A, 3442B, and 3442C. In an embodiment, the adders are CSAs. Adder 3442A receives the output from input selection circuit and the output for subrotation module 34221. The first adder also receives a negate first term bit from ROM table 3402. The negate first term bit determines whether the first term should be negated during operation. Each of adders 3442A, 3442B, and 3442C receives the negate second term bit from ROM table 3402. The negate second term bit determines whether the second term should be negated during operation. The output of adder 3442C is provided to a carry ripple adder 3490.
As would be appreciated by persons of skill in the art, fine-stage circuit 3400 may be used, with modifications, as the fine-stage circuit of
It is possible to simplify the fine-stage circuit somewhat by consolidating several multiplexers into a single 4-to-1 multiplexer (and an AND gate) in each of four sub-circuits—as will be understood by one having ordinary skill in the art.
Fine-stage circuit 3500 includes three subrotation modules 3522, a rotation circuit 3526, a fine stage magnitude scaling module 3528, and an input selection circuit 3525. Each subrotation module 3522 includes a 4-to-1 multiplexer 3534 and an AND gate 3536. The 4-to-1 multiplexer 3534 of subrotation module 35221 is controlled by bit b3 of the three-bit input group for the subrotation module and β2 which is calculated as:
β2=(b1∩b2)∪(
AND gate 3536 of subrotation module 35221 receives the output of 4-to-1 multiplexer 3534 and β1 which is calculated as:
β1=
Similarly, the 4-to-1 multiplexer 3534 of subrotation module 35222 is controlled by bit b6 of the three-bit input group for the subrotation module and β5 and AND gate 3536 of subrotation module 35222 receives the output of 4-to-1 multiplexer 3534 and β4. The 4-to-1 multiplexer 3534 of subrotation module 35223 is controlled by bit b9 of the three-bit input group for the subrotation module and β8 and AND gate 3536 of subrotation module 35223 receives the output of 4-to-1 multiplexer 3534 and β7.
Fine stage magnitude scaling module 3528 includes a 4-to-1 multiplexer 3554 and AND gate 3556. The 4-to-1 multiplexer 3554 is controlled by bit b3 of the three-bit input group for the subrotation module and β2. AND gate 3556 receives the output of 4-to-1 multiplexer 3554 and β1.
Input selection circuit 3525 is configured to allow selection of either the X1 or Y1 input as the “first term” (the other input becomes the “second term”). During the circuit's operation, a determination as to whether it is the X or Y input that is being used as the first term depends on the Octant data O1, O2, O3 and it is given by data in a ROM table 3502. Table 3502 is addressed by the value of the octant associated with the octant part of the normalized rotation angle
Input selection circuit 3525 includes a pair of 2-to-1 multiplexers controlled by a data bit from ROM table 3502. For example, if the ROM table data is a “0” for the first term value, then input coordinate X1 is provided to the subrotation modules 3522 and input coordinate Y1 is provided to the rotation circuit 3526. Note that one of these signals is effectively multiplied by three in a block labeled “×1.5” prior to input at the subrotation modules.
When this type of fine stage is used, where inputs can come from ROMs (such as in
As would be appreciated by persons of skill in the art, fine-stage circuit 3500 may be used, with modifications, as the fine-stage circuit of
The coarse-stage scaling circuit is configured to multiply the fine-stage output by cos θm. Because θm has been carefully chosen such that cos θm is “simple,” it is possible to perform the scaling operation with three CSAs and one CRA. It is not obvious that one would be able to do this implementation so efficiently. For example, it is not obvious that, no matter which cos θm value is being handled, it will always be the case that the signed-powers-of-two representation of cos θm will never have more than one non-zero bit in bit-positions 7, 8, 9, and 10—as the multiplexer in th lower right corner of
Term generation block 3630A generates term r1. Term generation block 3630A includes two 4-to-1 multiplexers which are controlled by a pair of bits (e.g., δ3δ2) from the correction value. The outputs of the two 4-to-1 multiplexers are fed as inputs to a single 2-to-1 multiplexer controlled by a bit (e.g., δ4) in the correction value. The output of the 2-to-1 multiplexer is term r1.
Term generation block 3630B generates term r2. Term generation block 3630B includes two 4-to-1 multiplexers which are controlled by a pair of bits (e.g., δ6δ5) from the correction value. The outputs of the two 4-to-1 multiplexers are fed as inputs to a single 2-to-1 multiplexer controlled by a bit (e.g., δ7) in the correction value. The output of the 2-to-1 multiplexer is term r2.
Term generation block 3630C generates term r3. Term generation block 3630C includes a 4-to-1 multiplexer which is controlled by a pair of bits (e.g., δ9δ8) from the correction value and a 2-to-1 multiplexer which is controlled by a bit (e.g., δ8) in the correction value. The outputs of the 4-to-1 multiplexer and the 2-to-1 multiplexer are fed as inputs to a single 2-to-1 multiplexer controlled by a bit (e.g., δ10) in the correction value. The output of the 2-to-1 multiplexer is fed into an AND gate along with a control bit z2. The output of the AND gate is the term r3.
Term generation block 3630D generates term r4. Term generation block 3630D includes two 4-to-1 multiplexers which are controlled by a pair of bits (e.g., δ12δ11) from the correction value. The outputs of the 4-to-1 multiplexers are fed as inputs to a single 2-to-1 multiplexer controlled by a bit (e.g., δ13) in the correction value. The output of the 2-to-1 multiplexer is fed into an AND gate along with a control bit z3. The output of the AND gate is the term r4.
Term generation block 3630E generates term r5. Term generation block 3630E includes a 2-to-1 multiplexer which is controlled by a bit (e.g., δ1) in the correction value. The output of the 2-to-1 multiplexer is the term r5. Adder 3636A combines terms r2, r3, and r5. Term r5 is always added.
However, adder 3636A performs a conditional negation or inversion of term r2 based on the value of control bit u2 and a conditional negation or inversion of term r3 based on the value of control bit u3. Adder 3636B combines the output of adder 3636A and term r1. Adder 3636A performs a conditional negation or inversion of term r1 based on the value of control bit u1. Adder 3636C combines the output of adder 3636B and term r4. Adder 3636c performs a conditional negation or inversion of term r4 based on the value of control bit u4. The output of adder 3636 is the scaled output coordinate (e.g., scaled X2).
As before, the 2×2 angle-rotation matrix is represented as the product of a coarse rotation matrix and a fine rotation matrix. Again, half of one matrix can be eliminated resulting in the following:
and, hence, either
In the latter expression, for example, the approximations cos θL≈1 and tan δL≈θL, lead to
It is evident that “half of the coarse stage” can be implemented by a circuit of the type shown in
Other minor variations on the embodiments described above will be evident to one of ordinary skill in the art. One quite simple example of this is the minor change that is often encountered in the circuits of
An analog downconverter 3802 downconverts the communication signal 3850 to baseband or any suitable intermediate frequency (IF) to produce a downconverted communication signal 3852. The analog downconverter 3802 may be implemented using a direct-conversion receiver (DCR) or homodyne receiver, a superheterodyne receiver, or any other suitable receiver capable receiving the communication signal 3850 without departing from the spirit and scope of the present invention.
An analog to digital converter (ADC) 3804 digitizes the downconverted communication signal 3852 to produce a digital communication signal 3854. More specifically, the ADC 3804 samples the downconverted communication signal 3852 according to one or more sampling clocks to produce the digital communication signal 3854. However, this example is not limiting, the communication system 3800 may use any suitable means to assign the downconverted communication signal 3852 to one or more digital representations without departing from the sprit and scope of the present invention.
A digital receiver 3806 receives the digital communication signal 3854 from the ADC 3804. The digital receiver 3806 processes the digital communication signal 3854 to produce a recovered communication signal 3858. The recovered communication 3858 may represent an approximation to a transmitted communication signal before transmission through the communication channel. In an exemplary embodiment, the digital receiver 3806 includes an angle rotator 3808 and a post processor 3810. However, this embodiment is not limiting, the digital receiver 3806 may include any suitable means to produce the recovered communication signal 3858 from the digital communication signal 3854 without departing from the spirit and scope of the present invention.
The angle rotator 3808 produces a derotated communication signal 3856 based upon the digital communication signal 3854. More specifically, the angle rotator 3808 rotates the digital communication signal 3854 by the angle θ to remove angular offsets in the digital communication signal 3854 as a result of the communication channel or any other suitable means that will be apparent to those skilled in the relevant art(s). In an exemplary embodiment, angle rotator 3808 may include one or more angle rotators 200 as discussed in
The post processor 3810 produces the recovered communication signal 3858 based upon the derotated communication signal 3856. The post processor 3810 may include forward error correction (FEC) decoders, deinterleaver, timing loops, carrier recover loops, equalizers, digital filters, or any other suitable means that may be used to produce the recovered communication signal 3858. In an exemplary embodiment, angle rotator 3808 may include one or more angle rotators 200 as discussed in
While various embodiments of the present invention have been described above, it should be understood that they have been presented by way of example, and not limitation. It will be apparent to persons skilled in the relevant art(s) that various changes in form and detail can be made therein without departing from the spirit and scope of the invention. Thus the present invention should not be limited by any of the above-described exemplary embodiments, but should be defined only in accordance with the following claims and their equivalents.
This patent application claims the benefit of Provisional Patent Application No. 61/646,824, filed on May 14, 2012 and is a continuation-in-part of U.S. patent application Ser. No. 13/205,525, filed on Aug. 8, 2011, titled “Excess-Fours Processing in Direct Digital Synthesizer Implementations” which is a continuation-in-part of U.S. patent application Ser. No. 11/938,252 (now U.S. Pat. No. 8,131,793) filed on Nov. 9, 2007 which claims the benefit of Provisional Patent Application No. 60/857,778, filed Nov. 9, 2006, entitled “Improved Angle Rotator,” each of which is incorporated herein by reference in its entirety.
Number | Date | Country | |
---|---|---|---|
61646824 | May 2012 | US | |
60857778 | Nov 2006 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 13205525 | Aug 2011 | US |
Child | 13894321 | US | |
Parent | 11938252 | Nov 2007 | US |
Child | 13205525 | US |