This application relates generally to direct digital synthesizers and particularly to direct digital synthesizer implementing excess-fours processing.
Direct digital synthesizers, commonly referred to as DDS or DDFS, are widely used in digital devices. A direct digital frequency synthesizer (DDS) can be considered a special case of a digital mixer. While the mixer rotates an arbitrary point in the plane by an angle specified by the normalized rotation angle, θ, the DDS always rotates a fixed point, which can be considered to be the point (1, 0).
The phase accumulator in a DDS employs a relatively long phase word (e.g., the word length M=32 bits for the examples we have been using here, and M=48 bits has been used in commercial products described in “1 GSPS Direct Digital Synthesizer with 14-Bit DAC,” AD9912 Data Sheet, Analog Devices, Inc., 2007-2010 and “2.7 GHz DDS-Based AgileRF™ Synthesizer,” AD9956 Data Sheet, Analog Devices, Inc., 2004). When incrementing the phase accumulator by adding a frequency control word (FCW) to it, a long carry-ripple delay can be problematic. For example, the use of Artisan library cells for TSMC 0.18-μm CMOS can require a carry-ripple delay that is sufficiently long as to make running the phase accumulator at a desired 250-MHz speed expensive in terms of power dissipation. A well-known technique for increasing the frequency at which a DDS phase accumulator can be updated is to employ some form of pipelining of the phase accumulator.
When pipelined, a 32-bit phase accumulator can run at 250-MHz in TSMC 0.18-μm CMOS. In addition to the increased hardware expense incurred by the pipelining circuitry, one residual problem remains: the inherent pipeline-induced delay and/or complexity when one desires to instantaneously change the frequency being generated—by changing FCW. (Instantaneous frequency changing is one of the very desirable capabilities of a DDS; indeed, such a feature is perhaps unique to a DDS, in comparison with other types of oscillators.) When changing to a new FCW value, in a pipelined-phase-accumulator system, it can be a problem that the least-significant part of the phase accumulator must be incremented in a previous output-data cycle to that in which the most-significant part of the phase accumulator is incremented, and solving this and related problems can require additional and more complicated circuitry and/or performance compromises.
What is therefore needed is a DDS that solves the phase accumulator speed-up problem.
The accompanying drawings, which are incorporated herein and form a part of the specification, illustrate the present invention and, together with the description, further serve to explain the principles of the invention and to enable a person skilled in the pertinent art to make and use the invention.
FIG. 4 depicts a direct digital synthesizer eliminating the π/4 multiplier.
The present invention will now be described with reference to the accompanying drawings. In the drawings, like reference numbers can indicate identical or functionally similar elements. Additionally, the left-most digit(s) of a reference number may identify the drawing in which the reference number first appears.
Embodiments of the present invention integrating excess-fours processing into direct digital synthesizer implementations improves upon many facets of the implementation and operation of a direct digital synthesizer (DDS)—sometimes referred to as a direct digital frequency synthesizer (DDFS), or as a numerically controlled oscillator (NCO). Perhaps the most significant is the improved operation of the DDS phase accumulator. Issues related to the “too-long carry ripple” of a DDS phase accumulator have been discussed and dealt with in the literature for virtually as long as the DDS has been in existence.
The solution to this problem is obtained by use of the excess fours processing.
DDS 100 includes an adder 112, a phase accumulator 114, a truncate module 120, and a phase to amplitude mapper 130. In embodiments, adder 112 is an unsigned overflowing adder that is repeatedly incremented by the M-bit FCW 102. The output of the adder, {circumflex over (φ)}, is stored in the phase accumulator 114. The phase accumulator 114 in embodiments is an M-bit register. The sequence of phase values that results from the repeated incrementing of the FCW is a sequence of unsigned numbers lying within the interval [0, 1), specifying a sequence of points on the unit circle, each point corresponding to a radian-valued angle that lies within the interval [0, 2π) where “[” is used to indicate inclusion of the end point in the interval and “)” is used to indicated exclusion of the end point from the interval.
Thus, the phase accumulator values can be viewed as normalized angles that become radian-valued angles if they are multiplied by 2π. When an overflow occurs, the integer part of {circumflex over (φ)} is lost, which elegantly causes the remaining fractional part of {circumflex over (φ)} to be represented by an angle within [0, 2π), but normalized to [0, 1). The normalized angles are fed to the phase-to-amplitude mapper 130. The phase-to-amplitude mapper 130 is configured to compute the sequence of amplitudes of a sinusoid, e.g., y=sin 2π{circumflex over (φ)} for each normalized angle {circumflex over (φ)}.
Various innovations have been adopted in the conventional use of a DDS over the past four decades. The earliest DDS implementations simply employed lookup tables (ROMs) for getting the y=sin 2π{circumflex over (φ)} values, using the normalized phase angle {circumflex over (φ)} to address the ROMs. Since the ROM size grows exponentially with the bit-length of {circumflex over (φ)} various techniques have been devised to reduce lookup-table size. One early technique was to use only the most significant W bits of normalized angle {circumflex over (φ)} as the ROM address. This technique is depicted in
Another technique that is almost universally applied for reducing ROM storage requirements is the exploitation of certain symmetries of sinusoidal functions defined over [0, 2π). For example, rather than storing the values of sin 2π{circumflex over (φ)} for all needed values of {circumflex over (φ)} within [0, 1), it suffices to have a ROM that contains only the values of sin 2π{circumflex over (φ)} for 0≦{circumflex over (φ)}≦¼ (first quadrant represented by Octants 0 and 1 in
Similarly, when both sin 2π{circumflex over (φ)} and cos 2π{circumflex over (φ)} are being generated, it suffices to store data in ROMs for both sin 2π{circumflex over (φ)} and cos 2π{circumflex over (φ)} for values of 0 within the first octant 0≦{circumflex over (φ)}≦⅛ Various computation-based approaches for getting sin 2π{circumflex over (φ)} and/or cos 2π{circumflex over (φ)} over the quadrant or octant intervals, as well as combinations of table-lookups and computations, have been employed. The article, “A 100-MHz 16-b, direct digital frequency synthesizer with a 100-dBc spurious-free dynamic range,” by A. Madisetti, A. Y. Kwentus, and A. N Willson, published in IEEE J Solid-State Circuits, vol. 34, pp. 1034-1043 (August 1999) (hereinafter “Madisetti”) shows that a modified-CORDIC rotation can be applied, using the value of 0 to rotate the point (1, 0) in the plane to get to a point (a, b) on the unit circle, with the result that a and b represent the desired cos 2π{circumflex over (φ)} and sin 2π{circumflex over (φ)} values, respectively. Using CORDIC, the rotations are performed, one after the other, each rotating further the result of the previous rotation, using positive or negative rotation angles having strictly monotone decreasing magnitude.
The first four rotations for the method of Madisetti are actually performed by a single small ROM table (a coarse rotation ROM), and the remaining rotations are each performed by a matrix operator that provides an approximate pure rotation through a small angle α. For a sufficiently small, a point (x, y) in the plane can be approximately rotated to obtain the point (u, v) as follows:
where the vector (u, v)T is close to the point obtained by a pure rotation of (x, y)T around the origin, counter-clockwise, through the positive angle α, because a pure rotation (by an angle β, a so-called Givens rotation) would employ the following matrix involving cos β and sin β. The relations are
That is, equation (1) above introduces an angular-rotation error, in that (1) rotates by β radians, where β=atan α, rather than α radians, and it introduces a magnitude-scaling error in the rotated vector, where the scaling factor is 1/cos β=√{square root over (1+α2)}. The length of the rotated vector is slightly increased. But clearly, if α is sufficiently small then β=atan α≈α and also 1/cos β ≈1.
The article, “A two-stage angle-rotation architecture and its error analysis for efficient digital mixer implementation,” by D. Fu and A. N. Willson, Jr., published in IEEE Trans. Circuits Syst. I, vol. 53, pp. 604-614 (March 2006) (hereinafter “Fu”) formalized the notion of a two-stage rotation process, comprising first a ROM-based coarse rotation and then a fine, computation-based rotation.
DDS 300 may also include a module 320 to truncate the M-bit output of the phase accumulator to W bits (e.g., 16 bits). Module 320 may be a stand-alone module or may be included in the phase accumulator 314. The output of module 320 is the sequence of bits {circumflex over (φ)}1, {circumflex over (φ)}2 . . . {circumflex over (φ)}16. Truncate module 320 is coupled to a conditional two's complement negation mapping module 332 and to an output stage 350. After truncating the M-bit phase accumulator to W bits (see
Conditional two's complement negation mapping module 332 receives two inputs. The first input is {circumflex over (φ)}3 and the second input is bits {circumflex over (φ)}4{circumflex over (φ)}5{circumflex over (φ)}6 . . . {circumflex over (φ)}16. The conditional two's complement negation mapping module 332 outputs 13 bits, φ4φ5φ6φ7φ8φ9 . . . φ16. After being processed by the conditional negation block, angles {circumflex over (φ)} are represented as φ (i.e., without the “hat”).
These angles are converted into radian-valued angles by multiplying them by an approximation to π/4. Multiplier 334 receives as input bits φ8φ9 . . . φ16 and outputs nine bits φ8 . . . φ16 of the radian-valued angle, θ. The π/4 value reflects the 2π/8 value that would be applied to a normalized “Octant-0 angle” where normalized values within the interval [0, 1) correspond to radian-valued angles within the Octant-0 interval [0, π/4).
Rotation may be decomposed into two stages: a coarse rotation of the input signal followed by a fine rotation of an intermediate pair of numbers. Coarse rotation stage 336 receives as input, bits φ4φ5φ6φ7 from the conditional two's complement mapping module 332. In embodiments, coarse rotation stage 336 includes a Read Only Memory (ROM). The coarse rotation stage 336 produces an intermediate pair of numbers (X, Y). The fine rotation stage 338 receives the intermediate numbers (X, Y) and performs the fine rotation of the intermediate pair (representing a point in the plane) counter-clockwise around the origin, to produce an output signal. Various techniques have also been advocated for reducing the number of fine-stage rotations, including the use of “look-ahead rotations” described in Madisetti, and minority-select fine-stage rotations described in U.S. Pat. No. 7,539,716 to A. Torosyan (hereinafter “Torosyanl”). Output Stage 350 controlled by {circumflex over (φ)}1{circumflex over (φ)}2{circumflex over (φ)}3 remaps the fine-stage rotated result into its correct octant.
Output stage 350 receives the three high order bits {circumflex over (φ)}1{circumflex over (φ)}2{circumflex over (φ)}3 from truncation module 320 and the output from fine rotation stage 338. The three high order bits {circumflex over (φ)}1{circumflex over (φ)}2{circumflex over (φ)}3 identify the octant (0 through 7) to which the output of the fine rotation stage 338 belongs. Output stage 350 may provide two (so-called quadrature) outputs, where both sin 2π{circumflex over (φ)} and cos 2π{circumflex over (φ)} are generated simultaneously for each {circumflex over (φ)} value. Alternatively, output stage 350 may provide a single-output (e.g., cos 2π{circumflex over (φ)} or sin 2π{circumflex over (φ)}).
While it is evident that the coarse rotation stage can be driven by a normalized angle (the conditionally-negated normalized angle φ4φ5φ6φ7 in
Coarse rotation stage 436 receives as input, bits φ4φ5φ6φ7 from the conditional two's complement mapping module 432. The coarse rotation stage 436 produces an intermediate pair of numbers (X, Y). The fine rotation stage 438 receives the intermediate numbers (X, Y) from coarse rotation stage 436 and φ8 . . . φ16 from conditional two's complement negation mapping module 432 and performs the fine rotation of the intermediate pair (representing a point in the plane) counter-clockwise around the origin, to produce an output signal.
While we have illustrated the
In embodiments of the present invention, the number of fine rotations required in a 2-stage angle-rotation DDS, such as the DDS depicted in
2.1 Excess Four Basics
Table 1 below depicts an excess four rotation table. There are two inputs to an excess four rotation table (such as Table 1): a three-bit pattern, specifying a normalized-angle rotation value from 0 through 7, and a one-bit value (not used in the fine rotation stage example of
Notice that the BIAS=1 column has values that could be obtained from the BIAS=0 column by shifting each row up by one row and including a “100 (4)” entry on the last row. This, of course, corresponds to an additional “001” rotational increment for the BIAS=1 value. Notice also, however, that the BIAS=1 column has values that could be obtained from the BIAS=0 column if the BIAS=0 column is just flipped-up/down and negated. Moreover, the up/down flipping can effectively be achieved by simply addressing a table row by the ones' complement of the 3-bit input data (in the column labeled “bit pattern”). This means that the operations indicated in Table 1 can be performed by simply building a circuit that implements only the BIAS=0 column and then, when using the circuit to perform a rotation, replacing the three input-bits φaφbφc by the three bits γaγbγc, obtained with use of an Exclusive-OR (XOR): γk=φk⊕ BIAS, for k=1, 2, 3. The table's output add/subtract bit, described above, is used after Exclusive-ORing it with BIAS. In situations always dealing with the BIAS=0 case, the above-mentioned XORs may be omitted. The capability, exhibited in Table 1, to increase an angle-rotation value by one bit, on the fly (by merely using a single-bit XOR gate), provides a powerful means for incrementing the phase value “in place,” i.e., without requiring a long carry ripple. Various examples of significant benefits deriving from this feature are described herein.
2.2 Fine Stage Rotation Using Excess Four Technique: Conventional Processing
An excess fours processor can be used conventionally (i.e., with a fixed BIAS=0 value on each three-bit group). This conventional processing corresponds to the usual DDS processing of the fine-stage data, with the full-angle data having been obtained from a truncated phase accumulator. The fine-stage data could either be un-normalized data (i.e., radian valued, where the normalized phase value has been multiplied by π/4—as shown in
Subrotation module 610 includes an AND gate 612, three 2-to-1 multiplexers 614, 615, and 616, and an exclusive-OR gate 618. Subrotation module 610 receives the group 1, φ8φ9φ10, of the fine-stage rotation bits. AND gate 612 receives as a first input, (π/4)Yk, shifted by 5 (i.e., by five bits in the LSB direction) and as a second input, bit φ8 of group 1, which is negated at the input. The output of AND gate 612 is provided at the 0 input to multiplexer 614. Multiplexer 614 receives (π/4)Yk shifted by 6 at its 1 input. Multiplexer 614 is controlled by bit φ9 of the three-bit input group. Multiplexer 615 receives (3π/4)Yk shifted by 7 at its 0 input and (π/4)Yk shifted by 7 at its 1 input. Multiplexer 615 is controlled by the exclusive-OR of bits φ8 and φ9 of group 1 of the fine-stage rotation bits. The output of multiplexer 614 is provided at the 0 input of multiplexer 616 and the output of multiplexer 615 is provided at the 1 input of multiplexer 616. Multiplexer 616 is controlled by bit φ10 of group 1. The output of multiplexer 616 is provided as a first input to exclusive-OR gate 618. Exclusive-OR gate 618 receives bit φ8 of group 1 as its second input.
Subrotation module 620 includes an AND gate 622, three 2-to-1 multiplexers 624, 625, and 626, and an exclusive-OR gate 628. Subrotation module 620 receives group 2, φ11φ12φ13, of the fine-stage rotation bits. AND gate 622 receives as a first input, (π/4)Yk, shifted by 8 and as a second input, bit φ11 of the three-bit input group, which is negated at the input. The output of AND gate 622 is provided at the 0 input to multiplexer 624. Multiplexer 624 receives (π/4)Yk shifted by 9 at its 1 input. Multiplexer 624 is controlled by bit φ12 of group 2. Multiplexer 625 receives (3π/4)Yk shifted by 10 at its 0 input and (π/4)Yk shifted by 10 at its 1 input. Multiplexer 625 is controlled by the exclusive-OR of bits φ11 and φ12 of group 2 of the fine rotation bits. The output of multiplexer 624 is provided at the 0 input of multiplexer 626 and the output of multiplexer 625 is provided at the 1 input of multiplexer 626. Multiplexer 626 is controlled by bit φ13 of group 2. The output of multiplexer 626 is provided as a first input to exclusive-OR gate 628. Exclusive-OR gate 628 receives bit φ11 of group 2 as its second input.
Subrotation module 630 includes an AND gate 632, three 2-to-1 multiplexers 634, 635, and 636, and an exclusive-OR gate 638. Subrotation module 630 receives group 3, φ14φ15φ16, of the fine-stage rotation bits. AND gate 632 receives as a first input, (π/4)Yk, shifted by 11 and as a second input, bit φ14 of group 3, which is negated at the input. The output of AND gate 632 is provided at the 0 input to multiplexer 634. Multiplexer 634 receives (π/4)Yk shifted by 12 at its 1 input. Multiplexer 634 is controlled by bit φ15 of group 3. Multiplexer 635 receives (3π/4)Yk shifted by 13 at its 0 input and (π/4)Yk shifted by 13 at its 1 input. Multiplexer 635 is controlled by the exclusive-OR of bits φ14 and φ15 of group 3. The output of multiplexer 634 is provided at the 0 input of multiplexer 636 and the output of multiplexer 635 is provided at the 1 input of multiplexer 636. Multiplexer 636 is controlled by bit φ16 of group 3. The output of multiplexer 636 is provided as a first input to exclusive-OR gate 638. Exclusive-OR gate 638 receives bit φ14 of the third three-bit input group as its second input.
Both FIGS. 6A/B depict the use of the additional coarse-stage outputs (3π/4)X and (3π/4)Y. While it will be understood by one of ordinary skill in the art that there would be alternate computation-based techniques to generate the needed (3π/4)X and (3π/4)Y values, (e.g., given the (π/4)X value, (3π/4)X can be obtained with one addition and a hard-wired shift, as (3π/4)X=(π/4)X+2(π/4)X). By using additional (3π/4)X and (3π/4)Y ROM values the system of FIG. 6A/B is computation free, except for the additions that appear in the vertical path down the center of FIG. 6A/B. The cost of storing the (3π/4)X and (3π/4)Y data would be 2×24 32 additional ROM words, which brings the total ROM storage cost up to 6×24=96 words for a system having four ROM address bits, or 6×25=192 words for a five ROM address bit system, etc.
Fine stage magnitude scaling module 680 is configured to provide magnitude scaling for the fine rotation stage 600. The fine stage magnitude scaling module 680 of
Magnitude scaling module 680 includes an AND gate 682 and three 2-to-1 multiplexers 684, 685, and 686. The AND gate 682 receives as a first input, (π/4)Xk, shifted by 11 and as a second input, bit φ8 of group 1, which is negated at the input. The output of AND gate 682 is provided at the 0 input of multiplexer 684. Multiplexer 684 receives (π/4)Xk shifted by 13 at its 1 input. Multiplexer 684 is controlled by bit φ9 of the three-bit input group. Multiplexer 685 receives (π/4)Xk shifted by 12 at its 0 input and (π/4)Xk shifted by 15 at its 1 input. Multiplexer 685 is controlled by the exclusive-OR of bits φ8 and φ9 of group 1. The output of multiplexer 684 is provided at the 0 input of multiplexer 686 and the output of multiplexer 685 is provided at the 1 input of multiplexer 686. Multiplexer 686 is controlled by bit φ10 of group 1.
Adder 642 is configured to rotate input coordinate, Xk, by the value generated by first subrotation module 610. Adder 642 receives as inputs, Xk, the output of exclusive-OR gate 618 of the first subrotation module 610, and the negation of the output of multiplexer 686 of the fine stage magnitude scaling module 680. In an embodiment, adder 642 is a carry-save adder (CSA). Adder 642 may further receive bit φ8 of group 1 of the fine-stage rotation bits as the carry_in value, which causes the XOR 618 output, when negating a signal, to provide a fully two's-complemented value to adder 642.
Adder 644 receives the output from adder 642 and the subrotation value generated by the second subrotation module 620. Thus, adder 644 rotates Xk by the additional subrotation value generated by subrotation module 620. In an embodiment, adder 644 is also a carry-save adder (CSA). Adder 644 may further receive bit φ11 of group 2 of the fine-stage rotation bits as the carry_in value. Similarly, adder 646 receives the output from 644 and the subrotation value generated by the third subrotation module 630. Thus, adder 646 rotates Xk by the additional subrotation value generated by subrotation module 630. In an embodiment, adder 646 is also a carry-save adder (CSA). Adder 646 may further receive bit φ14 of group 3 of the fine-stage rotation bits as the carry_in value.
A final adder 648 receives the output from adder 646. In an embodiment, adder 648 is a carry ripple adder. Adder 648 receives a 1 value as a carry_in input. The carry_in=1 bit completes the ones'-complement negation of the correction value from the fine stage magnitude scaling module, making it into a two's-complement correction. The output of adder 648 is a coordinate, Xfine, of the rotated pair of output values.
Notice that, when implemented using carry-save adders, the total computation cost of the
The circuit 600B of
2.3 Ones' Complement Negation in Fine Stage Rotation Using Excess Four Technique
Since a two's complement negation can be accomplished by starting with a ones' complement negation and including a carry-in bit into the ones' complement result, the conventional excess fours processing on a system for which the odd octants have had the ones' complement operation conditionally performed can be used instead of the two's complement negation. Then, when doing the fine-stage processing, an excess four processor is used for the least-significant three-bit group having the BIAS value set to “bit-3” (i.e., {circumflex over (φ)}3), or some equivalent computation. Thus, this method to implement the conditional two's complement negation requires only that the least-significant three-bit group's excess four processor be built to accommodate both columns of Table 1. This implementation would be applied to the normalized phase accumulator value, since it is this point in the conventional DDS processing where the conditional two's complement negation occurs.
1.00→0.11+(BIAS=1)=1.00 (invariant)
1.01→0.10+(BIAS=1)=0.11 →
1.10→0.01+(BIAS=1)=0.10 →
1.11→0.00+(BIAS=1)==0.01 →
Subrotation module 810 includes an AND gate 812, three 2-to-1 multiplexers 814, 815, and 816, and an exclusive-OR gate 818. Subrotation module 810 receives the group 1, φ8φ9φ10, of the fine-stage rotation bits. AND gate 812 receives as a first input, (π/4)Yk, shifted by 5 and as a second input, φ8cn (referred to herein as the ‘cn’ version or as having been conditionally negated) which is then negated at the input. The conditional negation operation is defined using the equation:
φkcn=φk⊕{circumflex over (φ)}3, for k=8, . . . ,14
The output of AND gate 812 is provided at the 0 input to multiplexer 814. Multiplexer 814 receives (π/4)Yk shifted by 6 at its 1 input. Multiplexer 814 is controlled by φ9cn, Multiplexer 815 receives (3π/4)Yk shifted by 7 at its 0 input and (π/4) Yk shifted by 7 at its 1 input. Multiplexer 815 is controlled by the exclusive-OR of bits φ8 and φ9 of group 1 of the fine-stage rotation bits. In an alternative embodiment, multiplexer 815 could be controlled by the ‘cn’ versions of the control signals (i.e., φ8cn⊕φ9cn) since the XOR output would be unaffected by inverting both inputs.
The output of multiplexer 814 is provided at the 0 input of multiplexer 816 and the output of multiplexer 815 is provided at the 1 input of multiplexer 816. Multiplexer 816 is controlled by φ10cn. The output of multiplexer 816 is provided as a first input to exclusive-OR gate 818. Exclusive-OR gate 818 receives φ8cn as its second input.
Subrotation module 820 includes an AND gate 822, three 2-to-1 multiplexers 824, 825, and 826, and an exclusive-or gate 828. Subrotation module 820 receives group 2, φ11 φ12φ13, of the fine-stage rotation bits. AND gate 822 receives as a first input, (π/4)Yk, shifted by 8 and as a second input, φ11cn which is negated at the input. The output of AND gate 822 is provided at the 0 input to multiplexer 824. Multiplexer 824 receives (π/4)Yk shifted by 9 at its 1 input. Multiplexer 824 is controlled by conditional negation of bit φ12 of Group 2, φ12cn Multiplexer 825 receives (3π/4)Yk shifted by 10 at its 0 input and (π/4)Yk shifted by 10 at its 1 input. Multiplexer 825 is controlled by the exclusive-OR of bits φ11 and φ12 of group 2 of the fine rotation bits and the output of multiplexer 825 is provided at the 1 input of multiplexer 826. In an alternative embodiment, multiplexer 825 could be controlled by the ‘cn’ versions of the control signals (i.e., φ11cn⊕φ12cn) since the XOR output would be unaffected by inverting both inputs.
The output of multiplexer 824 is provided at the 0 input of multiplexer 826 and the output of multiplexer 825 is provided at the 1 input of multiplexer 826. Multiplexer 826 is controlled by the conditional negation of bit φ3 of Group 2, φ13cn. The output of multiplexer 826 is provided as a first input to exclusive-OR gate 828. Exclusive-OR gate 828 receives the conditional negation of bit φ11 Group 2, φ11cn as its second input.
The third (least significant) fine-stage group (830) in
Subrotation module 830 includes an AND gate 832, three 2-to-1 multiplexers 834, 835, and 836, and an exclusive-OR gate 838. Subrotation module 830 receives group 3, φ14φ15φ16, of the fine-stage rotation bits. AND gate 832 receives as a first input, (π/4)Yk, shifted by 11 and as a second input, bit φ14 of group 3, which is negated at the input. The output of AND gate 832 is provided at the 0 input to multiplexer 834. Multiplexer 834 receives (π/4)Yk shifted by 12 at its 1 input. Multiplexer 834 is controlled by bit φ15 of group 3. Multiplexer 835 receives (3π/4)Yk shifted by 13 at its 0 input and (π/4)Yk shifted by 13 at its 1 input. Multiplexer 835 is controlled by the exclusive-OR of bits φ14 and φ15 of group 3. The output of multiplexer 834 is provided at the 0 input of multiplexer 836 and the output of multiplexer 835 is provided at the 1 input of multiplexer 836. Multiplexer 836 is controlled by bit φ16 of group 3. The output of multiplexer 836 is provided as a first input to exclusive-OR gate 838. Exclusive-OR gate 838 receives the conditional negation of bit φ14 of Group 3, φ14cn as its second input.
As illustrated in
Like
Magnitude scaling module 880 includes an AND gate 882 and three 2-to-1 multiplexers 884, 885, and 886. The AND gate receives as a first input, (π/4)Xk, shifted by 11 and as a second input, φ8cn, which is negated at the input. The output of AND gate 882 is provided at the 0 input of multiplexer 884. Multiplexer 884 receives (π/4)Xk shifted by 13 at its 1 input. Multiplexer 884 is controlled by φ9cn. Multiplexer 885 receives (π/4)Xk shifted by 12 at its 0 input and (π/4)Xk shifted by 15 at its 1 input. Multiplexer 885 is controlled by the exclusive-OR of bits φ8 and φ9cn of group 1. In an alternative embodiment, multiplexer 885 could be controlled by the ‘cn’ versions of the control signals (i.e., φ8cn⊕φ9cn) since the XOR output would be unaffected by inverting both of its inputs. The output of multiplexer 884 is provided at the 0 input of multiplexer 886 and the output of multiplexer 885 is provided at the 1 input of multiplexer 886. Multiplexer 886 is controlled by φ10cn.
Adder 842 is configured to rotate input coordinate, Xk, by the value generated by first subrotation module 810. Adder 842 receives as inputs, Xk, the output of exclusive-OR gate 818 of the first subrotation module 810, and the negation of the output of multiplexer 886 of the fine stage magnitude scaling module 880. In an embodiment, adder 842 is a carry-save adder (CSA). Adder 642 may further receive φ8cn as the carry_in value.
Adder 844 receives the output from adder 842 and the subrotation value generated by the second subrotation module 820. Thus, adder 844 rotates Xk by the additional subrotation value generated by subrotation module 820. In an embodiment, adder 844 is also a carry-save adder (CSA). Adder 844 may further receive φ11cn as the carry_in value. Similarly, adder 846 receives the output from 844 and the subrotation value generated by the third subrotation module 830. Thus, adder 846 rotates Xk by the additional subrotation value generated by subrotation module 830. In an embodiment, adder 846 is also a carry-save adder (CSA). Adder 846 may further receive φ14cn as the carry_in value.
A final adder 848 receives the output from adder 846. In an embodiment, adder 848 is a carry ripple adder. Adder 848 receives a 1 value as a carry_in input. The carry_in=1 bit completes the ones'-complement negation of the correction value from the fine stage magnitude scaling module, making it into a two's-complement correction. The output of adder 848 is a coordinate, Xfine, of the rotated pair of output values.
The circuit 800B of
When two's complement negation is being employed to map odd octants into Octant 0, conventional DDS architectures require special processing for the four normalized angles representing π/4, 3π/4, 5π/4, and 7π/4. Basically, this is because there is no ROM data available to represent the sine and/or cosine values for these special angles. (All addresses φ4φ5φ6φ7 for the ROM tables of
2.4 Phase Accumulator Rounding
The following embodiments incorporate phase accumulator rounding into DDS architectures. The excess-four processor's BIAS bit allows processing of one additional LSB (a “½ LSB”) of the phase accumulator. Notice that, because the excess four processor can handle the BIAS bit “in place,” i.e., within the least-significant three-bit group's processing unit (without needing to send out a C_out bit that could ripple toward the MSB), phase accumulator rounding can be achieved without having to deal with a W-bit carry-ripple delay. This phase accumulator rounding can be applied to either a normalized or an un-normalized phase accumulator value. BIAS input is used to do a rounding of the last 3-bit group's fourth (extra) LSB regardless of whether an odd octant or even octant is being processed.
Prior DDS systems did not utilize phase accumulator rounding possibly because of the additional carry-ripple computation that it would entail. The embodiments of the processing system described herein, however, avoid the W-bit carry ripple. Alternatively, phase accumulator rounding may not have been used because it can be shown (perhaps surprisingly) to cause no improvement in the DDS spurious free dynamic range. However, the systems described herein can accomplish phase accumulator rounding with essentially zero computational cost. While not improving the DDS spurs, the phase accumulator rounding does provide a small improvement in the signal-to-noise ratio of the DDS output. Furthermore, the circuit that implements phase accumulator rounding may also be reused to facilitate other desirable design goals.
Subrotation module 1010 includes an AND gate 1012, three 2-to-1 multiplexers 1014, 1015, and 1016, and an exclusive-OR gate 1018. Subrotation module 1010 receives the group 1, φ8φ9 φ10, of the fine-stage rotation bits. AND gate 1012 receives as a first input, (π/4)Yk, shifted by 5 and as a second input, bit φ8 of group 1, which is negated at the input. The output of AND gate 1012 is provided at the 0 input to multiplexer 1014. Multiplexer 1014 receives (π/4)Yk shifted by 6 at its 1 input. Multiplexer 1014 is controlled by bit φ9 of the three-bit input group. Multiplexer 1015 receives (3π/4)Yk shifted by 7 at its 0 input and (π/4)Yk shifted by 7 at its 1 input. Multiplexer 1015 is controlled by the exclusive-OR of bits φ8 and φ9 of group 1 of the fine-stage rotation bits. The output of multiplexer 1014 is provided at the 0 input of multiplexer 1016 and the output of multiplexer 1015 is provided at the 1 input of multiplexer 1016. Multiplexer 1016 is controlled by bit φ10 of group 1. The output of multiplexer 1016 is provided as a first input to exclusive-OR gate 1018. Exclusive-OR gate 1018 receives bit φ8 of group 1 as its second input.
Subrotation module 1020 includes an AND gate 1022, three 2-to-1 multiplexers 1024, 1025, and 1026, and an exclusive-OR gate 1028. Subrotation module 1020 receives group 2, φ11φ12φ13, of the fine-stage rotation bits. AND gate 1022 receives as a first input, (π/4)Yk, shifted by 8 and as a second input, bit φ11 of the three-bit input group, which is negated at the input. The output of AND gate 1022 is provided at the 0 input to multiplexer 1024. Multiplexer 1024 receives (π/4)Yk shifted by 9 at its 1 input. Multiplexer 1024 is controlled by bit φ12 of group 2. Multiplexer 1025 receives (3π/4)Yk shifted by 10 at its 0 input and (π/4)Yk shifted by 10 at its 1 input. Multiplexer 1025 is controlled by the exclusive-OR of bits φ11 and φ12 of group 2 of the fine rotation bits. The output of multiplexer 1024 is provided at the 0 input of multiplexer 1026 and the output of multiplexer 1025 is provided at the 1 input of multiplexer 1026. Multiplexer 1026 is controlled by bit φ13 of group 2. The output of multiplexer 1026 is provided as a first input to exclusive-OR gate 1028. Exclusive-OR gate 1028 receives bit φ11 of group 2 as its second input.
Subrotation module 1030 includes an AND gate 1032, three 2-to-1 multiplexers 1034, 1035, and 1036, and an exclusive-OR gate 1038. Subrotation module 1030 receives group 3, φ14φ15φ16φ17, of the fine-stage rotation bits. AND gate 1032 receives as a first input, (π/4)Yk, shifted by 11 and as a second input, the exclusive-OR of bits φ14 and φ17 of group 3 of the fine rotation bits, which is negated at the input. The output of AND gate 1032 is provided at the 0 input to multiplexer 1034. Multiplexer 1034 receives (π/4)Yk shifted by 12 at its 1 input. Multiplexer 1034 is controlled by the exclusive-OR of bits φ15 and φ17 of group 3 of the fine rotation bits. Multiplexer 1035 receives (3π/4)Yk shifted by 13 at its 0 input and (π/4)Yk shifted by 13 at its 1 input. Multiplexer 1035 is controlled by the exclusive-OR of bits φ14 and φ15 of group 3. The output of multiplexer 1034 is provided at the 0 input of multiplexer 1036 and the output of multiplexer 1035 is provided at the 1 input of multiplexer 1036. Multiplexer 1036 is controlled by the exclusive-OR of bits φ16 and φ17 of group 3. The output of multiplexer 1036 is provided as a first input to exclusive-OR gate 1038. Exclusive-OR gate 1038 receives bit φ14 of the three-bit input group as its second input.
Like
Fine stage magnitude scaling module 1080 is configured to provide magnitude scaling for the fine rotation stage 1000. The fine stage magnitude scaling module 1080 of
Magnitude scaling module 1080 includes an AND gate 1082 and three 2-to-1 multiplexers 1084, 1085, and 1086. AND gate receives as a first input, (π/4)Xk, shifted by 11 and as a second input, bit φ8 of group 1, which is negated at the input. The output of AND gate 1082 is provided at the 0 input of multiplexer 1084. Multiplexer 1084 receives (π/4)Xk shifted by 13 at its 1 input. Multiplexer 1084 is controlled by bit φ9 of the three-bit input group. Multiplexer 1085 receives (π/4)Xk shifted by 12 at its 0 input and (π/4)Xk shifted by 15 at its 1 input. Multiplexer 1085 is controlled by the exclusive-OR of bits φ8 and φ9 of group 1. The output of multiplexer 1084 is provided at the 0 input of multiplexer 1086 and the output of multiplexer 1085 is provided at the 1 input of multiplexer 1086. Multiplexer 1086 is controlled by bit φ10 of group 1.
Adder 1042 is configured to rotate input coordinate, Xk, by the value generated by first subrotation module 1010. Adder 1042 receives as inputs, Xk, the output of exclusive-OR gate 1018 of the first subrotation module 1010, and the negation of the output of multiplexer 1086 of the fine stage magnitude scaling module 1080. In an embodiment, adder 1042 is a carry-save adder (CSA). Adder 1042 may further receive bit φ8 of group 1 of the fine-stage rotation bits as the carry_in value, which causes the XOR 1018 output, when negating a signal, to provide a fully two's complemented value to adder 1042.
Adder 1044 receives the output from adder 1042 and the subrotation value generated by the second subrotation module 1020. Thus, adder 1044 rotates Xk by the additional subrotation value generated by subrotation module 1020. In an embodiment, adder 1044 is also a carry-save adder (CSA). Adder 1044 may further receive bit φ11 of group 2 of the fine-stage rotation bits as the carry_in value. Similarly, adder 1046 receives the output from 1044 and the subrotation value generated by the third subrotation module 1030. Thus, adder 1046 rotates Xk by the additional subrotation value generated by subrotation module 1030. In an embodiment, adder 1046 is also a carry-save adder (CSA). Adder 1046 may further receive bit φ14 of group 3 of the fine-stage rotation bits as the carry_in value.
A final adder 1048 receives the output from adder 1046. In an embodiment, adder 1048 is a carry ripple adder. Adder 1048 receives a 1 value as a carry_in input. The carry_in=1 bit completes the ones'-complement negation of the correction value from the fine stage magnitude scaling module, making it into a two's-complement correction. The output of adder 1048 is a coordinate, Xfine, of the rotated pair of output values.
The circuit 1000B of
As illustrated in FIGS. 10A/B, even less hardware is required for this phase-accumulator rounding operation than that described above in reference to FIGS. 8A/B for getting conditional two's complement negation via ones' complement negation. As compared to the system of FIGS. 6A/B, the fine-stage processing, shown in FIGS. 10A/B, requires just three additional single-bit XOR gates to get MUX control signals that can be shared by both (X and Y) data paths.
2.5 Ones' Complement Negation and Phase Accumulator Rounding
As shown in
DDS further includes an exclusive-OR gate 1125. Exclusive-OR gate 1125 receives as a first input bit, {circumflex over (φ)}3 and as a second input, bits {circumflex over (φ)}4{circumflex over (φ)}5 {circumflex over (φ)}6{circumflex over (φ)}7. The output of exclusive-OR gate 1125, φ4φ5φ6φ7 is provided as input to coarse rotation stage 1136. Coarse rotation stage 1136 outputs coordinates (X, Y), values (πX/4, πY/4) and values (3πX/4, 3πY/4) to fine rotation stage 1138. Fine rotation stage 1138 further receives as input, bits {circumflex over (φ)}8{circumflex over (φ)}9 . . . {circumflex over (φ)}16{circumflex over (φ)}17 and bit {circumflex over (φ)}3. As discussed in further detail below, phase-accumulator fine-stage processing is performed in three-bit groups, where the conditionally inverted fine stage values are used in each three-bit group. In the least-significant three-bit group, however, a BIAS bit that is the ½-LSB is included and the odd/even octant-designating bit (bit-3) is further used to determine whether the stage output is added or subtracted into the data path. FIGS. 12A/B provide a detailed implementation of fine rotation stage 1138.
The output of fine rotation stage 1138 is provided as an input to output stage 1150. Output stage 1150 further receives as input, bits {circumflex over (φ)}1{circumflex over (φ)}2 {circumflex over (φ)}3. The output of output stage 1150 is cos 2πφ and/or sin 2πφ.
Subrotation module 1210 includes an AND gate 1212, three 2-to-1 multiplexers 1214, 1215, and 1216, and an exclusive-OR gate 1218. Subrotation module 1210 receives the group 1, φ8φ9φ10, of the fine-stage rotation bits. AND gate 1212 receives as a first input, (π/4)Yk, shifted by 5 and as a second input, φ8cn (referred to herein as the ‘cn’ version) which is negated at the input. The ‘cn’ version is determined using the equation:
φkcn=φk⊕{circumflex over (φ)}3 for k=8, . . . ,14
The output of AND gate 1212 is provided at the 0 input to multiplexer 1214. Multiplexer 1214 receives (π/4)Yk shifted by 6 at its 1 input. Multiplexer 1214 is controlled by φ9cn. Multiplexer 1215 receives (3π/4)Yk shifted by 7 at its 0 input and (π/4)Yk shifted by 7 at its 1 input. Multiplexer 1215 is controlled by the exclusive-OR of bits φ8 and φ9 of group 1 of the fine-stage rotation bits. In an alternative embodiment, multiplexer 1215 could be controlled by the ‘cn’ versions of the control signals (e.g., φ8cn⊕φ9cn) since the XOR output would be unaffected by inverting both inputs.
The output of multiplexer 1214 is provided at the 0 input of multiplexer 1216 and the output of multiplexer 1215 is provided at the 1 input of multiplexer 1216. Multiplexer 1216 is controlled by φ10cn. The output of multiplexer 1216 is provided as a first input to exclusive-OR gate 1218. Exclusive-OR gate 1218 receives φ8cn as its second input.
Subrotation module 1220 includes an AND gate 1222, three 2-to-1 multiplexers 1224, 1225, and 1226, and an exclusive-OR gate 1228. Subrotation module 1220 receives group 2, φ11φ12φ13, of the fine-stage rotation bits. AND gate 1222 receives as a first input, (π/4)Yk, shifted by 8 and as a second input, φ11cn which is negated at the input. The output of AND gate 1222 is provided at the 0 input to multiplexer 1224. Multiplexer 1224 receives (π/4)Yk shifted by 9 at its 1 input. Multiplexer 1224 is controlled by conditional negation of bit φ12 of Group 2, φ12cn. Multiplexer 1225 receives (3π/4)Yk shifted by 10 at its 0 input and (π/4)Yk shifted by 10 at its 1 input. Multiplexer 1225 is controlled by the exclusive-OR of bits φ11 and φ12 of group 2 of the fine rotation bits and the output of multiplexer 1225 is provided at the 1 input of multiplexer 1226. In an alternative embodiment, multiplexer 1225 could be controlled by the ‘cn’ versions of the control signals (e.g., φ11cn⊕φ12cn) since the XOR output would be unaffected by inverting both inputs.
Multiplexer 1226 is controlled by the conditional negation of bit φ13 of Group 2, φ13cn. The output of multiplexer 1226 is provided as a first input to exclusive-OR gate 1228. Exclusive-OR gate 1228 receives the conditional negation of bit φ11 of Group 2, φ11cn as its second input.
As described above, in the least-significant three-bit group, a BIAS bit that is the ½-LSB is included and the odd/even octant-designating bit (bit 3) is farther used to determine whether the stage output is added or subtracted into the data path. Since the three-bit sub-rotation should perform the conditional rounding before the conditional negation (cn), and since the two's complement negation just requires a negation of the sub-rotation output (see Tables 1 and 2, and see the presence of “cn” in the third stage of
In all four situations, the phase-accumulator rounding operation requires little computation beyond that required by a conventional system using phase-accumulator truncation—essentially, just a few more single-bit XOR gates. Moreover, the absence of a carry-ripple yields a system requiring less computational delay than a conventional two's-complement conditional negation implementation requires.
Subrotation module 1230 includes an AND gate 1232, three 2-to-1 multiplexers 1234, 1235, and 1236, and an exclusive-or gate 1238. Subrotation module 1230 receives group 3, φ1φ14φ15φ16, of the fine-stage rotation bits in addition to bit φ17 (the ½-LSB). AND gate 1232 receives as a first input, (π/4)Yk, shifted by 11 and as a second input, the exclusive-OR of bits φ14 of group 3 of the fine rotation bits and φ17, which is negated at the input. The output of AND gate 1232 is provided at the 0 input to multiplexer 1234. Multiplexer 1234 receives (π/4)Yk shifted by 12 at its 1 input. Multiplexer 1234 is controlled by the exclusive-OR of bits φ15 of group 3 and φ17 of the fine rotation bits. Multiplexer 1235 receives (3π/4)Yk shifted by 13 at its 0 input and (π/4)Yk shifted by 13 at its 1 input. Multiplexer 1235 is controlled by the exclusive-OR of bits φ14 and φ15 of group 3. The output of multiplexer 1234 is provided at the 0 input of multiplexer 1236 and the output of multiplexer 1235 is provided at the 1 input of multiplexer 1236. Multiplexer 1236 is controlled by the exclusive-OR of bits φ16 of group 3 and φ17. The output of multiplexer 1236 is provided as a first input to exclusive-OR gate 1238. Exclusive-OR gate 1238 receives the conditional negation of bit φ14 of Group 3, φ14cn, as its second input.
Magnitude scaling module 1280 includes an AND gate 1282 and three 2-to-1 multiplexers 1284, 1285, and 1286. AND gate receives as a first input, (π/4)Xk, shifted by 11 and as a second input, φ8cn, which is negated at the input. The output of AND gate 1282 is provided at the 0 input of multiplexer 1284. Multiplexer 1284 receives (π/4)Xk shifted by 13 at its 1 input. Multiplexer 1284 is controlled by φ9cn. Multiplexer 1285 receives (π/4)Xk shifted by 12 at its 0 input and (π/4)Xk shifted by 15 at its 1 input. Multiplexer 1285 is controlled by the exclusive-OR of bits φ8 and φ9 of group 1. In an alternative embodiment, multiplexer 1285 could be controlled by the ‘cn’ versions of the control signals (e.g., φ8cn⊕φ9cn) since the XOR output would be unaffected by inverting both inputs. The output of multiplexer 1284 is provided at the 0 input of multiplexer 1286 and the output of multiplexer 1285 is provided at the 1 input of multiplexer 1286. Multiplexer 1286 is controlled by φ10cn.
Adder 1242 is configured to rotate input coordinate, Xk, by the value generated by first subrotation module 1210. Adder 1242 receives as inputs, Xk, the output of exclusive-OR gate 1218 of the first subrotation module 1210, and the negation of the output of multiplexer 1286 of the fine stage magnitude scaling module 1280. In an embodiment, adder 1242 is a carry-save adder (CSA). Adder 1242 may further receive φ8cn as the carry_in value.
Adder 1244 receives the output from adder 1242 and the subrotation value generated by the second subrotation module 1220. Thus, adder 1244 rotates Xk by the additional subrotation value generated by subrotation module 1220. In an embodiment, adder 1244 is also a carry-save adder (CSA). Adder 1244 may further receive φ11cn as the carry_in value. Similarly, adder 1246 receives the output from adder 1244 and the subrotation value generated by the third subrotation module 1230. Thus, adder 1246 rotates Xk by the additional subrotation value generated by subrotation module 1230. In an embodiment, adder 1246 is also a carry-save adder (CSA). Adder 1246 may further receive φ14cn as the carry_in value.
A final adder 1248 receives the output from adder 1246. In an embodiment, adder 1248 is a carry ripple adder. Adder 1248 receives a 1 value as a carry_in input. The carry_in=1 bit completes the ones'-complement negation of the correction value from the fine stage magnitude scaling module, making it into a two's-complement correction. The output of adder 1248 is a coordinate, Xfine, of the rotated pair of output values.
The circuit 1200B of
2.6 Improved DDS Phase Accumulator
Another use for the excess fours processor concerns improving the performance of the DDS phase accumulator. As discussed above, a typical DDS phase accumulator employs a relatively long phase word (e.g., the word length M=32 bits for the examples we have been using here, and M=48 bits has been used in commercial products described in REFs 6 and 7). When incrementing the phase accumulator by adding FCW to it, a long carry-ripple delay can be problematic. A well-known technique for increasing the frequency at which a DDS phase accumulator can be updated is to employ some form of pipelining of the phase accumulator. More details regarding this technique can be found in F. Lu, H. Samueli, J. Yuan, and C. Svensson, “A 700-MHz 24-b pipelined accumulator in 1.2-μm CMOS for application as a numerically controlled oscillator,” IEEE J Solid-State Circuits, vol. 28, pp. 878-886, August 1993 (hereinafter “LU”), J. Vankka and K. Halonen, Direct Digital Synthesizers: Theory, Design and Applications. Dordrecht, Netherlands: Kluwer, 2001, and J. D. Betowski and V. Beiu, “Considerations for phase accumulator design for direct digital frequency synthesizers,” IEEE Int. Coq Neural Networks & Signal Proc., Nanjing, China, Dec. 14-17, 2003, each of which is hereinafter incorporated by reference in its entiretly.
When pipelined, a 32-bit phase accumulator can run at 250-MHz in TSMC 0.18-μm CMOS. In addition to the increased hardware expense incurred by the pipelining circuitry, one residual problem remains: the inherent pipeline-induced delay and/or complexity when one desires to instantaneously change the frequency being generated—by changing FCW. (Instantaneous frequency changing is one of the very desirable capabilities of a DDS; indeed, such a feature is perhaps unique to a DDS, in comparison with other types of oscillators.) When changing to a new FCW value, in a pipelined-phase-accumulator system, it can be a problem that the least-significant part of the phase accumulator must be incremented in a previous output-data cycle to that in which the most-significant part of the phase accumulator is incremented, and solving this and related problems can require additional and more complicated circuitry and/or performance compromises.
The excess fours processor can provide an elegant solution to this phase-accumulator speed-up problem.
Upper phase accumulator half 1310A receives the upper half of the FCW (FCWH) and lower phase accumulator half 1310B receives the lower half of the FCW (FCWL). For a 32-bit FCW, the upper phase accumulator half 1310A will receive the 16 most significant bits of the FCW and the lower phase accumulator half 1310B will receive the remaining bits. Each phase accumulator half includes an adder and a register. Adder 1312B of the lower phase accumulator half 1310B receives as input a portion of the frequency control word (FCWL) and the output of register 1314B. Adder 1312B provides an output to register 1314B. In addition, adder 1312B outputs C_out. C_out, the carry-out bit of the least-significant part, is held in a single-bit register (SBR) 1318. This bit is used on the next phase-accumulator updating cycle as a carry-in bit for adder 1312A of upper phase accumulator half 1310A, which ensures that the sequence of most significant parts of the phase accumulator (along with the bit held in the single-bit register) will always contain the correct values. (The least-significant part of the phase accumulator will, of course, always have the correct value.) Notice that, if one desires to change the DDS frequency instantaneously, at an arbitrary time, it suffices simply to change both halves of the FCW at that time. No pipelining is used and no undesired transient or synchronization issues arise; both upper and lower FCW halves are added into the upper and lower halves of the phase accumulator simultaneously. When the upper FCW half is used to increment the upper half of the phase accumulator, the C_in bit from the single-bit register is included in the normal manner—no special processing is needed.
The upper half of the phase accumulator 1310A is used as the W-bit truncated phase accumulator value. Here, the processing can proceed normally with the exception that the C_out value that gets stored in SBR 1318 must also be included as a part of the truncated phase accumulator value used by the DDS. This is where the excess fours processor elegantly provides the required capability. It processes this bit without requiring additional carry-ripple delays.
Note that while there is no apparent purpose to be served by tracking the normal occurrence of overflows of the phase accumulator during the real-time operation of a DDS, if an application does need this information for some special purpose then subsequent observations will give useful insights and techniques to easily achieve this goal. The one-cycle deferring of the inclusion of the SBR input into the {circumflex over (φ)}H update can, conceivably, delay the appearance of a normal {circumflex over (φ)}H overflow (even though our DDS architecture assures normal behavior with respect to the DDS output sequence).
DDS 1300 further includes a conditional two's complement negation mapping module 1332, coarse rotation stage 1336, augmented excess fours fine-rotation processor 1338, and an output stage 1350. Conditional two's complement negation mapping module 1332 receives bit {circumflex over (φ)}3 and bits {circumflex over (φ)}4{circumflex over (φ)}5 . . . {circumflex over (φ)}16 from the output of register 1314A of the first part of the phase accumulator 1310A as inputs. The conditional two's complement negation mapping module 1332 generates a first output φ4φ5φ6φ7 and a second output φ8φ9 . . . φ16. Coarse rotation stage 1336 receives as input, bits φ4φ5φ6φ7 from the conditional two's complement mapping module 1332. Coarse rotation stage 1336 outputs coordinates (X, Y), values (πX/4, πY/4) and values (3πX/4, 3πY/4) to fine rotation processor 1338. Fine rotation processor 1338 further receives as input, C_out, a bit that is also being sent to the single bit register 1318, bits φ8φ9 . . . φ16 from conditional two's complement negation mapping module 1332, and bit {circumflex over (φ)}3 from the first part of phase accumulator 1310A.
By comparing the system of
The output of fine rotation processor 1338 is provided as an input to output stage 1350. Output stage 1350 further receives as input, bits {circumflex over (φ)}1{circumflex over (φ)}2{circumflex over (φ)}3 from the first part of the phase accumulator 1310A. The output of output stage 1350 is cos 2π{circumflex over (φ)} and/or sin 2πφ.
2.7 Augmented Excess Fours Processor
The augmented excess fours processor for the
The Table 3 fine-stage phase word is again divided into three-bit groups, as in
The BIAS=0 column of Table 3 is addressed by using the φa φb φc bits directly. The BIAS=1 column is addressed by using inverted bits
BIAS=0 applies when C_out=1 and {circumflex over (φ)}3=1;
BIAS=1 applies when C_out=0;
BIAS=2 applies when C_out=1 and {circumflex over (φ)}3=0. (2)
As mentioned above, the mapping of the φa φb φc control bits causes φa to be replaced by φa⊕φb and φb to be replaced by
For example, when φaφb φc=“000,” Table 3 shows that when BIAS=2 the value is treated as “−2” (due to the excess four feature as well as the BIAS). Thus, ideally this case is processed as the BIAS=0 processing would do if the input bit pattern were “010” (see the BIAS=0 entry in Table 3 for that bit pattern). The BIAS=2 specifications above give the results A=0, B=1, and C=0 when φa φb φc=“000.”
The manner in which As might differ from A when BIAS=2 reflects the minor alteration required by the last one or two entries in the rightmost column of Table 3. In Table 3 it is evident that there is just one element in the BIAS=2 column that is outside the usual parameter range for the sub-stage processor, and that is the bottom-row value 5 (101). This value is represented as “8-3” because both the eight and the three can easily be implemented. There is another irregular feature exhibited by the BIAS=2 values of Table 3, and that is the sign of the output produced by the sub-stage processor. Here, output negations for φa φb φa entries having bit patterns 000, 001, and 111 (the top two entries and the “−3” part of the bottom one) are required. There is also the row where φa φb φc=010 for which the output sign could be either plus or minus, since the output is zero. Beyond this, however, the value of four that applies to the next-to-last row of Table 3 may be represented as “8-4.” Then, the higher-order sub-stage processor can be requested to provide a rotation of 8 for both of the bottom two rows. This makes it easier to find a simple expression for the sign-bit control signal for the BIAS=2 column. Namely, φa ⊕φb can now specify the situations in which the application of an output negation is desired. If the reliance on another sub-stage processor's help is limited to just the “8-3” case, then a slightly more complicated logical expression for specifying just the three rows (1, 2, 7) for negation would be required.
The excess four method employs the existence of an excess (four) rotation amount in the rotated values stored in the coarse-rotation ROMs. For a specific rotation sub stage, this amount is an excess rotation in the positive (counter-clockwise) direction of four binary units “100.” This is why, in Table 1 and Table 3, the BIAS=0 column shows the first entry as “−4,” since that is what would be required to compensate for the presence of the excess four amount that was built in. Clearly, the minus part of the −4 rotation the tables call for here is a rotation in the clockwise direction. From this insight we establish that references made throughout this document to “negating the output” of an excess-four sub stage are, in fact, referring to the direction of the rotation being negative, i.e., clockwise.
Notice that, from equation (1) with α>0, the negative “−α” appearing in the first row of the 2×2 matrix causes a positive rotation: the first row computation is X−αY and since both X and Y have positive values throughout Octant 0, it is clear that the result of X−αY is to reduce the positive value of X Similarly, the second-row computation of equation (1) is αX+Y, which shows that the positive value Y is made larger. Obviously, making the X coordinate smaller and making the Y coordinate larger is consistent with a positive rotation in Octant 0. Such insight makes it clear that, for example, the first four BIAS=0 table entries being negative, hence calling for clockwise rotations, dictate that, in equation (3), As=φa=0. This, in terms of equation (1), makes the first row become X+αY (and the second row, −αX+Y)—clearly a negative (clockwise) rotation (increasing X, and decreasing Y).
A final insight resulting from this discussion is that no matter which rotation direction applies, all excess-four sub stage implementations will have opposite negation/no-negation specifications for the control signals applied to the substage output XOR gates (e.g., φ8 is applied to XOR gate 618 while
Subrotation module 1510 includes an AND gate 1512, three 2-to-1 multiplexers 1514, 1515, and 1516, and an exclusive-or gate 1518. Subrotation module 1510 receives the group 1, φ8φ9φ10, of the fine-stage rotation bits. AND gate 1512 receives as a first input, (π/4)Yk, shifted by 5 and as a second input, bit 08 of group 1, which is negated at the input. The output of AND gate 1512 is provided at the 0 input to multiplexer 1514. Multiplexer 1514 receives (π/4)Yk shifted by 6 at its 1 input. Multiplexer 1514 is controlled by bit φ9 of the three-bit input group. Multiplexer 1515 receives (3π/4)Yk shifted by 7 at its 0 input and (π/4)Yk shifted by 7 at its 1 input. Multiplexer 1515 is controlled by the exclusive-OR of bits φ8 and φ9 of group 1 of the fine-stage rotation bits. The output of multiplexer 1514 is provided at the 0 input of multiplexer 1516 and the output of multiplexer 1515 is provided at the 1 input of multiplexer 1516. Multiplexer 1516 is controlled by bit φ10 of group 1. The output of multiplexer 1516 is provided as a first input to exclusive-OR gate 1518. Exclusive-OR gate 1518 receives bit φ8 of group 1 as its second input.
Subrotation module 1520 includes an AND gate 1522, three 2-to-1 multiplexers 1524, 1525, and 1526, and an exclusive-OR gate 1528. Subrotation module 1520 receives group 2, φ11φ12φ13, of the fine-stage rotation hits. AND gate 1522 receives as a first input, (π/4)Yk, shifted by 8 and as a second input, φ11stg3 where φkstg3=φk⊕(φ14 ∩φ15 ∩BIAS0), for k=11, 12 and 13.
The output of AND gate 1522 is provided at the 0 input to multiplexer 1524. Multiplexer 1524 receives (π/4)Yk shifted by 9 at its 1 input. Multiplexer 1524 is controlled by φ12stg3. Multiplexer 1525 receives (π/4)Yk shifted by 10 at its 0 input and (π/4)Yk shifted by 10 at its 1 input. Multiplexer 1525 is controlled by the exclusive-OR of bits φ11 and φ12 of group 2 of the fine rotation bits. The output of multiplexer 1524 is provided at the 0 input of multiplexer 1526 and the output of multiplexer 1525 is provided at the 1 input of multiplexer 1526. Multiplexer 1526 is controlled by φ13stg3. The output of multiplexer 1526 is provided as a first input to exclusive-OR gate 1528. Exclusive-OR gate 1528 receives φ11stg3 as its second input. Note that the modifications to the control bits of subrotation module 1520 yield the needed “8” rotations in Table 3.
Subrotation module 1530 is described above in reference to
BIAS0=C_out∩{circumflex over (φ)}3
BIAS1=
BIAS2=C_out∩
A, B, etc., of equation (3) above can be defined as:
A=(φ14∩BIAS0)∪(
Fine stage magnitude scaling module 1580 is configured to provide magnitude scaling for the fine rotation stage 1500. The fine stage magnitude scaling module 1580 of
Magnitude scaling module 1580 includes an AND gate 1582 and three 2-to-1 multiplexers 1584, 1585, and 1586. AND gate 1582 receives as a first input, (π/4)Xk, shifted by 11 and as a second input, bit φ8 of group 1, which is negated at the input. The output of AND gate 1582 is provided at the 0 input of multiplexer 1584. Multiplexer 1584 receives (π/4)Xk shifted by 13 at its 1 input. Multiplexer 1584 is controlled by bit φ9 of the three-bit input group. Multiplexer 1585 receives (π/4)Xk shifted by 12 at its 0 input and (π/4)Xk shifted by 15 at its 1 input. Multiplexer 1585 is controlled by the exclusive-OR of bits φ8 and φ9 of group 1. The output of multiplexer 1584 is provided at the 0 input of multiplexer 1586 and the output of multiplexer 1585 is provided at the 1 input of multiplexer 1586. Multiplexer 1586 is controlled by bit φ10 of group 1.
Adder 1542 is configured to rotate input coordinate, Xk, by the value generated by first subrotation module 1510. Adder 1542 receives as inputs, Xk, the output of exclusive-OR gate 1518 of the first subrotation module 1510, and the negation of the output of multiplexer 1586 of the fine stage magnitude scaling module 1580. In an embodiment, adder 1542 is a carry-save adder (CSA). Adder 1542 may further receive bit φ8 of group 1 of the fine-stage rotation bits as the carry_in value.
Adder 1544 receives the output from adder 1542 and the subrotation value generated by the second subrotation module 1520. Thus, adder 1544 rotates Xk by the additional subrotation value generated by subrotation module 1520. In an embodiment, adder 1544 is also a carry-save adder (CSA). Adder 1544 may further receive φ11stg3 as the carry_in value. Similarly, adder 1546 receives the output from 1544 and the subrotation value generated by the third subrotation module 1530. Thus, adder 1546 rotates Xk by the additional subrotation value generated by subrotation module 1530. In an embodiment, adder 1546 is also a carry-save adder (CSA). Adder 1546 may further receive As as the carry_in value.
A final adder 1548 receives the output from adder 1546. In an embodiment, adder 1548 is a carry ripple adder. Adder 1548 receives a 1 value as a carry_in input. The carry_in=1 bit completes the ones'-complement negation of the correction value from the fine stage magnitude scaling module, making it into a two's-complement correction. The output of adder 1548 is a coordinate, Xfine, of the rotated pair of output values.
The circuit 1500B of
2.8 Other Split Phase-Accumulator DDS Systems
Upper phase accumulator half 1610A receives the upper half of the FCW (FCWH) and lower phase accumulator half 1610B receives the lower half of the FCW (FCWL). Each phase accumulator half includes an adder and a register. Adder 1612B of the lower phase accumulator half 1610B receives as input a portion of the frequency control word (FCWL) and the output of register 1614B. Adder 1612B provides an output to register 1614B. In addition, adder 1612B outputs C_out. C_out, the carry-out bit of the lower (least-significant) phase-accumulator half, is held in a single-bit register (SBR) 1618. This bit is used on the next phase-accumulator updating cycle as a carry-in bit for adder 1612A of upper phase accumulator half 1610A, which ensures that the sequence of most significant parts of the phase accumulator (along with the bit currently held in the single-bit register) will always contain the correct values.
The upper half of the phase accumulator 1610A is used as the W-bit truncated phase accumulator value. Here, the processing can proceed normally with the exception that the C_out value that gets stored in SBR 1618 must also be included as a part of the truncated phase accumulator value used by the DDS.
DDS 1600 further includes an exclusive-OR gate 1625. Exclusive-OR gate 1625 receives as a first input bit, {circumflex over (φ)}3 and as a second input, bits {circumflex over (φ)}4{circumflex over (φ)}5{circumflex over (φ)}6{circumflex over (φ)}7. The output of exclusive-OR gate 1625, φ4φ5φ6φ7 is provided as input to coarse rotation stage 1636. Coarse rotation stage 1636 outputs coordinate pair (X, Y), values (πX/4, πY/4) and values (3πX/4, πY/4) to excess fours fine rotation “cn” and phase accumulator rounding stage 1638. Fine rotation stage 1638 further receives as input, bits {circumflex over (φ)}8{circumflex over (φ)}9 . . . {circumflex over (φ)}16, bit {circumflex over (φ)}3, and the C_out bit being fed to the single bit register 1618.
The output of fine rotation processor 1638 is provided as an input to output stage 1650. Output stage 1650 further receives as input, bits {circumflex over (φ)}1{circumflex over (φ)}2{circumflex over (φ)}3 from the first part of the phase accumulator 1610A. The output of output stage 1650 is cos 2π{circumflex over (φ)} and/or sin 2π{circumflex over (φ)}.
Subrotation module 1710 includes an AND gate 1712, three 2-to-1 multiplexers 1714, 1715, and 1716, and an exclusive-OR gate 1718. Subrotation module 1710 receives the group 1, φ8φ9φ10, of the fine-stage rotation bits. AND gate 1712 receives as a first input, (π/4)Yk, shifted by 5 and as a second input, φ8cn (referred to herein as the ‘cn’ version) which is negated at the input. The ‘cn’ version is determined using the equation:
φkcn=φk⊕{circumflex over (φ)}3 for k=8, . . . ,14
The output of AND gate 1712 is provided at the 0 input to multiplexer 1714. Multiplexer 1714 receives (π/4)Yk shifted by 6 at its 1 input. Multiplexer 1714 is controlled by φ9cn. Multiplexer 1715 receives (3π/4)Yk shifted by 7 at its 0 input and (π/4)Yk shifted by 7 at its 1 input. Multiplexer 1715 is controlled by the exclusive-OR of bits φ8 and φ9 of group 1 of the fine-stage rotation bits. In an alternative embodiment, multiplexer 1715 could be controlled by the ‘cn’ versions of the control signals (i.e., φ8cn⊕φ9cn) since the XOR output would be unaffected by inverting both inputs.
The output of multiplexer 1714 is provided at the 0 input of multiplexer 1716 and the output of multiplexer 1715 is provided at the 1 input of multiplexer 1716. Multiplexer 1716 is controlled by φ10cn. The output of multiplexer 1716 is provided as a first input to exclusive-OR gate 1718. Exclusive-OR gate 1718 receives φ8cn as its second input.
Subrotation module 1720 includes an AND gate 1722, three 2-to-1 multiplexers 1724, 1725, and 1726, and an exclusive-OR gate 1728. Subrotation module 1720 receives group 2, φ11φ12φ13, of the fine-stage rotation bits. AND gate 1722 receives as a first input, (π/4)Yk, shifted by 8 and as a second input, φ11cn; which is negated at the input. The output of AND gate 1722 is provided at the 0 input to multiplexer 1724. Multiplexer 1724 receives (π/4)Yk shifted by 9 at its 1 input. Multiplexer 1724 is controlled by the conditional negation of bit φ12 of Group 2, φ12cn. Multiplexer 1725 receives (3π/4)Yk shifted by 10 at its 0 input and (π/4)Yk shifted by 10 at its 1 input. Multiplexer 1725 is controlled by the exclusive-OR of bits φ11 and φ12 of group 2 of the fine rotation bits. The output of multiplexer 1724 is provided at the 0 input of multiplexer 1726 and the output of multiplexer 1725 is provided at the 1 input of multiplexer 1726. In an alternative embodiment, multiplexer 1725 could be controlled by the ‘cn’ versions of the control signals (i.e., φ11cn⊕φ12cn) since the XOR output would be unaffected by inverting both inputs.
Multiplexer 1726 is controlled by the conditional negation of bit φ13 of Group 2, φ13cn. The output of multiplexer 1726 is provided as a first input to exclusive-OR gate 1728. Exclusive-OR gate 1728 receives the conditional negation of bit φ11 of Group 2, φ11cn as its second input.
Subrotation module 1730 includes an AND gate 1732, three 2-to-1 multiplexers 1734, 1735, and 1736, and an exclusive-or gate 1738. Subrotation module 1730 receives group 3, φ14φ15φ16, of the fine-stage rotation bits in addition to C_out. AND gate 1732 receives as a first input, (π/4)Yk, shifted by 11 and as a second input, the exclusive-OR of bit φ14 of group 3 of the fine rotation bits and C_out. The output of AND gate 1732 is provided at the 0 input to multiplexer 1734. Multiplexer 1734 receives (π/4)Yk shifted by 12 at its 1 input. Multiplexer 1734 is controlled by the exclusive-OR of bits φ15 and C_out. Multiplexer 1735 receives (3π/4)Yk shifted by 13 at its 0 input and (π/4)Yk shifted by 13 at its 1 input. Multiplexer 1735 is controlled by the exclusive-OR of bits φ14 and φ15 of group 3. The output of multiplexer 1734 is provided at the 0 input of multiplexer 1736 and the output of multiplexer 1735 is provided at the 1 input of multiplexer 1736. Multiplexer 1736 is controlled by the exclusive-OR of bits 016 of group 3 and C_out. The output of multiplexer 1736 is provided as a first input to exclusive-OR gate 1738. Exclusive-OR gate 1738 receives the conditional negation of bit φ14 of Group 3, φ14cn, as its second input.
Magnitude scaling module 1780 includes an AND gate 1782 and three 2-to-1 multiplexers 1784, 1785, and 1786. AND gate receives as a first input, (π/4)Xk, shifted by 11 and as a second input, φ8cn, which is negated at the input. The output of AND gate 1782 is provided at the 0 input of multiplexer 1784. Multiplexer 1784 receives (π/4)Xk shifted by 13 at its 1 input. Multiplexer 1784 is controlled by φ9cn. Multiplexer 1785 receives (π/4)Xk shifted by 12 at its 0 input and (π/4)Xk shifted by 15 at its 1 input. Multiplexer 1785 is controlled by the exclusive-OR of bits φ8 and φ9 of group 1. In an alternative embodiment, multiplexer 1785 could be controlled by the ‘cn’ versions of the control signals (i.e., φ8cn⊕φ9cn) since the XOR output would be unaffected by inverting both inputs. The output of multiplexer 1784 is provided at the 0 input of multiplexer 1786 and the output of multiplexer 1785 is provided at the 1 input of multiplexer 1786. Multiplexer 1786 is controlled by φ10cn.
Adder 1742 is configured to rotate input coordinate, Xk, by the value generated by first subrotation module 1710. Adder 1742 receives as inputs, Xk, the output of exclusive-OR gate 1718 of the first subrotation module 1710, and the negation of the output of multiplexer 1786 of the fine stage magnitude scaling module 1780. In an embodiment, adder 1742 is a carry-save adder (CSA). Adder 1742 may further receive φ8cn as the carry_in value.
Adder 1744 receives the output from adder 1742 and the subrotation value generated by the second subrotation module 1720. Thus, adder 1744 rotates Xk by the additional subrotation value generated by subrotation module 1720. In an embodiment, adder 1744 is also a carry-save adder (CSA). Adder 1744 may farther receive φ11cn as the carry_in value. Similarly, adder 1746 receives the output from 1744 and the subrotation value generated by the third subrotation module 1730. Thus, adder 1746 rotates Xk by the additional subrotation value generated by subrotation module 1730. In an embodiment, adder 1746 is also a carry-save adder (CSA). Adder 1746 may further receive φ14cn as the carry_in value.
A final adder 1748 receives the output from adder 1746. In an embodiment, adder 1748 is a carry ripple adder. Adder 1748 receives a 1 value as a carry_in input. The carry_in=1 bit completes the ones'-complement negation of the correction value from the fine stage magnitude scaling module, making it into a two's-complement correction. The output of adder 1748 is a coordinate, Xfine, of the rotated complex number.
The circuit 1700B of
2.9 Other Phase-Accumulator Splits
The 16/16 two-part split in the 32-bit phase accumulator systems of
Lower portion 1810C includes an adder 1812C coupled to a register 1814C. Adder 1812C receives as inputs, the 11 least-significant bits of the FCW (FCWL) and the output of register 1814C. Adder 1812C outputs C_out1 to single bit register (SBR1) 1818. Middle portion 1810B includes an adder 1812B coupled to a register 1814B. Adder 1812B receives as inputs, 11-bits of the FCW (FCWM), the output of SBR11818 (C_in1), and the output of register 1814B. Adder 1812B outputs C_out2 to single bit register (SBR2) 1819.
Upper portion 1810A includes an adder 1812A coupled to a register 1814A. Adder 1812A receives as inputs, the 10 most-significant bits of the FCW (FCWH), the output of SBR11819 (C_in2), and the output of register 1814A. The outputs of registers 1814A, B, and C are provided to truncation module 1820. Truncation module 1820 is configured to cut off a least significant part of the received input; in this example it cuts off the 16 least significant bits of the phase accumulator output, leaving a truncated normalized rotation angle having 16 bits.
Adder 1912C of lower portion 1910C receives as inputs, FCWL and the output of register 1914C. Adder 1912C outputs C_out1 to single-bit register (SBR1) 1918. This bit is used on the next phase-accumulator updating cycle as an input (C_in1) to adder 1912B of middle portion 1910B. Adder 1912B further receives as input FCWM and the output of register 1914B. Adder 1912B outputs C_out2 to single-bit register (SBR2) 1919. This bit is used on the next phase-accumulator updating cycle as an input (C_in2) to adder 1912A of upper portion 1910A. The output of the phase accumulator may be further truncated. For example, in
DDS 1900 further includes an exclusive-OR gate 1925. Exclusive-OR gate 1925 receives as a first input bit, φs and as a second input, bits {circumflex over (φ)}4{circumflex over (φ)}5{circumflex over (φ)}6{circumflex over (φ)}7. The output of exclusive-OR gate 1925, φ4φ5φ6φ7 is provided as input to coarse rotation stage 1936. Coarse rotation stage 1936 outputs coordinates (X, Y), values (πX/4, πY/4) and values (3πX/4, 3πY/4) to excess fours fine rotation “cn” and phase accumulator rounding stage 1938.
DDS 1900 also includes an AND gate 1980. AND gate 1980 receives as a first input, C_out1 from adder 1912C and as a second input, a special “five. . . ones” output from adder 1912B. The output of AND gate 1980, C_out16, is provided as input to fine rotation stage 1938. Fine rotation stage 1938 further receives as input, bits {circumflex over (φ)}11{circumflex over (φ)}12 . . . {circumflex over (φ)}16, bits {circumflex over (φ)}8{circumflex over (φ)}9{circumflex over (φ)}10, and C_out2 from adder 1912B.
The special “five_ones” output bit of the adder 1912B for {circumflex over (φ)}M in
The above discussion of the implementation of the three-way split (10/11/11) phase accumulator for a DDS can be extended to various other split phase accumulators. While the reduction of a 32-bit phase accumulator to three parts with 10/11/11 bits, resulting in a system having an eleven-bit carry ripple delay, may not seem a practical system, since other processing delays within the overall system may already exceed a delay this short, a three-part split could, however, be a useful means of implementing a 48-bit phase accumulator having three 16-bit phase accumulator parts.
Subrotation module 2110 receives the group 1, φ8φ9φ10, of the fine-stage rotation bits. There are interesting special issues that relate to the implementation of this most-significant φ8φ9φ10-driven sub-stage. The reason for this difference is that the way the phase accumulator has been split in this example causes a C_out2 bit to appear between the MSB fine-rotation sub-stage and the other two fine-rotation sub-stages. A need exists to accommodate the 0/1 possibilities of C_out2 as well as to accommodate the conditional negation that all three sub-stages must deal with. This presents four combinations of cases for this MSB sub-stage 2110, which is a somewhat similar situation to issues encountered in an embodiment described above. Several choices exist as to how the DDS fine stage can be designed, each with their own advantages and disadvantages.
In a first approach, normal excess-fours processing is employed for this sub-stage rotation. This approach is suggested by the treatment of bits φ8φ9φ10 in
Another approach is to use an excess-three stage for this sub-rotation. Then, when {circumflex over (φ)}3=1, Table 3, BIAS=0 processing is required when C_out2=0 and BIAS=1 processing is required when C_out2=1. In both cases, the output result must be negated. Similarly, when {circumflex over (φ)}3=0 Table 3, BIAS=1 processing is required when C_out2=0, and Table 3, BIAS=2 processing is required when C_out2=1. Thus, the “+5” result is needed when φ8φ9φ10=111 and C_out2=1. In the previous encounter with a similar situation, the technique was used wherein 5 was represented as “8−3” and a higher-order fine-stage processor (i.e., the middle processor in the
One solution to the DDS design problem would require a slight increase in coarse-stage ROM storage. Effectively, the ROM would provide the extra “8” rotation when needed. There is actually no difficulty in simply increasing the ROM address (specified by bits φ4φ5φ6φ7) by “0001” to get the extra “8” value needed except for the case when these bits happen to have the value “1111” and in this case we would perhaps be reluctant to modify the octant bits φ1φ2φ3. But this one situation, where C_out2=1, {circumflex over (φ)}3=1, φ8φ9φ10=111, and φ4φ5φ6φ7=1111, could be accommodated by simply having one alternate ROM entry for both X and Y ROMs that would have an appropriate additional offset value incorporated into it that would account for the extra rotation needed due to the C_out2=1 bit. There would be several possible ways to organize the ROM-aided rotation details, but the price, in terms of additional ROM storage, would be minimal and no extra computation in the fine stage would be needed. This would possibly be the best solution for this implementation.
A second approach to solving the problem involves adding extra hardware outside the fine-stage processor. In a manner similar to the generation of the five_ones output of the middle phase accumulator adder, both three_ones and seven_ones outputs for 1912A, the top adder of
With use of the modifications to
BIAS=0 applies when C_out2=0 and {circumflex over (φ)}3=0;
BIAS=1 applies when {circumflex over (φ)}3⊕C_out2=1;
BIAS=2 applies when C_out2=1 and {circumflex over (φ)}3=1.
and
Notice that for BIAS=2, this specifies that all rotations in the rightmost column of Table 3 are now positive (counter-clockwise) except for the top two entries. (And also recognize that the bottom entry in the column is actually a “don't care” for the
Subrotation module 2120 includes an AND gate 2122, three 2-to-1 multiplexers 2124, 2125, and 2126, and an exclusive-or gate 2128. Subrotation module 2120 receives group 2, φ11φ12φ13, of the fine-stage rotation bits. AND gate 2122 receives as a first input, (π/4)Yk, shifted by 8 and as a second conditionally negated input, φ11cn which is negated at the input. The output of AND gate 2122 is provided at the 0 input to multiplexer 2124. Multiplexer 2124 receives (π/4)Yk shifted by 9 at its 1 input. Multiplexer 2124 is controlled by conditionally negated bit φ12 of Group 2, φ12cn. Multiplexer 2125 receives (3π/4)Yk shifted by 10 at its 0 input and (π/4)Yk shifted by 10 at its 1 input. Multiplexer 2125 is controlled by the exclusive-OR of bits φ11 and φ12 of group 2 of the fine rotation bits. The output of multiplexer 2124 is provided at the 0 input of multiplexer 2126 and the output of multiplexer 2125 is provided at the 1 input of multiplexer 2126. In an alternative embodiment, multiplexer 2125 could be controlled by the ‘cn’ versions of the control signals (i.e., φ11cn⊕φ12cn) since the XOR output would be unaffected by inverting both inputs.
Subrotation module 2130 includes an AND gate 2132, three 2-to-1 multiplexers 2134, 2135, and 2136, and an exclusive-OR gate 2138. Subrotation module 2130 receives group 3, φ14φ15φ16, of the fine-stage rotation bits. AND gate 2132 receives as a first input, (π/4)Yk, shifted by 11 and as a second input, φ14Cout16, which is negated at the input. The output of AND gate 2132 is provided at the 0 input to multiplexer 2134. Multiplexer 2134 receives (π/4)Yk shifted by 12 at its 1 input. Multiplexer 2134 is controlled by φ15Cout16 Multiplexer 2135 receives (3π/4)Yk shifted by 13 at its 0 input and (π/4)Yk shifted by 13 at its 1 input. Multiplexer 2135 is controlled by the exclusive-OR of bits φ14 and φ15 of group 3. The output of multiplexer 2134 is provided at the 0 input of multiplexer 2136 and the output of multiplexer 2135 is provided at the 1 input of multiplexer 2136. Multiplexer 2136 is controlled by φ16Cout16 The output of multiplexer 2136 is provided as a first input to exclusive-OR gate 2138. Exclusive-OR gate 2138 receives φ14cn as its second input.
Adder 2142 is configured to rotate input coordinate, Xk, by the value generated by first subrotation module 2110. Adder 2142 receives as inputs, Xk, the output of exclusive-OR gate 2118 of the first subrotation module 2110, and the negation of the output of multiplexer 2186 of the fine stage magnitude scaling module 2180. In an embodiment, adder 2142 is a carry-save adder (CSA). Adder 2142 may further receive AS as the carry_in value.
Adder 2144 receives the output from adder 2142 and the subrotation value generated by the second subrotation module 2120. Thus, adder 2144 rotates Xk by the additional subrotation value generated by subrotation module 2120. In an embodiment, adder 2144 is also a carry-save adder (CSA). Adder 2144 may further receive φ11cn as the carry_in value. Similarly, adder 2146 receives the output from adder 2144 and the subrotation value generated by the third subrotation module 2130. Thus, adder 2146 rotates Xk by the additional subrotation value generated by subrotation module 2130. In an embodiment, adder 2146 is also a carry-save adder (CSA). Adder 2146 may further receive φ14cn as the carry_in value.
A final adder 2148 receives the output from adder 2146. In an embodiment, adder 2148 is a carry ripple adder. Adder 2148 receives a 1 value as a carry_in input. The carry_in=1 bit completes the ones'-complement negation of the correction value from the fine stage magnitude scaling module 2180, making it into a two's-complement correction. The output of adder 2148 is a coordinate, Xfine, of the rotated pair of output values.
The circuit 2100B of
The issue of deciding which of the above solutions to employ when implementing a multi-part split phase accumulator system will necessarily be guided by the relative importance of considerations such as minimizing power consumption, maximizing processing speed, or meeting a combination of such design goals.
2.10 Tracking Phase Accumulator Overflows
While it is not necessary for normal DDS implementations to track phase accumulator overflows, one new issue must be considered if there is a special requirement that one must track the normal occurrence of phase accumulator overflows. It does not necessarily suffice to simply let each of the top two phase accumulator parts 1810A and B be represented by the phase accumulator value ({circumflex over (φ)}H or {circumflex over (φ)}M) and the contents of an associated SBR input. One situation can occur in which the C_out1 value fed to SBR11818 can cause a carry ripple that goes completely through the center part k yielding a carry-out bit into the upper SBR21819. This happens when (and only when) {circumflex over (φ)}M has a value of all ones and when the bit being sent to SBR1, 1818, is a one.
Consider the example in
The special “five_ones” output bit of the adder for {circumflex over (φ)}M in
2.11 Using a Carry-Save Phase Accumulator
One way to speed up the (possibly long) carry-ripple-limited phase-accumulator updating is to use a carry-save adder. If the phase-accumulator value is maintained in carry-save form, then each update, which produces another carry-save result, will involve a delay of just a single one-bit addition. The excess-fours system can be useful in such a system, as will be recognized by one of ordinary skill in the art. In doing this, it may happen that as much, or more, extra time could be required to implement the desired computation of the DDS outputs as might have been saved in the faster phase-updating but the excess-fours processing could help to reduce this time. Various compromises between carry-save, carry-ripple, and excess-fours phase accumulator systems can also be employed.
2.12 Excess-Two and Excess-One Stages
For example, a five-bit stage can be split it into a three-bit stage and a two-bit stage. The two-bit part can be dealt with in the following manner. An excess two system can be used for processing just two fine-stage bits. In this environment, the two-bit group is driven by a two-bit part of the phase accumulator word that coincides with an excess-two coarse-rotation operation—i.e., part of the coarse-stage rotation would include (possibly among other offsets) a position-weighted “10” (two) value in the rotation result stored in the ROM. Then, when processing the relevant two fine-stage bits, the “Excess Two” processing specified in Table 4 can be used.
In a further example, a four-bit stage can be split into two two-bit sub-stages, or can be split into one three-bit sub-stage and another one-bit sub-stage. For the one-bit sub-stage an excess one system can be employed that can use the “Excess One” processing, specified in Table 5, in a manner similar to the other (excess four, three, and two) systems.
Examples of the processing circuits for an excess-two and an excess-one stage are shown in
2.13 Avoiding the Addition of Zero
In the excess-n processors (for n=4, 3, 2, 1) presented herein, there are cases in which an AND gate having an inverted control input is employed to provide the possibility of obtaining a zero output—such output then being added or subtracted from the signal proceeding down the center path in the fine-stage processor circuit (e.g., as in
In the DDS implementation described above, a person of ordinary skill in the art would recognize that the values of the contents of the coarse-rotation ROM could be altered such that they somewhat compensate for approximation errors that occur elsewhere in the system—e.g., within the fine-stage. To make such alterations, various techniques have been employed by DDS designers. As would be known to a person of skill in the art, embodiments disclosed herein could include not just excess-fours DDS in which the “ideal” coarse ROM data are used, but also such DDS in which altered ROM values are used to improve the output accuracy.
The examples shown above have focused on hardware embodiments. Such focus serves well to explain the working details of the new systems. Nonetheless, it is certainly possible to advantageously employ the invention in a software embodiment, as will be recognized by one of ordinary skill in the art. Presently existing DDS technology employing such embodiments would be good candidates for improvements to their operating speed and/or power consumption by use of the methods for DDS improvement explained above. Examples of software platforms in which the above-described invention may be employed are general purpose digital processors and programmable hardware such as field programmable gate arrays (FPGA). The scope and spirit of this disclosure is intended to cover all such software methods of implementation.
Processing unit 2603 may represent a computer, a hand-held computer, a lap top computer, a personal digital assistant, a mobile phone, and/or any other type of data processing device. The type of processing device used to implement the embodiments above is implementation specific.
Processing unit 2603 includes a communications medium 2610 (such as a bus, for example) to which other modules are attached.
Processing unit 2603 also includes one or more processors 2620 and a main memory 2630. Main memory 2630 may be RAM, ROM, or any other memory type, or combinations thereof.
Processing unit 2603 may also include secondary storage devices 2640 such as but not limited to hard drives 2642 or computer program product interfaces 2644. Computer program product interfaces 2644 are devices that access objects (such as information and/or software) stored in computer program products 2650. Examples of computer program product interfaces 2644 include, but are not limited to, floppy drives, CD drives, DVD drives, ZIP drives, JAZ drives, optical storage devices, etc. Examples of computer program products 2650 include, but are not limited to, floppy disks, CDs, DVDs, ZIP and JAZ disks, memory sticks, memory cards, or any other medium on which objects may be stored.
The computer program products 2650 include a computer useable medium 2652 on which objects may be stored, such as but not limited to optical mediums, magnetic mediums, etc.
Control logic or software may be stored in main memory 2630, second storage device(s) 2640, and/or computer program products 2650.
More generally, the term “computer program product” refers to any device in which control logic (software) is stored, so in this context a computer program product could be any memory device having control logic stored therein. The invention is directed to computer program products having stored therein software that enables a computer/processor to perform functions of the invention as described herein.
Processing unit 2603 may also include an interface 2660 which may receive objects (such as data, applications, software, images, etc.) from external entities 2680 via any communications media including wired and wireless communications media. In such cases, objects 2670 are transported between external entities 2680 and interface 2660 via signals 2665, 2675. In other words, signals 2665, 2675 include or represent control logic for enabling a processor or computer to perform the functions of the invention. According to embodiments of the invention, such signals 2665, 2675 are also considered to be computer program products, and the invention is directed to such computer program products.
While various embodiments of the present invention have been described above, it should be understood that they have been presented by way of example only, and not limitation. It will be apparent to persons skilled in the relevant art that various changes in form and detail can be made therein without departing from the spirit and scope of the invention. Thus, the breadth and scope of the present invention should not be limited by any of the above-described exemplary embodiments, but should be defined only in accordance with the following claims and their equivalents.
This application is a continuation-in-part of application Ser. No. 11/938,252, filed Nov. 9, 2007, entitled “Efficient Angle Rotation Configured for Dynamic Adjustment” which claims benefit to Application No. 60/857,778, filed on Nov. 9, 2006, both of which are incorporated herein by reference in their entirety.
Number | Name | Date | Kind |
---|---|---|---|
4896287 | O'Donnell et al. | Jan 1990 | A |
5276633 | Fox et al. | Jan 1994 | A |
5673212 | Hansen | Sep 1997 | A |
5737253 | Madisetti et al. | Apr 1998 | A |
RE36388 | Fox et al. | Nov 1999 | E |
5991788 | Mintzer | Nov 1999 | A |
7203718 | Fu et al. | Apr 2007 | B1 |
7228325 | Willson, Jr. et al. | Jun 2007 | B2 |
7440987 | Song et al. | Oct 2008 | B1 |
7532989 | Torosyan | May 2009 | B1 |
7539716 | Torosyan | May 2009 | B2 |
8131793 | Willson, Jr. | Mar 2012 | B2 |
20060167962 | Torosyan | Jul 2006 | A1 |
20140195579 | Willson, Jr. | Jul 2014 | A1 |
Entry |
---|
Ken Gentile, Direct Digital Synthesis Primer, Analog Devices, May 2003, pp. 1-50. |
Lionel Cordesses, Direct Digital Synthesis: A Tool for Periodic Wave Generation (Part 1), IEEE Signal Processing Magazine, Jul. 2004, pp. 50-54. |
Lionel Cordesses, Direct Digital Synthesis: A Tool for Periodic Wave Generation (Part 2), IEEE Signal Processing Magazine, Sep. 2004, pp. 110-112 and 117. |
Vankka, J., and Halonen, K., “Direct Digital Synthesizers: Theory, Design, Applications,” Boston: Kluwer Academic Publishers, 2001, 218 pages. |
Ahn, Y. et al., “VLSI design of a CORDIC-based derotator,” in Proc. IEEE Int. Symp. Circuits Syst., vol. 2, May 1998, pp. 449-452. |
Analog Devices, CMOS 200 MHz Quadrature Digital Upconverter: AD9856, Rev. C, 2005, 36 pages. |
Avizienis, A., “Signed-digit number representation for fast parallel arithmetic,” IRE Trans. Electron. Comp., vol. EC-10, pp. 389-400, 1961. |
Booth, A.D., “A signed binary multiplication technique,” Quart. J. Mech. Appel. Math., vol. 4, 1951, pp, 236-240. |
Bull, D.R. and Horrocks D.H., “Primitive operator digital filters,” Proc. Inst. Elect. Eng., vol. 138, pt. G, pp. 401-412, Jun. 1991. |
Chen, C.L. and Willson, Jr., A.N., “A trellis search algorithm for the design of FIR filters with signed-powers-of-two coefficients,” IEEE Trans. Circuits and Systems-II, vol. 46, No. 1, pp. 29-39, Jan. 1999. |
Ching, A.Y., “A 12-bit direct digital frequency synthesizer/mixer in 0.8 μm CMOS,” M.S. thesis, Univ. California, Los Angeles, 1999, 50 pages. |
DeCaro, D. et al., “A 380 MHz direct, digital synthesizer/mixer with hybrid CORDIC architecture in 0.25-mm CMOS” IEEE J. Solid-State Circuits, vol. 42, No. 1, pp. 151-160, Jan. 2007. |
DeCaro, D. et al., “A 380MHz, 150mW direct digital synthesizer/mixer,” in 0.25mm CMOS, in IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers, Feb. 2006, 10 pages. |
Dempster, A.G. and Macleod, M.D., “Constant integer multiplication using minimum adders,” IEE Circuits Devices Syst., vol. 141, pp. 407-413, Oct. 1994. |
Dempster, A.G. and Macleod, M.D., “Multiplication by two integers using the minimum number of adders,” in Proc. IEEE Int. Symp. Circuits Syst., May 23-26, 2005, pp. 1814-1818. |
Dempster, A.G. and Macleod, M.D., “Use of minimum-adder multiplier blocks in FIR digital filters,” IEEE Trans. Circuits and Systems-II, vol. 42, No. 9, pp. 569-577, Sep. 1995. |
Fu, D. and Willson, Jr., A.N., “A high-speed processor for digital sine/cosine generation and angle rotation,” in Proc. 32nd Asilomar Conf. Signals, Syst. Comput., vol. 1, Nov. 1998, pp. 177-181. |
Fu, D. and Willson, Jr., A.N., “A two-stage angle-rotation architecture and its error analysis for efficient digital mixer implementation,” IEEE Trans. Circuits and Systems-I, vol. 53, No. 3, pp. 604-614, Mar. 2006. |
Fu, D., “Efficient synchronization architectures for multimedia communications,” Ph.D. dissertation, Univ. California, Los Angeles, 2000, 176 pages. |
Gentile, K., “Digital upconverter IC tames complex modulation,” Microwaves & RF Magazine, Aug. 2000, 6 pages. |
Golub, et al., “Matrix Computations,” Baltimore: Johns Hopkins Press, 1996. |
Gustafsson, O. et al., “Multiplier blocks using carry-save adders,” in Proc. IEEE Int. Symp. Circuits Syst., vol. 2, May 23-26, 2004, pp. 473-476. |
Hartley. R., “Subexpression sharing in filters using canonic signed digit multipliers,” IEEE Trans. Circuits and Systems-II, vol. 43, No. 10, pp. 677-688, Oct. 1996. |
Koren, I., “Computer Arithmetic Algorithms,” 2nd ed., Natick, MA: A.K. Peters. 2002. |
Macleod, M.D. and Dempster, A.G., “A common subexpression elimination algorithm for low-cost multiplierless implementation of matrix multipliers,” Electron. Lett., vol. 40, No. 11, pp. 651-652, May 2004. |
Madisetti, A. et al., “A100-MHz, 16-b, direct digital frequency synthesizer with a 100-dBc spurious-free dynamic range,” IEEE J. Solid-State Circuits, vol. 34, No. 8, pp. 1034-1043, Aug. 1999. |
Madisetti A., “VLSI architectures and IC implementations for bandwidth efficient communications,” Ph.D. dissertation, Univ. California, Los Angeles, 1996, 151 pages. |
Samueli, H., “An improved search algorithm for the design of multiplierless FIR filters with powers-of-two coefficients,” IEEE Trans. Circuits and Systems, vol. 36, No. 7, pp. 1044-1047, Jul. 1989. |
Song, et al., “A Quadrature Digital Synthesizer/Mixer Architecture Using Fine/Coarse Coordinate Rotation to Achieve 14-b Input, 15-b Output, and 100-dBc SFDR,” IEEE J. Solid-State Circuits, vol. 39, No. 11, Nov. 2004, pp. 1853-1861. |
Tan, L. and Samueli, H., “A 200-MHz quadrature frequency synthesizer/mixer in 0.8-mm CMOS,” IEEE J. Solid-State Circuits, vol. 30, No. 3, pp. 193-200, Mar. 1995. |
Torosyan, A. et al., “A 300-MHz quadrature direct digital synthesizer/mixer in 0.25-mm CMOS,” IEEE J. Solid-State Circuits, vol. 38, No. 6, pp. 875-887, Jun. 2003. |
Torosyan, A. et al., “A 300-MHz quadrature direct digital synthesizer/mixer in 0.25-mm CMOS,” in IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers, Feb. 2002, pp. 132-133. |
Volder, J., “The cordic trigonometric computing technique,” IEEE Trans. Computers, vol. EC-8, pp. 330-334, Sep. 1959. |
Wang, S. et al., “Hybrid Cordic algorithms,” IEEE Trans. Computers, vol. 46, pp. 1202-1207, Nov. 1997. |
Willson, et al., “A Direct Digital Frequency Synthesizer with Minimized Tuning Latency of 12ns,” IEEE ISSCC Dig. Tech. Papers, Feb. 2011, pp. 138-139. |
Wu, Y. et al., “A 415 MHz. direct digital quadrature modulator in 0.25-mm CMOS,” in Proc. IEEE Custom Integrated Circ. Conf., May 21-24, 2003, pp. 287-290. |
Xu, F. et al., “Efficient algorithms for common subexpression elimination in digital filter design,” in Proc. IEEE Int. Conf. Acoust. Speech, Signal Processing, vol. 5, May 17-21, 2004, pp. 137-140. |
Xu, F., “On designing high-performance signal processing algorithms for a ring-structured multiprocessor,” Ph.D. dissertation, Univ. California, Los Angeles, 2001. |
Lu, et al., “A 700-MHz 24-b pipelined accumulator in 1.2-μm CMOS for application as a numerically controlled oscillator,” IEEE J. Solid-State Circuits, vol. 28, 878-886, Aug. 1993. |
Betowski, et al., “Considerations for phase accumulator design for direct digital frequency synthesizers,” IEEE Int. Conf. Neural Networks & Signal Proc., Nanjing, China, Dec. 14-17, 2003. |
“1 GSPS Direct Digital Synthesizer with 14-bit DAC,” AD9912 Data Sheet, Analog Devices, Inc., 2007-2010. |
“2.7 GHz DDS-Based AgileRFTM Synthesizer,” AD9956 Data Sheet, Analog Devices, Inc, 2004. |
Torosyan, A., “Direct Digital Frequency Synthesizers: Complete Analysis and Design Guidelines,” University of California, Los Angeles, 2003. |
Number | Date | Country | |
---|---|---|---|
60857778 | Nov 2006 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 11938252 | Nov 2007 | US |
Child | 13205525 | US |