The present invention relates generally to methods and apparatuses for performing finite impulse response (FIR) filter operations, and more particularly to a method and apparatus for performing a finite impulse response filter operation that consumes low power without degrading the FIR filter's performance.
Modem applications require low cost chips containing integrated FIR filters. The widely known and understood FIR filter essentially performs a sum-of-products computation, In Very Large Scale Integration architectures, a fast FIR filter normally contains a set of multipliers to weight the input samples by the filter coefficients and a set of adders to accumulate the multiplier results. Given that hardware multipliers are essentially the addition of multiple partial products, the multiplier is often combined with the adders to make the FIR filter structure.
A well-known technique for accumulating several numbers is to use a carry-save format. A carry-save adder (CSA) does not propagate the (parry in the normal manner but rather stores the carry in a separate vector. Carry-save adders are faster and more efficient than carry-propagate adders (CPA). Carry-save adders arc normally configured with each adder taking four operand vectors and producing two result vectors (a sum and a carry vector).
In
The advantages with carry-save arithmetic include reduced propagation delay, reduced integration area and reduced power dissipation because one adder is eliminated at each multiplier output. A disadvantage is that more flip-flops are needed when pipelining the carry-save filter accumulation paths. These additional flip-flops lead to increased power consumption mainly due to increased current drain on the source of the clock signal.
The present invention is therefore directed to the problem of developing a method and apparatus for performing an FIR filter operation without consuming much power and without any degradation in filter performance while maintaining reduced propagation delay.
The present invention solves this problem by using a partial carry-save format for the filter output representation thereby reducing the number of flip-flops or registers and hence the power. By replacing the least significant bit processing section on the output side of the finite impulse response (FIR) filter with a combined carry-save adder and carry-propagate adder followed by a single register rather than two registers or flip-flops, the present invention reduces the load on the clock and achieves a reduced propagation delay.
To further improve the performance of the FIR filter, the present invention employs a simpler carry-save adder than heretofore was possible by using a single register at the input to each of the carry-save adders in the least significant bit portion rather than two registers or flip-flops, one for the carry and one for the sum. The combination of a reduction of half of the registers or flip-flops and a concomitant replacement of a simpler carry-save adder for each of the carry-save adders results in a significant improvement in the overall filter performance.
One aspect of the present invention involves improving the speed of a multiplier. As the inventors have recognized, the propagation delays through a multiplier circuit are not equal. Therefore, the individual bits of the result will arrive at different times even though the inputs might arrive simultaneously. The reason for this is demonstrated in
The outputs of the carry-save adders 21–26 are accumulated in registers 27, 28. If the output is a 32-bit value, then each register 27, 28 will contain 37 bits. The partial products (PPs) in a multiplier are all shifted before they are added, because a multiplication is essentially a shift-add operation. The least-significant bits (LSBs) of the result are computed from the addition of fewer partial products than the middle bits (e.g., S0 is derived from PP0, whereas S4 is derived from PP0, PP1, PP2, PP3 and PP4). Therefore, the evaluation of S0 is completed earlier than S4 because it has a simpler Boolean function. The early arrival of the least significant bits in a multiplier can be exploited to reduce the number of flip-flops by incorporating the least significant bits of a carry-propagate adder into the least significant bit portions of the multiplier itself, The least significant bits of the carry-propagate adder are simply a ripple-carry adder and the most significant bits are computed using a faster, more parallel structure (such as carry-select, carry-skip or look-ahead adder, for example). By placing the least significant bits of a carry-propagate adder into the multiplier, only a single flip-flop is needed for each of the least significant bits. This concept is demonstrated in
Referring to
In sum, the carry and save registers 49a, 49b which relate to the least significant bits have been replaced with a single register 48 and a carry-propagate adder comprised of full adders 38–40 and half adder 37. The result is a faster, lower power multiplier 30.
According to another aspect of the present invention, the application of the above pipeline retiming in a multiplier is applied to a finite impulse response filter, as shown in
Turning to
The application of pipeline retiming to the finite impulse response filter in
An exemplary embodiment 50 of one aspect of the present invention is drawn in
The least significant bits of every accumulate stage in the filter 50 are replaced by a 3-2 carry-save adder 52 (rather than a 4-2 carry save adder 1 used in
If some speed degradation is acceptable, some of the Cmsb and Smsb bits can also be reduced. One such implementation 60 is shown in
As shown in
For a tree multiplier, it can be shown that some of the most significant bits can be reduced with the structure in
According to another aspect of the present invention, a method for performing a finite impulse response filter operation reduces the number of registers or flip-flops required by receiving a least significant bit input with a single input register. On the most significant bit section, two registers or flip-flops are used, one for a most significant bit carry input and one for a most significant bit sum input. In the adder sections, a carry-save adder coupled to a carry-propagate adder coupled to a single register is employed in each adding stage in a least significant bit portion of the finite impulse response filter. In the most significant bit portion of the finite impulse response filter in each adding stage, a carry-save adder coupled to two flip-flops, one for a carry output and one for a sum outputs is employed.
The above architecture permits the use of a carry-save adder with fewer inputs in the least significant bit portion of the finite impulse response filter than the carry-save adder in the most significant bit portion of the finite impulse response filter. For example, the carry-save adder in the least significant bit portion of the finite impulse response filter may consist of a 3-2 carry-save adder, whereas the carry-save adder in the most significant bit portion of the finite impulse response filter may consist of a 4-2 carry-save adder.
This is a divisional application of the U.S. patent application Ser. No. 09/526,836 filed Mar. 16, 2000 now U.S. Pat. No. 6,687,722.
Number | Name | Date | Kind |
---|---|---|---|
4463439 | Weinberger | Jul 1984 | A |
4769780 | Chang | Sep 1988 | A |
4791601 | Tanaka | Dec 1988 | A |
4799183 | Nakano et al. | Jan 1989 | A |
5253195 | Broker et al. | Oct 1993 | A |
5257217 | Chiu | Oct 1993 | A |
5325320 | Chiu | Jun 1994 | A |
5327368 | Eustace et al. | Jul 1994 | A |
5333119 | Raatz et al. | Jul 1994 | A |
5808928 | Miyoshi | Sep 1998 | A |
5914892 | Wang et al. | Jun 1999 | A |
5943250 | Kim et al. | Aug 1999 | A |
6018758 | Griesbach et al. | Jan 2000 | A |
6615229 | Vijayrao et al. | Sep 2003 | B1 |
6665691 | Hossain | Dec 2003 | B1 |
Number | Date | Country | |
---|---|---|---|
20040117424 A1 | Jun 2004 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 09526836 | Mar 2000 | US |
Child | 10724506 | US |