EFFICIENT IMPLEMENTATION OF CASCADED BIQUADS

Information

  • Patent Application
  • 20160112033
  • Publication Number
    20160112033
  • Date Filed
    October 15, 2014
    10 years ago
  • Date Published
    April 21, 2016
    8 years ago
Abstract
An improved biquad infinite impulse response filter is shown that may be implemented in a very large instruction word digital signal processor as well as in other processing circuitry. The new filter structure modifies the feedback path in the filter, resulting in a significant reduction in execution cycles.
Description
TECHNICAL FIELD OF THE INVENTION

The technical field of this invention is digital signal processing, and more particularly to infinite impulse response filters.


BACKGROUND OF THE INVENTION

One of the most-used digital filter forms is the biquad. A biquad is a second order (two poles and two zeros) Infinite Impulse Response (IIR) filter. It is high enough order to be useful on its own, and because of the coefficient sensitivities in higher order filters the biquad is often used as the basic building block for more complex filters. For instance, a biquad low pass filter has a cutoff slope of 12 dB/octave, useful for tone controls; if a 24 dB/octave filter is needed, you can cascade two biquads and it will have less coefficient sensitivity problems than a single fourth-order design.


Biquads come in several forms. The most obvious, a direct implementation of the second order differential equation





(y[n]=a0*x[n]+a1*x[n−1]+a2*x[n−2]−b1*y[n−1]−b2*y[n−2]),


is called direct form 1 and is shown in FIG. 1.


Direct form 1 is the best choice for implementation in a fixed point processor because it has a single summation point.


We can take direct form I and split it at the summation point as shown in FIG. 2, and then take the two halves and swap them, so that the feedback half (the poles) comes first as shown in FIG. 3. Now one pair of z delays is redundant, storing the same information as the other pair. Merging the two pairs yields the direct form II configuration shown in FIG. 4.


In floating point applications, direct form II is preferred because it reduces memory requirements, and floating point computation is not sensitive to overflow in the way fixed point computations are.


We can improve on this configuration by transposing the filter. To transpose a filter, the signal flow direction is reversed. Output becomes input, distribution nodes become summers, and summers become nodes as shown in FIG. 5. The characteristics of the filter are unchanged, but in this case the floating point characteristics are better. Floating point computation has better accuracy when intermediate sums are with closer values (adding small numbers to large number in floating point is less precise than with similar values).


SUMMARY OF THE INVENTION

An improved biquad filter is that is optimized for wide instruction word digital signal processors. The feedback path of the filter is modified, resulting in significant performance improvements.





BRIEF DESCRIPTION OF THE DRAWINGS

These and other aspects of this invention are illustrated in the drawings, in which:



FIG. 1 shows a direct form 1 biquad filter;



FIGS. 2 and 3 show intermediate forms of the biquad;



FIG. 4 shows a direct form 2 biquad filter;



FIG. 5 is a transposed form 2 biquad;



FIG. 6 illustrates an implementation of a biquad filter on a DSP;



FIG. 7 shows a modified biquad implementation; and



FIG. 8 shows a comparison of prior art and implementation according to this invention.





DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS


FIG. 6 shows the transposed direct form II structure used in some implementations in Texas Instruments Digital Signal Processors (DSP). This implementation requires more than 10 cycles in the feedback path. Three to 6 cycles are used in addition block 601, and 4 cycles in multipliers 602 and 603. As shown in the figure, the feedback path to multipliers 602 and 603 originates at the output 604.



FIG. 7 shows an improved implementation described in this invention. The feedback path to multipliers 702 and 703 originates from the output of storage element 701 instead of the output of summation block 704. The coefficient in multiplier 706 is changed from b1 to b1+a1, and the coefficient in multiplier 707 is changed from b2 to b2+a2. This improvement results in requiring 7 cycles in the overall feedback path, 3 cycles in addition block 705 and 4 cycles in multipliers 702 and 703.



FIG. 8 further demonstrates the implementation of this invention. The signal flow in the prior art is shown in table 1, and Table 2 shows the signal flow with the improved feedback path.











TABLE 1









out = in + d0



d0 = b1 * in + a1 * out + d1



d1 = b2 * in + a2 * out



















TABLE 2









out = in + d0



t1 = (b1 + a1) * in + d1



t0 = a2 * d0



d0 = a1 * d0 + t1



d1 = (b2 + a2) * in + t0










Table 3 shows performance benchmarks of the improved biquad filter executing on Texas Instruments C674x and C66x digital signal processors using single precision 32-bit floating point arithmetic, and Table 4 benchmarks filter performance using mixed/double precision floating point arithmetic on the same digital signal processors.
















TABLE 3






Exclusive
Exclusive








cycle count
cycle count



C674x per
C66x per
C66x/C674x
C674x
C66x
Comments
Comments


Function
biquad
biquad
Improvement
bytes
bytes
C674xx
C66x






















Cascade
4.5
4
1.11x
268
416
Loop
Loop


Biquad





Carried
Carried


1 Channel 2





Dependency
Dependency


stage





Bound 8,
Bound 16,








Resource
Resource








bound is 4
bound is 7









Loop Unroll 2x


Cascade
2.125
1.375
1.35x
1128
904
Loop
Loop


Biquad 2





Carried
Carried


channel 4-





Dependency
Dependency


stage, same





Bound 8,
Bound 10,


coefficient





Resource
Resource








bound is 16
bound is 8


Cascade
2
1.33
1.34x
536
656
Loop
Loop


Biquad 2





Carried
Carried


channel 3-





Dependency
Dependency


stage, same





Bound 10,
Bound 8,


coefficient





Resource
Resource








bound is 12
bound is 7





















TABLE 4






Exclusive
Exclusive






cycle count
cycle count


Cascaded
C674x per
C66x per
C66x/C674x
Comments
Comments


Biquad
biquad
biquad
Improvement
C674xx
C66x




















1 Channel 2
4.5
4
1.11x
Loop Carried
Loop Carried


stage Single



Dependency
Dependency


Precision



Bound 8,
Bound 16,






Resource
Resource






bound is 4
bound is 7







Loop Unroll 2x


1 Channel 2
9.75
4
2.4x
Loop Carried
Loop Carried


stage, same



Dependency
Dependency


coefficient,



Bound 37,
Bound 10,


Mixed/Double



Resource
Resource


Precision



Bound is 32
Bound is 10






Loop Unroll 2x


1 Channel 3
15.33
3.33
4.6x
Loop Carried
Loop Carried


stage, same



Dependency
Dependency


coefficient,



Bound 20,
Bound 8,


Mixed/Double



Resource
Resource


Precision



Bound is 24
Bound is 9


2 Channel 2
15.25
3.5
4.36x
Loop Carried
Loop Carried


stage, same



Dependency
Dependency


coefficient,



Bound 17,
Bound 7,


Mixed/Double



Resource
Resource


Precision



Bound is 32
Bound is 14








Claims
  • 1. A method of performing infinite impulse response filtering, the method comprising the steps of: computing the filter output by setting out=in+d0t1=(b1+a1)*in+d0t0=a2*d0d0=a1*d0d1=(b2+a2)*in+t0where a1, a2, b1, b2 are coefficients and d0, d1, t0, t1 are intermediate results.
  • 2. The method of claim 1, wherein: the output is computed using a digital signal processor.
  • 3. The method of claim 1, wherein: the digital signal processor is a very long instruction word type of digital signal processor.
  • 4. An apparatus for performing infinite impulse response filtering, the apparatus comprising: a digital signal processor operable to compute the filter output by performing the following steps: out=in+d0t1=(b1+a1)*in+d0t0=a2*d0d0=a1*d0d1=(b2+a2)*in+t0where a1, a2, b1, b2 are coefficients and d0, d1, t0, t1 are intermediate results.
  • 5. The apparatus of claim 4, wherein: the digital signal processor is a very long instruction word type of digital signal processor.