Nonuniform oversampled filter banks for audio signal processing

Information

  • Patent Grant
  • 7346137
  • Patent Number
    7,346,137
  • Date Filed
    Friday, September 22, 2006
    18 years ago
  • Date Issued
    Tuesday, March 18, 2008
    16 years ago
Abstract
A non-uniform filter bank is created by joining sections of oversampled uniform filter bands that are based on complex exponential modulation (as opposed to cosine modulation). Each filter bank handles a given, non-overlapping frequency band. The bands are not of uniform bandwidth, and the filters of different banks have different bandwidths. The frequency bands of the different filter banks cover the frequency of interest with gaps in the neighborhoods of the filter band edges. A set of transition filters fills those gaps.
Description
BACKGROUND OF THE INVENTION

This application is a continuation of U.S. patent application Ser. No. 11/233,322, filed Sep. 22, 2005, which is a continuation of U.S. Pat. No. 6,996,198 issued Feb. 7, 2006, and it claims priority under provisional application No. 60/243,811 filed on Oct. 27, 2000.


This invention relates to filters and, more particularly, to filter banks for audio applications.


In many areas of audio processing, it is desirable to analyze an audio signal in approximately the same time-frequency form as the human ear (i.e., with bandwidths on the order of one Bark) and with a time resolution that is commensurate with the bandwidth of the filter. In addition, it is desirable to process the signals in each of the bands and then reconstruct them in a manner such that when the bands are unmodified, the filter bank has a nearly perfect reconstruction characteristic. Because the signals might be modified, and different bands might be routed to different devices, not only must the filters provide approximately exact reconstruction, they must also prevent aliasing due to the unequal processing, or modification, of adjacent frequency bands. Hence, an oversampled filter bank is required where aliasing introduced due to unequal processing of bands is below the level of human hearing.


One application for this kind of filter bank is found for the problem of separating parts of an individual audio signal into its directs and indirect parts for the purpose of rerouting, in real time, the direct and indirect signals to drivers that reproduce them appropriately. In such an application, a filter bank that approximates the critical bandwidths is essential to identifying the part of a signal with direct cues, and the ability of the reconstruction filter bank to prevent substantial aliasing when adjacent bands are added incoherently (as opposed to coherently) is also an absolute requirement. Hence the need for an oversampled critical band filter bank. In applications that require nonuniform division of signal spectrum, iterative cascaded of uniform filter banks are often used. Iterated filter banks, however, impose considerable structure on the equivalent filters, which results in inferior time-frequency localization compared to direct designs. A study of critically sampled nonuniform filter banks has been reported by J. Princen in “The Design of Nonuniform Filter Banks,” IEEE Transactions on Signal Processing, Vol. 43, No. 11, pp. 2550-2560, November 1995. Nonuniform filter banks studied by Princen are obtained by joining pseudo QMF filter bank sections that are nearly perfect reconstruction filter banks based on cosine modulation and the principle of adjacent channel aliasing cancellation. R. Bernardini et al published “Arbitrary Tilings of the Time-Frequency Plane using Local Bases,” IEEE Transactions on Signal Processing, Vol. 47, No. 8, pp. 2293-2304, August 1999, wherein they describe a cosine-modulation-based structure that allows for time-adaptive nonuniform tiling of the time-frequency plane. Despite their many fine features that are relevant to coding purposes, however, these approaches do not have good aliasing attenuation properties in each of the subbands independently. This makes them unsuitable for tasks where processing effects need to be contained within the bands directly affected. Perfect, or nearly perfect, reconstruction properties of these filter banks in the presence of upsampling are also not clear. The pseudo QMF bank, for instance, loses its aliasing cancellation property if the subband components are not critically downsampled.


Oversampled uniform filter banks based on cosine modulation were studied by Bolceskei et al, and reported in “Oversampled Cosine Modulated Filter Banks with Perfect Reconstruction, IEEE Transaction on Circuits and Systems II, Vol. 45, No. 8, pp. 1057-1071, August 1998, but the cosine modulation places stringent aliasing attenuation requirements.


SUMMARY OF THE INVENTION

An advance in the art is attained with non-uniform filter banks created by joining sections of oversampled uniform filter banks that are based on complex exponential modulation (as opposed to cosine modulation). Each filter bank handles a given, non-overlapping frequency band, and the filters of different banks have different bandwidths. The frequency bands of the different filter banks cover the frequency of interest with gaps in the neighborhoods of the filter band edges. A set of transition filters fills those gaps.





BRIEF DESCRIPTION OF THE DRAWING


FIG. 1 shows an analysis filter bank stage, followed by a signal processing stage and a synthesis filter bank stage;



FIG. 2 shows the FIG. 1 arrangement, modified with a subsampling element interposed between the output of each analysis filter bank and the signal processing stage, and an upsampling element interposed between the signal processing stage and each synthesis filter bank;



FIG. 3 shows the frequency response of the filter arrangement disclosed herein; and



FIG. 4 presents a block diagram of the filter arrangement disclosed herein.





DETAILED DESCRIPTION

In accordance with the principles disclosed herein, nonuniform, oversampled filter banks are obtained by joining section of different uniform filter banks with the aid of transition filters. The uniform filter banks that are used are nearly perfect-reconstruction, oversampled, modulated filter banks.



FIG. 1 shows an analysis filter bank 100, followed by a processing stage 200, and a synthesis filter bank 300. Analysis filter bank 100 is shown with N filter elements (of which elements 11, 12, 13 are explicitly shown), where each is a modulated version of a given window function ν[n]. That is, the time response of the ith filter in the bank is:














h
i



[
n
]


=


v


[
n
]






j



2

π

N



(

i
-
1

)


n








i
=
1

,
2
,





,

N
.








(
1
)








The term








j



2

π

N



(

i
-
1

)


n






creates filters that are offset in frequency from one another, through a complex modulation of the fundamental filter function ν[n], and some artisans call these filters “modulates” of the filter ν[n].


It can be shown that, when there is no subsampling in the channels of such a filter bank (and, indeed, analysis filter 100 shows no subsampling), a necessary and sufficient condition for perfect and numerically stable reconstruction of any signal, x, from its subband components yi, where











y
i

=



k








x


[
k
]





h
i



[

n
-
k

]





,




(
2
)








is given by










0
<
A





i
=
0


N
-
1











V


(



j


(

ω
-



2

π

N


i


)



)




2



B
<


,

ω



(


-
π

,
π

)

.






(
3
)







In accord with the principles disclosed herein, however, a subclass of windows is considered that satisfy the equation (3) condition with the caveat that A=B=N; which reduces equation (3) to:













i
=
0


N
-
1











V


(



j


(

ω
-



2

π

N


i


)



)




2


=


N





for





all





ω




(


-
π

,
π

)

.






(
4
)








The power complementarity condition of equation (4) permits good control over the effects of processing in the subband domain, and also assures stability.


When the analysis filters satisfy this condition, the norm of the input signal is related to the norm of the corresponding subband components yi as follows:












x


2




1
N





i











y
i



2

.







(
5
)








It is noted that the considered signals belong to the space of square summable sequences, so the norm of x corresponds to








x


=



(



n










x


[
n
]




2


)


1
2


.





If some processing modifies the subband components yi to yi+ei (that is, the input signals to filter bank 300 in FIG. 1 are the signals (yi+ei), the total distortion in the signal x that is synthesized from the modified subband components is bounded by













e
x



2




1
N






i
=
0


N
-
1












e
i



2

.







(
6
)








On the other hand, if the window ν satisfies the looser form of the perfect reconstruction condition given by equation (3), the distortions in the subband components and the distortion in the synthesized signal are related as











1
B






i
=
0


N
-
1











e
i



2








e
x



2




1
A






i
=
0


N
-
1












e
i



2

.







(
7
)








Equation (7) indicates that the distortion in the synthesized signal may grow considerably out of proportion when A is small. Thus, the distortion limit of equation (6) is one advantage that arises from adopting the power complementarity condition of equation (4). Another advantage that arises from adopting the power complementarity condition of equation (4) is that an input signal can be perfectly reconstructed using a synthesis filter bank that consists of filters








V
~



(



j


(

ω
-



2

π

N


i


)



)


,





which are time-reversed versions of the analysis filters







V


(



j


(

ω
-



2

π

N


i


)



)


.





It is such filters that are depicted in filter bank 300 of FIG. 1 (with designation V(z) corresponding to the time-reversed version of filter V(z)). That provides the convenience of not having to deal with the design of a synthesis window. Thus, the synthesis filter shown in FIG. 1 depicts filters 31, 32, and 33 that are the time-reversed versions of corresponding analysis filters 11, 12, and 13. It may be noted that the power complementarity condition of equation (4) also holds for the synthesis filter; i.e.,










i
=
0


N
-
1











V


(



j


(

ω
-



2

π

N


i


)



)




2


=


N





for





all





ω




(


-
π

,
π

)

.







FIG. 2 shows an analysis filter bank where, following each of the filters there is an associated subsampling circuit. That is, circuits 21, 22, and 23 follow filters 11, 12, and 13, respectively. Correspondingly, in the synthesis filter bank there are upsampling circuits 24, 25 and 26 that respectively precede filters 31, 32, and 33. In the case of the FIG. 2 arrangement where there is subsampling by K in the analysis channels, and the input signal is reconstructed from the subband components using time-reversed versions of the analysis filters preceded by K-factor upsampling, the reconstructed signal, in the Fourier domain, is given by











X
r



(




)


=



1
K






i
=
0


N
-
1











V


(



j


(

ω
-



2

π

N


i


)



)




2



+


1
K






k
=
1


K
-
1









X


(



j


(

ω
-



2

π

K


k


)



)





A
k



(




)










(
8
)








where Ak(e) are aliasing components











A
k



(




)


=




i
=
0


N
-
1









V


(



j


(

ω
-



2

π

N


i


)



)





V


(



j


(

ω
-



2

π

N


i

-



2

π

K


k


)



)


.







(
9
)








Based on M. R. Portnoff, “Time-Frequency Representation of Digital Signals and Systems Based on Short-Time Fourier Analysis,” IEEE Transactions on Acoustics Speech and Signal Processing, Vol. 28, No. 1, pp. 55-69, February 1980; and Z. Cvetkovic, “On Discrete Short-Timer Fourier Analysis,” IEEE Transaction on Signal Processing, Vol. 48, No. 9, pp. 2628-2640, September 2000, it can be shown that the aliasing components reduce to zero, and a perfect reconstruction condition is attained, if the window satisfies the constraint:













j








v


[

k
+
jK

]




v


[


k


+
iN
+
jK

]




=


1
K



δ


[
i
]




,

k
=
0

,
1
,








K

-
1





(
10
)








where













δ


[
i
]


=
1





when





i

=
0











=
0




otherwise






(
11
)







As indicated above, the windows that are considered herein are those that satisfy the power complementarity condition in equation (4), and provide a nearly perfect reconstruction by having high enough attenuation for ω>2π/K. That is, there is no significant aliasing contribution due to subsampling; i.e.,











V


(




)




V


(




-


2

π

K



)





0





for





all






ω
.






(
12
)








This makes aliasing sufficiently low in each subband independently.


In order to facilitate design of transition filters, it is convenient to deal with windows that also have sufficiently high attenuation for frequencies ω>2π/N. Sufficiently high attenuation for that purposes means that in the power complementarity formula of equation (4), for any 2πi/N<ω<2π(i+1)/N, the only significant contribution to equation (4) comes from a filter having a center frequency at 2πi/N and the filter having a center frequency at 2π(i+1)/N, the contribution of other filters is negligible. Stated in other words, a filter spills energy into the band of no other filters except, possibly, the immediately adjacent filters. Stated in still other words, the attenuation of a filter centered at 2πi/N at frequencies (2π(i−1))/N>ω>(2π(i+1))/N is greater than a selected value.


The above addresses uniform filters, with subsampling of K. One aspect of the arrangement disclosed herein, however, is the use of filters that are not necessarily of uniform bandwidth, where a signal can be analyzed with a time resolution that is commensurate with the bandwidth of the filters, and have these filters be such that a nearly perfect reconstruction of the signal is possible. Consider, therefore, a first filter section, such as section 100 in FIG. 2, with subsampling of K1, and a window function








v
1



(

V


(



j


(

ω
-



2

π


N
1



i


)



)


)


,





and a second filter section, such as section 100 in FIG. 2, with upsampling of K2, and a window function








v
2



(

V


(



j


(

ω
-



2

π


N
2



i


)



)


)


.





Consider further that window ν1 filters are employed below the “break-over” frequency ω0 and windows ν2 are employed above ω0. Consequently, there is a gap in the frequency response function, as shown by the filters on lines 10 and 30 in FIG. 3 (where, illustratively, N1=16 and N2=12) and the resulting gap region 40. This gap is filled with a transition analysis filter V1,2(ej(ω+ω0)) having a window ν1,2.


The shape of window ν1,2 is designed to provide for the aforementioned near perfect reconstruction of a signal (in the absence of processing between the analysis filters and the synthesis filters). When the transition analysis filter is chosen to be subsampled at rate K2 (and the transition synthesis filter upsampled at rate K2) the above constraint also means that the shape of window ν1,2 satisfies the expression













1

K
1







i
=
0



n
1

-
1












V
1



(



j


(

ω
-



2

π


N
1



i


)



)




2



+


1

K
2








V

1
,
2




(



j


(

ω
-

ω
0


)



)





2







+


1

K
2







i
=

n
2




N
2

-
1












V
2



(



j


(

ω
-



2

π


N
2



i


)



)




2





R

,




(
13
)








where R is a constant. This can be achieved by selecting the ν1,2 window to have the shape of the ν2 at positive frequencies, and the shape of the ν1 at negative frequencies; but the latter being scaled by √{square root over (N1/N2)}. That is the arrangement shown on line 30 of FIG. 3. The physical filter arrangement is shown in FIG. 4, with filter bank 150 providing the response of line 10 in FIG. 3, filter bank 160 providing the response of line 30 in FIG. 3, and filter 170 providing the response of line 20 in FIG. 3.


In a relatively simple embodiment, the subsampling factors (K1, K2) and window bandwidths (2π/N1, 2π/N2) are selected so that N1/K1=N2/K2=R, and the “break-over” frequency, ω0, is chosen to satisfy the condition

n1/N1=2πn2/N20;  (14)

where n1 and n2 are integers. In FIG. 3, for example, N1=16, N2=12, n1=4 and n2=3. Filter bank 150, in accordance with the above, includes filters V1(ze−j2πi/N1), where i=0,1,2, . . . , n1−1, . . . , N1−n1+1, N1−n1+2, . . . , N1−1, and filter bank 170 includes filters V2(ze−j2πi/N2), where i=n2+1, n2+1, . . . , N2−n2−1.


Based on the above, one can conclude that in order to have nearly perfect reconstruction in a nonuniform filter bank that comprises uniform filter bank sections, the window of each of the uniform filter banks should have sufficiently high attenuation for ω greater than 2π divided by the respective value of K of the filter, and also sufficiently high attenuation for ω greater than 2π divided by the respective value of N of the filter. The attenuation for ω>2π/K controls aliasing, while the high attenuation for ω>2π/N facilitates design of transition filters for attaining nearly perfect reconstruction with the nonuniform filter bank. Roughly speaking, the attenuation at ω=2π/N, and twice that attenuation at ω=2π/K, expressed in decibels, have a comparable effect on the error in the approximation to equation (13).


To design a window that has a high attenuation in the band π/N to π, one can simply impose the requirement that the integral of the energy in that band should be minimized. As indicated above, some spilling of energy close to π/N is permissible, but it is considered important that the attenuation that is farther removed from π/N should be large. To achieve this result, it is suggested that the function to be minimized might be one that accentuates higher frequencies. For example, one might choose to minimize the weighted integral










E
N

=




π


/


N

π







V


(




)




2



ω
3





ω

.







(
15
)








This energy function is given by the quadratic form

ENTTν,  (16)

where ν is the window (time) function in the form of a column vector and the matrix T comprises the elements:











[
T
]


i
,
j


=




π


/


N

π




ω
3



cos


(


(


-
j

)


ω

)










ω

.







(
17
)








The window design requires minimization of the quadratic form in equation (16) under the power complementarity condition which, expressed in the time domain, takes the form of the following set of quadratic constraints:












n








v


[
n
]




v


[

n
+







N


]




=


δ


[

]


.





(
18
)








This is a perfectly valid design approach, but it can be numerically very extensive and hard to implement for long windows. A simpler and faster approach to window design is found in an article by the inventor herein, Z. Cvetkovic, “On Discrete Short-Time Fourier Analysis,” IEEE Transactions on Signals Processing, Vol. 48, No. 9, September 2000, pp. 2628-2640, which is hereby incorporated by reference, and is briefly described below. This simpler design approach represents a window using linear combination of discrete prolate spheriodal sequences that are eigen vectors of matrices SL(α), with matrix elements:

[SL(α)]i,j=sin((i−j)α)/(i−j)π, 1≦i, j≦L, α=π/N.  (19)

Given a column vector ν, the quadratic form νTSL(α)ν gives energy of ν in the frequency band (0, π/N). The eigen vectors ρ0, ρ1, . . . , ρL−1 of SL(α) are orthogonal, and corresponding eigenvalues λ0, λ1, . . . , λL−1 are distinct and positive. Sorting the eigenvectors so that λ1i+1, a window to be designed is constructed from a linear combination for the first L/N+k+1 eigenvectors,









v
=




i
=
0



L


/


N

+
k









a
i



ρ
i







(
20
)








where we take k to be a number between 5 and 10. The constraints of equation (18) translate into the following set of constraints in the expansion coefficients,














l
,

m
=
0




L


/


N

+
k









c
lm

(
k
)




a
l



a
m



=

δ


[
k
]



,

k
=
1

,
2
,





,


L


/


N

-
1





(
21
)








where










c
lm

(
k
)


=



n









ρ
l



[
n
]






ρ
m



[

n
+
kN

]


.







(
22
)








The design then amounts to finding the minimum of the quadratic form aTTaa under the constraints of equation (21), where

a=[α0 . . . α(L/N+k)]T,   (23)

Ta=rTTr, and r is the matrix of the first L/N+k+1 eigenvectors ρi,

r=[ρ0ρ0 . . . ρL/N+k].  (24)


A transition window ν1,2 for joining a uniform section of an N1-channel filter bank based on a window ν1 at low frequencies with a uniform section of an N2-channel filter bank based on a window ν2 at high frequencies is designed by approximating, as closely as possible, the frequency response of ν1 at negative frequencies, and the frequency response of ν2 at positive frequencies. This amounts to a minimizing energy function Etr given by










E
tr

=




π
0









V
1



(




)


-


V

1
,
2




(




)





2








ω



+



0
π








β







V
2



(




)



-


V

1
,
2




(




)





2








ω








(
25
)








where β=√{square root over (N1/N2)}, and carrying out the design process as discussed above.


It is noted that the choice of ω0 is not completely unconstrained because, if one wishes the relationship







ω
0

=


2

π






n
1



N
1







to hold, where both n1 and N1 are integers; and this is particularly true when the value of N1 is constrained by other design preferences. This, however, is not an absolute limitation of the design approach disclosed herein because as long as a gap is left between filter banks (like gap 40) and a transition filter is designed that meets the requirements of equation (13), near perfect reconstruction performance is attainable.


The above disclosed an arrangement where the frequency band of interest is divided into two bands, with each of the two bands being handled by one filter bank, and with a single transition filter between the two filter banks. It should be realized that this design approach could be easily extended to a non-uniform filter bank that includes any number of uniform filter banks, with each uniform filter bank segments having a lower cutoff frequency, ωilower, and an upper cutoff frequency, ωiupper and the number of filters in the bank being dictated by the two cutoff frequencies and the desired bandwidth, 2π/Ni.


The above disclosed an approach for creating a filter bank that performs a non-uniform decomposition of a signal having a given bandwidth, by means of an example where two different filter bank sections are joined using a transition filter. A more general non-uniform filter bank can be created by joining non-overlapping sections of any number of uniform filter banks, using a plurality of transition filters. Many variations can be incorporated by persons skilled in the art based on the disclosed approach, without departing from the spirit and scope of this invention. For example, assume that three bands are desired, with “break-over” frequencies ω0 and ω1. A set of constants can be selected so that 2πn1/N1=2πn2,1/N20 and 2πn2,2/N2=2πn3/N31; filters V1, V2, V3 can be selected, together with filters V1,2 and V2,3 can be designed, as described above.

Claims
  • 1. An arrangement comprising: a collection of filters indexed by i, where i=0, 1, 2, . . . n1, wherein n1 is an integer less than N1, each having an input that is responsive to a different signal, where filter i has the transfer function
  • 2. The arrangement of claim 1 further comprising a second collection of filters, indexed by i, where i=n2+1, n2+2, . . .
  • 3. A non-uniform filter bank comprising: a B plurality of filter bank sections, each spanning a chosen non-overlapping frequency band, where filter bank section j includes a filter Vj of bandwidth 2π/Nj and nj of its modulates, each responsive to a different signal and each developing an output signal, with said nj output signals combined to form an output of section j; andat least one transition filter for joining two adjacent filter bank sections j=k and j=k+1, that is a modulate of a filter Vk,k+1 with a shape of Vk+1 at positive frequencies and a shape of Vk at negative frequencies, with the latter scaled by √{square root over (Nk/Nk+1)}.
US Referenced Citations (1)
Number Name Date Kind
5568142 Velazquez et al. Oct 1996 A