The present invention relates to a system and method for low-delay filtering with low computational complexity, and more particularly to noise reduction in an In-car Communication System that includes efficient low-delay filtering.
In an In-Car Communication (ICC) system that supports speech communication between passengers, the signal recorded by the microphone(s) is processed and played back over one or more loudspeakers within the car. Communication can often be improved via noise reduction. Noise reduction typically involves a high processing delay. If the processing delay is too large, the signal may sound reverberant, similar to being in a bath room. Such a large delay is unsuitable for an ICC system.
Noise reduction is usually performed by filtering in the frequency-domain. With standard filterbank-based frequency domain methods, the low-delay requirements are typically not met because the filterbank introduces a relatively large delay. A common way to implement low-delay filtering is the so called overlap-save method. However, implementing time-invariant filtering using the overlap-save method is computationally expensive compared to commonly used filterbank methods. Methods have been proposed to reduce the complexity, however these methods are achieved at the cost of introducing slight signal distortions.
Basics of the Discrete Fourier Transformation (DFT)
The DFT and the inverse DFT can be performed with X=Fx and x=F−1X respectively. Here, x denotes the time-domain vector, X is the frequency-domain vector, and F denotes the DFT matrix with
If N is power of 2 the DFT can be performed efficiently by a Fast Fourier Transform (FFT).
Multiplication in the Frequency Domain
It is known from Fourier theory that a convolution in time domain corresponds to a multiplication in frequency domain. In case of the DFT, this holds true only if one of the two signals is assumed to be a periodic signal with period length N where N is also the order of the DFT. This is called the cyclic convolution property of the DFT. In many practical applications, however, signals are not periodic. Hence, in general a multiplication in frequency domain does not lead to the same result as the time domain convolution. Therefore effects of cyclic convolution have to be avoided.
A large benefit of frequency domain processing is that a filtering operation can be realized simply by applying frequency dependent weights W(λ) to a spectrum X(λ)
Y(λ)=W(λ)X(λ), (1)
rather than performing convolutions in time domain. Here, λ denotes the frequency index. Except for very low order filters, this reduces the processing load dramatically. To benefit from this efficient way of implementing a convolution, while obtaining the desired non-circular result, it must be ensured that the time-domain filter coefficients w(n)=IFFT{W(λ)} exhibit Q=N−P trailing zeros at the end
w=[w(0), . . . , w(P−1),0, . . . , 0]T (2)
If this is the case, the time-domain result y=IFFT{W(λ)X(λ)} contains P−1 samples [y(0), . . . , y(P−2)] which correspond to a cyclic convolution. The remaining Q+1 samples [y(P−1), . . . , y(N−1)] correspond to a non-cyclic convolution.
Finally, the following should be noted. Consider a filter vector w(n) with P coefficients different from zero, and wherein there are {tilde over (P)} non zero values at its end (these can be considered non-causal coefficients which reoccur at the end of the frame because of the cyclic property of the DFT.
w=[w(0), . . . , w(P−{tilde over (P)}−1),0, . . . , 0,w(N−{tilde over (P)}), . . . , w(N−1)]T (3)
In this case, the output samples that are free from cyclic convolution effects are
[y(P−{tilde over (P)}−1), . . . , y(N−{tilde over (P)}−1)]. (4)
Overlap-Save Filtering
Time-Invariant Filtering
In order to save computational power, signal filtering can be performed in the frequency domain. Instead of sample-based convolution, the filtering in the frequency domain can be performed by multiplication. The basic structure of the overlap-save method is depicted in
W=Fw. (5)
The result is then applied 105 to the signal vectors
Y(k)=WX(k) (6)
where the symbol {circle around (x)} stands for elementwise multiplication. As w obeys Eq. 2, the IFFT 107 of the filtered spectra Y(k), hence y(k)=F−1Y(k) has the property that its last Q+1 elements are valid in terms of corresponding to non-circular convolution. For obtaining the output signal stream yout(n) one block of R samples is extracted 109 from y(k)=[y(0, k), . . . , y(N−1, k)] for each frame k, e.g. yout(k R+n)=y(N/2+n, k) for 0≦n<R. If R≦Q+1 the correct (non-cyclic) convolution can be realized. If the filter vector doesn't change over time the operation in Eq. 5 can be calculated only once before start of the processing rather than for each frame.
Time-Variant Filtering
There are also applications known where the filter is not fixed as described above. In J. J. Shynk: Frequency-domain and multirate adaptive filtering, IEEE Signal Processing Magazine, pp. 14-37, January 1992, which is hereby incorporated by reference herein, a frequency-domain adaptive filter method is described where the filter coefficients change due to updates in the frequency-domain {tilde over (W)}(k)=W(k−1)+Δ(k). This is realized by a constraint in the time-domain:
W(k)=Fdiag{c}F−1{tilde over (W)}(k). (7)
Multiplication with F and F−1 represents the application of the FFT and the inverse FFT, respectively. In Shynk, a rectangular window c=chard=[1, . . . , 1, 0, . . . , 0] with P=N/2 ones and Q=N/2 zeros is used. Applying this window sets the filter coefficients selectively to zero (hard constraint). As this operation has to be performed for each modification of {tilde over (W)} (i.e., for each filter update) this procedure can be quite expensive. Even applying the constraint matrix Chard=Fdiag{chard}F−1 directly in the frequency domain to the unconstrained filter {tilde over (W)}(k) can be expensive.
In order to save computations, alternatively, this procedure can be approximated in the frequency domain, as proposed by G. Enzner, P. Vary: A soft-partitioned frequency-domain adaptive filter for acoustic echo cancellation, Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), 2003, which is hereby incorporated by reference herein. Here, a soft constraint is applied instead of a hard constraint. As opposed to a rectangular time-domain window with hard rectangular edges that reflect a hard constraint and cause spurious effects/leakage, a time-domain window function that represents a soft constraint has smooth transitions/edges. Illustratively, and without limitation, a Hann window wH(n) may be used. Other soft windows may also be utilized, as known in the art, such as, without limitation, a Hamming window, a cosine window, a Gaussian window, a Bartlett-Hann window, a Blackman window, a Kaiser Window, and other parametric windows.
The Hann Window wH(n) may have length P=N/2 and be zero-padded by Q=N/2 zeros
csoft=[wH(0), . . . , wH(P−1),0, . . . , 0] (8)
with
wH(n)=½(1−cos(2πn/P). (9)
The corresponding frequency-domain constraint matrix
Csoft=Fdiag{csoft}F−1 (10)
is a toeplitz matrix and exhibits neglectable values besides the main diagonal and some secondary diagonals. Setting these values to zero, the matrix-vector multiplication W=Csoft{tilde over (W)} reduces to a short convolution in frequency domain
The coefficients Csoft(l) are taken from the main and secondary diagonals of Csoft
Csoft(l)=Csoft|n,n+1, for lε[−L, . . . , L]. (12)
The coefficients for L=3 and for the example setting from paragraph [0015] are (note that the symbol * indicates conjugate complex):
Csoft(0)=0.25, (13)
Csoft(1)=C*soft(−1)=0.21221j, (14)
Csoft(2)=C*soft(−2)=0.125, (15)
Csoft(3)=C*soft(−3)=0.042441j. (16)
The constraint matrix Csoft,approx that corresponds to the convolution in Eq. 11 is still a toeplitz matrix but sparse.
Time-Variant Filtering for Spectral Weighting
For the purposes of noise reduction it is desirable to specify a filter response directly in the frequency domain, where for each frequency bin λ a real-valued weighting factor is determined dynamically. Because of the real values, the time-domain representation of the filter is symmetric with respect to time index n=0 (considering a periodic extension of the buffer in both time directions). The filter weights may have been calculated with a certain filter characteristic which usually evaluates the signal-to-noise ratios of the different frequency bins. Examples for such filter characteristics can be found in G. Schmidt: Single-Channel Noise Suppression Based on Spectral Weighting—An Overview, Eurasip Newsletter, Vol. 15, No. 1, pp. 9-24, March 2004, which is hereby incorporated by reference it its entirety.
Illustratively, in a prior art in-car communication system, each frame k a vector of frequency-dependent filter weights {tilde over (W)}(k) was calculated by arbitrary filter characteristics. As the corresponding time domain filter is symmetric with respect to n=0 but the maximum of the time-domain constraint window from Eq. 8 is at N/4 and Chard from paragraph [0013] is also symmetric with respect to N/4, the filter coefficients were shifted in time by N/4 samples. This was performed in the frequency domain by applying a linear phase rotation
{tilde over (W)}D(λ,k)=exp{−jπλ/2){tilde over (W)}(λ,k)=(−j)λ{tilde over (W)}(λ,k). (17)
After that the approximated constraint matrix was applied
W(k)=Csoft,approx{tilde over (W)}D(k) (18)
and the result in turn was applied to the input signal spectra
Y(k)=W(k)X(k). (19)
The filtered time-domain signal buffer resulted as y(k)=F−1Y(k). R samples out of it have to be used as output signal
Yout(kR+n)=y(N/2+n,k), for nε[0, . . . , R−1] (20)
The filter weights in Eq. 17 were modified by phase rotation in order to match to the location of the constraint window, aligned to time index 0. This phase rotation undesirably adds complexity into the system.
In accordance with an embodiment of the invention, a method for adaptive digital filtering is provided. Illustratively, the method includes determining a time domain representation of a soft constraint window, the soft constraint window for zero padding an adaptive digital filter. The time domain representation of the soft constraint matrix is circularly shifted to align in the frequency domain with the adaptive digital filter. The time domain representation of the circularly shifted soft constraint matrix is transformed to a frequency domain representation of the circularly shifted soft constraint matrix. A frequency domain representation of the adaptive digital filter is determined The frequency domain representation of the circularly shifted soft constraint matrix window is applied to the frequency domain representation of the adaptive digital filter, such that the resulting adaptive digital filter corresponds to a digital filter in the time domain that includes a plurality of consecutive zeros.
In accordance with related embodiments of the invention, the resulting frequency domain representation of the digital filter is applied to a frequency domain representation of an input signal. An overlap-save methodology may be used. The corresponding time domain representation of the soft constraint window may be substantially a Hann window.
In accordance with another embodiment of the invention, a method of frequency-domain filtering is provided. The method includes cascading a plurality of filters Wi=1,I in the frequency domain, each of the filters Wi being constrained and having the same length, to form a combined filter Wall=W1W2 . . . WI.
In accordance with related embodiments of the invention, Wall may be applied to a frequency domain representation of an input signal to form a frequency domain representation of an output signal. The frequency domain representation of the output signal may be transformed to a time domain representation of the output signal.
In accordance with another embodiment of the invention, a method of frequency-domain filtering is provided that includes a plurality of filters, the plurality of filters including at least one constrained filter(s) Wi=1,I and at least one unconstrained filter(s) {tilde over (W)}k=1,K. The method includes cascading the {tilde over (W)}k=1,K unconstrained filter(s). A single constraint window C is applied to the cascaded {tilde over (W)}k−1,K unconstrained filter(s). The Wi−1,I constrained filter(s) are cascaded with the constrained cascaded {tilde over (W)}k−1,K unconstrained filter(s) to form a resulting filter Wall*=C({tilde over (W)}1 . . . {tilde over (W)}K)W1 . . . WI.
In accordance with related embodiments of the invention, Wall may be applied to a frequency domain representation of an input signal to form a frequency domain representation of an output signal. The frequency domain representation of the output signal may be transformed to a time domain representation of the output signal. The Wi=1,I constrained filters may be a fixed equalizer WfixEQ and the {tilde over (W)}k=1,k unconstrained filters may be a dynamic equalizer WdynEQ, such that Wall=WfixEQ C {tilde over (W)}dynEQ. WfixEQ may be a hard constraint. WfixEQ may be determined offline before the start of processing, with, for example, a rectangular window applied.
In accordance with another embodiment of the invention, a method of frequency-domain filtering is provided that includes a plurality of unconstrained filters {tilde over (W)}k=1,K. The method includes cascading the {tilde over (W)}k−1,K unconstrained filters. A single constraint window C is applied to the cascaded {tilde over (W)}k=1,K unconstrained filter(s) to form a resulting filter Wall*=C({tilde over (W)}1 . . . {tilde over (W)}K).
In accordance with related embodiments of the invention, the {tilde over (W)}k=1,k unconstrained filters may be a dynamic equalizer {tilde over (W)}dynEQ and a noise reducer {tilde over (W)}NR, such that such Wall=C({tilde over (W)}dynEQ{tilde over (W)}NR). Wall may be applied to a frequency domain representation of an input signal to form a frequency domain representation of an output signal. The frequency domain representation of the output signal may be transformed to a time domain representation of the output signal.
In accordance with another embodiment of the invention, a method of beamforming and noise reduction is provided. The method includes applying a spectral filtering WRF,m to a plurality of microphone inputs Xm(k), 0≦m<M in the frequency domain to form a beamformed input signal
{tilde over (W)}
k=1,K unconstrained noise reduction filter(s) are cascaded. A single constraint window C is applied to the cascaded {tilde over (W)}k=1,K unconstrained filter(s) to form Wall=C(W1 . . . {tilde over (W)}K); Wall is applied to the beamformed input signal to form a frequency domain representation of an output signal Y(k)=WallXBF(k).
In accordance with related embodiments of the invention, the {tilde over (W)}k=1,K unconstrained filters may be a dynamic noise reducer {tilde over (W)}NR, such that Wall*=C {tilde over (W)}NR. Applying Wall to the beamformed input signal may form a frequency domain representation of an output signal Y(k)=(C{tilde over (W)}NR(k))XBF(k). The frequency domain representation of the output signal Y(k) may be transformed to a time domain representation of the output signal.
In accordance with another embodiment of the invention, a system for adaptive digital filtering is provided. The system includes means for determining a frequency domain representation of a digital filter. The system further includes means for applying a frequency domain representation of a soft constraint window to the frequency domain representation of the digital filter, such that the resulting frequency domain representation of the digital filter corresponds to a digital filter in the time domain that includes a plurality of consecutive zeros. The frequency domain representation of the soft constraint window is based, at least in part, on a time domain representation of a soft constraint window that has been circularly shifted such that the frequency domain representation of the constraint window matches a property of the frequency domain representation of the digital filter.
In accordance with related embodiments of the invention, the system may include means for applying the resulting frequency domain representation of the digital filter to a frequency domain representation of an input signal. The system may include means for performing an overlap-save method. The time domain representation of the soft constraint window may be substantially a Hann window.
In accordance with another embodiment of the invention, a system for frequency-domain filtering is provided. The system includes means for cascading a plurality of filters Wi=1,I in the frequency domain, each of the filters Wi being constrained and having the same length, to form a combined filter Wall=W1 W2 . . . WI.
In accordance with related embodiments of the invention, the system may further include means for applying Wall to a frequency domain representation of an input signal to form a frequency domain representation of an output signal. The system may further include means for transforming the frequency domain representation of the output signal to a time domain representation of the output signal.
In accordance with another embodiment of the invention, a system for frequency-domain filtering is provided. The system includes a plurality of filters, the plurality of filters including at least one constrained filter(s) Wi=1,I and at least one unconstrained filter(s) {tilde over (W)}k−1,K. The system further includes means for cascading the {tilde over (W)}k−1,K unconstrained filter(s). The system further includes means for applying a single constraint window C to the cascaded {tilde over (W)}k−1,K unconstrained filter(s), and means for cascading the Wi−1,I constrained filter(s) with the constrained cascaded {tilde over (W)}k=1,K unconstrained filter(s) to form a resulting filter Wall*=C({tilde over (W)}1 . . . {tilde over (W)}K)W1 . . . WI.
In accordance with related embodiments of the invention, the system may further include means for applying Wall to a frequency domain representation of an input signal to form a frequency domain representation of an output signal. The system may further include means for transforming the frequency domain representation of the output signal to a time domain representation of the output signal. The Wi−1,I constrained filters may be a fixed equalizer WfixEQ and the {tilde over (W)}k=1,k unconstrained filters may be a dynamic equalizer {tilde over (W)}dynEQ, such that Wall={tilde over (W)}fixEQ C {tilde over (W)}dynEQ.
In accordance with another embodiment of the invention, a system for frequency-domain filtering is provided that includes a plurality of unconstrained filters {tilde over (W)}k−1,K, the system. The system further includes means for cascading the {tilde over (W)}k=1,K unconstrained filters, and means for applying a single constraint window Ccasc to the cascaded {tilde over (W)}k=1,K unconstrained filter(s) to form a resulting filter Wall*=C({tilde over (W)}1 . . . {tilde over (W)}K).
In accordance with related embodiments of the invention, the {tilde over (W)}k=1,k unconstrained filters may be a dynamic equalizer {tilde over (W)}dynEQ and a noise reducer {tilde over (W)}NR such that such Wall=C({tilde over (W)}dynEQ {tilde over (W)}NR). The system may further include means for applying Wall to a frequency domain representation of an input signal to form a frequency domain representation of an output signal. The system may further include means for transforming the frequency domain representation of the output signal to a time domain representation of the output signal.
In accordance with another embodiment of the invention, a system for beamforming and noise reduction is provided. The system includes means for applying a spectral filtering WBF, m to a plurality of microphone inputs Xm(k), 0≦m<M in the frequency domain to form a beamformed input signal
The system further includes means for cascading {tilde over (W)}k=1,K unconstrained noise reduction filter(s), and means for applying a single constraint window C to the cascaded {tilde over (W)}k=1,K unconstrained filter(s) to form Wall=Ccasc({tilde over (W)}1 . . . {tilde over (W)}K). The system further includes means for applying Wall to the beamformed input signal to form a frequency domain representation of an output signal Y(k)=WallXBF(k).
In accordance with related embodiments of the invention, the {tilde over (W)}k=1,K unconstrained filters may be a dynamic noise reducer {tilde over (W)}NR, such that Wall*=C {tilde over (W)}NR. Applying Wall to the beamformed input signal may form a frequency domain representation of an output signal Y(k)=(C{tilde over (W)}NR(k))XBF(k). The system may further include means for transforming the frequency domain representation of the output signal Y(k) to a time domain representation of the output signal.
In accordance with another embodiment of the invention, a computer program product for use on a computer system for adaptive digital filtering is provided. The computer program product includes at least one non-transitory computer readable medium having computer executable program code thereon. The computer executable program code includes program code for defining a frequency domain representation of a digital filter. Further the product includes program code for applying a frequency domain representation of a soft constraint window to the frequency domain representation of the digital filter, such that the resulting frequency domain representation of the digital filter corresponds to a digital filter in the time domain that includes a plurality of consecutive zeros. The frequency domain representation of the soft constraint window is based, at least in part, on a time domain representation of a soft constraint window that has been circularly shifted such that the frequency domain representation of the constraint window matches a property of the frequency domain representation of the digital filter.
In accordance with related embodiments of the invention, the computer program product may include program code for applying the resulting frequency domain representation of the digital filter to a frequency domain representation of an input signal. The computer program product may include program code for performing an overlap-save method. The time domain representation of the soft constraint window may be substantially a Hann window.
In accordance with another embodiment of the invention, a computer program product for use on a computer system for frequency-domain filtering is provided. The computer program product comprising includes at least one non-transitory computer readable medium having computer executable program code thereon. The computer executable program code includes program code for cascading a plurality of filters Wi=1, l in the frequency domain, each of the filters Wi being constrained and having the same length, to form a combined filter Wall=W1W2 . . . WI.
In accordance with related embodiments of the invention, the computer program product may include program code for applying Wall to a frequency domain representation of an input signal to form a frequency domain representation of an output signal. The computer program product may include program code for transforming the frequency domain representation of the output signal to a time domain representation of the output signal.
In accordance with another embodiment of the invention, a computer program product for use on a computer system for frequency-domain filtering is provided. The computer program product includes at least one non-transitory computer readable medium having computer executable program code thereon. The computer executable program code includes program code for cascading the {tilde over (W)}k=1,K unconstrained filter(s), and program code for applying a single constraint window C to the cascaded {tilde over (W)}k=1,K unconstrained filter(s). The computer executable program code further includes program code program code for cascading the W=i=1,I constrained filter(s) with the constrained cascaded {tilde over (W)}k=1,K unconstrained filter(s) to form a resulting filter Wall*=C({tilde over (W)}1 . . . {tilde over (W)}K)W1 . . . WI.
In accordance with related embodiments of the invention, the computer program product may further include program code for applying Wall to a frequency domain representation of an input signal to form a frequency domain representation of an output signal. the computer program product may further include program code for transforming the frequency domain representation of the output signal to a time domain representation of the output signal. The W1=1,I constrained filters may be a fixed equalizer WfixEQ and the {tilde over (W)}k=1,k unconstrained filters may be a dynamic equalizer {tilde over (W)}dynEQ, such that Wall=WfixEQ C {tilde over (W)}dynEQ.
In accordance with another embodiment of the invention, a computer program product for use on a computer system for frequency-domain filtering is provided. The computer program product includes at least one non-transitory computer readable medium having computer executable program code thereon. The computer executable program code includes program code for cascading {tilde over (W)}k=1,K unconstrained filters. The computer executable program code further includes program code for applying a single constraint window Ccasc to the cascaded {tilde over (W)}k−1,K unconstrained filter(s) to form a resulting filter Wall*=C({tilde over (W)}1 . . . {tilde over (W)}K).
In accordance with related embodiments of the invention, the {tilde over (W)}k−1,k unconstrained filters may be a dynamic equalizer {tilde over (W)}dynEQ and a noise reducer {tilde over (W)}NR such that such Wall=C({tilde over (W)}dynEQ {tilde over (W)}NR). The computer executable program code may further include program code for applying Wall to a frequency domain representation of an input signal to form a frequency domain representation of an output signal. The computer executable program code may further include program code for transforming the frequency domain representation of the output signal to a time domain representation of the output signal.
In accordance with another embodiment of the invention, a computer program product for use on a computer system for beamforming and noise reduction is provided. The computer program product includes at least one non-transitory computer readable medium having computer executable program code thereon. The computer executable program code includes program code for applying a spectral filtering WBF, m to a plurality of microphone inputs Xm(k), 0≦m<M in the frequency domain to form a beamformed input signal
The computer executable program code further includes program code for cascading {tilde over (W)}k=1,K unconstrained noise reduction filter(s), and applying a single constraint window C to the cascaded {tilde over (W)}k=1,K unconstrained filter(s) to form Wall=Ccasc({tilde over (W)}1 . . . {tilde over (W)}K). The computer executable program code further includes program code for program code for applying Wall to the beamformed input signal to form a frequency domain representation of an output signal Y(k)=Wall XRF(k).
In accordance with related embodiments of the invention, the {tilde over (W)}k=1,K unconstrained filters is a dynamic noise reducer {tilde over (W)}NR, such that Wall*=C {tilde over (W)}NR. Applying Wall to the beamformed input signal may form a frequency domain representation of an output signal Y(k)=(C{tilde over (W)}NR(k))XBF(k). The program code may further include code for transforming the frequency domain representation of the output signal Y(k) to a time domain representation of the output signal.
In accordance with further related embodiments to the above-described embodiments of the invention, the frequency domain representation of the single constraint window C may be based, at least in part, on a time domain representation of a single constraint window C that has been circularly shifted such that the frequency domain representation of the constraint window matches a property of the frequency domain representation of the cascaded {tilde over (W)}k=1,k unconstrained filters. The time domain representation of the single constraint window may be a soft constraint window, for example, substantially a Hann window.
The foregoing features of embodiments will be more readily understood by reference to the following detailed description, taken with reference to the accompanying drawings, in which:
In illustrative embodiments of the invention, an efficient low-delay method and system for dynamic filtering in the frequency domain is provided. Various embodiments include a modification to the “overlap-save” technique known from adaptive filtering to match dynamic noise reduction requirements. Furthermore, this technique has been extended to a frequency domain framework that supports multiple filter operations within one processing frame. Thus, for example, an equalizing and noise reduction can be applied without additional delay. In addition the framework has been extended to multi-channel operation with beamforming and adaptive mixing to support multiple speakers. Thereby, the delay and the computational effort are kept at a minimum. Details are discussed below.
Embodiments of the invention may be applied to, without limitation, an In-Car Communication (ICC) system that supports communication between passengers inside the car by using built-in microphones and loudspeakers to reinforce the speech signal. Signal processing like beamforming, noise reduction, or equalizing may be applied for improving the quality of the ICC output signal. Thereby, a low processing delay is crucial in order to avoid an unnatural reverberant sound. Other applications of the system and method include, for example, live mixing scenarios over teleconferencing systems, and communication/speech/audio systems within other types of vehicles and environments. Other embodiments may be applicable to a wide variety of other digital signal processing environments including, without limitation, sonar and radar signal processing, sensor array processing, spectral estimation, statistical signal processing, digital image processing, control of systems, biomedical signal processing, and seismic data processing.
Efficient Constraint for Real-Valued Frequency-Domain Filters
As described above, the filter weights in Eq. 17 were conventionally modified by phase rotation in order to match to the location of the constraint window. Instead, in accordance with various embodiments of the invention, the constraint window is modified to match to the properties of the filter. Illustratively, as the original time-domain filter weights are symmetric with respect to n=0 the time-domain constraint window may be left-shifted by N/4 samples by a cyclic shift. Thus, the phase rotation in Eq. 2.17 can be avoided and the filter weights remain real valued. Furthermore, the constraint window is symmetric with respect to n=0 which corresponds to a real valued frequency-domain constraint matrix C. This has no effect on delay, but, reduces the computational complexity since the matrix-vector product C{tilde over (W)}(k) has only real-valued multiplications rather than complex-valued ones.
Csoft(0)=0.25, (21)
Csoft(1)=Csoft(−1)=0.21221, (22)
Csoft(2)=Csoft(−2)=0.125, (23)
Csoft(3)=Csoft(−3)=0.042441. (24)
The output signal is then
yout(kR+n)=y(N/4+n,k), for n ε[0, . . . , R−1]. (25)
Concept of Multi-Stage Frequency-Domain Filtering
Multiple filters Wi may be cascaded in frequency domain if they exhibit the same length N. Their time-domain representation can exhibit different numbers of zeros Qi, valid filter coefficients Pi and acausal coefficients {tilde over (P)}i. For the combined filter
Wall=W1 W2 . . . W1 (26)
the length of valid coefficients results is
Pall=(Σi=1IPi)−I+1 (27)
and the number of zeros results is
Qall=N−Pall. (28)
The number of non-causal coefficients accumulate to
For the filtered signal in time domain
y(k)=F−1(WallX(k)) (30)
the samples y(Pall−{tilde over (P)}all−1, k), . . . , y(N−{tilde over (P)}all−1, k) correspond to non-circular convolution (the indices Pall−{tilde over (P)}all−1, . . . , N−{tilde over (P)}all−1 may exceed the range [0, . . . , N−1] in this case the output buffer y(k) has to be extended periodically. As long as Qall+1≧R, there is no need to go back in the time domain as it would be done in a straight forward approach.
Cascading Unconstrained Filters in the Frequency Domain
In Eq. 26 constrained filters are cascaded, i.e., the time-domain vectors wi exhibit Qi trailing zeros. For some applications one or more of these filters may be calculated as spectral weights regardless of any constraint. For example, for unconstrained noise reduction or dynamic equalization the weighting factors may be dynamically calculated separately for each band. In such cases, a constraint Ci may be applied to each unconstrained filter {tilde over (W)}i, as is described above, which has the advantage that the order of each filter can be controlled through the respective constraint. This, however, may not be computationally efficient.
Consider the general case of a cascade of I constrained filters Wi and K unconstrained weight vectors {tilde over (W)}k. The overall frequency-domain filter is
Wall=C1{tilde over (W)}1 . . . CK {tilde over (W)}kW1 . . . WI. (31)
Here, each unconstrained filter vector is constrained prior to cascading. In illustrative embodiments of the invention, alternatively, one single constraint matrix Ccasc may be applied to the cascade of unconstrained filter vectors
Wall=Ccasc({tilde over (W)}1 . . . {tilde over (W)}k)W1 . . . WI. (32)
Ccasc is characterized by Pcasc time-domain filter coefficients, with {tilde over (P)}casc acausal ones, and Qcasc zeros. However, instead of assigning a specific filter length Pk to each stage {tilde over (W)}k, the accumulated filter {tilde over (W)}1 . . . {tilde over (W)}k is treated entirely.
This method may be advantageous for real-time applications where {tilde over (W)}k are determined dynamically and the constrained filters Wi are fixed filters which have been designed prior to processing.
For the examples below, the following parameters, without limitation, may be applied:
Sample rate: fs=16000 Hz
FFT order: N=256
Frameshift: R=N/4
Number of microphones: M=2
The W1 cascaded, constrained filter may then be cascaded with the constrained cascaded {tilde over (W)}I unconstrained filter to form a resulting filter Wall*=C({tilde over (W)}1 . . . {tilde over (W)}K)W1 . . . WI. Wall may be applied to a frequency domain representation of an input signal X(k) to form a frequency domain representation of an output signal Y(k). The frequency domain representation of the output signal Y(k) may then be transferred back into a time domain representation of the output signal.
More particularly, a fixed and a dynamic equalizer may be realized jointly within the above-proposed framework, in accordance with various embodiments of the invention. Illustratively, the fixed part WfixEQ corresponds to a precalculated causal filter of length R with Pfixed=R and {tilde over (P)}fixed=0. The dynamic part {tilde over (W)}dynEQ relates to real-valued weights in the frequency domain. Thus, the dynamic part corresponds to a symmetric time-domain filter. Before applying the dynamic filter in the frequency-domain it is ensured by a constraint matrix CEx1 that 2R−1 time-domain coefficients are (approximately) zero (Eq. 3 with Pdyn=2R+1 and {tilde over (P)}dyn=R). Thus, the overall resulting filter has {tilde over (P)}overall=R acausal coefficients and Poverall−{tilde over (P)}overall=2R causal coefficients
WEx1=WfixEQCEx1 {tilde over (W)}dynEQ. (33)
The valid output signal can be retrieved from y(k)=F−1(WEx1X(k)) as [y(2R, k), . . . , y(3R−1, k)]. This segment has length N−Poverall=R. Thus, the output signal stream results as
yout(k R+n)=y(2R+n,k). (34)
Illustratively, a dynamic equalization and a noise reduction can be combined, in accordance with an embodiment of the invention. The unconstrained filters {tilde over (W)}dynEQ and {tilde over (W)}NR both are real-valued. They are multiplied before applying a constraint
WEx2=CEx2({tilde over (W)}dynEQ {tilde over (W)}NR) (35)
In this case there is PEx2=2R−1 and {tilde over (P)}Ex2=R−1. Thus CEx2=Csoft,approx can be applied.
A further embodiment of the invention is the combination of a beamformer and a noise reduction within the low-delay filtering framework, as shown in
First a predetermined, spectral filtering WBF,m may be applied within the beamformer individually to each microphone input channel m in order to realize a time delay compensation. These filters have been designed as causal fractional delay filters of length P=R+2 in the time domain. They have been transformed by FFT into the complex-valued frequency domain filters. Applying them to a signal means that only N−R−1 samples could be used if the filtered signals were transformed back into time-domain. However, the filtered signals are kept in the frequency domain and added:
In terms of cyclic convolution effects the sum has no effect. After that, dynamic spectral weighting may be applied.
Within the noise reduction for each frequency bin a real-valued weighting factor {tilde over (W)}NR(k) is determined dynamically. After introducing the constraint CEx3=Csoft,approx with PEx3=2R−1 and {tilde over (P)}Ex3=R−1 the filter can be applied
Y(k)=(CEx3{tilde over (W)}NR(k))XBF(k). (37)
The present invention described in the above embodiments may be embodied in many different forms, including, but in no way limited to, computer program logic for use with a processor (e.g., a microprocessor, microcontroller, digital signal processor, or general purpose computer), programmable logic for use with a programmable logic device (e.g., a Field Programmable Gate Array (FPGA) or other PLD), discrete components, integrated circuitry (e.g., an Application Specific Integrated Circuit (ASIC)), or any other means including any combination thereof.
Computer program logic implementing all or part of the functionality previously described herein may be embodied in various forms, including, but in no way limited to, a source code form, a computer executable form, and various intermediate forms (e.g., forms generated by an assembler, compiler, linker, or locator.) Source code may include a series of computer program instructions implemented in any of various programming languages (e.g., an object code, an assembly language, or a high-level language such as Fortran, C, C++, JAVA, or HTML) for use with various operating systems or operating environments. The source code may define and use various data structures and communication messages. The source code may be in a computer executable form (e.g., via an interpreter), or the source code may be converted (e.g., via a translator, assembler, or compiler) into a computer executable form.
The computer program may be fixed in any form (e.g., source code form, computer executable form, or an intermediate form) either permanently or transitorily in a tangible storage medium, such as a semiconductor memory device (e.g., a RAM, ROM, PROM, EEPROM, or Flash-Programmable RAM), a magnetic memory device (e.g., a diskette or fixed disk), an optical memory device (e.g., a CD-ROM), a PC card (e.g., PCMCIA card), or other memory device. The computer program may be fixed in any form in a signal that is transmittable to a computer using any of various communication technologies, including, but in no way limited to, analog technologies, digital technologies, optical technologies, wireless technologies, networking technologies, and internetworking technologies. The computer program may be distributed in any form as a removable storage medium with accompanying printed or electronic documentation (e.g., shrink wrapped software or a magnetic tape), preloaded with a computer system (e.g., on system ROM or fixed disk), or distributed from a server or electronic bulletin board over the communication system (e.g., the Internet or World Wide Web.)
Hardware logic (including programmable logic for use with a programmable logic device) implementing all or part of the functionality previously described herein may be designed using traditional manual methods, or may be designed, captured, simulated, or documented electronically using various tools, such as Computer Aided Design (CAD), a hardware description language (e.g., VHDL or AHDL), or a PLD programming language (e.g., PALASM, ABEL, or CUPL.)
Although various exemplary embodiments of the invention have been disclosed, it should be apparent to those skilled in the art that various changes and modifications can be made which will achieve some of the advantages of the invention without departing from the true scope of the invention.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/US2011/035356 | 5/5/2011 | WO | 00 | 11/25/2013 |
Publishing Document | Publishing Date | Country | Kind |
---|---|---|---|
WO2012/150942 | 11/8/2012 | WO | A |
Number | Name | Date | Kind |
---|---|---|---|
20020073128 | Egelmeers et al. | Jun 2002 | A1 |
20040042557 | Kabel et al. | Mar 2004 | A1 |
20040260737 | Van Den Enden et al. | Dec 2004 | A1 |
20070290737 | Li | Dec 2007 | A1 |
20090072896 | Han | Mar 2009 | A1 |
20100153409 | Joshi et al. | Jun 2010 | A1 |
20100174767 | Villemoes | Jul 2010 | A1 |
20130332171 | Avendano et al. | Dec 2013 | A1 |
Entry |
---|
PCT International Search Report and Written Opinion of the ISA dated Feb. 8, 2012; for PCT Pat. App. No. PCT/US2011/035356; 12 pages. |
PCT International Perliminary Report on Patentability and Written Opinion of the ISA dated Nov. 14, 2013; for PCT Pat. App. No. PCT/US2011/035356; 9 pages. |
Number | Date | Country | |
---|---|---|---|
20140105338 A1 | Apr 2014 | US |