The present invention relates to digital signal processing, and more particularly, to a digital signal processing system for use in an audio system such as a hearing aid.
The combination of spatial processing using beamforming techniques (i.e., multiple-microphones) and binaural listening is applicable to a variety of fields and is particularly applicable to the hearing aid industry. This combination offers the benefits associated with spatial processing, i.e., noise reduction, with those associated with binaural listening, i.e., sound location capability and improved speech intelligibility.
Beamforming techniques, typically utilizing multiple microphones, exploit the spatial differences between the target speech and the noise. In general, there are two types of beamforming systems. The first type of beamforming system is fixed, thus requiring that the processing parameters remain unchanged during system operation. As a result of using unchanging processing parameters, if the source of the noise varies, for example due to movement, the system performance is significantly degraded. The second type of beamforming system, adaptive beamforming, overcomes this problem by tracking the moving or varying noise source, for example through the use of a phased array of microphones.
Binaural processing uses binaural cues to achieve both sound localization capability and speech intelligibility. In general, binaural processing techniques use interaural time difference (ITD) and interaural level difference (ILD) as the binaural cues, these cues obtained, for example, by combining the signals from two different microphones.
Fixed binaural beamforming systems and adaptive binaural beamforming systems have been developed that combine beamforming with binaural processing, thereby preserving the binaural cues while providing noise reduction. Of these systems, the adaptive binaural beamforming systems offer the best performance potential, although they are also the most difficult to implement. In one such adaptive binaural beamforming system disclosed by D. P. Welker et al., the frequency spectrum is divided into two portions with the low frequency portion of the spectrum being devoted to binaural processing and the high frequency portion being devoted to adaptive array processing. (Microphone-array Hearing Aids with Binaural Output-part II: a Two-Microphone Adaptive System, IEEE Trans. on Speech and Audio Processing, Vol. 5, No. 6, 1997, 543–551).
In an alternate adaptive binaural beamforming system disclosed in co-pending U.S. patent application Ser. No. 09/593,728, filed Jun. 13, 2000, two distinct adaptive spatial processing filters are employed. These two adaptive spatial processing filters have the same reference signal from two ear microphones but have different primary signals corresponding to the right ear microphone signal and the left ear microphone signal. Additionally, these two adaptive spatial processing filters have the same structure and use the same adaptive algorithm, thus achieved reduced system complexity. The performance of this system is still limited, however, by the use of only two microphones.
An adaptive binaural beamforming system is provided which can be used, for example, in a hearing aid. The system uses more than two input signals, and preferably four input signals, the signals provided, for example, by a plurality of microphones.
In one aspect, the invention includes a pair of microphones located in the user's left ear and a pair of microphones located in the user's right ear. The system is preferably arranged such that each pair of microphones utilizes an end-fire configuration with the two pairs of microphones being combined in a broadside configuration.
In another aspect, the invention utilizes two stages of processing with each stage processing only two inputs. In the first stage, the outputs from two microphone pairs are processed utilizing an end-fire array processing scheme, this stage providing the benefits of spatial processing. In the second stage, the outputs from the two end-fire arrays are processed utilizing a broadside configuration, this stage providing further spatial processing benefits along with the benefits of binaural processing.
In another aspect, the invention is a system such as used in a hearing aid, the system comprised of a first channel spatial filter, a second channel spatial filter, and a binaural spatial filter, wherein the outputs from the first and second channel spatial filters provide the inputs for the binaural spatial filter, and wherein the outputs from the binaural spatial filter provide two channels of processed signals. In a preferred embodiment, the two channels of processed signals provide inputs to a pair of transducers. In another preferred embodiment, the two channels of processed signals provide inputs to a pair of speakers. In yet another preferred embodiment, the first and second channel spatial filters are each comprised of a pair of fixed polar pattern units and a combining unit, the combining unit including an adaptive filter. In yet another preferred embodiment, the outputs of the first and second channel spatial filters are combined to form a reference signal, the reference signal is then adaptively combined with the output of the first channel spatial filter to form a first channel of processed signals and the reference signal is adaptively combined with the output of the second channel spatial filter to form a second channel of processed signals.
In yet another aspect, the invention is a system such as used in a hearing aid, the system comprised of a first channel spatial filter, a second channel spatial filter, and a binaural spatial filter, wherein the binaural spatial filter utilizes two pairs of low pass and high pass filters, the outputs of which are adaptively processed to form two channels of processed signals.
A further understanding of the nature and advantages of the present invention may be realized by reference to the remaining portions of the specification and the drawings.
In the following description, “RF” denotes right front, “RB” denotes right back, “LF” denotes left front, and “LB” denotes left back. Each of the four microphones 101–104 converts received sound into a signal; xRF(n), xRB(n), xLF(n) and xLB(n), respectively. Signals xRF(n), xRB(n), xLF(n) and xLB(n) are processed by an adaptive binaural beamforming system 107. Within system 107, each microphone signal is processed by an associated filter with frequency responses of WRF(f), WRB(f), WlF(f) and WLB(f), respectively. System 107 output signals 109 and 110, corresponding to zR(n) and zL(n), respectively, are sent to speakers 111 and 112, respectively. Speakers 111 and 112 provide processed sound to the user's right ear and left ear, respectively.
To maximize the spatial benefits of system 100 while preserving the binaural cues, the coefficients of the four filters associated with microphones 101–104 should be the solution of the following optimization equation:
minW
where CT W=g, E(f)=0, and L(f)=0. In these equations, C and g are the known constrained matrix and vector; W is a weight matrix consisting of WRF(f), WRB(f), WlF(f) and WLB(f); E(f) is the difference in the ITD before and after processing; and L(f) is the difference in the ILD before and after processing. As Eq. (1) is a nonlinear constrained optimization problem, it is very difficult to find the solution in real-time.
In the embodiment shown in
An advantage of the embodiment shown in
Further explanation will now be provided for the related adaptive algorithms for RSF 201, LSF 203 and BSF 205. With respect to the adaptive processing of RSF 201 and LSF 203, preferably a fixed polar pattern based adaptive directionality scheme is employed as illustrated in
The adaptive algorithm for two nearby microphones in an endfire array for LSF 203 is primarily based on an adaptive combination of the outputs from two fixed polar pattern units 301 and 302, thus making the null of the combined polar-pattern of the LSF output always toward the direction of the noise. The null of one of these two fixed polar patterns is at zero (straight ahead of the subject) and the other's null is at 180 degrees. These two polar patterns are both cardioid. The first fixed polar pattern unit 301 is implemented by delaying the back microphone signal xLB(n) by the value d/c with a delay unit 303 and subtracting it from the front microphone signal, xLF(n), with a combining unit 305, where d is the distance separating the two microphones and c is the speed of the sound. Similarly, the second fixed polar pattern unit is implemented by delaying the front microphone signal xLF(n) by the value d/c with a delay unit 307 and subtracting it from the back microphone signal, xLB(n), with a combining unit 309.
The adaptive combination of these two fixed polar patterns is accomplished with combining unit 311 by adding an adaptive gain following the output of the second polar pattern. This combination unit provides the output yL(n) for next stage BSF 205 processing. By varying the gain value, the null of the combined polar pattern can be placed at different degrees. The value of this gain, W, is updated by minimizing the power of the unit output yL(n) as follows:
where R12 represents the cross-correlation between the first polar pattern unit output xL1(n) and the second polar pattern unit xL2(n) and R22 represents the power of XL2(n).
In a real-time application, the problem becomes how to adaptively update the optimization gain Wopt with available samples xL1(n) and xL2(n) rather than cross-correlation R12 and power R22. Utilizing available samples xL1(n) and xL2(n), a number of algorithms can be used to determine the optimization gain Wopt (e.g., LMS, NLMS, LS and RLS algorithms). The LMS version for getting the adaptive gain can be written as follows:
W(n+1)=W(n+1)+λxL2(n)yL(n) (3)
where λ is a step parameter which is a positive constant less than 2/P and P is the power of xL2(n).
For improved performance, λ can be time varying as the normalized LMS algorithm uses, that is,
where μ is a positive constant less than 2 and PL2(n) is the estimated power of xL2(n).
Equations (3) and (4) are suitable for a sample-by-sample adaptive model.
In accordance with another embodiment of the present invention, a frame-by-frame adaptive model is used. In frame-by-frame processing, the following steps are involved in obtaining the adaptive gain. First, the cross-correlation between xL1(n) and xL2(n) and the power of xL2(n) at the m'th frame are estimated according to the following equations:
where M is the sample number of a frame. Second, R12 and R22 of Equation (2) are replaced with the estimated {circumflex over (R)}12 and {circumflex over (R)}22 and then the estimated adaptive gain is obtained by Eqn.(2).
In order to obtain a better estimation and achieve smoother frame-by-frame processing, the cross-correlation between xL1(n) and xL2(n) and the power of xL2(n) at the m'th frame can be estimated according to the following equations:
where α and β are two adjustable parameters and where 0≦α≦1, 0≦β≦1, and α+β=1. Obviously if α=1 and β=0, Equations (7) and (8) become Equations (5) and (6), respectively.
As previously noted, the adaptive algorithms described above also apply to RSF 201, assuming the replacement of xLF(n) and xLB(n) with xRF(n) and xRB(n), respectively.
Since BSF 205 has only two inputs and is similar to the case of a broadside array with two microphones, the implementation scheme illustrated in
WR(n)=[WR1(n), WR2(n), . . . , WRN(n)]T and
WL(n)=[WL1(n), WL2(n), . . . , WLN(n)]T
Adaptive filters 401 and 403 provide the outputs 405 (aR(n)) and 407 (aL(n)), respectively, as follows:
where R(n)=[r(n), r(n−1), . . . , r(n−N+1)]T and N is the length of adaptive filters 401 and 403. Note that although the length of the two filters is selected to be the same for the sake of simplicity, the lengths could be different. The primary signals at adaptive filters 401 and 403 are yR(n) and yL(n). Outputs 109 (zR(n)) and 110 (zL(n)) are obtained by the equations:
zR(n)=yR(n)−aR(n) (11)
zL(n)=yL(n)−aL(n) (12)
The weights of adaptive filters 401 and 403 are adjusted so as to minimize the average power of the two outputs, that is,
In the ideal case, r(n) contains only the noise part and the two adaptive filters provide the two outputs aR(n) and aL(n) by minimizing Equations (13) and (14). Accordingly, the two outputs should be approximately equal to the noise parts in the primary signals and, as a result, outputs 109 (i.e., zR(n)) and 110 (i.e., zL(n)) of BSF 205 will approximate the target signal parts. Therefore the processing used in the present system not only realizes maximum noise reduction by two adaptive filters but also preserves the binaural cues contained within the target signal parts. In other words, an approximate solution of the nonlinear optimization problem of Equation (1) is provided by the present system.
Regarding the adaptive algorithm of BSF 205, various adaptive algorithms can be employed, such as LS, RLS, TLS and LMS algorithms. Assuming an LMS algorithm is used, the coefficients of the two adaptive filters can be obtained from:
WR(n+1)=WR(n)+ηR(n)zR(n) (15)
WL(n+1)=WL(n)+ηR(n)xL(n) (16)
where η is a step parameter which is a positive constant less than 2/P and P is the power of the input r(n) of these two adaptive filters. The normalized LMS algorithm can be obtained as follows:
where μ is a positive constant less than 2.
Based on the frame-by-frame processing configuration, a further modified algorithm can be obtained as follows:
where k represents the k'th repeating in the same frame. It is noted that the frame-by-frame algorithm in LSF is different from that for the BSF primarily because in LSF only an adaptive gain is involved.
In yet another alternate embodiment of BSF 205, a fixed filter replaces the adaptive filter. The fixed filter coefficients can be the same in all frequency bins. If desired, delay-summation or delay-subtraction processing can be used to replace the adaptive filter.
In yet another alternate embodiment, the adaptive processing used in RSF 201 and LSF 203 is replaced by fixed processing. In other words, the first polar pattern units xL1(n) and xR1(n) serve as outputs yL(n) and yR(n), respectively. In this case, the delay could be a value other than d/c so that different polar patterns can be obtained. For example, by selecting a delay of 0.342 d/c, a hypercardioid polar pattern can be achieved.
In yet another alternate embodiment, the adaptive gain in RSF 201 and LSF 203 can be replaced by an adaptive FIR filter. The algorithm for designing this adaptive FIR filter can be similar to that used for the adaptive filters of
As will be understood by those familiar with the art, the present invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof. For example, although an LMS-based algorithm is used in RSF 201, LSF 203 and BSF 205, as previously noted, LS-based, TLS-based, RLS-based and related algorithms can be used with each of these spatial filters. The weights could also be obtained by directly solving the estimated Wienner-Hopf equations. Accordingly, the disclosures and descriptions herein are intended to be illustrative, but not limiting, of the scope of the invention which is set forth in the following claims.
The present application is a continuation-in-part of U.S. patent application Ser. No. 09/593,266, filed Jun. 13, 2000, the disclosure of which is incorporated herein in its entirety for any and all purposes.
Number | Name | Date | Kind |
---|---|---|---|
6694028 | Matsuo | Feb 2004 | B1 |
Number | Date | Country | |
---|---|---|---|
20020041695 A1 | Apr 2002 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 09593266 | Jun 2000 | US |
Child | 10006086 | US |