1. Field of the Invention
The invention relates generally to the field of audio processing, and more particularly, to an audio processing apparatus in a communication system with a microphone array.
2. Description of the Related Art
In a communication system, there are three components that are picked up by a microphone, they include: a source signal, interference and echo. The source signal is a desired signal, such as a voice of a speaker. Additionally, only the source signal is required to be sent to a far end side. Thus, echo and interference are considered to be the most objectionable artifacts occurring in communication systems. The echo can be a result of a mismatch at the hybrid network, such as in the network echo case, or the reflections caused by a reverberant environment, such as an acoustic echo. An echo can manifest from the originator in a speech signal, wherein the originator is able to hear his/her own speech after a certain delay. With either kinds of echo, an annoyance factor increases as the amount of the delay increases.
Meanwhile, interference, such as environment noise, also disrupts the proper operation of various subsystems of a communications system, such as the codec. Different kinds of environment noise can vary widely in their characteristics, and a practical noise reduction scheme has to be capable of handling noises with different characteristics.
In order to properly remove the interference and echo picked up by the microphone (or microphone array), an adaptive beamforming filter and adaptive echo cancellation filter are respectively adopted in communications systems. However, as the echo and interference increases, filtering performance thereof degrades. Thus, a novel audio processing method and apparatus in a communication system with a microphone array are proposed.
Audio processing apparatuses are provided. An embodiment of an audio processing apparatus comprises a beamformer, a blocking matrix, a first adaptive filter and a second adaptive filter. The beamformer receives input signals and processes the input signals to generate a first processed signal. The input signals include at least one of a source signal and interference. The blocking matrix receives the input signals and operates to cancel the source signal from the input signals to generate a second processed signal. The first adaptive filter has adaptable first filter coefficients, generates a first filtered signal approximating the interference according to the first and second processed signals and continuously adapts the first filter coefficients according to the first filtered signal and the first processed signal. The second adaptive filter has adaptable second filter coefficients, generates a second filtered signal approximating the interference according to the first and second processed signals and selectively adapts the second filter coefficients according to the first filter coefficients and an output signal.
Another embodiment of an audio processing apparatus comprises an adaptive beamforming filter and an adaptive echo canceller. The adaptive beamforming filter receives a plurality of input signals, comprising at least one of a source signal, interference and echo, in a first acoustic path from a microphone array of the system and operates to cancel the interference from the input signals to generate a first processed signal and selectively change an adaptation step size of a plurality of filter coefficients according to a control signal. The adaptive echo canceller is coupled between the first acoustic path and at least one loudspeaker in a second acoustic path of the system and operates to cancel the echo from the first processed signal to generate a second processed signal, wherein the control signal is generated according to the presence of the echo in the input signals.
A detailed description is given in the following embodiments with reference to the accompanying drawings.
The invention can be more fully understood by reading the subsequent detailed description and examples with references made to the accompanying drawings, wherein:
a shows an exemplary waveform of a speech signal;
b shows an exemplary waveform of another speech signal with SNR=−6 dB;
c shows the exemplary waveforms of the analysis results of obtained energy levels, power ratios, the control signals; and
The following description is of the best-contemplated mode of carrying out the invention. This description is made for the purpose of illustrating the general principles of the invention and should not be taken in a limiting sense. The scope of the invention is best determined by reference to the appended claims.
As shown in
A blocking matrix 103 is disposed in another audio processing path to receive the input signals and operates to cancel the source signal from the input signals so as to generate another processed signal SBM. According to an embodiment of the invention, the blocking matrix 103 receives the delay compensated input signals from the delay compensation unit 201 and may cancel the source signal by subtraction. According to another embodiment of the invention, the beamformer 102 and the blocking matrix 103 may also be integrated as a signal generator 109 for outputting the processed signals SBF and SBM. Because the input signals are synchronized after delay compensation, the processed signal SBM containing essentially only interference is obtained by subtracting one channel from another. An exemplary blocking matrix WC is shown as:
where the dimension M′ of WC can be determined as M′=M−1 and M represents the number of microphones in the microphone array.
According to an embodiment of invention, the audio processing apparatus 100 comprises two adaptive filters 104 and 105, instead of one as compared with the conventional design, and a characteristic analyzer 106 and a controller 107 to improve interference filtering performance. The interference filtering performance is improved, especially when the audio processing apparatus 100 is disposed in a noisy environment with low signal to noise ratio (SNR). The adaptive filters 104 and 105 are coupled between the beamformer 102 and the blocking matrix 103 and respectively have a plurality of adaptable filter coefficients.
According to the embodiment of the invention, the filter coefficients of the adaptive filter (104 and/or 105) may be adapted according to the normalized least mean squares (NLMS) algorithm to minimize the cost for a next adaptation. The NLMS algorithm updates the coefficients of an adaptive filter by using the following equation:
where the error signal e(n)=d(n)−y(n), d(n) is the input signal of the adaptive filter, y(n) is the output signal from the adaptive filter, {right arrow over (w)}(n) is the filter coefficients vector, {right arrow over (u)}(n) is the filter input vector, and μ is the step size for the coefficient adaptation of the adaptive filter. By way of that, the interference portion is processed through the adaptive filter 105 to minimize the output power of the output signal Sout, which is equivalent to minimize the interference content of the output signal Sout.
According to an embodiment of the invention, the step size for the coefficient adaptation of the adaptive filter 105, such as the value μ shown in Eq. 2, may vary with the characteristics of the coefficients of the adaptive filter 104. The characteristic analyzer 106 is coupled to the adaptive filter 104 for analyzing the characteristics of the coefficients of the adaptive filter 104. As an example, the characteristic analyzer 106 monitors the coefficients of the adaptive filter 104 and analyzes energy level of the coefficients. According to the embodiment of the invention, when the source signals are substantially picked up by the microphone array 101A˜101M in the desired direction (the direction directed to the position of a speaker), the resulting signals output from the beamformer 102 and the blocking matrix 103 will hypothetically diverge. That is, the difference between the processed signals SBM and SBF will be large. In this case, since the coefficients of the adaptive filter 104 are continuously adapted for minimizing the output energy, the coefficient energy of the adaptive filter 104 would be larger than the coefficient energy in other cases. Thus, according to the embodiment of the invention, the controller 107 coupled between the characteristic analyzer 106 and the adaptive filter 105 generates a control signal Sctrl according to the energy level of the coefficients of the adaptive filter 104, which is analyzed by the characteristic analyzer 106, so as to direct the adaptive filter 105 to change its adaptation step size according to the control signal Sctrl.
According to the embodiment of the invention, when the energy level of the coefficients of the adaptive filter 104 increases, the controller 107 may direct the adaptive filter 105 to reduce the adaptation step size. Further, if the energy level exceeds a predetermined threshold, the controller 107 may further direct the adaptive filter 105 to suspend adaptation of the filter coefficients. As previously discussed, although the source signals are substantially picked up in the desired direction, the blocking matrix 130 may not be able to completely remove the source signal from the input signals, and some source signals may still remain in the processed signal SBM. As a result, the output signal Sout, which is supposed to be a clean version of the desired source signal, would be distorted by subtracting the filtered signal SF2 from the processed signal SBF. Thus, in this case, the adaptation step size of the adaptive filter 105 is preferably reduced, or even set to zero so as to slow down or suspend the adaptation. On the other hand, when the energy level of the coefficients of the adaptive filter 104 decreases, the controller 107 may direct the adaptive filter 105 to increase or maintain the adaptation step size, or to resume adaptation (if it was suspended).
where PA+PB represents the power of the subband signal of the processed signals SBF, and PA−PB represents the power of the subband signal of the processed signals SBM.
As previously described, when the source signals are substantially picked up by the microphone array 101A˜101M in the desired direction, the resulting signals output from the beamformer 102 and the blocking matrix 103 will hypothetically diverge. That is, the difference between the processed signals SBM and SBF will be large. Thus, it can be seen from Eq. 3 that the obtained power ratio will be small. According to an embodiment of the invention, in addition to reference with the energy level of the adaptive filter 104, the controller 107 may generate the control signal Sctrl according to the power ratio obtained by the subband signal analyzer 108 to improve further interference filter performance. As an example, when the energy level increases or the power ratio decreases, the controller 107 accordingly directs the adaptive filter 105 to reduce the adaptation step size. Further, when the energy level exceeds a first predetermined threshold or the power ratio does not exceed a second predetermined threshold, the controller 107 accordingly directs the adaptive filter 105 to suspend adaptation. On the other hand, when the energy level decreases or the power ratio increases, the controller 107 accordingly directs the adaptive filter 105 to maintain or increase the adaptation step size, or to resume the adaptation (if it was suspended).
a˜5c shows some experiment results according to the embodiment of the invention. In
decision_value=Function1(SEnergy)+Function2(SPowerRatio) Eq. 4
and
The functions Function1( ) and Function2( ) may be designed flexibly according to different scenarios and thus, the controller 107 may obtain the decision value with adjustable weighting for the energy level signal SEnergy and the power ratio signal SpowerRatio. In the embodiment of the invention, when Sctrl=1, which means the desired signal is present, the adaptive filter 105 suspends the adaptation of its filter coefficients. On the other hand, when Sctrl=0, the adaptive filter 105 may resume adaptation. As can be seen from
According to an embodiment of the invention, the rate of filter adaptation (i.e. the step size μ shown in Eq. 2) of the ABF 601 is controlled by the control signal Sctrl generated according to the extent of interference remaining in the processed signal SAEC and presence of the echo in the input signals. As shown in
Table 1 shows the decision rule for controlling the adaptation step size of the filter coefficients of the ABF 601.
As shown in Table 1, when the echo detector 603 detects that the echo is present and the interference detector 604 detects that interference remains in the processed signal SAEC, the controller 605 generates the control signal Sctrl accordingly so as to direct the ABF 601 to reduce the adaptation step size. When the echo detector 603 detects that the echo is present and the interference detector 604 detects that interference is cancelled, the controller 605 generates the control signal Sctrl accordingly so as to direct the ABF 601 to suspend the adaptation. And when the echo detector 603 detects that the echo is not present, the controller 605 generates the control signal Sctrl accordingly so as to direct the ABF 601 to maintain or increase the adaptation step size. As an example, when the ABF 601 is directed to suspend adaptation, the step size μ may be controlled by setting:
μ=μ·0 Eq. 6
When the ABF 601 is directed to reduce the adaptation step size, the step size μ may be controlled by setting:
When the ABF 601 is directed to increase the adaptation step size, the step size μ may be controlled by setting:
It is noted that in the conventional design, the AEC is usually disposed in front of the ABF for achieving better filtering performance. However, a drawback of such implementation is that the number of AEC filters should be equal to the number of microphones so as to perform echo cancellation for each individual noisy channel. Thus, the computation cost increases as the number of microphones increases. According to the embodiment of the invention, the ABF 601 is designed to be disposed in front of the AEC 602. Thus, only one AEC is required in the audio processing apparatus 600. Further, the adaptation step size of the ABF 601 is adequately controlled as shown in Table 1 in accordance with the extent of the interference remaining in the processed signal SAEC and presence of the echo in the input signals. In this way, compared with the conventional design, the proposed structure not only greatly reduces the computation cost, but also improves the filtering performance by adequately controlling the adaptation step size of the ABF.
While the invention has been described by way of example and in terms of preferred embodiment, it is to be understood that the invention is not limited thereto. Those who are skilled in this technology can still make various alterations and modifications without departing from the scope and spirit of this invention. Therefore, the scope of the present invention shall be defined and protected by the following claims and their equivalents.
Number | Name | Date | Kind |
---|---|---|---|
5353376 | Oh et al. | Oct 1994 | A |
6449586 | Hoshuyama | Sep 2002 | B1 |
7035415 | Belt et al. | Apr 2006 | B2 |
7171008 | Elko | Jan 2007 | B2 |
7203323 | Tashev | Apr 2007 | B2 |
7305099 | Gustavsson | Dec 2007 | B2 |
7346179 | Bobisuthi et al. | Mar 2008 | B1 |
7657038 | Doclo et al. | Feb 2010 | B2 |
7747001 | Kellermann et al. | Jun 2010 | B2 |
7885417 | Christoph | Feb 2011 | B2 |
7957542 | Sarrukh et al. | Jun 2011 | B2 |
20070076898 | Sarroukh et al. | Apr 2007 | A1 |
20090034752 | Zhang et al. | Feb 2009 | A1 |
20090175466 | Elko et al. | Jul 2009 | A1 |