A preferred embodiment of the present invention relates to a signal processing device, a teleconferencing device, and a signal processing method that obtain sound of a sound source by using a microphone.
Japanese Unexamined Patent Application Publication No. 2009-049998 and International publication No. 2014/024248 disclose a configuration to enhance a target sound by the spectrum subtraction method. The configuration of Japanese Unexamined Patent Application Publication No. 2009-049998 and International publication No. 2014/024248 extracts a correlated component of two microphone signals as a target sound. In addition, each configuration of Japanese Unexamined Patent Application Publication No. 2009-049998 and International publication No. 2014/024248 is a technique of performing noise estimation in filter processing by an adaptive algorithm and performing processing of enhancing the target sound by the spectral subtraction method.
A signal processing method performs echo reduction processing on at least one of a collected sound signal of a first microphone, a collected sound signal of a second microphone, or both the collected sound signal of the first microphone and the collected sound signal of the second microphone, and calculates a correlated component between the collected sound signal of the first microphone and the collected sound signal of the second microphone, using a collected sound signal of which echo has been reduced by the an echo reduction processing.
As in the conventional art, in a case of a device that obtains sound of a sound source, using a microphone, the sound outputted from a speaker may be diffracted as an echo component. Since the echo component is inputted as the same component to two microphone signals, the correlation is very high. Therefore, the echo component becomes a target sound and the echo component may be enhanced.
In view of the foregoing, an object of a preferred embodiment of the present invention is to provide a signal processing device, a teleconferencing device, and a signal processing method that are able to calculate a correlated component, with higher accuracy than conventionally.
The microphone 10A and the microphone 10B are disposed at an outer peripheral position of the housing 70 on an upper surface of the housing 70. The speaker 50 is disposed on the upper surface of the housing 70 so that sound may be emitted toward the upper surface of the housing 70. However, the shape of the housing 70, the placement of the microphones, and the placement of the speaker are merely examples and are not limited to these examples.
The signal processor 15 includes a CPU or a DSP. The signal processor 15 performs signal processing by reading out a program 151 stored in the memory 150 being a storage medium and executing the program. For example, the signal processor 15 controls the level of a collected sound signal Xu of the microphone 10A or a collected sound signal Xo of the microphone 10B, and outputs the signal to the I/F 19. It is to be noted that, in the present preferred embodiment, the description of an A/D converter and a D/A converter is omitted, and all various types of signals are digital signals unless otherwise described.
The I/F 19 transmits a signal inputted from the signal processor 15, to other devices. In addition, the I/F 19 receives an emitted sound signal from other devices and inputs the signal to the signal processor 15. The signal processor 15 performs processing such as level adjustment of the emitted sound signal inputted from other devices, and causes sound to be outputted from the speaker 50.
The echo reducer 20 receives a collected sound signal Xo of the microphone 10B, and reduces an echo component from an inputted collected sound signal Xo (S11). It is to be noted that the echo reducer 20 may reduce an echo component from the collected sound signal Xu of the microphone 10A or may reduce an echo component from both the collected sound signal Xu of the microphone 10A and the collected sound signal Xo of the microphone 10B.
The echo reducer 20 receives a signal (an emitted sound signal) to be outputted to the speaker 50. The echo reducer 20 performs echo reduction processing with an adaptive filter. In other words, the echo reducer 20 estimates a feedback component to be obtained when an emitted sound signal is outputted from the speaker 50 and reaches the microphone 10B through a sound space. The echo reducer 20 estimates a feedback component by processing an emitted sound signal with an FIR filter that simulates an impulse response in the sound space. The echo reducer 20 reduces an estimated feedback component from the collected sound signal Xo. The echo reducer 20 updates a filter coefficient of the FIR filter using an adaptive algorithm such as LMS or RLS.
The noise estimator 21 receives the collected sound signal Xu of the microphone 10A and an output signal of the echo reducer 20. The noise estimator 21 estimates a noise component, based on the collected sound signal Xu of the microphone 10A and the output signal of the echo reducer 20.
It is to be noted that the noise estimator 21 applies the Fourier transform to each of the collected sound signal Xo and the collected sound signal Xu, and converts the signals into a signal Xo(f, k) and a signal Xu(f, k) of a frequency axis. The “f” represents a frequency and the “k” represents a frame number.
The gain adjuster 212 extracts a target sound by multiplying the collected sound signal Xu(f, k) by the gain W(f, k) for each frequency. The filter calculator 211 updates the gain of the gain adjuster 212 in update processing by the adaptive algorithm. However, the target sound to be extracted by processing of the gain adjuster 212 and the filter calculator 211 is only a correlated component of direct sound from a sound source to the microphone 10A and the microphone 10B. The impulse response corresponding to a component of indirect sound is ignored. Therefore, the filter calculator 211, in the update processing by the adaptive algorithm such as NLMS or RLS, performs update processing with only several frames being taken into consideration.
Then, the noise estimator 21, in the adder 213, as shown in the following equations, reduces the component of the direct sound, from the collected sound signal Xo(f, k), by subtracting the output signal W(f, k)·Xu(f, k) of the gain adjuster 212 from the collected sound signal Xo(f, k) (S13).
E(f,k)=Xo(f,k)−W(f,k)Xu(f,k) [Equation 1]
Accordingly, the noise estimator 21 is able to estimate a noise component E(f, k) which reduced the correlated component of the direct sound from the collected sound signal Xo (f, k).
Subsequently, the signal processor 15, in the noise suppressor 23, performs noise suppression processing by the spectral subtraction method, using the noise component E(f, k) estimated by the noise estimator 21 (S14).
Herein, β(f, k) is a coefficient to be multiplied by a noise component, and has a different value for each time and frequency. The β(f, k) is properly set according to the use environment of the signal processing device 1. For example, the β value is able to be set to be increased for the frequency of which the level of a noise component is increased.
In addition, in this present preferred embodiment, a signal to be subtracted by the spectral subtraction method is an output signal X′o(f, k) of the sound enhancer 22. The sound enhancer 22, before the noise suppression processing by the noise suppressor 23, as shown in the following equation 3, calculates an average of the signal Xo(f, k) of which the echo has been reduced and the output signal W(f, k)·Xu(f, k) of the gain adjuster 212 (S141).
X′o(f,k)=0.5×{Xo(f,k)+W(f,k)Xu(f,k)} [Equation 3]
The output signal W(f, k)·Xu(f, k) of the gain adjuster 212 is a component correlated with the Xo(f, k) and is equivalent to a target sound. Therefore, the sound enhancer 22, by calculating the average of the signal Xo(f, k) of which the echo has been reduced and the output signal W(f, k)·Xu(f, k) of the gain adjuster 212, enhances sound that is a target sound.
The gain adjuster 232 calculates an output signal Yn(f, k) by multiplying the spectral gain|Gn(f, k)| calculated by the filter calculator 231 by the output signal X′o(f, k) of the sound enhancer 22.
It is to be noted that the filter calculator 231 may further calculate spectral gain G′n(f, k) that causes a harmonic component to be enhanced, as shown in the following equation 4.
Here, i is an integer. According to the equation 4, the integral multiple component (that is, a harmonic component) of each frequency component is enhanced. However, when the value of f/i is a decimal, interpolation processing is performed as shown in the following equation 5.
Subtraction processing of a noise component by the spectral subtraction method subtracts a larger number of high frequency components, so that sound quality may be degraded. However, in the present preferred embodiment, since the harmonic component is enhanced by the spectral gain G′n(f, k), degradation of sound quality is able to be prevented.
As shown in
The gain calculator 241 performs noise suppression processing by the spectral subtraction method, as shown in the following equation 6. However, the multiplication coefficient γ of a noise component is a fixed value and is a value different from a coefficient β(f, k) in the noise suppressor 23.
The gain calculator 241 further calculates an average value Gth(k) of the level of all the frequency components of the signal that has been subjected to the noise suppression processing. Mbin is the upper limit of the frequency. The average value Gth(k) is equivalent to a ratio between a target sound and noise. The ratio between a target sound and noise is reduced as the distance between a microphone and a sound source is increased and is increased as the distance between a microphone and a sound source is reduced. In other words, the average value Gth(k) corresponds to the distance between a microphone and a sound source. Accordingly, the gain calculator 241 functions as a distance estimator that estimates the distance of a sound source based on the ratio between a target sound (the signal that has been subjected to the sound enhancement processing) and a noise component.
The gain calculator 241 changes the gain Gf(k) of the gain adjuster 25 according to the value of the average value Gth(k) (S16). For example, as shown in the equation 6, in a case in which the average value Gth(k) exceeds a threshold value, the gain Gf(k) is set to the specified value a, and, in a case in which the average value Gth(k) is not larger than the threshold value, the gain Gf(k) is set to the specified value b (b<a). Accordingly, the signal processing device 1 does not collect sound from a sound source far from the device, and is able to enhance sound from a sound source close to the device as a target sound.
It is to be noted that, in the present preferred embodiment, the sound of the collected sound signal Xo of the non-directional microphone 10B is enhanced, subjected to gain adjustment, and outputted to the I/F 19. However, the sound of the collected sound signal Xu of the directional microphone 10A may be enhanced, subjected to gain adjustment, and outputted to the I/F 19. However, the microphone 10B is a non-directional microphone and is able to collect sound of the whole surroundings. Therefore, it is preferable to adjust the gain of the collected sound signal Xo of the microphone 10B and to output the adjusted sound signal to the I/F 19.
The technical idea described in the present preferred embodiment will be summarized as follows.
1. A signal processing device includes a first microphone (a microphone 10A), a second microphone (a microphone 10B), and a signal processor 15. The signal processor 15 (an echo reducer 20) performs echo reduction processing on at least one of a collected sound signal Xu of the microphone 10A, or a collected sound signal Xo of the microphone 10B. The signal processor 15 (a noise estimator 21) calculates an output signal W(f, k)·Xu(f, k) being a correlated component between the collected sound signal of the first microphone and the collected sound signal of the second microphone, using a signal Xo(f, k) of which echo has been reduced by the echo reduction processing.
As with Japanese Unexamined Patent Application Publication No. 2009-049998 and International publication No. 2014/024248, in a case in which echo is generated when a correlated component is calculated using two signals, the echo component is calculated as a correlated component, which causes the echo component to be enhanced as a target sound. However, the signal processing device according to the present preferred embodiment, since calculating a correlated component using a signal of which the echo has been reduced, is able to calculate a correlated component, with higher accuracy than conventionally.
2. The signal processor 15 calculates an output signal W(f, k)·Xu(f, k) being a correlated component by performing filter processing by an adaptive algorithm, using a current input signal or the current input signal and several previous input signals.
For example, Japanese Unexamined Patent Application Publication No. 2009-049998 and International publication No. 2014/024248 employ the adaptive algorithm in order to estimate a noise component. In an adaptive filter using the adaptive algorithm, a calculation load becomes excessive as the number of taps is increased. In addition, since a reverberation component of sound is included in processing using the adaptive filter, it is difficult to estimate a noise component with high accuracy.
On the other hand, in the present preferred embodiment, the output signal W(f, k)·Xu(f, k) of the gain adjuster 212, as a correlated component of direct sound, is calculated by the filter calculator 211 in the update processing by the adaptive algorithm. As described above, the update processing is update processing in which an impulse response that is equivalent to a component of indirect sound is ignored and only one frame (a current input value) is taken into consideration. Therefore, the signal processor 15 of the present preferred embodiment is able to remarkably reduce the calculation load in the processing to estimate a noise component E(f, k). In addition, the update processing of the adaptive algorithm is the processing in which an indirect sound component is ignored. In the update processing of the adaptive algorithm, the reverberation component of sound has no effect, so that a correlated component is able to be estimated with high accuracy. However, the update processing is not limited only to one frame (the current input value). The filter calculator 211 may perform update processing including several past signals.
3. The signal processor 15 (the sound enhancer 22) performs sound enhancement processing using a correlated component. The correlated component is the output signal W(f, k)·Xu(f, k) of the gain adjuster 212 in the noise estimator 21. The sound enhancer 22, by calculating an average of the signal Xo(f, k) of which the echo has been reduced and the output signal W(f, k)·Xu(f, k) of the gain adjuster 212, enhances sound that is a target sound.
In such a case, since the sound enhancement processing is performed using the correlated component calculated by the noise estimator 21, sound is able to be enhanced with high accuracy.
4. The signal processor 15 (the noise suppressor 23) uses a correlated component and performs processing of reducing the correlated component.
5. More specifically, the noise suppressor 23 performs processing of reducing a noise component using the spectral subtraction method. The noise suppressor 23 uses the signal of which the correlated component has been reduced by the noise estimator 21, as a noise component.
The noise suppressor 23, since using a highly accurate noise component E(f, k) calculated in the noise estimator 21, as a noise component in the spectral subtraction method, is able to suppress a noise component, with higher accuracy than conventionally.
6. The noise suppressor 23 further performs processing of enhancing a harmonic component in the spectral subtraction method. Accordingly, since the harmonic component is enhanced, the degradation of the sound quality is able to be prevented.
7. The noise suppressor 23 sets a different gain β(f, k) for each frequency or for each time in the spectral subtraction method. Accordingly, a coefficient to be multiplied by a noise component is set to a suitable value according to environment.
8. The signal processor 15 includes a distance estimator 24 that estimates a distance of a sound source. The signal processor 15, in the gain adjuster 25, adjusts a gain of the collected sound signal of the first microphone or the collected sound signal of the second microphone, according to the distance that the distance estimator 24 has estimated. Accordingly, the signal processing device 1 does not collect sound from a sound source far from the device, and is able to enhance sound from a sound source close to the device as a target sound.
9. The distance estimator 24 estimates the distance of the sound source, based on a ratio of a signal X′(f, k) on which sound enhancement processing has been performed using the correlated component and a noise component E(f, k) extracted by the processing of reducing the correlated component. Accordingly, the distance estimator 24 is able to estimate a distance with high accuracy.
Finally, the foregoing preferred embodiments are illustrative in all points and should not be construed to limit the present invention. The scope of the present invention is defined not by the foregoing preferred embodiment but by the following claims. Further, the scope of the present invention is intended to include all modifications within the scopes of the claims and within the meanings and scopes of equivalents.
The present application is a continuation of International Application No. PCT/JP2017/021616, filed on Jun. 12, 2017, the entire contents of which are incorporated herein by reference.
Number | Name | Date | Kind |
---|---|---|---|
6738482 | Jaber | May 2004 | B1 |
8462962 | Itou | Jun 2013 | B2 |
9426566 | Takahashi | Aug 2016 | B2 |
9510095 | Takahashi | Nov 2016 | B2 |
20090067642 | Buck | Mar 2009 | A1 |
20140328490 | Mohammad | Nov 2014 | A1 |
20140376742 | Hetherington | Dec 2014 | A1 |
20150181329 | Mikami | Jun 2015 | A1 |
Number | Date | Country |
---|---|---|
101527875 | Sep 2009 | CN |
S63262577 | Oct 1988 | JP |
2009049998 | Mar 2009 | JP |
2013061421 | Apr 2013 | JP |
2014229932 | Dec 2014 | JP |
2015070291 | Apr 2015 | JP |
2009104252 | Aug 2009 | WO |
2014024248 | Feb 2014 | WO |
2015049921 | Apr 2015 | WO |
Entry |
---|
English Language Translation of JP2015070291A, Takahashi. (Year: 2015). |
International Search Report issued in Intl. Appln. No. PCT/JP2017/021616 dated Aug. 1, 2017. English translation provided. |
Written Opinion issued in Intl. Appln. No. PCT/JP2017/021616 dated Aug. 1, 2017. |
Extended European Search Report issued European Appln. No. 17913502.5 dated Dec. 16, 2020. |
Office Action issued in Japanese Appln. No. 2019-524558 dated Dec. 22, 2020. English machine translation provided. |
Office Action issued in Chinese Appln. No. 201780091855.1 dated Dec. 15, 2020. English translation provided. |
Number | Date | Country | |
---|---|---|---|
20200105290 A1 | Apr 2020 | US |
Number | Date | Country | |
---|---|---|---|
Parent | PCT/JP2017/021616 | Jun 2017 | US |
Child | 16701771 | US |