The present disclosure relates generally to wind noise suppression for sound pickup devices such as headsets and the like.
The use of communication devices in windy conditions is an every-day occurrence for people around the world, but the microphone pickup of wind noise often interferes with effective communication. A basic characteristic of wind noise is that it is highly dynamic and non-stationary in time, much like the characteristic of speech, making it difficult to separate the wind noise from a noisy speech signal. Current state-of-the art headsets, handsets, car kits and the like utilize multiple microphones in array configurations, along with noise reduction algorithms, to reduce or remove acoustic background noise. Recognizing the fact that wind noise is heavily weighted toward the low frequencies, the interference of wind noise is often addressed by using high-pass filters in single-channel methods (sometimes in an adaptive manner). These methods reduce the audible wind noise, but such filters cut all low frequency sounds including that of the desired speech signals, producing a deterioration of sound quality and a reduction of speech intelligibility.
Wind noise is created at a microphone's input by the turbulent pressure fluctuations developed by moving air. These pressure fluctuations are effectively uncorrelated at multiple, spaced apart, microphones because the spatial coherence of the fluctuations decays rapidly with distance. Thus, wind noise picked up by spaced apart microphones is essentially uncorrelated, while the desired signal is correlated.
As disclosed herein, a wind noise suppression device for suppressing wind noise in one or more of at least first and second channels includes a differencing module configured to obtain a magnitude difference of signals in the first and second channels, a summing module configured to obtain a magnitude sum of signals in the first and second channels, a ratioing module configured to obtain a ratio of the magnitude difference to the magnitude sum, one or more attenuators each associated with a channel, an attenuation generator configured to generate an attenuation value based on the ratio from the ratioing module, and an attenuation steering module configured to select an attenuator based on the magnitude difference, the selected attenuator operative to attenuate the signal in the associated channel by the attenuation value.
Also as disclosed herein, a method for suppressing noise in one or more of at least first and second channels includes obtaining a magnitude difference of signals in the first and second channels, obtaining a magnitude sum of signals in the first and second channels, obtaining a ratio of the magnitude difference to the magnitude sum, generating an attenuation value based on the ratio, selecting an attenuator based on the magnitude difference, and attenuating a signal in a channel by the attenuation value using the selected attenuator.
Also as disclosed herein, a nonvolatile program storage device readable by a machine, embodying a program of instructions executable by the machine to perform a method for suppressing noise in one or more of at least first and second channels, the method including obtaining a magnitude difference of signals in the first and second channels, obtaining a magnitude sum of signals in the first and second channels, obtaining a ratio of the magnitude difference to the magnitude sum, generating an attenuation value based on the ratio, selecting an attenuator based on the magnitude difference, and attenuating a signal in a channel by the attenuation value using the selected attenuator.
The accompanying drawings, which are incorporated into and constitute a part of this specification, illustrate one or more examples of embodiments and, together with the description of example embodiments, serve to explain the principles and implementations of the embodiments.
Example embodiments are described herein in the context of a multi-channel wind noise suppression method and system. Those of ordinary skill in the art will realize that the following description is illustrative only and is not intended to be in any way limiting. Other embodiments will readily suggest themselves to such skilled persons having the benefit of this disclosure. Reference will now be made in detail to implementations of the example embodiments as illustrated in the accompanying drawings. The same reference indicators will be used to the extent possible throughout the drawings and the following description to refer to the same or like items.
In the interest of clarity, not all of the routine features of the implementations described herein are shown and described. It will, of course, be appreciated that in the development of any such actual implementation, numerous implementation-specific decisions must be made in order to achieve the developer's specific goals, such as compliance with application- and business-related constraints, and that these specific goals will vary from one implementation to another and from one developer to another. Moreover, it will be appreciated that such a development effort might be complex and time-consuming, but would nevertheless be a routine undertaking of engineering for those of ordinary skill in the art having the benefit of this disclosure.
In accordance with this disclosure, the components, process steps, and/or data structures described herein may be implemented using various types of operating systems, computing platforms, computer programs, and/or general purpose machines. In addition, those of ordinary skill in the art will recognize that devices of a less general purpose nature, such as hardwired devices, field programmable gate arrays (FPGAs), application specific integrated circuits (ASICs), or the like, may also be used without departing from the scope and spirit of the inventive concepts disclosed herein. Where a method comprising a series of process steps is implemented by a computer or a machine and those process steps can be stored as a series of instructions readable by the machine, they may be stored on a tangible medium such as a computer memory device (e.g., ROM (Read Only Memory), PROM (Programmable Read Only Memory), EEPROM (Electrically Erasable Programmable Read Only Memory), FLASH Memory, Jump Drive, and the like), magnetic storage medium (e.g., tape, magnetic disk drive, and the like), optical storage medium (e.g., CD-ROM, DVD-ROM, paper card, paper tape and the like) and other types of program memory.
The term “exemplary” is used exclusively herein to mean “serving as an example, instance or illustration.” Any embodiment or arrangement described herein as “exemplary” is not necessarily to be construed as preferred or advantageous over other embodiments.
A basic characteristic of wind noise is that it is highly dynamic and non-stationary in time, much like the characteristic of speech, making it difficult to separate the wind noise from a noisy speech signal. However, in multi-microphone systems, wind noise, which is created at the sound inlet, or “port,” of microphones, is poorly correlated between spaced-apart microphones.
As disclosed herein, a new multi-channel wind noise reduction method and system exploits the spatial independence of wind noise at physically separated sensor inputs. It takes advantage of the fact that wind disturbances affect each microphone differently; in particular, at different energy levels. By design, when there are N signals, or channels, the wind noise is individually suppressed in each channel, and N channels of wind-noise reduced signals are output. Operation can be partially or fully implemented in the time or frequency domains.
In recognition that wind turbulence noise is poorly correlated in space and time, the wind noise effects present in the output signals of separate microphones will be different from each other. In particular, the magnitude difference between the signals from the microphones generally will be much larger than that of the desired signal. When the signals are broken into very small temporal increments, the probability that there is significant wind noise energy in more than one signal at the same time approaches zero. Similarly, by breaking each signal into very small frequency increments, the probability that there is significant wind noise energy in more than one signal frequency increment at the same time also approaches zero.
Turning to
The system 100 divides the signals from each of the microphones 102, 104 into small increments, which can be time or frequency increments, or both. A signal domain converter 106 converts the time domain signals from the microphones into the frequency domain. Then, in each corresponding time/frequency increment, a set of new signals indicative of the magnitude sums and differences of the original individual array microphone signals are generated. One approach for accomplishing this can be in accordance with Equation 1 below, and can be performed using summing module 108 and differencing module 110. The resultant sum and difference signals are applied to a divider module 112, which divides the differences by the sums. Attenuation values are generated for each increment in an attenuation value generator 114. The attenuation value for each increment is described in Equation 1 as follows:
wherein ATW is the wind noise attenuation value, L and R are the left (102) and right (104) microphone signals in this exemplary two-channel system and, i, is a time index.
It should be noted that as described herein, the procedure of “obtaining the magnitude sum” is intended to encompass both 1) summing the signals involved then taking the magnitude of the result as in Equation 1 above, and 2) obtaining the magnitudes of each of the signals involved, then summing these magnitudes together. Thus when referring to “obtaining the magnitude sum,” either of these approaches is contemplated. Similarly, the procedure of “obtaining the magnitude difference” is intended to encompass both 1) subtracting one signal from the other then determining the magnitude of the difference and 2) determining the magnitude of each of the signals involved, and then determining the difference of these magnitudes (the latter approach is taken in Equation 1). In “obtaining the magnitude difference,” the sign of the difference (that is, which channel is larger in magnitude, which is indicative of which channel contains the greater wind noise component), is tracked in order to properly direct the attenuation to the appropriate (left or right) channel. This attenuation directing, or steering, is performed by a steering module 114a. It is further intended that the procedure of “obtaining the magnitude” includes obtaining a signal amplitude value, signal rms value, signal energy value, or any other signal level measure.
Moreover, in the discussion below, it should be noted that the less attenuation applied, the more of the original signal is preserved; conversely, the more attenuation applied, the less of the original signal is preserved. In effect, zero attenuation means that no attenuation is applied, and the original signal is passed unattenuated. Conversely, if the maximum range of attenuation is from 0 to 1, then an attenuation of 1 means the maximum attenuation is applied, and minimum, or zero, original signal is passed.
The variables p1 and p2 are powers to which the individual components can be raised to control the amount of attenuation that is applied to the output signals. The variables p1 and p2 are not necessarily integers and are not limited to real numbers, but are typically real numbers in the range from 1 to 10. In one embodiment, they are both selected to be 2 (p1=p2=2). Moreover, different values of p1 and p2 may be selected for the numerator terms and denominator terms in the above equation—that is, p1 need not equal p2. Selection of the powers p1 and p2 can be made with an eye to preserving the sign of the difference, in order to properly direct the attenuation to the appropriate channel. Alternatively, the sign can be separately determined independent of the difference determination, or, in the case where the power is applied to a difference value, the sign can be extracted and preserved prior to application of the power operation. Adding the constant k to the denominator before dividing, k typically being selected to be a very small number such as 10−99, can be performed to avoid the difficulties associated with dividing by zero. The calculation of Equation 1 is performed separately on each frequency/time increment.
Using the example of the two-microphone array of system 100, if there is only wind noise energy in just one microphone signal at a time, then the magnitudes of the sum and of the difference signals will be identical and the value of the magnitude of ATW will be “1” since the numerator and denominator in Equation 1 will be identical. However, there will be a sign difference depending on whether the wind noise is in the left channel or in the right channel. In the above convention, if the wind noise is predominantly in the left channel signal (mic 102), the sign will be positive, while if the predominant energy is in the right channel signal (mic 104) it will be negative. The significance of this sign preservation will be discussed in detail below.
Alternatively, the desired signal, for example voice, will have the same magnitude, or nearly the same, in each incremental pair of the original signals, but not in the sum and difference pair. The magnitude of the difference (numerator) will be quite small while the magnitude of the sum (denominator) will be approximately twice that of either signal. In this case, Equation 1 above indicates that ATW will be close to or equal to “0”.
It is intended that well known principles for signal matching are within the scope of this application, and that the signals input to this multi-channel wind noise suppression system/method may be modified versions of the signals directly available from the microphones themselves. For example, the microphone signals may be amplified to overcome additive noise in an electronic system incorporating this wind noise suppression technology. Also, the microphone signals preferentially may be matched in amplitude and/or phase and/or time delay for the desired signal using well known preprocessing means, prior to the wind noise suppression technology of the present application. In many broadside microphone array applications, the desired signal is inherently well matched in the original signals. In other applications, such as in end-fire microphone array applications, the desired signal may need to be matched first before the wind-noise suppression is applied. All such system configurations are contemplated as included for this technology.
For example, assuming that the desired (voice) signal is the same in both original signals, or matched after reception, then the magnitude of the sum signal will be twice the magnitude of either original signal, but the magnitude of the difference signal will be zero. In other words, the magnitude difference between the sum and difference signals will be very large for the desired signal component, but very small for the wind noise component. In the high-wind case, that difference between the numerator and denominator will approach 0, and the ratio will approach unity, since the numerator and denominator will both be almost the same, whereas in the low-wind or high desired-signal case, that difference between the numerator and denominator will be non-negligible—that is, significantly greater than 0, and the ratio will approach 0. Applying the process described by Equation 1 above and illustrated in block diagram in
Next, the attenuation values, ATW, are applied to the individual microphone channel signals, to thereby suppress the windy portions of the signals as necessary and result in the generation of multiple wind-noise reduced, but separate, signals that can be used in any subsequent multi-channel process. One manner of applying the attenuation to the microphone signals is to weight the signals in the two channels differently, as a function of the attenuation values and their sign. In one embodiment, a right channel multiplier 116 and a left channel multiplier 118 are utilized to apply the attenuation weight values ATWR and ATWL to the respective right and left channels, each multiplier multiplying the channel signal by a factor that is a function of the attenuation signal ATW. For maximum attenuation—that is, ATW close to ±1—the factor by which the channel signal is multiplied can be a very small fraction, or even zero (to thereby completely suppress that channel's signal). For minimum attenuation—that is, ATW close to 0—the factor by which the channel signal is multiplied can be close to one, thereby passing the channel signal substantially or completely unaltered or unsuppressed. Which channel is treated in this manner can be determined by the sign of ATW, which will indicate which of the channels has the greater noise and warrants greater suppression.
To demonstrate this application of the attenuation values ATW, first a separate attenuation value for each channel is derived as follows:
As shown in Equation 2, the attenuation to be applied to the left channel in this two microphone example is “1” whenever ATW is less than or equal to zero, “0” whenever ATW is greater than one, and 1−ATW whenever ATW is between zero and one. The arrow over Equation 2 indicates that like Equation 1 this calculation is performed separately for each frequency/time increment.
As shown in Equation 3, the attenuation to be applied to the right channel is “1” whenever ATW is greater than or equal to zero, “0” whenever ATW is less than minus one, and ATW+1 whenever ATW is between minus one and zero. In other words, the positive values of ATW are applied to the left channel signal and the negative values of ATW are applied to the right channel signal, in the manner explained above, to create two separate and independent channel attenuation signals ATWL and ATWR.
These separate channel attenuation weight value signals ATWL and ATWR are then used to suppress the wind noise in each channel's signal as necessary. It should be noted that for each time and/or frequency increment, at least one channel will be passed without attenuation, as evident from Equations 1, 2 and 3 above. In other words, for each time/frequency increment the attenuation is calculated on information from both channels, and used to attenuate only the channel with more wind noise, passing the other channel unattenuated.
In one implementation, the suppression is implemented multiplicatively, using multipliers 116 and 118. In the two-channel example being used here,
LWi=
the wind-reduced left channel output signal, LW, is the product of the original left channel input signal, L, times the left channel attenuation value signal, ATWL. Similarly, the wind reduced right channel output signal, RW, is the product of the original right channel input signal, R, times the right channel attenuation value signal, ATWR. These calculations are shown in Equation 4. The outputs are then passed to the next process in the device, which can be implemented using a different processing module 120. Examples of such further processing include transmission via a wired or wireless network to a remote listening or recording device, recording at a local device, or the like. Additionally, further sound processing can be implemented, such as that for enhancement of noise discrimination, signal matching, beam forming or the like. By removing the wind noise component while still preserving the original channel signals, the system and method described herein allows for flexible application in virtually any multi-channel microphone array system. For example, it is compatible with many beam formers because it only affects the magnitudes of the signals while the phase is preserved. In some applications, use can be made of the magnitude of the unattenuated signal, which can be applied to the opposite channel signal or vector fractions of the unattenuated channel signal can be mixed into the attenuated signal to recreate magnitude information to preserve good desired signal output.
The above calculations are performed in a process 200 illustrated in
Alternatively, the sign can be determined separately, as shown in
There are several methods to utilize this technology for multi-channel systems of three or more channels. In a first method, pairs of channels can be selected, and the above described process is applied to those pairs. For example, in a four channel system, channel signals #1 and #2 are processed as disclosed above, then channel signals #3 and #4 are similarly processed, resulting in four channels of wind noise reduced signals. In a second approach, instead of processing pairs of channel signals, the wind attenuations from multiple pairs can be combined to create the ATWx signals. For example, in a three channel system, first channel signals #1 and #2 are processed to create the ATW1-1 and ATW2-1 attenuations, second channel signals #2 and #3 are processed to create the ATW2-2 and ATW3-2 attenuations, and lastly channel signals #3 and #1 are processed to create the ATW3-3 and ATW1-3 attenuations. Subsequently, the two ATW1 attenuations, ATW1-1 and ATW1-3, are combined by multiplication and applied to the #1 channel signal to remove it's wind component. Similarly the attenuations ATW2-1 and ATW2-2 are combined and applied to the #2 channel signal, while the attenuations ATW3-2 and ATW3-3 are combined and applied to the #3 channel signal. Thereby all three channel signals are wind noise reduced. The wind noise reduced multi-channel signals can be used as the input signals for virtually any multi-channel system, for example a beam former.
While embodiments and applications have been shown and described, it would be apparent to those skilled in the art having the benefit of this disclosure that many more modifications than mentioned above are possible without departing from the inventive concepts disclosed herein. The invention, therefore, is not to be restricted except in the spirit of the appended claims.
This Application claims priority to related, U.S. Provisional Patent Application Nos. 61/441,528 filed on Feb. 10, 2011, hereby incorporated by reference in its entirety. This application is related to U.S. Provisional Pat. Appl. No. 61/441,396 filed Feb. 10, 2011; U.S. Provisional Pat. Appl. No. 61/441,397 filed Feb. 10, 2011; U.S. Provisional Pat. Appl. No. 61/441,611 filed Feb. 10, 2011; U.S. Provisional Pat. Appl. No. 61/441,511 filed Feb. 10, 2011 and U.S. Provisional Pat. Appl. No. 61/441,633 filed Feb. 10, 2011.
Number | Name | Date | Kind |
---|---|---|---|
4649505 | Zinser | Mar 1987 | A |
4912767 | Chang | Mar 1990 | A |
5208786 | Weinstein | May 1993 | A |
5226087 | Ono | Jul 1993 | A |
5251263 | Andrea et al. | Oct 1993 | A |
5288955 | Staple | Feb 1994 | A |
5568559 | Makino | Oct 1996 | A |
5701344 | Wakui | Dec 1997 | A |
5917921 | Sasaki | Jun 1999 | A |
5999567 | Torkkola | Dec 1999 | A |
5999956 | Deville | Dec 1999 | A |
6002776 | Bhadkamkar | Dec 1999 | A |
6091830 | Toki | Jul 2000 | A |
6108415 | Andrea et al. | Aug 2000 | A |
6317709 | Zack | Nov 2001 | B1 |
6343268 | Balan | Jan 2002 | B1 |
6424960 | Lee | Jul 2002 | B1 |
6526148 | Jourjine | Feb 2003 | B1 |
6549630 | Bobisuthi | Apr 2003 | B1 |
6688169 | Choe | Feb 2004 | B2 |
6741714 | Jensen | May 2004 | B2 |
6822507 | Buchele | Nov 2004 | B2 |
6857312 | Choe | Feb 2005 | B2 |
6859420 | Coney | Feb 2005 | B1 |
6983264 | Shimizu | Jan 2006 | B2 |
7043043 | Kakinuma | May 2006 | B2 |
7082204 | Gustavsson | Jul 2006 | B2 |
7171008 | Elko | Jan 2007 | B2 |
7174023 | Ozawa | Feb 2007 | B2 |
7181030 | Rasmussen | Feb 2007 | B2 |
7246058 | Burnett | Jul 2007 | B2 |
7274794 | Rasmussen | Sep 2007 | B1 |
7305099 | Gustavsson | Dec 2007 | B2 |
7383178 | Visser | Jun 2008 | B2 |
7415372 | Taenzer | Aug 2008 | B2 |
7436188 | Taenzer | Oct 2008 | B2 |
7464029 | Visser | Dec 2008 | B2 |
7472041 | Taenzer | Dec 2008 | B2 |
7617099 | Yang | Nov 2009 | B2 |
7619563 | Taenzer | Nov 2009 | B2 |
7885420 | Hetherington | Feb 2011 | B2 |
8428275 | Yoshida | Apr 2013 | B2 |
8861745 | Yen | Oct 2014 | B2 |
20030147538 | Elko | Aug 2003 | A1 |
20040039464 | Virolainen | Feb 2004 | A1 |
20040167777 | Hetherington | Aug 2004 | A1 |
20070047742 | Taenzer | Mar 2007 | A1 |
20070047743 | Taenzer | Mar 2007 | A1 |
20070047744 | Harney | Mar 2007 | A1 |
20070050441 | Taenzer | Mar 2007 | A1 |
20070058822 | Ozawa | Mar 2007 | A1 |
20080049953 | Harney | Feb 2008 | A1 |
20080152167 | Taenzer | Jun 2008 | A1 |
20080212794 | Ikeda | Sep 2008 | A1 |
20090055170 | Nagahama | Feb 2009 | A1 |
20090136057 | Taenzer | May 2009 | A1 |
20090154726 | Taenzer | Jun 2009 | A1 |
20090175466 | Elko et al. | Jul 2009 | A1 |
20090226006 | Meyer et al. | Sep 2009 | A1 |
20090316916 | Haila | Dec 2009 | A1 |
Number | Date | Country |
---|---|---|
6029764 | Feb 1994 | JP |
10327494 | Dec 1998 | JP |
10327494 | Dec 1998 | JP |
2004232964 | Aug 2004 | JP |
2010028307 | Feb 2010 | JP |
03015459 | Feb 2003 | WO |
03015460 | Feb 2003 | WO |
Entry |
---|
Yousefian, et al., “Power Level Difference as a Criterion for Speech Enhancement” Acoustics, Speech and Signal Processing 2009, IEEE International Conference on Issue Date: Apr. 19-24, 2009, pp. 4653-4656. |
Elko, Gary W., “Adaptive Noise Cancellation with Directional Microphones” 1997 IEEE ASSP Workshop Issue Date: Oct. 19-22, 1997, p. 4. |
Huang, Y., et al., “Real-Time Passive Source Localization: A Practical Linear-Correction Least-Squares Approach” IEEE Transactions on Speech and Audio Processing, vol. 9, No. 8, Nov. 2001, pp. 943-956. |
Yousefian, N., et al., “Using Power Level Difference for Near Field Dual-Microphone Speech Enhancement” Applied Acoustics 2009. |
Matsumoto, M., et al., “Modified Safia Utilizing Aggregated Microphones” Proceedings of the 24th IASTED International Multi-Conference Signal Processing, Pattern Recognition, and Applications, Feb. 15-17, 2006, Innsbruck, Austria, pp. 222-227. |
Matsumoto, M., et al., “A Miniaturized Adaptive Microphone Array Under Directional Constraint Utilizing Aggregated Microphones” J. Acous. soc. AM Jan. 2006, pp. 352-359. |
Girolami, M., “Symmetric Adaptive Maximum Likelihood Estimation for Noise Cancellation and Signal Separation” Electronics Letters, Aug. 14, 1997, vol. 33, No. 17, pp. 1437-1438. |
Number | Date | Country | |
---|---|---|---|
20120207325 A1 | Aug 2012 | US |
Number | Date | Country | |
---|---|---|---|
61441528 | Feb 2011 | US | |
61441396 | Feb 2011 | US | |
61441397 | Feb 2011 | US | |
61441611 | Feb 2011 | US | |
61441511 | Feb 2011 | US | |
61441633 | Feb 2011 | US |