METHOD AND APPARATUS FOR MASKING WIND NOISE

Description

BACKGROUND

Wind noise is a serious problem that occurs during telephone conversations that take place outside, in a moving vehicle, or in an otherwise windy environment. Wind noise can cause the listener on the far end of a telephone conversion to be unable to understand or hear the caller's voice.

Wind speed and direction is constantly changing and as a result is very difficult to eliminate from telephone conversations. Conventional wind and/or noise cancelling methods and apparatuses are ineffective. The invention provides an effective method and/or apparatus for masking or eliminating wind noise from a telephone conversation while maintaining audible speech. A method and apparatus for masking, removing or suppressing wind noise would be an improvement over the prior art.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 depicts a block diagram for an adaptive wind noise masking filter;

FIG. 2 depicts a block diagram of an implementation of the adaptive wind noise masking filter using a computer;

FIG. 3 shows several example frequency responses of primary noise masking filters;

FIG. 4 shows frequency response changes for linear primary noise masking filter gain (W) variation from 0.1 to 0.9 at a fixed Cogent frequency (CF);

FIG. 5 shows frequency response changes for linear primary noise masking filter CF variation from 50 Hz to 550 Hz. At a fixed gain W;

FIG. 6 shows frequency response changes for a linear reference filter based on different W and CF;

FIGS. 7A and 7B show oscilloscope traces of an input signal before and after filtering the audio signal using the adaptive wind noise masking filter; and

FIG. 8 is a depiction of how characteristics of the adaptive wind noise masking filter change over time, to provide the output signal shown in FIG. 7B from the input signal shown in FIG. 7A.

DETAILED DESCRIPTION

FIG. 1 is a functional block diagram of a method and apparatus 10 for masking wind noise. An embodiment is implemented by a computer executing program instructions stored in a memory device coupled to the computer. The instructions cause the computer to perform functions identified by the various functional blocks. FIG. 1 thus illustrates a methodology, however, those of ordinary skill in the art will recognize that the methodology depicted in FIG. 1 can also be implemented using a digital signal processor (DSP), a field programmable gate array or FPGA as well as discrete devices. FIG. 1 is thus considered to also illustrate an apparatus.

An embodiment is comprised of a low-pass filter 15, which receives audio signals 30, such as those output from a conventional microphone 25. In the preferred embodiment, the low-pass filter 15 is a digital filter, embodied as various computer program routines that process digital representations of the audio signal 30 from the microphone 25.

As shown in the figure, the analog audio signals 30 are input to a Fast Fourier Transform (FFT) calculator 35, implemented using program instructions. The output of the FFT calculator is input to a multiplier 40, also implemented using program instructions. The multiplier 40 multiplies the output of the Fast Fourier Transform calculator 35 by the output of an adaptive wind noise masking filter 45.

The adaptive wind noise masking filter 45 receives information from a wind noise probability classification block 50 and processes appropriate reference filters 60 to generate a target filter to apply on the output of the FFT 35. The wind noise probability classification 50 generates an output that is indicative of whether the signal 30 from the microphone 25 is likely to have noise, speech, or combination of speech and noise. The wind noise probability classification is derived from information obtained from a wind noise detector 65.

Digital signals representing a wind noise-suppressed version of the audio from the microphone 25, is output from the multiplier 40 when a decision is made that the audio 30 from the microphone 25 is likely to have wind noise. The output of the adaptive wind noise masking filter 45 is therefore a frequency domain wind noise masking filter coefficients 58 which is input to the multiplier 40. The output of the multiplier 40 is input to an inverse Fast Fourier Transform (IFFT) circuit 70 the output of which 75 is a noise-reduced copy of the speech input into the microphone 25.

In an embodiment, wind noise detection is performed by a comparison of the low-pass filtered signal to the audio input signal 30. The comparison is computed as a ratio of the power level in each of the signals. In the embodiment, which uses the ratio of the low-pass filtered signal power P_tto the total power of the input signal P_T, the comparison is a ratio expressed in equation (1) below. In the embodiment, low pass filter has a cutoff frequency at 150 Hz.

$\begin{matrix} ρ (n) = \frac{P_{t} (n)}{P_{T} (n)} & (1) \end{matrix}$

Where, ρ is the power ratio for a given input frame, n. In an embodiment a frame is 10 ms long.

A wind noise probability classification (50) is calculated by using a “smoothened power ratio.” The smoothened power ratio is expressed by equation (2) below:

ξ(n)=α·ξ(n−1)+(1−α)·ρ(n) (2)

where, α is smoothing coefficient, the value of which is a design choice but selected to determine the emphasis to put on one or more historical values of ξ. And, the value of α is between 0 and 1. In an embodiment α is set in the rage of [0.75, 1), where the bracket “[” indicates inclusion of the adjacent value, i.e., the value next to it is to be included within the range and, the parenthesis means, up to but not including the adjacent value, i.e., the value “1” is not included in the range but all lesser values are.

In Equation (2), the value of ξ(n) defines the probability of speech or noise in the input signal. And, it can be seen in Equation (2) that the speech or noise probability determination uses a current sample represented by the term, (1−α)·ρ(n) and at least one, previously-obtained sample or “history” of the signal, which is represented by the term. α·ξ(n−1) In the embodiment, the following speech and noise classifications are obtained by comparing numeric values of ξ obtained from Equation (2) with user defined numeric thresholds:

$\begin{matrix} \begin{matrix} Ψ = “ Speech only ” : if ξ < SP_ONLY_THR \\ Ψ = “ Mostly speech ” : if SP_ONLY_THR < ξ < NS_SP_THR \\ Ψ = “ Mostly wind noise ” : if NS_SP_THR < ξ < NS_THR \\ Ψ = “ Mostly noise only ” : if NS_THR < ξ \end{matrix}} & (3) \end{matrix}$

where,

SP_ONLY_THR is a threshold for speech classification;

NS_SP_THR is an intermediate threshold for identifying high probability of speech or wind noise;

NS_THR is a high threshold for wind noise classification, and; Ψ is a wind noise probability classification.

There could be more classifications of Ψ than are shown in the family of Equations (3), e.g., “More speech”, “More wind noise”, “Equal speech and wind noise” etc., in order to maintain smoother transition between wind noise and speech.

The thresholds defined in the family of equations (3) are used to determine characteristics of a primary adaptive masking filter 45. The characteristics of a primary adaptive masking filter 45 are compared to at least one reference filter 60 and thereafter selected to allow appropriate suppression and/or amplification of noise and/or speech in the audio signal. Example frequency responses of reference filters 60 are shown in FIG. 3, where filter represented with ‘−’ performs less aggressive attenuation and filter represented with ‘+’ being more aggressive. The curves depicted in FIG. 3 depict examples of different attenuation characteristics of different reference filters. The solid line in FIG. 3 shows that one reference filter attenuates signals linearly from six hundred Hz. down to zero Hz. Stated another way, the solid line shows that one reference filter decreasingly attenuates input signals linearly from zero Hz. up to about six hundred Hz. The other curves show that other reference filters can have attenuation characteristics that are more or less aggressive in different frequency ranges.

The adaptive wind noise masking filter 45 derives a cogent (i.e., pertinent or relevant) frequency (CF) and a gain W for the CF determined by the evaluation of the wind noise probability classification Ψ received from the wind noise probability classification 50. In an embodiment, the CF and W of the filter 45 for the frame n are determined by the following family of equations (4):

$\begin{matrix} W (n), CF (n) = {\begin{matrix} G_{\max}, & NsFreq & if Ψ = Noise \\ a \cdot G_{\max}, & NsSpFreq & if Ψ = MostlyNoise \\ b \cdot G_{\max}, & SpNsFreq & if Ψ = MostlySpeech \\ G_{\min}, & SpFreq & if Ψ = Speech \end{matrix}} & (4) \end{matrix}$

where,

a and b are scaling parameters; and 0≦b≦a≦1.

G_maxand G_minare maximum attenuation and minimum attenuation applied to the signal respectively;

NsFreq, NsSpFreq, SpNsFreq and SpFreq are predetermined CFs for “Noise”, “Mostly noise”, “Mostly speech”, and “Speech” classifications respectively from the families of equations 3 set forth above.

Values of a, b, G_max, G_min, NsFreq, NsSpFreq, SpNsFreq and SpFreq are determined experimentally a priori, in order to optimize noise suppression from the input signal. After the cogent frequency (CF) and target gain (W) are determined from the family of equations (4) set forth above, an amplification factor or an attenuation factor G_lowand G_highare calculated as shown in equation (5a) and (6a) respectively. The amplification or attenuation factor G_lowis applied to the frequencies below CF as shown in equation (5b) and G_highis applied to the frequencies above CF as shown in equation (6b).

$\begin{matrix} G_{low} = \frac{W (n)}{G (filt (CF))} & (5 a) \\ filt (0 : CF - 1) = filt (0 : CF - 1) \cdot G_{low} & (5 b) \end{matrix}$

Where, fill is a filter chosen from the reference filters 60;

filt(0:CF-1) are the filter coefficients of the chosen reference filter up to CF-1;

G(filt(CF)) is the current gain value on the chosen reference filter at CF;

G_lowis the calculated gain applied to the reference filter coefficients below CF as shown in equation (5b).

And,

$\begin{matrix} G_{high} = \frac{1 - W (n)}{G (filt (FiltLen)) - G (filt (CF))} & (6 a) \\ Filt (CF : FiltLen) = (filt (CF : FiltLen) - G (filt (CF))) \cdot G_{high} & (6 b) \end{matrix}$

Where, filt(CF:FiltLen) are the filter coefficients of the reference filter from CF to the last frequency (FiltLen) of the filter;

G(filt(FiltLen)) and G(filt(CF)) are the current gains of the reference filter coefficients at the last frequency (FiltLen) of the reference filter and at the CF respectively;

G_highis the calculated new gain applied to the normalized filter coefficients of the reference filter (filt) above CF as shown in equation (6b).

Adjusting the CF of the filter 45 based on G_lowand G_highin response to historical characteristics of noise in a signal effectively changes the shape of the pass band of the filter 45, in real time, in response to changing noise levels in the signal 30 from the microphone 25 audio source. The shape of the band pass characteristic of the filter 45 is therefore adjusted empirically in real time, i.e., based on observations of noise characteristics, such that the filter 45 attenuates noise signals on the input signal 30 by reducing the amplitude of the signals in a particular frequency spectrum range that are received from the Fast Fourier Transform calculator 35. Stated another way, the adaptive wind noise masking filter 45 generates filter coefficients to selectively attenuate different frequency ranges to suppress wind noise content in signals received from the Fast Fourier Transform calculator 35. The adaptive wind noise-masking filter 45 therefore effectively extracts speech signals from the input signal 30. Different frequency ranges are attenuated by determining coefficients of the FFT calculator output.

A slow moving average based on a history of both W and CF is calculated for smoother transition between speech and noise part of the input signal. For W, the slow moving average can be expressed as:

Ŵ(n)=β·W(n·1)+(1−β)·W(n) (7)

Where, β is a smoothing coefficient between 0 and 1. In an embodiment, the value of β is set in the rage of [0.75, 1). Smoothening of the filter coefficients for CF is calculated as shown in Equation (9) below.

FIG. 4 shows examples of different filter coefficients where CF remains constant at 300 Hz. and the gain W changes linearly from 0.1 to 0.9. FIG. 5 shows different values of CF with a value of W equal to 0.5 and CF changes between 50 Hz. to 550 Hz. Together, FIG. 4 and FIG. 5 show the changes in W and CF based on a linear reference filter, however an actual reference filter could be of any shape and length. FIG. 6 shows the linear reference filter change based on different W and CF.

Significantly, the reference filter 60 can be of different frequency ranges and different shapes for different values of Ψ. This helps adapt the adaptive wind noise masking filter 45 to different noise characteristics in real time, based on actual noise conditions in the actual environment where the filter 45 is being used. There can also be more than one gain Was well as more than one CF in order to be able to achieve a smooth filter response, i.e., one with multiple filter steps.

Equation (8) below is a wind noise masking filter response to be applied on the input signal in frequency domain. The function Adaptive Win is a function that generates the wind noise masking filter based on the values of CF, Ŵ and filt reference filter as shown in Equations (5) and (6) above.

Wnm(ω)=AdaptiveWin(CF,Ŵ,filt) (8)

where, Wnm represents wind noise masking filter.

Once the wind noise masking filter coefficients are determined, averaging is performed on each coefficient of the new filter shaped for smooth changes in CF. This helps improve the sound quality and makes it pleasant to hear when transitioning between speech and noise.

Ŵnm(n)=δ·Wnm(n−1)+(1−δ)·Wnm(n) (9)

Where δ is a smoothing coefficient between 0 and 1. In an embodiment, the value of δ is set in the rage of [0.75, 1).

In Equation (9), the value of δ is selected to provide different ramp rates between speech-to-noise and noise-to-speech transitions and to be able to adapt more quickly or less quickly from one condition to the other. δ can thus be considered to be a ramp rate, which is a rate at which a speech-to-noise and noise-to-speech transition is made. Masking the noise in the adaptive wind noise masking filter 45 is a simple multiplication 40 of the filter coefficients 58 and input samples received from the FFT calculator 35. That multiplication can be expressed as:

{circumflex over (X)}(w)=Wnm(w)·X(ω) (10)

where

X(ω)=FFT(x(n)) (11)

and where {circumflex over (X)} is a wind noise suppressed signal in the frequency domain, and ω represents a specific frequency.

A noise-suppressed audio output signal 75 is obtained by computing an inverse Fourier Transform (IFFT) 70 on signals output from the adaptive wind noise masking filter 45, via the multiplier 40. The IFFT output 75 can be expressed as:

x
(n)=IFFT({circumflex over (X)}(ω)) (12)

Where, x is the wind noise suppressed final output 75 for frame n in the time domain.

The system depicted in FIG. 1 effectively masks wind noise in audio signals by classifying certain low frequency signals as being wind noise and signals above a particular frequency as being speech and using a recent history of noise characteristics in the signal. The system 10 adapts the noise filtering based on a recent history of input signals 30 (at least one previous sample) to keep the characteristics of the filter 45 changing over time. Tracking the noise characteristics over time helps mask wind noise bursts known as buffeting and enables the system 10 to adapt to different acoustic environments that include, but are not limited to, hands-free microphones, conference rooms or other environments where background noise would otherwise be detectable in an audio signal detected by a microphone.

FIG. 2 is a block diagram of an audio system 100 that forms part of a radio. An embodiment includes a computer, i.e., a central processing unit (CPU) 70 having associated memory 75 that stores program instructions for the CPU 70. Analog output signals from the microphone 25 are converted to a digital form by an analog to digital (A/D) converter 80. The digital signal from the A/D converter 80 is input to and processed by the CPU 70 using the methodology described above. The memory device 75 stores program instructions, which when executed by the CPU 70, cause the CPU 70 to perform the steps described above, including changing characteristics of the adaptive wind noise masking filter according to the detected noise content in an input signal 30. The CPU 70 outputs a digital representation of the corrected digital sound signal to a digital to analog (D/A) converter 90. The analog signal from the D/A converter 90 is input to a loudspeaker 95. An example of the output signal quality improvement is shown in FIGS. 7A and 7B.

FIG. 7A is an oscilloscope trace of an actual audio signal that is input to the adaptive wind noise filter described above. FIG. 7B is an oscilloscope trace of the same signal after it has passed through, i.e., after it has been processed by, the adaptive wind noise filter. Short-duration noise bursts in the input signal shown in FIG. 7A are removed from the output signal shown in FIG. 7B. The output signal is otherwise the same or substantially the same as the input signal.

FIG. 8 shows how characteristics of the adaptive wind noise masking filter change over time, to provide the output signal shown in FIG. 7B from the input signal shown in FIG. 7A. The filter's gain or attenuation is depicted as a vertically-oriented axis, which is orthogonal to two other, mutually orthogonal axes that are labeled “Frequency” and “Seconds.”

In FIG. 7A, the first or left-most noise burst is missing from the output signal shown in FIG. 7B. That first noise burst is suppressed, by adjusting the gain of the filter to suppress the burst.

As shown in FIG. 8, input signal frequencies below about 300 Hz. are attenuated, i.e., have zero gain, just after the initial or starting time shown in the figure. The gain provided to input signals above 300 Hz. however increases linearly.

In FIG. 7A, there is a second noise burst at t=4 seconds. That second noise burse is missing from the output signal shown in FIG. 7B. The second noise burst at t=4 seconds is suppressed, by adjusting the gain of the filter to suppress the second noise burst.

In FIG. 8, at t=4 seconds, input signal frequencies below about 300 Hz. are attenuated, i.e., have little or no gain provided to them whereas the low frequency filter gain just prior to and just after t=4 seconds is greater. Reducing or eliminating the amplification of low frequency signals around 4 seconds thus suppresses the noise burst as shown in FIG. 7B.

The last or right-most noise burst shown in FIG. 7A is also missing from the output as shown in FIG. 7B. In FIG. 8, the filter's gain at t=12 is shown as being reduced. The reduced gain at t=12 seconds suppresses the noise burst from the output signal shown in FIG. 7B.

In a preferred embodiment, filter characteristics were chosen to suppress relatively low-frequency signals, i.e., below about 300 Hz, and having relatively short durations, i.e., less than a few hundred milliseconds. Such signals are typically produced by wind gusts passing a microphone. Different filter characteristics can be chosen to suppress signals with different frequencies and different durations. The method and apparatus disclosed herein should therefore not be considered to be limited to filtering only wind noise. By appropriately selecting operating characteristics, the adaptive filter can suppress or amplify high-frequency electrical noise caused by electric arcing, such as spark plug ignition noise. The filter can also be used to suppress or amplify signals within a frequency band.

While a preferred embodiment of the filter attenuates signals, the filter disclosed herein can also apply selective amplification to signals at different frequencies or within user-specified pass bands. Selectively amplifying signals in pass bands can be applied to radar, sonar and two-way radio communications systems.

Those of ordinary skill in the art will appreciate that in an alternate embodiment, the low-pass filtering can instead be a band-pass filter whereby frequency spectrum segments are selectively filtered with the result being a determination of whether noise is present. An example of a band-pass filter would be one that selectively filters audio signals between approximately 100 Hz up to about 300 to 400 Hz.

In an embodiment, the following threshold values were used:

- a. From families of equation (3) SP_ONLY_THR=0.3; NS_SP_THR=0.5 and NS_THR=0.7.
- b. From families of equation (4) a=0.6, b=0.3, G_max=−30 dB, G_min=0 dB, NsFreq=300 Hz, NsSpFreq=250 Hz, SpNsFreq=200 Hz and SpFreq=150 Hz.

In an alternate embodiment, the filtering performed by the low-pass filter 15 or some other filter device is performed by analog circuitry, well-known to those of ordinary skill in the electronic arts. Such filters can be either passive or active.

The wind noise detection circuit 65 can alternatively be implemented using operational amplifiers to compute either a difference or ratio between the power levels of the signal from the filter 15 to the input signal 30. Similarly, the wind noise probability classification 50 can also be implemented using analogue operational amplifiers to output signals to an array of active filters that make-up an analogue version of the adaptive wind noise masking filter 45.

In an analog device environment, the Fast Fourier Transform calculator 35 can be replaced by an array of frequency-selective active filters each of which is configured to selectively amplify segments of the spectrum of the input signal 30.

The foregoing description is for purposes of illustration only. The true scope of the invention is set forth in the appurtenant claims.

Claims

1. A method of suppressing noise in an audio signal comprising: generating a noise probability classification by calculating a smoothened power ratio based on an input signal, wherein the smoothened power ratio represents a noise probability; andadaptively masking noise by: selecting a reference filter based on the noise probability classification, andapplying the selected reference filter to the input signal to generate an output signal.
2. The method of claim 1, further comprising the step of using different smoothing coefficients to calculate the smoothened power ratio based on at least one type of a noise-probability-classification transition selected from: a speech-to-noise transition; anda noise-to-speech transition.
3. The method of claim 1, wherein the step of generating a noise probability classification is further comprised of: filtering the input signal to provide a filtered portion of the signal;comparing a relationship between the input signal and the filtered portion of the signal to a plurality of threshold values.
4. The method of claim 1, wherein the wind noise probability classification is identified based on comparison with thresholds.
5. The method of claim 4, wherein the value of each threshold of the plurality of thresholds is predetermined.
6. The method of claim 1, wherein the reference filter is selected from a plurality of reference filters.
7. The method of claim 6, wherein each filter of the plurality of reference filters optionally has one or more corresponding cogent frequencies and one or more corresponding gains.
8. The method of claim 6, further comprising the step of smoothening the one or more corresponding gains of each reference filter.
9. The method of claim 8, further comprising the step of using different values for a smoothing coefficient to calculate a smoothened gain of each filter of the plurality of filters based on at least one type of a noise-probability-classification transition selected from: a speech-to-noise transition; anda noise-to-speech transition.
10. The method of claim 7, further comprising the step of smoothening a frequency response of each reference filter.
11. The method of claim 7, further comprising the step of using different smoothing coefficients to calculate a smoothened cogent frequency of each filter of the plurality of filters based on at least one type of a noise-probability-classification transition selected from: a speech-to-noise transition; anda noise-to-speech transition.
12. The method of claim 11, further comprising the step of selecting a plurality of cogent frequencies for a corresponding number of reference filters.
13. The method of claim 6, further comprising the step of modifying a frequency response of the reference filters based on a selected gain.
14. The method of claim 6, further comprising the step of modifying a frequency response of the reference filters based on one or more gains and one or more cogent frequencies.
15. The method of claim 6, further comprising the step of: separately modifying a frequency response above and below a cogent frequency based on a selected gain.
16. An apparatus comprising: a processor; anda memory device coupled to the processor, the memory device storing program instructions, which when executed by the processor cause the processor to: generate a noise probability classification by calculating a smoothened power ratio based on an input signal, wherein the smoothened power ratio represents a noise probability; andadaptively mask noise by: selecting a reference filter based on the noise probability classification, andapplying the selected reference filter to the input signal to generate an output signal.
17. The apparatus of claim 16, wherein the selected reference filter is configured to selectively attenuate signals in a range of frequencies, attenuated signals comprising noise signals.
18. The apparatus of claim 16, further comprising a memory device having program instructions, which when executed by the processor cause the processor to: use different smoothing coefficients to calculate the smoothened power ratio based on at least one type of a noise-probability-classification transition selected from: a speech-to-noise transition; anda noise-to-speech transition.
19. The apparatus of claim 16, further comprising a memory device having program instructions, which when executed, cause the processor to generate a noise probability classification, which is further comprised of: filtering the input signal to provide a filtered portion of the signal;comparing a relationship between the input signal and the filtered portion of the signal to a plurality of threshold values.
20. The apparatus of claim 16, further comprising a memory device having program instructions, which when executed, cause the processor to classify a noise probability based on a comparison with thresholds.
21. The apparatus of claim 16, further comprising a memory device having program instructions, which when executed cause the processor to classify a noise probability based on a comparison of predetermined thresholds.
22. The apparatus of claim 16, further comprising a memory device having program instructions, which when executed cause the processor to select the reference filter from a plurality of reference filters, each filter attenuating signals in a range of frequencies differently.
23. The apparatus of claim 22, wherein each reference filter has at least one cogent frequency.
24. The apparatus of claim 23, wherein each reference filter has a predetermined frequency response above and below the reference filter's cogent frequency based on a selected gain.
25. A signal filter apparatus comprising: a filter receiving an input signal;noise detector configured to receive a filter input signal from the filter; a noise classifier receiving indications of noise in the input signal from the noise detector;an adaptive wind noise masking filter, receiving noise classifications from the noise classifier;a fast Fourier transform (FFT) calculator configured to provide FFT representations of the input signal; anda multiplier; configured to provide a multiplication of FFT representations of the input signal by signals output from the adaptive wind nose masking filter.
26. The signal filter apparatus of claim 23, further comprising a plurality of reference filters coupled to the adaptive wind noise masking filter, each reference filter of the plurality of reference filters having different signal filtering characteristics, the signal filtering characteristics of a selected one of the reference filters being provided to the adaptive wind noise masking filter.
27. The signal filter apparatus of claim 24, wherein said apparatus is configured to suppress wind noise from an audio signal.

METHOD AND APPARATUS FOR MASKING WIND NOISE

Information

Publication Number

Date Filed

Date Published

Inventors

Original Assignees

CPC

US Classifications

International Classifications

Abstract

Description

Claims