Wind noise is a serious problem that occurs during telephone conversations that take place outside, in a moving vehicle, or in an otherwise windy environment. Wind noise can cause the listener on the far end of a telephone conversion to be unable to understand or hear the caller's voice.
Wind speed and direction is constantly changing and as a result is very difficult to eliminate from telephone conversations. Conventional wind and/or noise cancelling methods and apparatuses are ineffective. The invention provides an effective method and/or apparatus for masking or eliminating wind noise from a telephone conversation while maintaining audible speech. A method and apparatus for masking, removing or suppressing wind noise would be an improvement over the prior art.
An embodiment is comprised of a low-pass filter 15, which receives audio signals 30, such as those output from a conventional microphone 25. In the preferred embodiment, the low-pass filter 15 is a digital filter, embodied as various computer program routines that process digital representations of the audio signal 30 from the microphone 25.
As shown in the figure, the analog audio signals 30 are input to a Fast Fourier Transform (FFT) calculator 35, implemented using program instructions. The output of the FFT calculator is input to a multiplier 40, also implemented using program instructions. The multiplier 40 multiplies the output of the Fast Fourier Transform calculator 35 by the output of an adaptive wind noise masking filter 45.
The adaptive wind noise masking filter 45 receives information from a wind noise probability classification block 50 and processes appropriate reference filters 60 to generate a target filter to apply on the output of the FFT 35. The wind noise probability classification 50 generates an output that is indicative of whether the signal 30 from the microphone 25 is likely to have noise, speech, or combination of speech and noise. The wind noise probability classification is derived from information obtained from a wind noise detector 65.
Digital signals representing a wind noise-suppressed version of the audio from the microphone 25, is output from the multiplier 40 when a decision is made that the audio 30 from the microphone 25 is likely to have wind noise. The output of the adaptive wind noise masking filter 45 is therefore a frequency domain wind noise masking filter coefficients 58 which is input to the multiplier 40. The output of the multiplier 40 is input to an inverse Fast Fourier Transform (IFFT) circuit 70 the output of which 75 is a noise-reduced copy of the speech input into the microphone 25.
In an embodiment, wind noise detection is performed by a comparison of the low-pass filtered signal to the audio input signal 30. The comparison is computed as a ratio of the power level in each of the signals. In the embodiment, which uses the ratio of the low-pass filtered signal power Pt to the total power of the input signal PT, the comparison is a ratio expressed in equation (1) below. In the embodiment, low pass filter has a cutoff frequency at 150 Hz.
Where, ρ is the power ratio for a given input frame, n. In an embodiment a frame is 10 ms long.
A wind noise probability classification (50) is calculated by using a “smoothened power ratio.” The smoothened power ratio is expressed by equation (2) below:
ξ(n)=α·ξ(n−1)+(1−α)·ρ(n) (2)
where, α is smoothing coefficient, the value of which is a design choice but selected to determine the emphasis to put on one or more historical values of ξ. And, the value of α is between 0 and 1. In an embodiment α is set in the rage of [0.75, 1), where the bracket “[” indicates inclusion of the adjacent value, i.e., the value next to it is to be included within the range and, the parenthesis means, up to but not including the adjacent value, i.e., the value “1” is not included in the range but all lesser values are.
In Equation (2), the value of ξ(n) defines the probability of speech or noise in the input signal. And, it can be seen in Equation (2) that the speech or noise probability determination uses a current sample represented by the term, (1−α)·ρ(n) and at least one, previously-obtained sample or “history” of the signal, which is represented by the term. α·ξ(n−1) In the embodiment, the following speech and noise classifications are obtained by comparing numeric values of ξ obtained from Equation (2) with user defined numeric thresholds:
where,
SP_ONLY_THR is a threshold for speech classification;
NS_SP_THR is an intermediate threshold for identifying high probability of speech or wind noise;
NS_THR is a high threshold for wind noise classification, and; Ψ is a wind noise probability classification.
There could be more classifications of Ψ than are shown in the family of Equations (3), e.g., “More speech”, “More wind noise”, “Equal speech and wind noise” etc., in order to maintain smoother transition between wind noise and speech.
The thresholds defined in the family of equations (3) are used to determine characteristics of a primary adaptive masking filter 45. The characteristics of a primary adaptive masking filter 45 are compared to at least one reference filter 60 and thereafter selected to allow appropriate suppression and/or amplification of noise and/or speech in the audio signal. Example frequency responses of reference filters 60 are shown in
The adaptive wind noise masking filter 45 derives a cogent (i.e., pertinent or relevant) frequency (CF) and a gain W for the CF determined by the evaluation of the wind noise probability classification Ψ received from the wind noise probability classification 50. In an embodiment, the CF and W of the filter 45 for the frame n are determined by the following family of equations (4):
where,
a and b are scaling parameters; and 0≦b≦a≦1.
Gmax and Gmin are maximum attenuation and minimum attenuation applied to the signal respectively;
NsFreq, NsSpFreq, SpNsFreq and SpFreq are predetermined CFs for “Noise”, “Mostly noise”, “Mostly speech”, and “Speech” classifications respectively from the families of equations 3 set forth above.
Values of a, b, Gmax, Gmin, NsFreq, NsSpFreq, SpNsFreq and SpFreq are determined experimentally a priori, in order to optimize noise suppression from the input signal. After the cogent frequency (CF) and target gain (W) are determined from the family of equations (4) set forth above, an amplification factor or an attenuation factor Glow and Ghigh are calculated as shown in equation (5a) and (6a) respectively. The amplification or attenuation factor Glow is applied to the frequencies below CF as shown in equation (5b) and Ghigh is applied to the frequencies above CF as shown in equation (6b).
Where, fill is a filter chosen from the reference filters 60;
filt(0:CF-1) are the filter coefficients of the chosen reference filter up to CF-1;
G(filt(CF)) is the current gain value on the chosen reference filter at CF;
Glow is the calculated gain applied to the reference filter coefficients below CF as shown in equation (5b).
Where, filt(CF:FiltLen) are the filter coefficients of the reference filter from CF to the last frequency (FiltLen) of the filter;
G(filt(FiltLen)) and G(filt(CF)) are the current gains of the reference filter coefficients at the last frequency (FiltLen) of the reference filter and at the CF respectively;
Ghigh is the calculated new gain applied to the normalized filter coefficients of the reference filter (filt) above CF as shown in equation (6b).
Adjusting the CF of the filter 45 based on Glow and Ghigh in response to historical characteristics of noise in a signal effectively changes the shape of the pass band of the filter 45, in real time, in response to changing noise levels in the signal 30 from the microphone 25 audio source. The shape of the band pass characteristic of the filter 45 is therefore adjusted empirically in real time, i.e., based on observations of noise characteristics, such that the filter 45 attenuates noise signals on the input signal 30 by reducing the amplitude of the signals in a particular frequency spectrum range that are received from the Fast Fourier Transform calculator 35. Stated another way, the adaptive wind noise masking filter 45 generates filter coefficients to selectively attenuate different frequency ranges to suppress wind noise content in signals received from the Fast Fourier Transform calculator 35. The adaptive wind noise-masking filter 45 therefore effectively extracts speech signals from the input signal 30. Different frequency ranges are attenuated by determining coefficients of the FFT calculator output.
A slow moving average based on a history of both W and CF is calculated for smoother transition between speech and noise part of the input signal. For W, the slow moving average can be expressed as:
Ŵ(n)=β·W(n·1)+(1−β)·W(n) (7)
Where, β is a smoothing coefficient between 0 and 1. In an embodiment, the value of β is set in the rage of [0.75, 1). Smoothening of the filter coefficients for CF is calculated as shown in Equation (9) below.
Significantly, the reference filter 60 can be of different frequency ranges and different shapes for different values of Ψ. This helps adapt the adaptive wind noise masking filter 45 to different noise characteristics in real time, based on actual noise conditions in the actual environment where the filter 45 is being used. There can also be more than one gain Was well as more than one CF in order to be able to achieve a smooth filter response, i.e., one with multiple filter steps.
Equation (8) below is a wind noise masking filter response to be applied on the input signal in frequency domain. The function Adaptive Win is a function that generates the wind noise masking filter based on the values of CF, Ŵ and filt reference filter as shown in Equations (5) and (6) above.
Wnm(ω)=AdaptiveWin(CF,Ŵ,filt) (8)
where, Wnm represents wind noise masking filter.
Once the wind noise masking filter coefficients are determined, averaging is performed on each coefficient of the new filter shaped for smooth changes in CF. This helps improve the sound quality and makes it pleasant to hear when transitioning between speech and noise.
Ŵnm(n)=δ·Wnm(n−1)+(1−δ)·Wnm(n) (9)
Where δ is a smoothing coefficient between 0 and 1. In an embodiment, the value of δ is set in the rage of [0.75, 1).
In Equation (9), the value of δ is selected to provide different ramp rates between speech-to-noise and noise-to-speech transitions and to be able to adapt more quickly or less quickly from one condition to the other. δ can thus be considered to be a ramp rate, which is a rate at which a speech-to-noise and noise-to-speech transition is made. Masking the noise in the adaptive wind noise masking filter 45 is a simple multiplication 40 of the filter coefficients 58 and input samples received from the FFT calculator 35. That multiplication can be expressed as:
{circumflex over (X)}(w)=Wnm(w)·X(ω) (10)
where
X(ω)=FFT(x(n)) (11)
and where {circumflex over (X)} is a wind noise suppressed signal in the frequency domain, and ω represents a specific frequency.
A noise-suppressed audio output signal 75 is obtained by computing an inverse Fourier Transform (IFFT) 70 on signals output from the adaptive wind noise masking filter 45, via the multiplier 40. The IFFT output 75 can be expressed as:
x
Where,
The system depicted in
In
As shown in
In
In
The last or right-most noise burst shown in
In a preferred embodiment, filter characteristics were chosen to suppress relatively low-frequency signals, i.e., below about 300 Hz, and having relatively short durations, i.e., less than a few hundred milliseconds. Such signals are typically produced by wind gusts passing a microphone. Different filter characteristics can be chosen to suppress signals with different frequencies and different durations. The method and apparatus disclosed herein should therefore not be considered to be limited to filtering only wind noise. By appropriately selecting operating characteristics, the adaptive filter can suppress or amplify high-frequency electrical noise caused by electric arcing, such as spark plug ignition noise. The filter can also be used to suppress or amplify signals within a frequency band.
While a preferred embodiment of the filter attenuates signals, the filter disclosed herein can also apply selective amplification to signals at different frequencies or within user-specified pass bands. Selectively amplifying signals in pass bands can be applied to radar, sonar and two-way radio communications systems.
Those of ordinary skill in the art will appreciate that in an alternate embodiment, the low-pass filtering can instead be a band-pass filter whereby frequency spectrum segments are selectively filtered with the result being a determination of whether noise is present. An example of a band-pass filter would be one that selectively filters audio signals between approximately 100 Hz up to about 300 to 400 Hz.
In an embodiment, the following threshold values were used:
In an alternate embodiment, the filtering performed by the low-pass filter 15 or some other filter device is performed by analog circuitry, well-known to those of ordinary skill in the electronic arts. Such filters can be either passive or active.
The wind noise detection circuit 65 can alternatively be implemented using operational amplifiers to compute either a difference or ratio between the power levels of the signal from the filter 15 to the input signal 30. Similarly, the wind noise probability classification 50 can also be implemented using analogue operational amplifiers to output signals to an array of active filters that make-up an analogue version of the adaptive wind noise masking filter 45.
In an analog device environment, the Fast Fourier Transform calculator 35 can be replaced by an array of frequency-selective active filters each of which is configured to selectively amplify segments of the spectrum of the input signal 30.
The foregoing description is for purposes of illustration only. The true scope of the invention is set forth in the appurtenant claims.