This application claims the benefit of Australian Provisional Patent Application No. 2014902057 filed 29 May 2014, which is incorporated herein by reference.
The present invention relates to the digital processing of signals from microphones or other such transducers, and in particular relates to a device and method for mixing multiple such signals in order to reduce wind noise.
Processing signals from microphones in consumer electronic devices such as smartphones, hearing aids, headsets and the like presents a range of design problems. There are usually multiple microphones to consider, including one or more microphones on the body of the device and one or more external microphones such as headset or hands-free car kit microphones. In smartphones these microphones can be used not only to capture speech for phone calls, but also for recording voice notes. In the case of devices with a camera, one or more microphones may be used to enable recording of an audio track to accompany video captured by the camera. Increasingly, more than one microphone is being provided on the body of the device, for example to improve noise cancellation as is addressed in GB2484722 (Wolfson Microelectronics).
The device hardware associated with the microphones should provide for sufficient microphone inputs, preferably with individually adjustable gains, and flexible internal routing to cover all usage scenarios, which can be numerous in the case of a smartphone with an applications processor. Telephony functions should include a “side tone” so that the user can hear their own voice, and acoustic echo cancellation. Jack insertion detection should be provided to enable seamless switching between internal to external microphones when a headset or external microphone is plugged in or disconnected.
Consequently, a range of digital signal processing applications involve the mixing of signals from multiple microphones, whether across the full audio band or in selected frequency subbands. Adaptive directional beamforming is one such application, and involves the signals from two or more microphones being mixed in a manner to maintain gain in a direction of interest (typically being the forward direction of the listener), while adaptively nulling background noise from other directions, such as conversations happening behind the listener. Adaptive directional beamforming works to null signals coming from a particular direction such as background speech, and in particular this approach only works on such correlated signals.
However wind noise detection and reduction is a particularly difficult problem in such devices. Wind noise is defined herein as a microphone signal generated from turbulence in an air stream flowing past microphone ports, as opposed to the sound of wind blowing past other objects such as the sound of rustling leaves as wind blows past a tree in the far field. Wind noise can be objectionable to the user, can mask other signals of interest, and can corrupt the device's ability to suppress background noise sources by beamforming. It is desirable that digital signal processing devices are configured to take steps to ameliorate the deleterious effects of wind noise upon signal quality. However, when wind noise is present, existing devices simply revert adaptive directional beamforming to an omnidirectional state by use of a primary microphone only. This is because the beamforming function cannot identify and thus cannot null a direction of origin of wind noise because wind noise is uncorrelated between microphones. Instead, disadvantageously, beamforming functions are usually corrupted by wind noise and respond inappropriately by actually amplifying uncorrelated noise such as wind noise. It is for this reason that existing devices tend to simply disable beamforming in the presence of wind noise and revert to a primary microphone and omnidirectional operation.
Any discussion of documents, acts, materials, devices, articles or the like which has been included in the present specification is solely for the purpose of providing a context for the present invention. It is not to be taken as an admission that any or all of these matters form part of the prior art base or were common general knowledge in the field relevant to the present invention as it existed before the priority date of each claim of this application.
Throughout this specification the word “comprise”, or variations such as “comprises” or “comprising”, will be understood to imply the inclusion of a stated element, integer or step, or group of elements, integers or steps, but not the exclusion of any other element, integer or step, or group of elements, integers or steps.
In this specification, a statement that an element may be “at least one of” a list of options is to be understood that the element may be any one of the listed options, or may be any combination of two or more of the listed options.
According to a first aspect the present invention provides a method of wind noise reduction, the method comprising
obtaining a first microphone signal from a first omnidirectional microphone;
contemporaneously obtaining a second microphone signal from a second omnidirectional microphone; and
mixing the first and second microphone signals to produce an output signal, by:
wherein the first and second signal weights are calculated to minimize the power of the output signal.
According to a second aspect the present invention provides a device for wind noise reduction, the device comprising:
a first omnidirectional microphone and a second omnidirectional microphone;
a processor for calculating first and second signal weights in a manner to minimize the power of an output signal; and
a first multiplication block configured to apply the first signal weight to a first microphone signal from the first omnidirectional microphone, and a second multiplication block configured to apply the second signal weight to a second microphone signal from the second omnidirectional microphone; and
a summation block configured to sum the weighted first and second microphone signals together to produce the output signal.
In some embodiments, the first signal weight may be denoted by a, wherein a takes a value in the range of 0 to 1, inclusive. In such embodiments, the second signal weight may be defined to be (1−a). The first signal weight may be calculated by the processor as follows:
where:
x=signal sample of the first microphone signal, and
y=signal sample of the second microphone signal.
Alternative embodiments may apply equation (1) in a modified form for example with scalar coefficients not equal to 1 applied to any one or more of the terms.
A weight may be calculated for a frame of predetermined length consisting of N first signal samples and N second signal samples. The length of the frame (N) generally depends upon the environment of application of the method, however a suitable frame length for audio frequency signals is 32 or 64 samples long. The weighting factor calculated by use of equation (1) alone may change significantly from frame to frame, so in some preferred embodiments the series of weight values determined for a may be filtered or smoothed to minimize frame to frame variation in the weight which may otherwise be heard as audible artifacts.
In another embodiment weights are calculated continuously for each first signal sample and second signal sample. This is achieved by calculating x2, y2 and xy for each sample and adding them to a respective appropriate running sum. A leaky integrator (an integrator having a feedback coefficient slightly less than one) can be used to perform the running sum in order to prevent overflows and to ensure that the system's ‘memory’ is not too long. Such embodiments allow a new weighting factor to be calculated every time that a new sample is available, rather than having to wait for a whole frame of samples.
In another embodiment, the first and second signals (i.e. the variables x and y in the form described above) can be frequency domain samples rather than time domain samples. In this case the optimization of the weighting factor ai can be calculated as above for each subband i, but with the added advantage that the weighting factor can be calculated and applied on a subband-by-subband basis, giving different mixing ratios at different frequencies. Also, if some frequencies are deemed to be more important for wind noise suppression than other frequencies, they can be given a higher weighting, for example by calculating the weighting factor a in respect of such frequencies before applying a for mixing across the entire audio band, and/or by performing mixing only in the important subbands. In embodiments utilizing complex inputs such as those in the DFT domain, the weighting factor may be calculated as being:
The present invention is also applicable to signals produced from more than two microphones. In such embodiments, the processor is configured to calculate the required number of signal weights in a manner to minimize the power of the output signal. For example, when a signal z from a third omnidirectional microphone is obtained, the output signal Y may be calculated as follows:
Y=a*primary_mic+b*secondary_mic+(1−a−b)*tertiary_mic
where
Other embodiments of the present invention may mix four or more microphone signals in a corresponding manner.
In some embodiments, prior to mixing, the first and second microphone signals are matched for a level of a signal of interest, such as speech. In some embodiments, prior to mixing, the first and second microphone signals may be matched for phase.
In some embodiments the method of the present invention may be activated only at times when a wind noise detector indicates that wind noise is present. The wind noise detector may be implemented in the manner set out in International Patent Application No. PCT/AU2012/001596 by Wolfson Dynamic Hearing Pty Ltd, published as WO2013/091021, the content of which is incorporated herein by reference. The method of the present invention may in some embodiments be discontinued at times when a wind noise detector indicates that wind noise is not present.
In some embodiments involving stereo audio channels, the method of the present invention may be utilized to produce from a plurality of left-side microphones a wind-noise-reduced left side output signal, and may further be utilized to produce from a plurality of right-side microphones a wind-noise-reduced right side output signal. The wind-noise-reduced left and right side signals may then be used for further stereo processing. The present invention may similarly be applied in multi-channel environments such as 5:1 surround sound environments to produce a wind-noise reduced signal for each channel.
An example of the invention will now be described with reference to the accompanying drawings, in which:
In the present embodiment, the weight a is calculated by the processor 220 as follows:
where:
x=signal sample of the first microphone signal, and
y=signal sample of the second microphone signal.
The derivation of the above formula is found by using the constraint that the total power of the output wind-noise-reduced signal is to be minimized. It is noted that:
Energy=Σ(ax(t)+(1−a)y(t))2
Thus, differentiating with respect to a to find the point of minimum energy gives:
Solving for a gives:
To implement this requirement, the primary mic and secondary mic signals are buffered and the buffer signals are used as the inputs to the optimization algorithm. The algorithm outputs the mixing coefficient ‘a’ within a range of 0 and 1, inclusive. The value of a is then smoothed with a leaky integrator and constrained to the range between 0 and 1, inclusive.
The output signal produced at 240 is thus:
output=a*primary_mic+(1−a)*secondary_mic
If we assume the microphone signals are not correlated in wind, the equation can be simplified as
However this simplified equation is less optimal if speech is present during wind.
The present invention can in other embodiments be extended to producing a wind-noise-reduced output from 3 or more microphone inputs. For three microphones, where z is the input from the tertiary microphone:
Y=a*primary_mic+b*secondary_mic+(1−a−b)*tertiary_mic
In one embodiment for reducing wind noise, involving the use of three input microphone signals:
In another embodiment for reducing wind noise, involving the use of three input microphone signals, the primary mic input and secondary mic input are mixed using equation (1) to determine a mixing factor A. Next, the mixed result produced by applying A and (1−A) weights to the primary and secondary signals is processed together with the tertiary input, to determine a mixing factor B. The mixing coefficient is then calculated as a=A*B and b=(1−A)*B.
The FIR filter 360 can be built from an inverse DFT of the array of the ‘ai’ values.
While the preceding describes the mixing of the signals from microphones 132 and 134 in order to produce a first wind-noise-reduced signal, it is to be noted that the signals from microphones 136 and 138 may also be similarly mixed in accordance with the present invention in order to produce a second wind-noise-reduced signal. Microphone 136 captures a first (primary) right signal R1, and microphone 138 captures a second (secondary) right signal R2. The first and second wind-noise-reduced signals may then be processed by subsequent stages as desired, and for example could be input to an adaptive directional microphone stage, or could be used for stereo processing to retain binaural cues, or could be used for other multi-channel audio functions as appropriate.
It will be appreciated by persons skilled in the art that numerous variations and/or modifications may be made to the invention as shown in the specific embodiments without departing from the spirit or scope of the invention as broadly described. The present embodiments are, therefore, to be considered in all respects as illustrative and not limiting or restrictive.
Number | Date | Country | Kind |
---|---|---|---|
2014902057 | May 2014 | AU | national |
Number | Date | Country | |
---|---|---|---|
Parent | 15312874 | Nov 2016 | US |
Child | 16112365 | US |