The present disclosure relates to methods, systems and apparatus for improved feedback control in acoustic systems. Some embodiment relate to methods and apparatus for reducing feedback noise in acoustic systems. Some embodiment relate to methods and apparatus for improving feedback cancellation in acoustic systems. Some embodiment relate to methods and apparatus for improving detection of feedback in acoustic systems.
In audio systems comprising a microphone and speaker in close proximity, such as the audio system shown in
In audio systems which implement active noise cancellation (ANC), a feedback path is purposefully created to reduce environmental noise. However, when the loop gain of such a feedback path is greater than 1, feedback will build up leading to howling at the speaker.
Known passive feedback management techniques used to address such feedback include modifying acoustics (attenuating the acoustic feedback path) or reducing gain (attenuating the electrical feedback path). In current generation ANC headsets with talk-through, low-pass filters are typically applied so that no gain is applied above 2 kHz.
Known active feedback management techniques for hearing augmentation include feedback suppression and feedback cancellation. However, both of these techniques have drawbacks. For example, active feedback suppression may allow short bursts of feedback before suppression is applied. Additionally, active feedback suppression leads to a reduction in gain in the hearing augmentation path. Active feedback cancellation may only model a linear feedback path and is limited in its performance by reverberation.
Other feedback management techniques include techniques for reducing feedback noise, for example, by microphone signal mixing. However, microphone signal mixing may corrupt binaural or stereo cues being delivered to a user.
It is desired to address or ameliorate one or more shortcomings of known feedback management techniques, or to at least provide a useful alternative thereto.
Any discussion of documents, acts, materials, devices, articles or the like which has been included in the present specification is not to be taken as an admission that any or all of these matters form part of the prior art base or were common general knowledge in the field relevant to the present disclosure as it existed before the priority date of each of the appended claims.
An apparatus of reducing feedback noise in an acoustic system, the apparatus comprising: a first input for receiving a first signal derived from a first microphone associated with a first channel, the first signal comprising a first set of frequency sub-bands; a second input for receiving a second signal derived from a second microphone associated with a second channel, the second signal comprising second set of frequency sub-bands, the first and second sets of frequency sub-bands having matching frequency ranges, each frequency sub-band of the first and second sets of frequency sub-bands having a frequency of greater than a threshold frequency; and one or more processors configured to: determining feedback at a first speaker associated with the first channel; and responsive to determining feedback, mix each of the first set of frequency sub-bands with a corresponding one of the second set of frequency sub-bands to generate a mixed output signal comprising a mixed set of frequency sub-bands; wherein the mixing is performed so as to minimize the output power in each of the mixed set of frequency sub-bands whilst maintaining a stereo effect level difference in the mixed signal between the first and second signals within a level difference threshold range.
The mixing may comprise: determining first mixing coefficients Ai for each of the first set of frequency sub-bands, where Ai is equal to or less than 1; determining second mixing coefficients 1-Ai for each of the second sets of frequency sub-bands; weighting each of the one or more frequency sub-bands of the first set with respective first mixing coefficients Ai and weighting each of the corresponding frequency sub-bands of the second set with respective second mixing coefficients, 1-Ai; and summing each of the weighted one or more frequency sub-bands of the first set with corresponding weighted frequency sub-bands of the second set together to produce the mixed set of one or more frequency sub-bands.
The one or more processors may be further configured to determine the first set of frequency sub-bands and the second set of frequency sub-bands.
The threshold frequency may be about 2000 Hz.
The level difference threshold range may be between about 6 dB to about 12 dB.
The one or more processors may be further configured to determine the first mixing coefficient Ai and the second mixing coefficient, 1-Ai, The first mixing coefficient Ai may be defined as:
where m1i is the first set of frequency sub-bands, m2i is the second set of frequency sub-bands, eps is a constant defining the minimum subband power for which mixing occurs, and skew is a skew factor for maintaining the stereo effect level difference in the mixed signal between the first and second signals within the level difference threshold range.
Determining feedback at the first speaker may comprise determining a first probability, p1, of feedback at the first speaker; and the one or more processors are further configured to: determine a second probability of feedback at a second speaker associated with the second channel.
The one or more processors may be further configured to determine the first mixing coefficient Ai and the second mixing coefficient, 1-Ai. The first mixing coefficient Ai may be defined as:
wherein p1 is the first probability, p2 is the second probability, m1i is the first set of frequency sub-bands, m2i is the second set of frequency sub-bands, and eps is a constant defining the minimum subband power for which mixing occurs.
Determining feedback at the first speaker may comprise: determining a first level difference between a level of at least one high frequency sub-band of the first signal and a corresponding high frequency sub-band of a second signal; and determining the first probability based on the first level difference.
Determining feedback at the first speaker may comprise: determining a second level difference between a level of at least one high frequency sub-band of a first signal and a corresponding high frequency sub-band of a third signal derived from a third microphone, the third microphone in close proximity to the first speaker.
The at least one high frequency sub-band of the first signal may comprise a first plurality of sub-bands and the at least one high frequency sub-band of the second channel comprises a second plurality of sub-bands. Determining feedback at the first speaker may further comprise: determining a set of level differences between each of the first plurality of sub-bands and a corresponding one of the second plurality of sub-bands; and determining the first probability based on the first set of level differences.
Determining the first probability may comprise: determining a mean of the determined set of level differences; determining a minimum value of the determined set of level differences; determining a level difference feature based on the mean of the determined set of level differences subtracted by the minimum value of the determined set of level differences; and determining the first probability based on the level difference feature.
The one or more processors may be further configured to: determine a first level difference between a level of at least one high frequency sub-band of the first signal and a corresponding high frequency sub-band of the second signal; determine a second level difference between a level of at least one low frequency sub-band of the first signal and a corresponding relatively low frequency sub-band of the second signal; determine a modified level difference by subtracting the second level difference from the first level difference; and determine the first probability based on the modified level difference.
The one or more processors may be further configured to: combine the mixed set of one or more frequency sub-bands with a third set of frequency sub-bands of the first signal to provide a combined set of frequency sub-bands, wherein each frequency sub-band of the third set of frequency sub-bands has a frequency of less than or equal to the threshold frequency; and transform the combined set of frequency sub-bands into a time domain output signal.
The first and second microphones may be either (i) reference microphones configured to capture ambient sounds or (ii) error microphones configured to capture sound in respective first and second channels.
Determining feedback at the first speaker may comprise receiving a feedback flag indicative of feedback detected at the first speaker.
One of the first and second microphones may be a first reference microphone associated with the first speaker and configured to capture ambient sound in proximity to the first speaker. The other of the first and second microphones may be a reference microphone associated with a second speaker and configured to capture sound in proximity to the respective second speaker.
According to another aspect of the disclosure, there is provided a system comprising: the apparatus as described above; the first microphone; the second microphone; and the first speaker.
According to another aspect of the disclosure, there is provided an electronic device comprising the apparatus or system as described above. The electronic device may be: a mobile phone, for example a smartphone; a media playback device, for example an audio player; or a mobile computing platform, for example a laptop or tablet computer.
According to another aspect of the disclosure, there is provided a method of reducing feedback noise in an acoustic system, the method comprising: receiving a first signal derived from a first microphone associated with a first channel, the first signal comprising a first set of frequency sub-bands; receiving a second signal derived from a second microphone associated with a second channel, the second signal comprising second set of frequency sub-bands, the first and second sets of frequency sub-bands having matching frequency ranges, each frequency sub-band of the first and second sets of frequency sub-bands having a frequency of greater than a threshold frequency; responsive to determining feedback at a first speaker associated with the first channel; mixing each of the first set of frequency sub-bands with a corresponding one of the second set of frequency sub-bands to generate a mixed output signal comprising a mixed set of frequency sub-bands; wherein the mixing is performed so as to minimize the output power in each of the mixed set of frequency sub-bands whilst maintaining a stereo effect level difference in the mixed signal between the first and second signals within a level difference threshold range.
The mixing may comprise: determining first mixing coefficients Ai for each of the first set of frequency sub-bands, where Ai is equal to or less than 1; determining second mixing coefficients 1-Ai for each of the second sets of frequency sub-bands; weighting each of the one or more frequency sub-bands of the first set with respective first mixing coefficients Ai and weighting each of the corresponding frequency sub-bands of the second set with respective second mixing coefficients, 1-Ai; and summing each of the weighted one or more frequency sub-bands of the first set with corresponding weighted frequency sub-bands of the second set together to produce the mixed set of one or more frequency sub-bands.
The method may further comprise determining the first set of frequency sub-bands and the second set of frequency sub-bands.
The threshold frequency may be about 2000 Hz. The level difference threshold range may be between about 6 dB to about 12 dB.
The first mixing coefficient Ai for each of the frequency sub-bands, i, of the first set may be defined as:
where m1i is the first set of frequency sub-bands, m2i is the second set of frequency sub-bands, eps is a constant defining the minimum subband power for which mixing occurs, and skew is a skew factor for maintaining the stereo effect level difference in the mixed signal between the first and second signals within the level difference threshold range.
Determining feedback at the first speaker may comprise determining a first probability, p1, of feedback at the first speaker; and the method further comprises: determining a second probability of feedback at a second speaker associated with the second channel.
The first mixing coefficient Ai for each of the frequency sub-bands of the first set may be defined as:
wherein p1 is the first probability, p2 is the second probability, m1i is the first set of frequency sub-bands, m2i is the second set of frequency sub-bands, and eps is a constant defining the minimum subband power for which mixing occurs.
Determining feedback at the first speaker may comprise determining a first level difference between a level of at least one high frequency sub-band of the first signal and a corresponding high frequency sub-band of a second signal; and determining the first probability based on the first level difference.
Determining feedback at the first speaker may comprise: determining a second level difference between a level of at least one high frequency sub-band of a first signal and a corresponding high frequency sub-band of a third signal derived from a third microphone, the third microphone in close proximity to the first speaker.
The at least one high frequency sub-band of the first signal may comprises a first plurality of sub-bands. The at least one high frequency sub-band of the second channel comprises a second plurality of sub-bands. Determining feedback at the first speaker further comprises: determining a set of level differences between each of the first plurality of sub-bands and a corresponding one of the second plurality of sub-bands; and determining the first probability based on the first set of level differences.
Determining the first probability comprises: determining a mean of the determined set of level differences; determining a minimum value of the determined set of level differences; determining a level difference feature based on the mean of the determined set of level differences subtracted by the minimum value of the determined set of level differences; and determining the first probability based on the level difference feature.
The method may further comprise determining a first level difference between a level of at least one high frequency sub-band of the first signal and a corresponding high frequency sub-band of the second signal; determining a second level difference between a level of at least one low frequency sub-band of the first signal and a corresponding relatively low frequency sub-band of the second signal; determining a modified level difference by subtracting the second level difference from the first level difference; and determining the first probability based on the modified level difference.
The method may further comprise: combining the mixed set of one or more frequency sub-bands with a third set of frequency sub-bands of the first signal to provide a combined set of frequency sub-bands, wherein each frequency sub-band of the third set of frequency sub-bands has a frequency of less than or equal to the threshold frequency; and transforming the combined set of frequency sub-bands into a time domain output signal.
The first and second microphones may be (i) reference microphones configured to capture ambient sounds or (ii) error microphones configured to capture sound in respective first and second channels.
Determining feedback at the first speaker comprises receiving a feedback flag indicative of feedback detected at the first speaker.
One of the first and second microphones may be a first reference microphone associated with the first speaker and configured to capture ambient sound in proximity to the first speaker. The other of the first and second microphones may be a reference microphone associated with a second speaker and configured to capture sound in proximity to the respective second speaker.
According to another aspect of the disclosure, there is provided a non-transitory computer-readable storage medium comprising instructions which, when executed by a computer, cause the computer to carry out the method described above.
According to another aspect of the disclosure, there is provided a feedback canceller, comprising: a first input for receiving a first signal derived from a first microphone associated with a first channel; a second input for receiving a first probability of feedback between the first microphone and a first speaker; a normalised least mean squares (NLMS) filter or least mean squares (LMS) filter configured to filter the first signal and output a filtered first signal; a controller configured to control an adaptation rate of the NLMS filter or the LMS filter in dependence of the first probability of feedback.
The controller may be configured to increase the adaptation rate of the NLMS filter or the LMS filter as the first probability of feedback increases.
The controller may be configured to control the adaptation rate, μ, using the following equation:
μ=Max(fbc_slow_rate,(fbc_fast_rate+log Prob))
where fbc_slow_rate is a lower bound of the adaptation rate, fbc_fast_rate is an upper bound of the adaptation rate, and logProb is the log of the first probability.
According to another aspect of the disclosure, there is provided a method of cancelling feedback, comprising: receiving a first signal derived from a first microphone associated with a first channel; receiving a first probability of feedback between the first microphone and a first speaker; filtering the first signal with a a normalised least mean squares (NLMS) filter or least mean squares (LMS) filter and outputting a filtered first signal; wherein an adaptation rate of the NLMS filter or LMS filter is controlled in dependence of the first probability of feedback.
The adaptation rate of the NLMS filter or LMS filter may be increased as the first probability of feedback increases.
The adaptation rate, μ, may be controlled based on the following equation:
μ=Max(fbc_slow_rate,(fbc_fast_rate+log Prob))
where fbc_slow_rate is a lower bound of the adaptation rate, fbc_fast_rate is an upper bound of the adaptation rate, and logProb is the log of the first probability.
Throughout this specification the word “comprise”, or variations such as “comprises” or “comprising”, will be understood to imply the inclusion of a stated element, integer or step, or group of elements, integers or steps, but not the exclusion of any other element, integer or step, or group of elements, integers or steps.
By way of example only, embodiments are now described with reference to the accompanying drawings, in which:
Described embodiments relate to methods, systems and apparatus for improved feedback control in an acoustic system. Described embodiments may reduce or eliminate incidences of feedback noise, such as howling when the acoustic path changes and/or improve added stable gain in ANC headset form factors.
Some embodiments relate to methods and apparatus for improved detection of feedback in acoustic systems. For example, some embodiments relate to determining an improved estimation of the likelihood or probability of feedback. By improving the detection of feedback in acoustic systems, feedback management techniques, such as feedback cancellation or suppression techniques, may be improved to thereby enhance the sound quality of the system. Similarly, by improving the detection of feedback in acoustic systems, feedback noise reduction techniques, such as microphone signal mixing techniques, may be improved to thereby enhance the sound quality of the system.
Some embodiment relate to methods and apparatus for reducing feedback noise in an acoustic system. For example, feedback reduction mechanisms, according to described embodiments, may instigate sub-band mixing in response to determining that feedback is present at a first speaker associated with a first microphone.
Some embodiment relate to methods and apparatus for improving feedback cancellation in an acoustic system. For example, feedback control mechanisms, according to described embodiments, may be used to perform improved feedback cancellation by adjusting an adaptation rate of the feedback cancellation being or to be performed in response to determining that feedback is present at a first speaker associated with a first microphone.
The system 200 shown in
The first module 202 may comprise a digital signal processor (DSP) 212 configured to receive microphone signals from error and reference microphones 205, 208. The module 202 may further comprise a memory 214, which may be provided as a single component or as multiple components. The memory 214 is provided for storing data and program instructions. The module 202 may further comprises a transceiver 216 to enable the module 202 to communicate wirelessly with external devices, such as the second module 204. Such communications between the modules 202, 204 may in alternative embodiments comprise wired communications where suitable wires are provided between left and right sides of a headset, either directly such as within an overhead band, or via an intermediate device such as a smartphone. The module 202 may be powered by a battery and may comprise other sensors (not shown).
The feedback reduction system 300 will be described with reference to the first module 202 shown in
The feedback reduction system 300 comprises a feedback detection module 302 and a cross-ear mixing module 304. Optionally, the feedback reduction system 300 also comprises a digital feedback cancellation (DFBC) module 306, an equalisation (EQ) module 308, an active feedback suppression (AFS) module 310, a subband loop gain estimation module 312 and a gains filter 314.
The feedback detection module 302 is configured to detect a feedback condition and to provide a feedback detection output to the cross-ear mixing module 304. In some embodiments such an output may be an indicator of the likelihood or probability of feedback. Additionally or alternatively, the output may be a binary flag indicative of the presence or absence of feedback noise, such as howling at the speaker 209.
The feedback detection module 302 may also be configured to provide a feedback detection output to the DFBC module 306 (if present) to improve control of feedback cancellation. In some embodiments, the DFBC module 306 may be configured to perform feedback cancellation and the feedback detection output from the feedback detection module 302 may be used by the DFBC module 306 to adjust an adaptation rate of the feedback cancellation being or to be performed. For example, the DFBC module 306 may control the adaptation rate based on the probability of feedback calculated by the feedback detection module 302. Further details of the DFBC module 306 and the feedback detection module 302 are provided below.
The cross-ear mixing module 304 may be configured to generate a modified output signal by mixing components of the left and right reference microphone signals 205, 210 in dependence of those signals. Such mixing may reduce unwanted feedback, such as howling.
The modified output signal from the cross-ear mixing module 304 may optionally then be equalised by the EQ module 308 and gain adjusted by the gain filter 314 (again optionally) before being output to the first speaker 209. If implemented, the gains filter 314 receives inputs from the AFS module 310 and/or the subband loop gain estimation module 312. The AFS module 310 may generate sub-band gains suitable for feedback suppression in accordance with known techniques, such as those described in US patent application publication number US 2004/0252853 A1, the content of which is hereby incorporated by reference in its entirety. Equally, the subband loop gain estimation module 312 may generate sub-band gains to maintain subband loop gain below 1 in order to minimize howling. Having a feedback loop gain greater than 1 can cause the system 200 to become unstable, leading to howling. Sub-band gains from each of the AFS module 310 and the subband loop gain estimation module 312 may then be combined (e g summed) to generate a combined gain to be applied by the gains filter 314. In the example shown in
In the embodiment shown in
The cross-ear mixing module 304 comprises a mixing module 400. In some embodiments, for example where the first and second signals S1, S2 are provided to the cross-ear mixing module 304 in the time domain, the cross-ear mixing module 304 may further comprise a DFT module 402. The DFT module 402 is configured to convert the first and second signals, S1, S2, from the time domain into the frequency domain, generating frequency domain representations S1F, S2F of the first and second signals S1, S2 respectively, each comprising a plurality of frequency sub-bands. The frequency ranges of sub-bands of the converted first signal S1F are chosen to correspond to the frequency ranges of the sub-bands of the converted second signal S2F. The DFT module 402 may employ Discrete Fourier Transform (DFT), such as Fast Fourier Transform (FFT), or any other suitable method of conversion between time and frequency domains.
In other embodiments, the first and second signals S1, S2 may be provided in the frequency domain. In which case, the DFT module 402 may be omitted.
In some embodiments the cross-ear mixing module 400 further comprises a filter module 404 configured to receive the converted frequency domain versions of the first and second signals S1F, S2F and determine a first filtered subset of frequency sub-bands m1 wherein the frequency of each of the sub-bands has a frequency of greater than a threshold frequency.
The threshold frequency may be selected to identify frequency sub-bands of the first signal S1 and/or the second signal S2 that may be affected by feedback, such as howling, i.e., candidate feedback affected sub-bands. In some embodiment, the threshold frequency is about 2 kHz or about 3 kHz.
In some embodiments, the filter 404 is a 64 tap linear phase FIR filter. In other embodiments, the filter is an asymmetric window function filter, which is generally associated with a reduced delay compared to a 64 tap linear phase FIR filter. For example, a 64 tap linear phase FIR filter may introduce a 4 ms delay to the system, whereas a asymmetric window function filter may introduce about a 1.5 ms delay to the system. In some embodiments, the filter 404 may be implemented by the DFT module 302. In which case, the DFT module 402 may only convert sub-bands having frequency ranges above the threshold frequency, discarding components of the frequency domain signal having a frequency less than the threshold frequency.
The mixing module 400 is configured to determine a modified output signal Sm1 in which feedback affected sub-bands of the first filtered subset of frequency sub-bands m1 have been mixed with corresponding sub-bands of the second filtered subset of frequency sub-bands m2. The result of the mixing is a modified output signal SM having a reduced power compared with the first signal S1 and a stereo effect level difference between the modified output signal SM and second signal S2 that is within a predetermined level difference threshold.
By reducing the output power in the modified output signal SM, the feedback path gain is reduced. Additionally, when implemented in a stereo system such as the system 200 shown in
The mixing module 400 comprises a mixing ratio module 408 configured to determine a mixing coefficient Ai for each frequency sub-band of the first set of frequency sub-bands, m1. For each channel, the mixing coefficient Ai defines how much of the corresponding subband of the other channel is substituted (mixed) into the output signal. The mixing ratio module 408 is configured to determine mixing coefficients Ai for each sub-band i of the first set of frequency sub-bands m1 using minimum power criteria, while substantially maintaining, or mitigating the loss of, stereo cues between the first signal S1 and the second signal S2 in the modified output signal SM.
For example, in some embodiments, the mixing coefficients Ai for each sub-band i are selected such that when a sub-band of the first signal S1 is much louder than the corresponding sub-band of the second signal S2 the corresponding sub-band of the second signal S2, which has less power, will be mostly used as the corresponding sub-band of the modified output signal SM. Conversely, when a signal level of a sub-band of the first signal S1 is relatively low, the mixing coefficient Ai may be selected to be equal to or approach 1, meaning that that sub-band of the first signal S1 will be mostly used as the corresponding sub-band of the modified output signal, SM.
It will be appreciated that mixing of the first and second signals S1, S2 may cause a reduction in stereo cues in the modified output signal SM. To address this, in some embodiments, stereo cues between the modified output signal and the second signal are provided for or maintained by incorporating a skew factor, skew, into the mixing coefficient Ai. For example, the skew factor may be selected to ensure that any change to the stereo effect level difference between the first signal and the second signal in the modified output signal SM is within a threshold level. or that a stereo effect level difference between sub-bands of the modified output signal SM and corresponding sub-bands of the second signal is within a level difference threshold range.
In some embodiments, the mixing coefficient Ai for each sub-band i is defined as follows:
where skew is the skew factor, m1i is the first filtered subset of frequency sub-bands, m2i is the second filtered subset of frequency sub-bands. eps is a constant defining the minimum subband power for which mixing occurs, the threshold power level at which mixing occurs increasing with eps.
However, although a relatively high skew factor will cause less attenuation of the modified output signal SM particularly for relatively high level differences, the higher the skew factor, the greater the value of the mixing coefficient Ai which in turn causes a greater portion of the first signal m1i to be mixed with the second signal, m2i, in determining the modified output signal SM. Accordingly, the modified output signal, SM, may retain a greater amount of howling or feedback noise than if a lower skew factor value were used.
The value for the skew factor is selected to counteract feedback noise, while providing for or retaining stereo cues, and a selection of a suitable is skew factor is effectively balancing tolerable noise and sufficient stereo cue maintenance. The skew factor may be predefined and/or adjustable to suit a user's needs depending on the user's tolerance to feedback noise. In some embodiments, an input may be provided for the user to adjust the skew factor (directly or indirectly) to their specific requirements.
In some embodiments, the skew factor may be selected to maintain a level difference of between about 6 to 12 dB between the modified output signal and the second signal. In some embodiments, to determine a suitable skew factor, the level difference between the first and second signals in a non-noise effected subband is measured.
An alternative method of determining the mixing coefficient Ai will now be described. In this embodiment, the microphone signals are dynamically mixed in a way that the output power is minimised during feedback. Feedback howling in headsets tends to occur only on one side of the head, such that left side howling and right side howling are largely uncorrelated. As mentioned above, the feedback detection module 302 may determine a probability of feedback at each of the left and right reference microphones 205, 210. The probability of feedback in the left and right channels may be used to determine the mixing coefficient Ai used by the mixing module 400 as described in more detail below. In some embodiments, the mixing coefficient Ai for the left channel is determined, using the following equation.
where p1 and p2 are the probability of feedback on left and right channels respectively, and eps is a constant defining the minimum subband power for which mixing occurs, the threshold power level at which mixing occurs increasing with eps. m1i is the first filtered subset of frequency sub-bands and m2i is the second filtered subset of frequency sub-bands. When both p1 and p2 are low, e.g. close to or equal to zero, the above equation simplifies as follows:
So, for the left channel, instead of mixing out subbands of the left channel for corresponding subbands of the right channel, the left channel subband will be passed straight through to the speaker with no change when p1 and p2 are both low (i.e. a low probability of feedback in either channel at the subband of interest). Indeed, the mixing coefficient becomes equal to 1 whenever p1 falls to zero such that the subband of interest in the left channel is always passed through when the estimated probability of feedback is zero.
When a level difference between the left and right channels is large (due to the presence of feedback in one channel or the other) the feedback detection module 302 may determine a high probability of feedback in one channel or the other. This probability may be increased if a large level difference is detected between error and reference microphones in one of the left and right channels. For example, when the feedback detection module 302 determines a high probability of feedback in a subband of the left channel, p1 may be close to 1 and p2 may be close to zero. In any case, where feedback probability in the left channel is high, i.e. p1 is close to 1 and the feedback probability in the right channel is low, the above equation simplifies to:
The mixing coefficient is then determined by the level of the left channel. The greater the level of the left channel, the smaller the mixing coefficient and the more of the corresponding subband of the right channel is mixed. When a level difference between the left and right channels is present due to environmental sound coming from a particular angle relative to the user, the level of the effected subband in the left channel may be low. In which case, more of the affected subband in the left channel will be maintained in the output signal and, as such, the mixing ratio Ai is less likely to reduce stereo perception, i.e. less likely to remove perception to the user of the sound coming from the left side of his head.
In addition to level difference, the feedback detection module 302 may also take into account the signal level in the left and right channels for the sub-band of interest. For example, in some instances, when the level difference is caused by head shadowing, the level difference may be high but the signal level itself may be low (relative to the signal level in the presence of feedback). This is in contrast to feedback howling where the signal level in the affected channel is always relatively high.
The mixing module 400 is configured to weight each of the one or more frequency sub-bands i of the first set m1i with a respective mixing coefficient Ai and weight each of the corresponding frequency sub-bands i, of the second set m2i with a respective mixing coefficient (1-Ai). The mixing module 400 is further configured to sum each of the weighted one or more frequency sub-bands i of the first set m1i with corresponding weighted frequency sub-bands i of the second set m2i together to produce a set mmi of one or more mixed frequency sub-bands i.
The mixing module 400 may be further configured to combine the mixed set, mmi, of the one or more frequency sub-bands of first signal S1, for example, those frequency sub-bands of first signal S1 which were blocked by the filter 404, to produce the modified output signal SM.
The mixing module 400 may further comprise an inverse DFT module 410 to convert the modified output signal, SM, into a time domain modified output signal, Sm. The inverse DFT module 410 may implement any known conversion algorithm, for example, an IFFT.
The mixing module 400 may further comprise a cross fader 412 to mix or blend the modified output signal, Sm, with the first signal, S1, to produce the modified output signal, Sm1. For example, the cross fader 412 may be configured to gradually blend the modified output signal, Sm, with the first signal, S1, to minimise an abrupt change in sound distinctly audible to the user.
In the embodiment shown in
In some embodiments, the mixing module 400 is activated or instigated in response to determining that feedback is present at a first speaker associated with the first microphone. For example, in some embodiments, the mixing module 400 is configured to receive an indication of the determination of feedback from the feedback detection module 302. As mentioned above, in some embodiments, the indication may comprise a binary flag indicative of the presence or otherwise of feedback such as howling at the first or second microphones. In some embodiments, the feedback reduction system 300 further comprises the feedback detection module 302.
Referring to
At 502, feedback at first speaker associated with first microphone is determined.
Optionally, at 504, a first set of frequency sub-bands of a first signal derived from the first microphone having a frequency of greater than a threshold frequency is determined. Alternatively, the first set of frequency sub-bands of the first signal are received (in the frequency domain) and no conversion is necessary.
Optionally, at 506, a second set of frequency sub-bands of a second signal derived from the second microphone having a frequency of greater than a threshold frequency is determined. Alternatively, the second set of frequency sub-bands of the second signal are received (in the frequency domain) and no conversion is necessary.
At 508, first mixing coefficients Ai for each of the one or more frequency sub-bands i of the first set are determined such that power of the modified output signal is reduced and a stereo effect level difference between the modified output signal and second signal is at within a level difference threshold range. For example, the level difference threshold range may be about 6 to 12 dB.
At 510, each of the one or more frequency sub-bands of the first and second sets are weighted with respective first and second mixing coefficient.
At 512, each of the weighted frequency sub-bands of the first set is summed with corresponding weighted frequency sub-bands of the second set together to produce a mixed set of one or more frequency sub-bands.
At 514, the mixed set of one or more frequency sub-bands is combined with the first signal to produce a modified output signal.
Referring now to
As stated above, in some embodiments, the output of the feedback detection module 302 may be provided to the DFBC module 304 shown in
In some embodiments, the adaptation rate μ is determined by the following equation:
μ=Max(fbc_slow_rate,(fbc_fast_rate+log Prob)),
where logProb is the log probability of feedback occurring (in this case in the left channel), fbc_slow_rate is the lower bound of the adaptation rate μ, and fbc_fast_rate is the upper bound of the adaptation rate μ. In other words, the adaptation rate μ is calculated as the lowest value of the lower bound of the adaptation rate μ on the one hand and the sum of the upper bound of the adaptation rate μ and the log probability of feedback occurring on the other hand. Since the probability of feedback occurring is always less than or equal to 1, logProb will always be negative. As such, the adaptation rate μ is saturated between the lower and upper bound of the adaptation rate μ.
The above is described with reference to NLMS filters. However, the above could equally be implemented using a least means squares (LMS) algorithm or other suitable algorithm. Both NLMS inputs, or both LMS inputs, are preferably decorrelated or whitened by suppression of the correlated signals.
In some embodiments, the output of the feedback detection module 302 may be provided to the cross-ear mixing module 304 to reduce feedback noise by instigating signal mixing to reduce deleterious feedback effects, such as howling, as discussed above.
The feedback detection module 302 comprise a level difference unit 602 for determining a level difference between at least first and second signals, S1, S2, derived from at least first and second microphones (not shown), respectively, associated with one or more speakers (not shown) and a decision function unit 604, such as a logistic regression unit, which may be configured to determine a likelihood or probability of the presence of feedback noise such as howling at a first speaker based on the level difference.
The at least first and second microphones may comprise one or more reference microphones configured to capture ambient sounds and/or one or more error microphones configured to capture sound at respective one or more speakers. In some embodiments, the at least first and second microphones comprise the first and second reference microphones 208, 210 and first and second error microphones 205, 206 of the system 200 shown in
In some embodiments, the feedback detection module 302 comprises one or more A/D converter (not shown) configured to convert analogue electrical signals received, for example, from analogue microphones into digital signals. In other embodiments, the feedback detection module 302 is configured to receive digital signals.
In some embodiments, the feedback detection module 302 is configured to transform the received first and second signals S1, S2 from the time domain (if received in the time domain) into the frequency domain. In other embodiments, the first and second signals S1, S2 may be received in the frequency domain. In either case, in some embodiments, full-band calibration gains may be applied on the frequency domain data.
During testing of headsets, earphones and earbuds, the inventors observed that feedback howling is most likely to be present at high frequencies and is further likely to be localised. In other words, howling is most likely to occur on one side of a stereo audio system. This is due to the fact that howling is commonly induced by a user touching one side or the other of the audio system (e.g. headset) at a time. The inventors have also discovered that when feedback reduction algorithms are used in the signal path, howling tends to be short lived. Additionally, due to the effect of head shadowing, howling is generally attenuated by over 20 dB when picked up by a microphone on the other side of the headset. In view of the above, exemplary embodiments of the disclosure are configured to monitor levels at microphones associated with audio systems such as the system 200 of
In some embodiments, the level difference unit 602 is configured to determine a level difference between the first signal S1 which may be derived from a first reference microphone and a second signal, S2 which may be derived from a second reference microphone. For example, the first and second reference microphones may be first and second (left and right) reference microphones of a headset, earphones or earbuds and the level difference unit 602 may be configured to determine a cross ear level difference. In some embodiments, the level difference unit 602 is configured to determine a level difference between a first signal derived from a first error microphone and a second signal derived from a second error microphone. The level difference unit 602 may equally be able to determine a cross ear level difference from left and right error microphones. In some embodiments, the first error microphone is the error microphone 205 of system 200 and the second error microphone is the error microphone 206 of system 200
Referring to
At 702, a first level of at least one relatively high frequency sub-band of a first input signal derived from a first microphone associated with the first speaker is determined. In some embodiments, the first input signal S1 in the frequency domain is grouped into two frequency sub-bands; a high frequency sub-band and a low frequency sub-band and the first level is the level of the high frequency band. The high frequency band may be chosen to be greater than 2 kHz or greater than 3 kHz. In other embodiments, the level difference module 602 may identify a high frequency sub-band having frequency range greater than a threshold, e.g. greater than 2 kHz or greater than 3 kHz.
At 704, a second level of at least one relatively high frequency sub-band of a second input signal derived from a second microphone of the acoustic system is determined. The at least one relatively high frequency sub-band of the first input signal corresponds with the at least one relatively high frequency sub-band of the second input signal.
At 706, a first level difference between the first level and the second level is determined. In some embodiments, the first level difference is indicative of the dB level difference between the at least one relatively high frequency sub-band of the first and second signals. In some embodiments, the first level difference is feature Xi.
In some embodiments, the method 700 further comprises determining a second level difference between a level of at least one relatively low frequency sub-band of the first input signal and a corresponding relatively low frequency sub-band of the second input signal and determining a modified level difference by subtracting the second level difference from the first level difference. In such an embodiment, the likelihood or probability of feedback at the first speaker is determined based on the modified level difference. In some embodiments, the modified level difference is feature Xi.
In some embodiments, the method 700 is performed on a first frame of data from the first input signal and a second frame of data from the second input signal. In some embodiments, prior to determining the first and second levels, the first and second frames of data are converted into the frequency domain and full-band calibration gains may be applied to the first and second frames of frequency domain data, as described above.
In audio systems comprising an error microphone and a reference microphone associated with a single speaker, for example the module 202 of
Accordingly, in some embodiments, in addition to or as an alternative to determining a level difference between two reference microphones or between two error microphones, i.e. a stereo level difference, the level difference unit 602 is configured to determine a level difference between a first signal derived from a first reference microphone and a second signal derived from a first error microphone. The first reference microphone and the first error microphone may be both associated with the same speaker. For example, the first reference microphone and the first error microphone may be associated with the same speaker of a headset, earphones or earbuds, such as the system 200 of
Referring to
At 802, a first group of multiple channels of a first input signal derived from a first microphone associated with a first speaker is determined.
At 804, a second group of multiple channels of a second input signal derived from a second microphone associated with the first speaker is determined.
At 806, a set of level differences between the first input signal and the second input signal by determining a difference between the level of corresponding channels of the first and second groups is determined.
In some embodiments, the method 800 is performed on a first frame of data from the first input signal and a second frame of data from the second input signal. In some embodiments, prior to determining the into first and second groups of multiple channels, the first and second frames of data are converted into the frequency domain. In some embodiments, full-band calibration gains are applied to the first and second groups of multiple channels to determine calibrated first and second groups and the set of level differences between the first input signal and the second input signal is determined by determining a difference between the dB level of corresponding channels of the calibrated first and second groups.
The decision function unit 604 is configured to determine a likelihood or probability of feedback at the first speaker based on the determined first level difference determined by the level difference unit 602 using process 700 and/or based on the set of level differences determined by process 800.
In some embodiments, determining the likelihood of feedback based on the set of level differences comprises determining a level difference feature Xi based on the mean of the determined set of level differences subtracted by the minimum value of the determined set of level differences and determining the likelihood of feedback based on the level difference feature Xi.
In some embodiments, the decision function unit 604 employs logistic regression to determine whether the level difference features, Xi, detected by the level difference unit 602 are indicative of the presence of feedback noise such as howling at a speaker of the system.
A predictor function F(X) of the decision function unit 604 may be a linear combination of features Xi, where
Where f(X)=Σcoefi*Xi+intercept
and where coefi and intercept are the linear coefficients.
By applying the logistic function on the predictor function, F(X) is interpreted as the probability of ‘1’ given certain combination of the feature values.
In some embodiments, the linear coefficients may be derived from training data. The training data may comprise two groups of data, namely, data with feedback and data without feedback. For example, the data with feedback may be created by holding a headset in hand and making it howl, (for example, by holding the headset in hand) and labelling any data above about 60 dB SPL as feedback data. The data without feedback may be created by recording the feature data in common false alarm situations, such as own voice, directional environmental sound, clapping hand, etc. and labelling that data as data without feedback. In some embodiments, the ratio between feedback data and no feedback data of the training data is about 1:1.
In some embodiments, the linear coefficients may be derived using a machine learning algorithm, such as a python machine learning algorithm (sklearn: linear_model.LogisticRegression). Adjustment of the intercept allows for the sensitivity of the detection algorithm to be adjusted as required.
In some embodiments, the decision function unit 604 is configured to output a binary flag Ff indicative of feedback, e.g. to the cross-ear mixing module 304.
It will be appreciated by persons skilled in the art that numerous variations and/or modifications may be made to the above-described embodiments, without departing from the broad general scope of the present disclosure. The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive.
Number | Name | Date | Kind |
---|---|---|---|
20040252853 | Blamey et al. | Dec 2004 | A1 |
20120207315 | Kimura | Aug 2012 | A1 |
20170180879 | Petersen | Jun 2017 | A1 |