The present disclosure relates to methods of and apparatus for compensating for ear occlusion.
Many hearing devices, such as headsets, hearing aids, and hearing protectors, have tightly sealing earbuds or earcups that occlude ears and isolate the users from environmental noise. This isolation has two side effects when users want to listen to their own-voice (OV), such as when making a phone call or talking to a person nearby without taking the devices off their ears. One of the side effects is the passive loss (PL) at high frequency, which makes the user's own voice sounded muffled to them. The other effect is the amplification of the user's own voice at low frequency, which makes their voice sounded boomy to them. The amplification of a user's own voice at low frequency is commonly referred to as the occlusion effect (OE).
The OE occurs primarily below 1 kHz and is dependent on ear canal structure of the user, the fitting tightness of hearing devices, and the phoneme being pronounced by the user. For example, for front open vowels such as [a:], the OE is usually only several decibels (dB), whereas for back closed vowels such as [i:], the OE can be over 30 dB.
Feedback active noise cancellation (ANC) is a common method used in noise cancelling headphones to compensate for OE. Feedback ANC uses an internal microphone, located near the eardrum, and a headset speaker to form a feedback loop to cancel the sound near the eardrum. Using feedback ANC to counteract OE is described in U.S. Pat. Nos. 4,985,925 and 5,267,321, the content of each of which is hereby incorporated by reference in its entirety. The methods described in these patents require all of the parameters of the feedback ANC to be preset based on an average OE of a user. U.S. Pat. No. 9,020,160, the content of which is hereby incorporated by reference in its entirety, describes updating feedback loop variables of a feedback ANC filter to account for changes in phenomes being pronounced by a user.
Any discussion of documents, acts, materials, devices, articles or the like which has been included in the present specification is not to be taken as an admission that any or all of these matters form part of the prior art base or were common general knowledge in the field relevant to the present disclosure as it existed before the priority date of each of the appended claims.
The present disclose provides methods for restoring the naturalness of a user's own voice using novel signal analysis and processing.
According to an aspect of the disclosure, there is provided a method of equalising sound in a headset comprising an internal microphone configured to generate a first audio signal, an external microphone configured to generate a second audio signal, a speaker, and one or more processors coupled between the speaker the external microphone, and the internal microphone, the method comprising: while the headset is worn by a user: determining a first audio transfer function between the first audio signal and the second audio signal in the presence of sound at the external microphone; and determining a second audio transfer function between a speaker input signal and the first audio signal with the speaker being driven by the speaker input signal; determining an electrical transfer function of the one or more processors; determining a closed-ear transfer function based on the first audio transfer function, the second audio transfer function and the electrical transfer function; and equalising the first audio signal based on a comparison between the closed-ear transfer function and an open-ear transfer function to generate an equalised first audio signal.
The comparison may be a frequency domain ratio between the closed-ear transfer function and the open-ear transfer function. The comparison may be a time-domain difference between the closed-ear transfer function and the open-ear transfer function.
The open-ear transfer function may be a measured open-ear transfer function between an ear-entrance or an eardrum of the user. Alternatively, the open-ear transfer function may be a measured open-ear transfer function between an ear-entrance and an ear-drum of a head simulator. Alternatively, the open-ear transfer function may be an average open-ear transfer function of a portion of the general population.
The method may further comprise a) measuring the open-ear transfer function between an ear-entrance or an eardrum of the user; or b) measuring the open-ear transfer function between an ear-entrance and an ear-drum of a head simulator; or c) determining the open-ear transfer function based on an average open-ear transfer function for a portion of the general population.
The step of determining the first audio transfer function may be performed with the speaker muted.
The step of determining the second audio transfer function may be performed in the presence of little or no sound external to the headset.
Determining the electrical path transfer function may comprise determining a frequency response of a feedforward ANC filter implemented by the one or more processors and/or a frequency response of a feedback ANC filter implemented by the one or more processors.
Determining the frequency response may comprise determining a gain associated with the one or more processors.
The method may further comprise determining an open-ear transfer function between an ear-entrance and an eardrum of the user comprises approximating the open-ear transfer function of the user.
The method may further comprise outputting the equalised first audio signal to the speaker.
The method may further comprise: determining a third audio transfer function between the first audio signal and the second audio signal while the headset is worn by the user and the user is speaking; and further equalising the equalised first audio signal based on the third transfer function.
The method may further comprise, on determining that the user is speaking, outputting the voice equalised first audio signal to the speaker.
The method may further comprise determining that the one or more processors is implementing active noise cancellation (ANC); and adjusting the further equalisation to account for the one or more processors implementing ANC.
The method may further comprise requesting that the user to speak a phoneme balanced sentence or phrase. The third audio transfer function may be determined while the user is speaking the phoneme balanced sentence.
According to another aspect of the disclosure, there is provided an apparatus, comprising: a headset comprising: an internal microphone configured to generate a first audio signal; an external microphone configured to generate a second audio signal; a speaker; and one or more processors configured to: while the headset is worn by a user: determine a first audio transfer function between the first audio signal and the second audio signal in the presence of sound at the external microphone; and determine a second audio transfer function between a speaker input signal and the first audio signal with the speaker being driven by the speaker input signal; determine an electrical transfer function of the one or more processors; determine a closed-ear transfer function based on the first audio transfer function, the second audio transfer function and the electrical transfer function; and equalise the first audio signal based on a comparison between the closed-ear transfer function and an open-ear transfer function to generate an equalised first audio signal.
The comparison may be a frequency domain ratio between the closed-ear transfer function and the open-ear transfer function. The comparison may be a time-domain difference between the closed-ear transfer function and the open-ear transfer function.
The open-ear transfer function may be a measured open-ear transfer function between an ear-entrance or an eardrum of the user. Alternatively, the open-ear transfer function may be a measured open-ear transfer function between an ear-entrance and an ear-drum of a head simulator. Alternatively, the open-ear transfer function may be an average open-ear transfer function of a portion of the general population.
The one or more processors may be further configured to: a) measuring the open-ear transfer function between an ear-entrance or an eardrum of the user; or b) measuring the open-ear transfer function between an ear-entrance and an ear-drum of a head simulator; or c) determining the open-ear transfer function based on an average open-ear transfer function for a portion of the general population.
The step of determining the first audio transfer function may be performed with the speaker muted.
The step of determining the second audio transfer function may be performed in the presence of little or no sound external to the headset.
Determining the electrical path transfer function may comprise determining a frequency response of a feedforward ANC filter implemented by the one or more processors and/or a frequency response of a feedback ANC filter implemented by the one or more processors.
Determining the electrical path transfer function may comprise determining a gain associated with the one or more processors.
Determining an open-ear transfer function between an ear-entrance and an eardrum of the user comprises approximating the open-ear transfer function.
The one or more processors may be further configured to, on determining that the user is not speaking, outputting the equalised first audio signal to the speaker.
The one or more processors may be further configured to determine a third audio transfer function between the first audio signal and the second audio signal while the headset is worn by the user and the user is speaking; and further equalise the equalised first audio signal based on the difference between the open-ear transfer function and the closed-ear transfer function to generate a voice equalised first audio signal.
The one or more processors may be further configured to, on determining that the user is speaking, output the voice equalised first audio signal to the speaker.
The one or more processors may be further configured to determine that the one or more processors is implementing active noise cancellation (ANC); and adjusting the further equalisation to account for the one or more processors implementing ANC.
The one or more processors may be further configured to output a request to the user to speak a phoneme balanced sentence or phrase, wherein the third audio transfer function is determined while the user is speaking the phoneme balanced sentence.
According to another aspect of the disclosure, there is provided a method of equalising sound in a headset comprising an internal microphone configured to generate a first audio signal, an external microphone configured to generate a second audio signal, a speaker, and one or more processors coupled between the speaker the external microphone, and the internal microphone, the method comprising: determining a first audio transfer function between the first audio signal and the second audio signal while the headset is worn by the user and the user is speaking; and equalising the first audio signal based on the first audio transfer function.
The method may further comprise, on determining that the user is speaking, outputting the voice equalised first audio signal to the speaker.
The method may further comprise determining that the one or more processors is implementing active noise cancellation (ANC); and adjusting the equalisation to account for the ANC.
The method may further comprise requesting that the user speak a phoneme balanced sentence or phrase. The first audio transfer function may then be determined while the user is speaking the phoneme balanced sentence.
According to another aspect of the disclosure, there is provided an apparatus, comprising: a headset comprising: an internal microphone configured to generate a first audio signal; an external microphone configured to generate a second audio signal; a speaker; and one or more processors configured to: determine a first audio transfer function between the first audio signal and the second audio signal while the headset is worn by the user and the user is speaking; and equalise the first audio signal based on the difference between the open-ear transfer function and the closed-ear transfer function to generate an equalised first audio signal.
The one or more processors may be further configured to: on determining that the user is speaking, output the equalised first audio signal to the speaker.
The one or more processors may be further configured to: determine that the one or more processors is implementing active noise cancellation (ANC); and adjust the equalisation to account for the ANC.
The one or more processors may be further configured to: request that the user speak a phoneme balanced sentence or phrase, wherein the first audio transfer function is determined while the user is speaking the phoneme balanced sentence.
The headset may comprise one or more of the one or more processors.
According to another aspect of the disclosure, there is provided an electronic device comprising the apparatus as described above.
Throughout this specification the word “comprise”, or variations such as “comprises” or “comprising”, will be understood to imply the inclusion of a stated element, integer or step, or group of elements, integers or steps, but not the exclusion of any other element, integer or step, or group of elements, integers or steps.
Embodiments of the present disclosure will now be described by way of non-limiting example only with reference to the accompanying drawings, in which:
Isolation of the user's 100 eardrums from the external environment has two side effects when users want to listen to their own-voice (OV). One of the side effects is the passive loss (PL) at high frequency which leads to a relatively attenuated high frequency sound at the user's eardrum as shown in the graph in
Embodiments of the present disclosure relate to methods for a) restoring attenuated high frequency sounds, and b) attenuating low frequency components introduced due to the occlusion effect with an aim of restoring the user's 100 voice such that when wearing a headset, his voice sounds substantially as if he wasn't wearing the headset.
The inventors also have realised that high frequency attenuation due to passive loss occurs regardless of whether the user of the headset 200 is speaking or not, whereas low frequency boom occurs only when the user is speaking. Accordingly, in embodiments of the present disclosure, methods are presented to change equalisation in response to detecting that the user is speaking.
With the above in mind, equalisation for restoring the attenuated high frequency sounds may be referred to herein as hearing augmentation equalisation (HAEQ). Equalisation for restoring the low frequency components of sound introduced due to the occlusion effect may be referred to herein as delta hearing augmentation equalisation (dHAEQ).
The headset 200 shown in
The first module 202 may comprise a digital signal processor (DSP) 212 configured to receive microphone signals from error and reference microphones 205, 208. The module 202 may further comprise a memory 214, which may be provided as a single component or as multiple components. The memory 214 may be provided for storing data and program instructions. The module 202 may further comprises a transceiver 216 to enable the module 202 to communicate wirelessly with external devices, such as the second module 204, smartphones, computers and the like. Such communications between the modules 202, 204 may in alternative embodiments comprise wired communications where suitable wires are provided between left and right sides of a headset, either directly such as within an overhead band, or via an intermediate device such as a smartphone. The module 202 may further comprise a voice activity detector (VAD) 218 configured to detect when the user is speaking. The module 202 may be powered by a battery and may comprise other sensors (not shown).
At step 502 an open-ear transfer function (i.e. a transfer function of the open ear (TFOE)) may be determined. The open-ear transfer function may be measured on the user, for example, by an audiologist using microphones positioned at the ear-entrance and the eardrum. Alternatively, the open-ear transfer function may be estimated base on an average open-ear transfer function of the general population. Alternatively, the open-ear transfer function of the user may be estimated based on a transfer function measured on a head simulator, such as a KEMAR (Knowles Electronic Manikin For Acoustic Research). Various methods of determining the open-ear transfer function are known in the art and so will not be explained further here. Where the open-ear transfer function is estimated based on population data or the like, the step 502 of determining the open-ear transfer function may be omitted or may simply comprise reading a stored open-ear transfer function from memory.
At step 504, a closed-ear transfer function for the user is determined. The closed-ear transfer function may be representative of the air-conduction and electrical-conduction paths present with the user 100 wearing the headset 200.
At step 506, a hearing augmentation EQ (HAEQ) may be determined based on a comparison between the open ear transfer function and the determined closed-ear transfer function for the user 100 wearing the headset 200. For example, the HAEQ may be determined based on a ratio between open-ear transfer function and the closed-ear transfer function (in the frequency domain) or based on a dB spectral different between the open-ear and closed-ear transfer functions. This EQ represents the difference in sound reaching the eardrum of the user 100 when the user is wearing the headset 200 versus when the user is not wearing the headset 200 (i.e. the open-ear state).
After the HAEQ has been determined at step 506, HAEQ may be applied at step 508 to the input signal for the speaker 209 so as to restore the high frequency sound attenuated due to passive loss in the headset 200.
Determining Open-Ear Transfer Function
The determination of the open-ear transfer function according to exemplary embodiments of the present disclosure will now be describe with reference to
Referring to
ZED_O(f)=ZEE(f)·HO(f) (1.1)
Where:
ZED_O(f): sound signal at eardrum in open ear;
ZEE(f): sound signal at ear-entrance (whether open or closed-ear); and
HO(f): open-ear transfer function from ear-entrance to eardrum in open ear.
As mentioned above, in some embodiments ZED_O(f) and ZEE(f) may be recorded using a pair of measurement microphones, a first measurement microphone 602 and a second measurement microphone 604. The first measurement microphone 602 may be placed at the ear-entrance and the second measurement microphone 604 may be placed at the ear-drum of the user 100. Preferably, the first and second microphones 602, 604 are matched, i.e. they have the same properties (including frequency response and sensitivity). As mentioned above, this process may be performed specifically on the user or, alternatively, data from the general population pertaining to the open-ear transfer function may be used to approximate the open-ear transfer function of the user 100.
The recorded electrical signals from the first and second microphones 602, 604 may be defined as:
XED_O(f)=ZED_O(f)·GMM1(f) (1.2)
XEE(f)=ZEE(f)·GMM2(f) (1.3)
Where GMM1(f) and GMM2(f) are frequency responses of the first and second measurement microphones 602, 604 respectively. For a typical measurement microphone, their frequency response is flat and equal to a fixed factor qMM (conversion factor from physical sound signal to electrical digital signal) for frequencies between 10 Hz and 20 kHz. XED_O(f) is the electrical signal of the first measurement microphone 602 at the eardrum in open ear. This may be approximated using an ear of a KEMAR by using its eardrum microphone. When measuring the open-ear transfer function of the specific user 100 the first measurement microphone 602 may be a probe-tube microphone which can be inserted into ear canal until it touches the eardrum of the user 100. XEE(f) is the electrical signal of the second measurement microphone 604 at ear-entrance.
Provided the first and second measurement microphones 602, 604 are matched:
So, HO(f) can be estimated by XED_O(f) and XEE(f) as:
Where HOE(f) is the estimated open-ear transfer function from ear-entrance to eardrum in open ear.
Determining Closed-Ear Transfer Function
Referring again to
In the closed-ear configuration, i.e. when the user 100 is wearing the headset, there exists both an air-conduction path (as was the case in the open-ear scenario of
It is noted that the electrical configuration of the module 202 shown in
The sound signal ZED_C(f) at the eardrum in the close-ear scenario may be defined as:
ZED_C(f)=ZEM(f)·HC2(f) (1.6)
Where:
The sound signal ZEM(f) at the error microphone 205 may be defined as:
ZEM(f)=ZEMa(f)+ZEMe(f) (1.7)
Where:
Embodiments of the present disclosure aim to estimate the sound signal ZEM(f) present at the error microphone 205 by first estimating the component ZEMa(f) of the sound signal present due to air-conduction and second estimating the contribution ZEMe(f) present at the error microphone 205 due to the electrical properties of the module 202 (i.e. the processed electrical signal output to the speaker 209). The inventors have realised that not only is the air-conduction component dependent on fit of the headset 200 on the user 100, but also the electrical-conduction path component ZEMe(f) is dependent both on fit of the headset 200 on the user 100 and also the geometry of the ear canal of the user 100.
Determining ZEMa(f)
The acoustic transfer function from the ear-entrance to the eardrum in the closed-ear state (with the headset 200 worn by the user 100) may be defined as:
HC(f)=HP(f)·HC2(f) (1.8)
Where HP(f) is the transfer function of sound signal from ear-entrance to the error microphone 205 which corresponds to the passive loss of sound caused by the headset 200 and HC2(f) is the transfer function between the error microphone 205 and the eardrum.
The above equation (1.8) may be simplified by assuming that error microphone 205 is very close to the ear drum such that HC2(f)≈1 and therefore HC(f)≈HP(f).
With the above in mind and assuming that the reference microphone 208 is positioned substantially at the ear-entrance, the acoustic path transfer function HC(f) can be estimated by comparing the sound signal received at the reference microphone 208 with that at the error microphone 205 in-situ while the user 100 is wearing the headset 200. Referring to
ZEMa(f)=ZEE(f)·HP(f) (1.9)
The electrical signal XEMa(f) captured by the error microphone 205 may be defined as:
XEMa(f)=ZEMa(f)·GEM(f)=ZEE(f)·HP(f)·GEM(f) (1.10)
Where GEM(f) is the frequency response of error microphone 205, which is typically is flat and equals to a fixed factor qEM (conversion factor from physical sound signal to electrical digital signal) for frequencies between 100 Hz and 8 kHz for a MEMS microphone.
At step 806, the electrical signal XRM(f) generated by the reference microphone 208 may be captured. The ear-entrance sound signal ZEE(f) can be recorded by the reference microphone 208 as:
XRM(f)=ZEE(f)·GRM(f) (1.11)
Where GRM(f) is the frequency response of reference microphone 208, which is typically is flat and equals to a fixed factor qEM (conversion factor from physical sound signal to electrical digital signal) for frequencies between 100 Hz and 8 kHz for a MEMS microphone.
Assuming the frequency response of the reference and error microphones 208, 205 are matched, then:
As such, at step 808, the user specific acoustic transfer function HC(f) from the ear-entrance to the eardrum in close-ear can be determined based on the captured electrical signals XEM(f), XRM(f) from the error and reference microphones 205, 208 as defined below.
Determining ZEMe(f)
The inventors have realised that with knowledge of the electrical characteristics of the processing between the reference microphone 208, the error microphone 205 and the speaker 209, the transfer function between the eardrum and ear entrance due to the electrical-conduction path may be determined by comparing the sound output at the speaker 209 and the same sound received at the error microphone 205.
At step 902, a signal is output to the speaker 209, preferably with any external sound muted so that there is no external sound contribution at the error microphone 205 due to the closed-ear acoustic-conduction path between the ear entrance and the eardrum. The speaker input signal XSI(f) is generated by processing electronics within the module 202.
With outside sound muted, the contribution to the sound signal ZEMe(f) at the error microphone 205 by the speaker 209 may be defined as:
ZEMe(f)=XSI(f)·GSK(f)·HS2(f) (1.13)
Where HS2(f) is the transfer function of the sound signal from the position at the output of the speaker 209 to the position of the error microphone 205 and GSK(f) is frequency response of speaker 209, and XSI(f) is the speaker input signal.
The electrical signal output from the error microphone 205 may therefore be defined as:
XEMe(f)=ZEMe(f)·GEM(f)=XSI(f)·GSK(f)·HS2(f)·GEM(f) (1.14)
Where GEM(f) is the frequency response of the error microphone 205.
The sound signal at headset speaker position can be estimated based on the speaker input XSI(f) signal and the frequency response of the speaker 209. The transfer function between the input signal at the speaker 209 and the error microphone 205 output signal may be defined as:
From the above equation, since GSK(f) and GEM(f) are fixed HSE(f) will be directly proportional to HS2(f) for different ear canal geometries and different headset fit.
The speaker input signal XS1(f) is defined by the back end processing implemented by the module 202. Accordingly, at step 906, the electrical characteristics of the module 202 used to generate the speaker input signal may be determined. In some embodiments, where the headset 200 is noise isolating only (i.e. no active noise cancellation (ANC)) the speaker input signal may be substantially unaffected by processing in the module 202. In some embodiments, however, the headset 200 may implement active noise cancellation. In which case, the speaker input signal XS1(f) will be affected by feedforward and feedback filters as well as hearing augmentation due to equalisation of the speaker input signal XS1(f). In such cases, the speaker input signal XSI(f) may be defined as:
XS1(f)=XRM(f)HHA(f)−XRM(f)HW1(f)−XCE(f)HFB(f) (1.16)
XCE(f)=XEMe(f)−XRM(f)HHA(f)HSE(f)−XPB(f)HSE(f) (1.17)
Where:
Thus, at step 908, a transfer function is determined between the error microphone 205 signal, the reference microphone 208 signal and the speaker input signal based on the determined electrical characteristics of the module 200 and the acoustic coupling of the speaker to the error microphone 205.
It is noted that if ANC is not being implemented by the headset, then there will be no feedback or feedforward filtering such that XSI(f)=XRM(f)HHA(f).
When HA is enabled, playback XPB(f) will usually be muted so that the user can hear the sound being restored to their eardrum from outside of the headset. Provided playback is muted and equals zero when the HA function is enabled, equation (1.17) becomes:
XCE(f)=XEMe(f)−XRM(f)HHA(f)HSE(f) (1.18)
Combining Acoustic-Conduction Path with Electrical-Conduction Path
The air-conduction and electrical-conduction components can be combined as follows:
When ANC is perfect, equation (1.20) can be simplified as:
XEM_ANCperfect(f)=XRM(f)HHA(f)HSE(f) (1.21)
This means that the air-conduction contribution of outer-sound at the eardrum has been totally cancelled and only the electrical-conduction contribution (at the speaker 209) is left.
When ANC is muted, equation (1.20) can be simplified as:
XEM_ANCoff(f)=XRM(f)·[HPE(f)+HHA(f)HSE(f)] (1.22)
It is noted that when HPE(f) and HHA(f)HSE(f) have similar magnitude but different phase, their summation will produce a comb-filter effect. To reduce the comb-filter effect, it is preferable to ensure that the latency between the electrical-conduction path and air-conduction path is minimized.
Thus, methods described herein can be used to derive an EQ which takes into account the air-conduction path between the ear-entrance and the ear-drum (using the reference to error microphone ratio, the electrical-conduction path within the headset module 202, and the air-conduction path between the speaker 209 and the error microphone 209. Since both air-conduction paths are dependent on headset fit and ear canal geometry, the present embodiments thus provides a technique for in-situ determination of a bespoke EQ for the user 100 of the headset 200.
Derivation of HAEQ
Referring to step 506 of the process 500 shown in
So:
Assuming the error microphone is close to eardrum, we have HC2(f)≈1. Provided the reference and error microphones 205, 208 have similar properties,
So, equation (1.24) can be simplified as:
If ANC is operating well,
so equation (1.25) can be further simplified as:
Thus, when ANC is operating efficiently, the reference and error microphones 208, 205 are matched, and the error microphone 205 is close to the eardrum of the user 100, HHA(f) will be decided only by HOE(f) and HSE(f).
Thus an HAEQ is determined which restores the sound signal ZEDC(f) at the eardrum of the user to the open ear state.
It is noted that the frequency response HHA(f) applied at the speaker input can be further decomposed into a default fixed electrical frequency response HHAEE(f) and a tuneable frequency response (or equalizer) HHAEQ(f):
HHA(f)=HHAEE(f)·HHAEQ(f) (1.28)
Where HHAEE(f) is the default transfer function from the input to the output of HHA(f) when all filters (like equalizer, noise cancellation, et al.) are disabled, and HHAEQ(f) is the equalisation for restoration of the open-ear condition at the eardrum of the user 100. Then,
Equation (1.29) above shows that HHAEQ(f) can be calculated directly after the measurement of HOE(f), HPE(f), HSE(f), and HHAEE(f) with the user 100 wearing the headset 200 (i.e. in-situ measurement), and the knowledge of current values of feedback and feedforward filters HW1(f) and HFB(f) from the headset 200.
The inventors have further realised that the effect of EQ is substantially unaffected when phase is ignored. As such, the above equation (1.29) can be simplified as follows.
It is noted that HHA(f) is preferably designed to restore/compensate but not to cancel sound signal at eardrum. So |HHAEQ(f)| should preferably not be negative. In equation (1.30), |HOE(f)| is always larger than or equal to |HPE(f)| (no matter whether ANC is switched on or off), so |HHAEQ(f)| should always be positive.
In addition the transfer functions referred to in equation (1.30), two additional transfer functions may be considered. The first may take into account a leakage path HLE(f) between the error microphone 205 and the reference microphone 208. The second may take into account the potential for feedback howling by estimating an open-loop transfer function of the module during feedback howling.
When the above referenced paths are considered:
Where HLE(f) is an estimation of the leakage path when outer-sound is muted, ANC is disabled, and the playback signal is output to the speaker 209.
is the open-loop transfer function of the feedback howling system; this transfer function should be smaller than 1 to avoid the generation of feedback howling.
Application of HAEQ
Finally, referring back to
Derivation of dHAEQ for Own Voice
As mentioned above, the effect of blocking the ear with a headset such as the headset 200 described herein is the amplification of the user's 100 own voice at low frequency, which makes their voice sounded boomy to them. This amplification is due to the transmission of the user's voice through the bone and muscle of their head, the so-called bone-conduction path. A determination of dHAEQ may be made in a similar manner to that described above with reference to the process 500 shown in
An added complication in addressing low frequency amplification of own voice due to bone conduction is that bone conduction varies with phenome that the user 100 is speaking, since the location of resonance in the mouth changes for different phenomes being spoken. This means that the bone-conduction path is time-varying.
At step 1202 an open-ear transfer function of the user (i.e. a transfer function of the open ear (TFOE) of the user) may be determined. The open-ear transfer function of the user may be measured, estimated or otherwise determined in the same manner as described above with reference to
At step 1204, a closed-ear transfer function for the user is determined. The closed-ear transfer function may be representative of the air-conduction, bone-conduction and electrical-conduction paths present with the user 100 wearing the headset 200 and speaking.
At step 1206, hearing augmentation EQ, HHA(f), may be determined based on a comparison between the open ear transfer function and the determined closed-ear transfer function for the user 100 wearing the headset 200. For example, the EQ may be determined based on a ratio between open-ear transfer function and the closed-ear transfer function (in the frequency domain) or based on a dB spectral different between the open-ear and closed-ear transfer functions. This EQ represents the difference in sound reaching the eardrum of the user 100 when the user is wearing the headset 200 when the user is speaking versus when the user is not wearing the headset 200 (i.e. the open-ear state).
After the dHAEQ has been determined at step 1206, dHAEQ may be applied at step 1208 to the input signal for the speaker 209 so as to attenuate the low frequency sound reaching the eardrum due to own voice occlusion.
Determining Open-Ear Transfer Function
The determination of the open-ear transfer function according to exemplary embodiments of the present disclosure will now be describe with reference to
Referring to
The acoustic-conduction (AC) path between the mouth and ear entrance of the user can be assumed to be approximately time-invariant. The sound signal at the ear-entrance can thus be defined as:
ZEE(f)=ZMP(f)HA(f) (2.1)
Where ZEE(f) is the sound signal at ear-entrance, ZM(f) is the sound signal of own-voice at the mouth point and HA(f) is the transfer function of the AC path between the mouth point and the ear-entrance while the user 100 is speaking.
HA(f) can be estimated using the second and third measurement microphones 1304, 1306 (one at the mouth point and the other at ear-entrance of the user 100), giving:
Where XEE(f) and XMP(f) represent the electrical output signals at microphones 1304 and 1304 representing ZEE(f) and ZM(f), respectively.
The AC and BC contributions ZED_Oa(f) and ZED_Ob(f,k) at the eardrum may be defined as:
ZED_Oa(f)=ZEE(f)HO(f) (2.3)
Where:
The transfer function of own-voice from ear-entrance to eardrum through the inverse of AC path and then through the BC path in open ear may be defined as:
So, equation (2.4) becomes:
ZED_Ob(f,k)=ZEE(f)HAB_O(f,k) (2.6)
The summation of the AC and BC contributions to sound at the eardrum may then be defined as:
ZED_O(f,k)=ZED_Oa(f)+ZED_Ob(f,k)=ZEE(f)[HO(f)+HAB_O(f,k)] (2.7)
When ZED_O(f,k) and ZEE(f) are recorded by the first and second measurement microphones 1302, 1304 as XED_O(f,k) and XEE(f), and HO(f) has been estimated as with equation (1.4) above, HAB_O(f,k) can be estimated as:
The ratio between the sound signal at the eardrum and the sound signal at the ear-entrance while the user 100 is speaking may be defined as:
We can also define the ratio between AC and BC contributions of the user's own-voice at eardrum, RZ_ED_O(f,k), as:
RZ_ED_O(f,k) for different phoneme has been measured and estimated for the general population by previous researchers. The details of an example experimental measurement and estimation is described in Reinfeldt, S., Östli, P., Hakansson, B., & Stenfelt, S. (2010) “Hearing one's own voice during phoneme vocalization—Transmission by air and bone conduction”. The Journal of the Acoustical Society of America, 128(2), 751-762, the contents of which is hereby incorporated by reference in its entirety.
Determining Own-Voice Closed-Ear Transfer Function
Referring again to
An additional air-conduction path exists between the speaker 209 and the error microphone 205 as denoted by HS2(f) in
In the own-voice closed-ear configuration, i.e. when the user 100 is wearing the headset 200 and is speaking, in addition to the air-conduction and bone-conduction paths which were also present in the open-ear scenario of
The analysis of AC and EC path contributions for own-voice are the same as those described above with reference to
Where HAB_C1(f,k) is the transfer function of own-voice from ear-entrance to the position of the error microphone 205 through the inverse of AC path (i.e. ear entrance to mouth point) and then BC path in close ear; k is the time-varying index of the transfer function, which may change as different phoneme are pronounced by the user—different phenomes result in different vocal and mouth shape.
HAB_C1(f,k) may be defined as:
Where HB_C1(f,k) is the transfer function of the BC path from mouth to the position of the error microphone 205 for own-voice; k is the time-varying index of the transfer function, which may change as different phoneme are pronounced by the user; At frequencies of less than around 1 kHz, HB_C1(f,k) is usually much larger than HB_O(f,k) due to the occlusion effect.
When the output at the speaker 209 is muted, equation (2.11) becomes:
XEM_ANCoffHAoff(f,k)=XRM(f)·[HAB_C1(f,k)+HPE(f)] (2.13)
So HAB_C1(f,k) can be estimated as:
Assuming ANC in the module 202 is functioning well, equation (2.12) can be simplified as:
XEM_ANCperfect(f,k)≈XRM(f)HHA(f)HSE(f) (2.15)
This means that both AC and BC contributions of the user's 100 own-voice have been totally cancelled at the eardrum and only the EC contribution is left.
When ANC is muted, equation (2.12) can be simplified as:
XEM_ANCoff(f)=XRM(f)·[HAB_C1(f,k)+HPE(f)+HHA(f)HSE(f)] (2.16)
Because of occlusion effect, for frequencies below 1 kHz, HAB_C1(f,k) is much larger than HPE(f) and HHA(f)HSE(f) in equation (2.16).
Derivation of dHAEQ for Own-Voice
Referring to step 1206 of the process 1200 shown in
We have:
Assuming the error microphone 205 is positioned close to the eardrum, Hc2(f)≈1. Then, provided the error and reference microphones 205, 208 are substantially matched,
So, equation (2.18) can be simplified as:
As discussed previously with reference equation (1.25), HHA(f) for outer sound (i.e. external sound not from the user's voice) is always positive. However, HHA(f) for own-voice calculated by equation (2.19) may be negative in some circumstances. This is because HAB_C1(f,k) can be 30 dB larger than HAB_O(f,k). Even when ANC is on in the headset 100, the attenuation [1+HFB(f)HSE(f)] on HAB_C1(f,k) is usually less than 30 dB.
Equation (2.19) can be further rewritten as the production of one term which is the same as equation (1.25) above and the other term which is defined as:
Where HHAforOS(f): HHA(f) for outer-sound as described in equation (1.25).
The product term in equation (2.20) may be defined as:
From equation (2.21) we can see that when there is no own-voice, HdHAEQ(f,k) becomes 1, and HHA(f,k) will become HHAforOS(f). Thus, HdHAEQ(f,k) represents the additional equalisation required to account for own-voice low frequency boost at the user's eardrum. As the occlusion effect mainly occurs at low frequencies, HdHAEQ(f,k) may only be applied at frequencies below a low frequency threshold. In some embodiments, HdHAEQ(f,k) may be applied at frequencies below 2000 Hz, or below 1500 Hz, or below 1000 Hz or below 500 Hz.
When ANC is functioning well, equation (2.21) can be simplified as:
RX_ED_O(f,k) (as defined in equation (2.9)) is the ratio between the output of the error microphone 205 (i.e. the microphone recording at the eardrum) and the output of the reference microphone (i.e. approximately at the ear-entrance of own-voice in open ear).
When ANC is performing well enough to cancel the AC path but not the BC path (this is the most possible case), equation (2.21) can be simplified as:
When ANC and HA are on, and HHA(f,k) is set as HHAforOS(f,k), we have:
We can define:
So, equation (2.23) can be rewritten as:
HdHAEQ(f,k)≈RX_ED_O(f,k)−RX_EM_ANConHAon(f,k)+1 (2.26)
It is noted that RX_ED_O(f,k) and RX_EM_ANConHAon(f,k) in equation (2.26) will always be larger than 1. Additionally, both RX_ED_O(f,k) and RX_EM_ANConHAon(f,k) are time-varying for different phonemes. Because RX_ED_O(f,k) needs to be recorded in open ear but RX_EM_ANConHAon(f,k) needs to be recorded in close ear with the user 100 wearing the headset 200, it is difficult to record both in-situ at the same time. Accordingly, in some embodiments, to approximate RX_ED_O(f,k) and RX_EM_ANConHAon(f,k), during calibration, the user 100 may be asked to read a sentence, preferably a phoneme-balanced sentence both in open ear and closed ear configuration whilst wearing the headset 200 and with ANC and HA enabled. An average of the ratios {circumflex over (R)}X_ED_O(f) and {circumflex over (R)}X_EM_ANConHAon(f) may then be determined across the phoneme balanced sentence.
Accordingly, HdHAEQ(f,k) may be fixed as:
ĤdHAEQ(f)={circumflex over (R)}X_ED_O(f)−{circumflex over (R)}X_EM_ANConHAon(f)+1 (2.27)
It is further noted that HA block is designed to compensate but not to cancel sound signal at eardrum, so ĤdHAEQ(f) should be limited to larger than zero, for example at least 0.01 as shown below:
ĤdHAEQ(f)=max{0.01,[{right arrow over (R)}X_ED_O(f)−{circumflex over (R)}X_EM_ANConHAon(f)+1]} (2.28)
The inventors have further discovered that the following equation provides good approximations for HdHAEQ(f,k) and ĤdHAEQ(f):
In other words, ĤdHAEQ(f) can be approximated as the ratio between the electrical output of the reference microphone and the electrical output at the error microphone when ANC and HA are switched on.
Application of dHAEQ
Finally, referring back to
As mentioned above, whether using HdHAEQ(f,k), ĤdHAEQ(f) or an approximation thereof, this equalisation is only required when the user is speaking. Preferably, therefore, the headset 200 may be configured to determine when the user 100 is speaking so that the total EQ applied by the HA block, i.e. HHA(f) or HHA(f,k), can be switched between HHAEQ(f) (i.e. EQ for restoring HF attenuation due to passive loss) and HHAEQ(f)+HdHAEQ(f) (i.e. the combination of EQ for restoring HF attenuation and EQ for removing LF boom due to the occlusion effect). To do so, the voice activity detector (VAD) 218 may be configured to provide the module 202 with a determination (e.g. flag or probability) of voice activity so that dHAEQ can be switched on and off.
At step 1602, the HAEQ may be determined as described above with reference to
At step 1604, the dHAEQ may be determined as describe above with reference to
At step 1606, the DSP 212 may be configured to make a determination as to whether the user 100 is speaking based on an output received from the VAD 218.
If it is determined that the user 100 is not speaking, then the process 1600 continues to step 1608 and the DSP 212 implements the HA block HHA to include HHAEQ only so as to restore the attenuated high frequency sound lost due to passive loss in the closed-ear state. The process then continues to step 1606 where a determination of whether the user 100 is speaking is repeated.
If, however, it determined that the user 100 is speaking, then the process 1600 continues to step 1610 and the DSP 212 implements the HA block HHA to include HHAEQ and HdHAEQ so as to both restore the attenuated high frequency sound lost due to passive loss in the closed-ear state and suppress the low frequency boost due to the occlusion effect while the user is speaking.
It is noted that since the occlusion effect occurs only at low frequencies, e.g. lower than around 1 kHz, the dHAEQ is preferably only applied at frequencies at which it is required, so as to minimize distortion in the signal output to the speaker 209.
It is noted that whilst it may be preferable to account for both high frequency attenuation and low frequency boost (due to bone conduction), embodiments of the present disclosure are not limited to doing so. For example, in some embodiments, the headset 200 may be configured to implement the HA block so as to equalise for high frequency attenuation and not low frequency (occlusion effect) boost. Equally, in some embodiments, the headset 200 may be configured to implement the HA block so as to equalise for low frequency (occlusion effect) boost and not high frequency attenuation.
Embodiments described herein may be implemented in an electronic, portable and/or battery powered host device such as a smartphone, an audio player, a mobile or cellular phone, a handset. Embodiments may be implemented on one or more integrated circuits provided within such a host device. Alternatively, embodiments may be implemented in a personal audio device configurable to provide audio playback to a single person, such as a smartphone, a mobile or cellular phone, headphones, earphones, etc.
Again, embodiments may be implemented on one or more integrated circuits provided within such a personal audio device. In yet further alternatives, embodiments may be implemented in a combination of a host device and a personal audio device. For example, embodiments may be implemented in one or more integrated circuits provided within the personal audio device, and one or more integrated circuits provided within the host device.
It should be understood—especially by those having ordinary skill in the art with the benefit of this disclosure—that that the various operations described herein, particularly in connection with the figures, may be implemented by other circuitry or other hardware components. The order in which each operation of a given method is performed may be changed, and various elements of the systems illustrated herein may be added, reordered, combined, omitted, modified, etc. It is intended that this disclosure embrace all such modifications and changes and, accordingly, the above description should be regarded in an illustrative rather than a restrictive sens
Similarly, although this disclosure makes reference to specific embodiments, certain modifications and changes can be made to those embodiments without departing from the scope and coverage of this disclosure. Moreover, any benefits, advantages, or solutions to problems that are described herein with regard to specific embodiments are not intended to be construed as a critical, required, or essential feature or element.
Further embodiments and implementations likewise, with the benefit of this disclosure, will be apparent to those having ordinary skill in the art, and such embodiments should be deemed as being encompassed herein. Further, those having ordinary skill in the art will recognize that various equivalent techniques may be applied in lieu of, or in conjunction with, the discussed embodiments, and all such equivalents should be deemed as being encompassed by the present disclosure.
The skilled person will recognise that some aspects of the above-described apparatus and methods, for example the discovery and configuration methods may be embodied as processor control code, for example on a non-volatile carrier medium such as a disk, CD- or DVD-ROM, programmed memory such as read only memory (Firmware), or on a data carrier such as an optical or electrical signal carrier. For many applications embodiments of the disclosure will be implemented on a DSP (Digital Signal Processor), ASIC (Application Specific Integrated Circuit) or FPGA (Field Programmable Gate Array). Thus the code may comprise conventional program code or microcode or, for example code for setting up or controlling an ASIC or FPGA. The code may also comprise code for dynamically configuring re-configurable apparatus such as re-programmable logic gate arrays. Similarly the code may comprise code for a hardware description language such as Verilog™ or VHDL (Very high speed integrated circuit Hardware Description Language). As the skilled person will appreciate, the code may be distributed between a plurality of coupled components in communication with one another. Where appropriate, the embodiments may also be implemented using code running on a field-(re)programmable analogue array or similar device in order to configure analogue hardware.
Note that as used herein the term module shall be used to refer to a functional unit or block which may be implemented at least partly by dedicated hardware components such as custom defined circuitry and/or at least partly be implemented by one or more software processors or appropriate code running on a suitable general purpose processor or the like. A module may itself comprise other modules or functional units. A module may be provided by multiple components or sub-modules which need not be co-located and could be provided on different integrated circuits and/or running on different processors.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design many alternative embodiments without departing from the scope of the appended claims or embodiments. The word “comprising” does not exclude the presence of elements or steps other than those listed in a claim or embodiment, “a” or “an” does not exclude a plurality, and a single feature or other unit may fulfil the functions of several units recited in the claims or embodiments. Any reference numerals or labels in the claims or embodiments shall not be construed so as to limit their scope.
Although the present disclosure and certain representative advantages have been described in detail, it should be understood that various changes, substitutions, and alterations can be made herein without departing from the spirit and scope of the disclosure as defined by the appended claims or embodiments. Moreover, the scope of the present disclosure is not intended to be limited to the particular embodiments of the process, machine, manufacture, compositions of matter, means, methods, or steps, presently existing or later to be developed that perform substantially the same function or achieve substantially the same result as the corresponding embodiments herein may be utilized. Accordingly, the appended claims or embodiments are intended to include within their scope such processes, machines, manufacture, compositions of matter, means, methods, or steps.
Number | Name | Date | Kind |
---|---|---|---|
4985925 | Langberg et al. | Jan 1991 | A |
5267321 | Langberg | Nov 1993 | A |
9020160 | Gauger, Jr. | Apr 2015 | B2 |
20120170766 | Alves | Jul 2012 | A1 |
20170148428 | Thuy | May 2017 | A1 |
20190043518 | Li | Feb 2019 | A1 |