The present invention relates to headsets, and in particular to a headset configured to determine whether or not the headset is in place on or in the ear of a user, and a method for making such a determination.
Headsets are a popular device for delivering sound to one or both ears of a user, such as playback of music or audio files or telephony signals. Headsets typically also capture sound from the surrounding environment, such as the user's voice for voice recording or telephony, or background noise signals to be used to enhance signal processing by the device. Headsets can provide a wide range of signal processing functions.
For example, one such function is Active Noise Cancellation (ANC, also known as active noise control) which combines a noise cancelling signal with a playback signal and outputs the combined signal via a speaker, so that the noise cancelling signal component acoustically cancels ambient noise and the user only or primarily hears the playback signal of interest. ANC processing typically takes as inputs an ambient noise signal provided by a reference (feed-forward) microphone, and a playback signal provided by an error (feed-back) microphone. ANC processing consumes appreciable power continuously, even if the headset is taken off.
Thus in ANC, and similarly in many other signal processing functions of a headset, it is desirable to have knowledge of whether the headset is being worn at any particular time. For example, it is desirable to know whether on-ear headsets are placed on or over the pinna(e) of the user, and whether earbud headsets have been placed within the ear canal(s) or concha(e) of the user. Both such use cases are referred to herein as the respective headset being “on ear”. The unused state, such as when a headset is carried around the user's neck or removed entirely, is referred to herein as being “off ear”.
Previous approaches to on ear detection include the use of dedicated sensors such as capacitive, optical or infrared sensors, which can detect when the headset is brought onto or close to the ear. However, to provide such non-acoustic sensors adds hardware cost and adds to power consumption. Another previous approach to on ear detection is to provide a sense microphone positioned to detect acoustic sound inside the headset when worn, on the basis that acoustic reverberation inside the ear canal and/or pinna will cause a detectable rise in power of the sense microphone signal as compared to when the headset is not on ear. However, the sense microphone signal power can be affected by noise sources such as wind noise, and so this approach can output a false positive that the headset is on ear when in fact the headset is off ear and affected by noise. These and other approaches to on ear detection can also output false positives when the headset is held in the user's hand, placed in a box, or the like.
Any discussion of documents, acts, materials, devices, articles or the like which has been included in the present specification is solely for the purpose of providing a context for the present invention. It is not to be taken as an admission that any or all of these matters form part of the prior art base or were common general knowledge in the field relevant to the present invention as it existed before the priority date of each claim of this application.
Throughout this specification the word “comprise”, or variations such as “comprises” or “comprising”, will be understood to imply the inclusion of a stated element, integer or step, or group of elements, integers or steps, but not the exclusion of any other element, integer or step, or group of elements, integers or steps.
In this specification, a statement that an element may be “at least one of” a list of options is to be understood that the element may be any one of the listed options, or may be any combination of two or more of the listed options.
A signal processing device for on ear detection for a headset, the device comprising:
a plurality of inputs for receiving respective microphone signals from a plurality of microphones; and
a processor configured to derive from the microphone signals a plurality of signal feature measures, the processor further configured to normalise the signal feature measures; the processor further configured to variably weight the signal feature measures in response to detected signal conditions in the microphone signals; the processor further configured to combine the variably weighted normalized signal feature measures to produce an output indication of whether a headset is on ear.
A method for on ear detection for a headset, the method comprising:
receiving respective microphone signals from a plurality of microphones;
deriving from the microphone signals a plurality of signal feature measures;
normalising the signal feature measures;
variably weighting the signal feature measures in response to detected signal conditions in the microphone signals; and
combining the variably weighted normalized signal feature measures to produce an output indication of whether a headset is on ear.
A non-transitory computer readable medium for on ear detection for a headset, comprising instructions which, when executed by one or more processors, causes performance of the following:
receiving respective microphone signals from a plurality of microphones;
deriving from the microphone signals a plurality of signal feature measures;
normalising the signal feature measures;
variably weighting the signal feature measures in response to detected environmental conditions in the microphone signals; and
combining the variably weighted normalized signal feature measures to produce an output indication of whether a headset is on ear.
A system for on ear detection for a headset, the system comprising a processor and a memory, the memory containing instructions executable by the processor and wherein the system is operative to:
receive respective microphone signals from a plurality of microphones;
derive from the microphone signals a plurality of signal feature measures;
normalise the signal feature measures;
variably weight the signal feature measures in response to detected signal conditions in the microphone signals; and
combine the variably weighted normalized signal feature measures to produce an output indication of whether a headset is on ear.
In some embodiments of the invention, the detected signal conditions comprise signal presence indicators respectively indicating whether a signal is present on the microphone signals.
In some embodiments of the invention the processor is configured to normalise the signal feature measures by applying a non-linear mapping of each signal feature measure to a unitless reference scale. The non-linear mapping could for example comprise a sigmoid function or a piecewise linear function. The unitless reference scale in some embodiments outputs a value between 0 and 1, inclusive, while in other embodiments may output a value between −1 and 1, inclusive.
The plurality of signal feature measures in some embodiments may comprise a signal feature reflecting passive loss, being the attenuation in an external sound level. In some embodiments of the invention, greater weight is given to the normalized passive loss signal feature measure when playback is quiet and ambient noise is not quiet.
Additionally or alternatively, the plurality of signal feature measures in some embodiments may comprise a signal feature reflecting occlusion gain, being the increase in sound level which occurs when the earbud is on ear. In some embodiments of the invention, greater weight is given to the normalized occlusion gain signal feature measure when playback is not quiet and ambient noise is quiet.
In some embodiments of the invention, the processor is configured to create an inaudible acoustic probe signal for playback. For example, a memory storage may be provided, storing data which defines a plurality of distinct probe signals, each probe signal corresponding to a respective detected signal condition. The plurality of signal feature measures may comprise a signal feature reflecting probe amplitude, being the observed amplitude of the inaudible probe signal when played back. An amplitude of the probe signal may be estimated by state estimation. In some embodiments of the invention, greater weight is given to the normalized probe amplitude signal feature measure when playback and ambient noise are quiet.
In some embodiments of the invention a control module is configured to select a weighting to be applied to the signal feature measures based on the detected signal conditions in the microphone signals. In some embodiments of the invention a memory storage is provided, storing predefined signal feature weightings to be applied to the signal features measures, each predefined signal feature weighting corresponding to a respective detected signal condition.
In some embodiments of the invention a linear combiner is provided, for multiplying the signal feature measures by respective variable weights. In some embodiments the linear combiner is further configured to produce a soft decision whether a headset is on ear by summing the products of the signal feature measures with the respective variable weights.
In some embodiments of the invention, at least one signal processing function of the device is altered in response to a determination that the headset is not on ear. For example the signal processing function might be active noise cancellation (ANC), and the ANC might be disabled when the headset is not on ear. The plurality of microphones in some embodiments might comprise an error microphone and a reference microphone, wherein the respective microphone signals from the error microphone and the reference microphone are further used to implement the active noise cancellation.
The output indication of whether a headset is on ear in some embodiments is a soft decision representing a probability that the headset is on ear. The output indication of whether a headset is on ear in some embodiments is a hard binary decision.
In some embodiments of the invention the processor is configured to normalise the signal features before variably weighting the signal feature measures. In some embodiments of the invention the processor is configured to normalise the signal features simultaneously with or after variably weighting the signal feature measures.
In some embodiments of the invention the processor is further configured to statically weight at least one signal feature measure, and the statically weighted signal feature measure is also combined with the variably weighted normalized signal feature measures to produce the output indication of whether a headset is on ear. In some embodiments of the invention the processor is configured to statically weight at least one signal feature measure in accordance with a user input. In some embodiments of the invention the processor is configured to statically weight at least one signal feature measure by a fixed proportion, or by an averaging step.
An example of the invention will now be described with reference to the accompanying drawings, in which:
Corresponding reference characters indicate corresponding components throughout the drawings.
Processor 124 is further configured to adapt the handling of such audio processing functions in response to one or both earbuds being positioned on the ear, or being removed from the ear. Earbud 120 further comprises a memory 125, which may in practice be provided as a single component or as multiple components. The memory 125 is provided for storing data and program instructions. Earbud 120 further comprises a transceiver 126, which is provided for allowing the earbud 120 to communicate wirelessly with external devices, including earbud 150. Such communications between the earbuds may alternatively comprise wired communications in alternative embodiments where suitable wires are provided between left and right sides of a headset, either directly such as within an overhead band, or via an intermediate device such as a smartphone. Earbud 120 further comprises a speaker 128 to deliver sound to the ear canal of the user. Earbud 120 is powered by a battery and may comprise other sensors (not shown).
In accordance with the present embodiment of the invention, processor 124 of earbud 120 executes an on ear detector 130, or OEDL, in order to acoustically detect whether the earbud 120 is on or in the ear of the user. Earbud 150 executes an equivalent OEDR 160. In this embodiment, the output of the respective on ear detector 130, 160 is passed as an enable or disable signal to a respective acoustic probe generator GENL, GENR. When enabled, the acoustic probe generator creates an inaudible acoustic probe signal UIL, UIR, to be summed with the respective playback audio signal. The output of the respective on ear detector 130, 160 is also passed as a signal DL, DR to a Decision Combiner 180 which produces an overall on ear decision DΣ.
In the following passages, i=L [left] or R [right]. As shown in
If enabled, the inaudible probe generator GENi generates an inaudible probe signal, which is used for OED when other features are found to be unreliable by the Control module 300. The inaudible probe signal is made to be inaudible by ensuring that its spectral content, BIPS, is situated below a suitable threshold considered to be the lower limit of the human audible frequency range. In this case 20 Hz>BIPS.
The inaudible probe may be a continuous stationary signal or its parameters may vary with time. The properties of the probe signal (e.g. frequency, phase, amplitude, spectral shape) may be varied depending on a preconfigured sequence or in response to the signals on the other sensors. For example, if the Control module 300 determines that there is a large amount of ambient activity at the same frequencies as the probe, the probe may be correspondingly adjusted to occur at quieter frequencies in order to improve on ear detection.
As shown in
Returning to
The features Fk may be of different nature, may be measured in different units, and some or all may also contain significant outliers. Reflecting the aim of the present invention of dynamically considering a plurality of such features, the Feature Mapping module, FM 330, being the second stage of the FP 310, is used to “squash” or compress the features Fk into normalized unit-less features, Mk. In this embodiment, Mk∈[0, 1], k=1, . . . , L. It is to be noted that each feature, Fk, is mapped to Mk using a corresponding set of parameters which pertain to that respective feature only.
The normalized unit-less features, Mk, output by FM 330 are input into Decision Device, DD 340, where a “soft” non-binary decision, pD, is made as to whether the headset is on ear or off ear. The “soft” (unsliced) decision, pD represents the probability of headphones being on ear. The soft decision, pD, may be sliced or thresholded to obtain a “hard” binary decision, D, as to whether the headset is on ear or off ear. Decision combiner 180 may receive the soft decision, pD, and/or “hard” binary decision, D, from both ears. Decision Combiner 180 may be a module executed in either of the earbuds 120, 150, and/or in an associated device such as a smartphone.
Referring again to
The Passive Loss feature, which can also be considered as an insertion loss feature, is defined as the attenuation in an external sound level. The external sound level is experienced by the reference mic 121 regardless of whether the headset is on ear or off ear, whereas less ambient sound leaks into the error mic 122 when the earbud 120 is on ear and is blocking or occluding the ear canal. This feature can thus provide one means for on ear detection. The passive loss signal feature FPL in this embodiment is defined as follows:
where PEB1 is the power of the signal from the error microphone 122, and PRB1 is the power of the signal from the reference microphone 121, calculated over a band B1=[f1PL, f2PL]. Corner frequencies f1PL, f2PL are likely to differ for various headphone designs. Typical corner frequencies are f1PL=1.4 kHz and f2PL=3.7 kHz which may be extended in real time based on the current state of the system (e.g. if ANC is on, f1PL=20 Hz in order to include active attenuation). The Passive Loss feature FPL produced by module 422 is most useful as an on-ear indication when the ambient noise is loud and the headphone playback is quiet or absent. Accordingly, in this embodiment the ambient noise level and playback level are determined in the control module 300, and are used to weight the Passive Loss feature FPL accordingly.
Feature extraction module F2 424 produces a feature FOG, being an occlusion gain signal feature. This feature seeks to exploit the increase in sound level which occurs when the earbud is on ear, due to the fact that less of the played back sound from the speaker escapes from the blocked ear. Feature FOG is defined as follows:
where PEB2 is the power of the signal from the error microphone, and PPBB2 is the power of the playback signal, each calculated over a band B2=[f1OG, f2OG] Again, corner frequencies f1OG, f2OG are likely to differ for various headphone designs. Typical corner frequencies are f1OG=0.1 kHz and f2OG=2.5 kHz. The Occlusion Gain feature FOG is most useful as an on-ear indication when the ambient noise is quiet and headphone playback is present. Accordingly, in this embodiment the ambient noise level and playback level are determined in the control module 300, and are used to weight the Occlusion Gain feature FOG accordingly.
Feature extraction module F3 426 produces a feature FI, being a probe amplitude signal feature. In this embodiment the observed amplitude of the inaudible probe signal XI is defined to be the maximum of the absolute value of XI. A harmonic tone or multi-tone signal UI of a pre-defined amplitude, AI, is used as the inaudible probe, for example an amplitude which produces ˜60 dB SPL at the speaker output. In other embodiments, any suitable method may be used to estimate the amplitude of the probe signal and/or components thereof, ÂI, such alternative methods including spectral analysis, state estimation such as Kalman filtering, and the like. In particular it is to be noted that state estimation such as Kalman filtering will only track parameters of a signal that is intended to be followed, based on the filter's internal space-state model, and is thus advantageously robust to wind noise or any low frequency sound that is different from the filter's internal signal. This feature FI seeks to exploit the increase in sound level which occurs when the earbud is on ear, due to the fact that less sound escapes from the blocked ear. Using inaudible probe UI is advantageous because the probe amplitude can be monitored continuously even when the playback signal UPB is zero or quiet. Additionally, using an inaudible probe is particularly suitable for headsets having a close fit design to the user's anatomy, providing effective occlusion of external sounds as observed within the headset.
It is to be noted that alternative embodiments of the invention may select a partly or entirely different set of signal features for on ear detection. In accordance with the present invention, it is the normalisation and weighting of two or more such features which is of primary note, as discussed further below. In this regard, referring again to
MPL=S(FPL,kPL,FPL0) (3)
where kPL is the slope, and FPL0 is the midpoint of the logistic sigmoid. Both kPL and FPL0 are chosen empirically.
In (3) S(⋅) is a logistic sigmoid function with slope (steepness) k and midpoint x0 such that:
Similarly, feature mapping module 330 maps FOG to a normalized unit-less feature MOG as follows:
MOG=S(FOG,kOG,FOG0) (5)
where kOG is the slope, and FOG0 is the centre of the respective logistic sigmoid. Both kOG and FOG0 are chosen empirically.
And, feature mapping module 330 maps FI to a normalized unit-less feature MI as follows.
MI=S(FI,kI,FI0) (6)
where kI is the slope, and FI0 is the centre of the logistic sigmoid, SI(⋅). Both kI and FI0 are chosen empirically.
A key issue to note in relation to the non-linear mapping adopted by the present embodiment of the invention is that the various signal features are at first measured on different scales, in different units. To normalise such measures from varied scales to a common normalized scale is a key enabler of the decision device 340.
Returning again to
CXi=S(PXi,kXi,vXi), (7)
where X={E, R, or PB}, i={L[eft] or R[ight]}, and S(⋅) is a logistic sigmoid function with slope kXi and midpoint vXi as per (4). Parameters of the sigmoid, kXi and midpoint vXi, are empirically chosen such that CXi is close to zero when XXi is low, and CXi is close to 1 when XXi is high.
A choice of a detection metric and enabling/disabling generation and injection of the inaudible probe signal by the control module 300 is based on the SPI, CEi, CRi, and CPBi, (0—low, 1—high) as summarised in Table 1.
Note, that states 1 and 3 in Table 1 represent headset abnormal behaviour: playback is present but no signal is registered on the error microphone. This may indicate a faulty error microphone or speaker. Thus these states are excluded from a list of “allowed” states.
The control signal (0-7) and the signal presence indicators, CEi, CRi, and CPBi, comprise the output CX of the Control module.
where TD is the (hard) decision threshold of the Slicer.
The weights are applied to the normalized unit-less metrics as per (8) in order to produce a probability of the respective earbud 120 being on-ear, pD. If a binary decision is required, the probability pD may be sliced as per (9).
The weights represented by a weight vector, {right arrow over (w)}={wk}, k=1, . . . , L, may either be calculated automatically in the Weight Calculator 530 based on the Control module outputs, CEi, CRi, and CPBi, or the weights may be manually set based on preference. For example if only the amplitude of the inaudible probe is to be used for in-ear detection then the weights may be manually set to {right arrow over (w)}={0 0 1}.
An example of weight calculations by Weight Calculator 530 in accordance with the present embodiment is given below. Weight w1 is calculated as follows:
Weight w2 is calculated as follows:
As MI=1 when the estimated probe signal amplitude reaches its expected level, and MI=0 when the estimated probe signal amplitude approaches zero, the dedicated weight, w3, is not required for further control of the contribution of MI into the overall decision by decision device 340. However, the weight w3 is useful for system-level control. To this end, weight w3 is calculated as follows:
The present embodiment of the invention further provides for averaged or smoothed hysteresis in changing the decision of whether the headset is on ear or off ear. In particular, only after the decision device indicates that the headset is on ear for more than 1 second is the state indication changed from off ear to on ear. Similarly, only after the decision device indicates that the headset is off ear for more than 3 seconds is the state indication changed from on ear to off ear.
Preferred embodiments also provide for automatic turn off of the OED 130 once the headset has been off ear for more than 5 minutes (or any suitable comparable period of time). This allows OED to provide a useful role when the headsets are in regular use and regularly being moved on ear, but also allows the headset to conserve power when off ear for long periods, after which the OED 130 can be reactivated when the device is next powered up or activated for playback.
The present embodiment thus provides for automatic or manual application-specific relative weighting of selected detection features. The variable weighting is made in response to detected signal conditions, so that the system responds to the use context of the headset, environmental conditions and/or demonstrates a level of situational awareness. Dynamic adjustment of the parameters (e.g. amplitude and frequency, spectral shape etc.) of the inaudible probe signals is also provided in response to the changing environment, headset design, and the like.
Embodiments of the invention may comprise a USB headset having a USB cable connection effecting a data connection with, and effecting a power supply from, a master device. The present invention, in providing for in ear detection which requires only acoustic microphone(s) and acoustic speaker(s), may be particularly advantageous in such embodiments, as USB earbuds typically require very small componentry and have a very low price point, motivating the omission of non-acoustic sensors such as capacitive sensors, infrared sensors, or optical sensors. Another benefit of omitting non-acoustic sensors is to avoid the requirement to provide additional data and/or power wires in the cable connection which must otherwise be dedicated to such non-acoustic sensors. Providing a method for in-ear detection which does not require non-acoustic components is thus particularly beneficial in this case.
Other embodiments of the invention may comprise a wireless headset such as a Bluetooth headset having a wireless data connection with a master device, and having an onboard power supply such as a battery. The present invention may also offer particular advantages in such embodiments, in avoiding the need for the limited battery supply to be consumed by non-acoustic in ear sensor componentry.
The present invention thus seeks to address on ear detection by acoustic means only, that is by using the extant speaker/driver, error microphone(s) and reference microphone(s) of a headset.
Knowledge of whether the headset is on ear can in a simple case be used to disable or enable one or more signal processing functions of the headset. This can save power. This can also avoid the undesirable scenario of a signal processing function adversely affecting device performance when the headset is not in an expected position, whether on ear or off ear. In other embodiments, knowledge of whether the headset is on ear can be used to revise the operation of one or more signal processing or playback functions of the headset, so that such functions respond adaptively to whether the headset is on ear.
The skilled person will thus recognise that some aspects of the above-described apparatus and methods, for example the calculations performed by the processor may be embodied as processor control code, for example on a non-volatile carrier medium such as a disk, CD- or DVD-ROM, programmed memory such as read only memory (firmware), or on a data carrier such as an optical or electrical signal carrier. The logic of Table 1 may be implemented in general purpose memory 125 of the earbuds, or by way of a look up table, or by any such suitable means. For many applications, embodiments of the invention will be implemented on a DSP (Digital Signal Processor), ASIC (Application Specific Integrated Circuit) or FPGA (Field Programmable Gate Array). Thus the code may comprise conventional program code or microcode or, for example, code for setting up or controlling an ASIC or FPGA. The code may also comprise code for dynamically configuring re-configurable apparatus such as re-programmable logic gate arrays. Similarly the code may comprise code for a hardware description language such as Verilog TM or VHDL (Very high speed integrated circuit Hardware Description Language). As the skilled person will appreciate, the code may be distributed between a plurality of coupled components in communication with one another. Where appropriate, the embodiments may also be implemented using code running on a field-(re)programmable analogue array or similar device in order to configure analogue hardware.
Embodiments of the invention may be arranged as part of an audio processing circuit, for instance an audio circuit which may be provided in a host device. A circuit according to an embodiment of the present invention may be implemented as an integrated circuit.
Embodiments may be implemented in a host device, especially a portable and/or battery powered host device such as a mobile telephone, an audio player, a video player, a PDA, a mobile computing platform such as a laptop computer or tablet and/or a games device for example. Embodiments of the invention may also be implemented wholly or partially in accessories attachable to a host device, for example in active speakers or headsets or the like. Embodiments may be implemented in other forms of device such as a remote controller device, a toy, a machine such as a robot, a home automation controller or the like.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design many alternative embodiments without departing from the scope of the appended claims. The use of “a” or “an” herein does not exclude a plurality, and a single feature or other unit may fulfil the functions of several units recited in the claims. Any reference signs in the claims shall not be construed so as to limit their scope.
It will be appreciated by persons skilled in the art that numerous variations and/or modifications may be made to the invention as shown in the specific embodiments without departing from the spirit or scope of the invention as broadly described. The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive.
This application claims priority to U.S. Provisional Patent Application Ser. No. 62/570,352, filed Oct. 10, 2017, which is incorporated by reference herein in its entirety.
Number | Name | Date | Kind |
---|---|---|---|
9838812 | Shetye | Dec 2017 | B1 |
9894452 | Termeulen | Feb 2018 | B1 |
20170013345 | Kumar | Jan 2017 | A1 |
Number | Date | Country |
---|---|---|
2017200679 | Nov 2017 | WO |
2018081154 | May 2018 | WO |
Number | Date | Country | |
---|---|---|---|
62570352 | Oct 2017 | US |