An aspect of the disclosure here relates to digital audio signal processing techniques for improving quality of headphone playback during feedback-type acoustic noise cancellation. Other aspects are also described.
Headphones let their users listen to music and participate in phone calls without disturbing others who are nearby. They are used in both loud and quiet ambient environments. Headphones can have various amounts of passive sound isolation against ambient noise. There may be in-ear rubber tips, on-ear cushions, or around-the-ear cushions, or the sound isolation may be simply due to the fact that the headphone housing rests against the ear and therefore loosely blocks the entrance to the ear canal. An electronic technique known as acoustic noise cancellation, ANC, is used to further reduce the ambient environment noise that has leaked past the passive isolation. ANC drives a headphone speaker to produce an anti-noise sound wave that is electronically designed to cancel the ambient noise that gets past the passive isolation and into the user's ear canal. But the performance of ANC varies greatly, depending on how the headphone is fitting to (or how the headphone is being worn) against the wearers ear.
In headphone technology, there is the so-called S-path which is an audio signal path from the input of a headphone speaker to an output of an internal microphone. Due to the unique structure of the ear, the S-path is different for every wearer, and affects how each wearer hears the same playback audio and how ANC performs (despite the same mechanical and acoustical headphone design.) In a headphone with feedback type ANC, there is an audio signal feedback path from the internal microphone (also referred to sometimes as the error microphone) to the input of the headphone speaker. There is an electronic filter in this feedback path that applies a frequency-dependent gain (sometimes referred to as equalization), that is designed to electronically correct for the differences between the ears of different users. In this manner, the playback sound and ANC are more heard consistently (despite the different ears of the users.) The equalization filter may be designed in the laboratory to conform with what an expert listener specifies as being good sound (by the particular headphone design.)
It has been determined however that if the headphone fits the ear of its wearer too loosely or improperly, due to for instance being bumped out of position slightly, an ear cup being raised slightly off ear briefly, or when the wearer puts on a pair of eye glasses, or when the user's hairs prevent the headphone from making contact with the user's skin, the aforementioned filter has to apply a large gain to compensate for the acoustic energy leaking out of the user's ear in order to maintain the desired timbral characteristics of music as head by the user. Under certain circumstances such as high playback volumes, where the audio signal is already testing the limits of the acoustic/electrical system, the large gain applied by the fit correcting filter can drive the amplifier, speaker or the acoustic system as a whole beyond its physical limits, e.g., the amplifier that is driving the headphone speaker becomes overloaded, resulting in the playback being distorted.
An aspect of the disclosure here is a method for headphone audio signal processing in which, during playback, an audio feedback signal from an internal microphone of the headphone is filtered, to produce a filtered feedback signal. The filtering may be designed to equalize how the playback should sound despite different users wearing the same headphone design, so that the playback sound is consistently good for the different users' ears. The filtered feedback signal is then compressed. Thus, the speaker of the headphone is being driven with a combined signal in which the compressed feedback signal has been combined with the playback audio signal. One or more of the compressor parameters are then changed, based on one or more context inference signals that includes a user volume setting. In this manner, when the user volume is high, e.g., at maximum, the likelihood of clipping by the headphone amplifier (that is driving the speaker with the combined signal) is reduced, or even eliminated, resulting in undistorted playback. In one aspect, based on system design and tuning, there may be two or more sets of compressor parameters, and an algorithm chooses from these available sets of compressor and interpolates based on one or more of the context inference signals, using linear interpolation or other interpolation scheme, to produce the final set of compressor parameters that are applied to compress the filtered feedback signal.
The above summary does not include an exhaustive list of all aspects of the present disclosure. It is contemplated that the disclosure includes all systems and methods that can be practiced from all suitable combinations of the various aspects summarized above, as well as those disclosed in the Detailed Description below and particularly pointed out in the Claims section. Such combinations may have particular advantages not specifically recited in the above summary.
Several aspects of the disclosure here are illustrated by way of example and not by way of limitation in the figures of the accompanying drawings in which like references indicate similar elements. It should be noted that references to “an” or “one” aspect in this disclosure are not necessarily to the same aspect, and they mean at least one. Also, in the interest of conciseness and reducing the total number of figures, a given figure may be used to illustrate the features of more than one aspect of the disclosure, and not all elements in the figure may be required for a given aspect.
Several aspects of the disclosure with reference to the appended drawings are now explained. Whenever the shapes, relative positions and other aspects of the parts described are not explicitly defined, the scope of the invention is not limited only to the parts shown, which are meant merely for the purpose of illustration. Also, while numerous details are set forth, it is understood that some aspects of the disclosure may be practiced without these details. In other instances, well-known circuits, structures, and techniques have not been shown in detail so as not to obscure the understanding of this description.
The headphone 1 has an against-the-ear acoustic transducer or speaker 7 arranged and configured to reproduce sound (that is represented in an audio signal that is said to drive the speaker) into the ear of the user, an external microphone 5 (arranged and configured to receive ambient environment sound directly), and an internal microphone 3 (arranged and configured to directly receive the sound reproduced by the speaker 7.) The headphone 1 is configured to acoustically couple the external microphone to the ambient environment of the headphone, in contrast to the internal microphone being acoustically coupled to a volume of air within the ear that is being blocked by the headphone. As integrated in the headphone 1 and worn by its user, the external microphone 5 may be more sensitive than the internal microphone 3 to a far field sound source outside of the headphone 1. Viewed another way, as integrated in the headphone and worn by its user, the external microphone 5 may be less sensitive than the internal microphone 3 to sound within the user's ear. Here it should be noted that while the figures show a single microphone symbol in each instance (external microphone 5 and internal microphone 3), as producing a sound pickup channel, this does not mean that the sound pickup channel must be produced by only one microphone. In some instances, the sound pickup channel may be the result of combining multiple microphone signals, e.g., by a beamforming process performed on a multi-channel output from a microphone array.
In one aspect, along with the transducers and the electronics that process and produce the transducer signals (output microphone signals and an input audio signal to drive the speaker), there is also electronics that is integrated in the headphone housing. Such electronics may include an audio amplifier to drive the speaker with an audio signal (that may include program audio, also referred to here as playback audio), a microphone sensing circuit or amplifier that receives the microphone signals converts them into a desired format for digital signal processing, and a digital processor 2 and associated memory (not shown.) The memory stores instructions for configuring or programing the processor (e.g., instructions to be executed by the processor) to perform digital signal processing methods as described below in detail. A playback audio signal (program audio) that may contain user content such as music, podcast, or the voice of a far end user during a voice communication session, can also be provided to drive the speaker in some modes of operation, e.g., during noise cancellation mode. The playback audio signal may be provided to the processor from an external, audio source device (not shown) such as a smartphone or tablet computer. Alternatively, the playback audio signal could be provided to the processor by a cellular phone network communications interface that is within the housing of the headphone 1.
Referring now
It has been determined however that if a particular instance of the headphone 1 fits the ear of its wearer too loosely or improperly due to for instance being bumped out of position slightly or if the wearer puts on a pair of glasses, thereby making the S-path leaky or less sealed (in the acoustic sense), then under certain conditions, such as high user volume, the amplifier (not shown) that is driving the headphone speaker 7 becomes overloaded by the feedback path, resulting in the playback being distorted. This effect is illustrated by a graph in
To mitigate the overloading or overdriving of the headphone amplifier, the following method is performed by the processor 2 (see
For downward compression (the magnitude of the full band signal or a sub-band component is reduced), the compressor parameters may include two or more of the following: attack time, release time, threshold, compression ratio, and cutoff frequencies (for narrow band compressor blocks.) For instance, when changing the compressor parameters (or the compressor setting), a first compression ratio is selected when the user volume setting is above a threshold, and a second compression ratio is selected when the user volume setting is below the threshold, wherein the first compression ratio is greater than the second compression ratio. For example, a first compressor setting is selected when user volume is maximum, and a second compressor setting is selected when the user volume is below a threshold that is less than the maximum. The first compressor setting can be said to be more aggressive than the second compressor setting. As a result, the headroom limit of the amplifier is not violated.
The variable compressor may be implemented as a digital, low frequency shelf filter, as a direct form, parametric biquad with atomic coefficient update capability. The cut frequency, Q, and gain of such a parametric biquad are variable and are set according to the compression parameters provided by a compressor parameter interpolation block (compressor parameter interpolation 6.) The digital filter however is not limited to being a direct form biquad. The variable compressor may also include a variable broadband gain stage following the low frequency shelf filter. The gain of that stage may be reduced by the compressor parameter interpolation 6 algorithm in situations where for example the expected or predicted output of the low frequency shelf filter is still too strong (or in other words too likely to induce clipping of the headphone amplifier.)
Still referring to
The feedback audio signal is then filtered by the filter G, which is designed to in effect produce a feedback anti-noise signal that is intended to acoustically cancel certain undesired sounds in the S-path. In addition to the feedback anti-noise signal, the case in
The combination of the feedforward and feedback anti-noise signals as described above work well to create a quiet hearing experience for the wearer (during playback for example), so long as the headphone 1 is being worn “properly” in that the acoustic leakage is not severe or poor. But severe leakage or poor acoustic seal could occur while the user volume is above a certain threshold, e.g., at maximum. Note here that the removal of the Spbc-filtered version of the playback is far from perfect under those acoustic leakage conditions, such that there is residual playback audio into the filter G. For instance, if the headphone fit or seal is poor so that acoustic leakage is high, then the error microphone signal contains very little of the S-path version of the playback audio, while the Spbc version of the playback audio is being subtracted from the error microphone signal. This residual playback is then undesirably subjected to a high feedback path gain of the filter G. The variable compressor in the feedback path suppresses peaks of this undesirable version of the (residual) playback. Also, by changing the setting of the variable compressor to be more aggressive only in certain contexts, and particularly in response to the user volume being above a certain threshold, the risk of the filtered feedback signal overdriving the headphone amplifier in case the headphone 1 is bumped out of position or its ear cup is briefly lifted off the ear, in reduced or even eliminated. At the same time, the variable compressor will be automatically re-configured into a less aggressive setting in other contexts (including one where the user volume is below the threshold), so that dynamic range of the sound produced by the headphone speaker 7 remains high (thereby maintaining improved user experience.)
Continuing with the description of
Another aspect of the disclosure here, which is also illustrated in
Turning to
In yet another aspect, illustrated in
The aspects of the disclosure described above refer to a variable compressor (in the feedback anti-noise signal path from the internal microphone) that is controlled according to either user volume, ambient environment sound level, playback content level, or clipping events derived from audio signals such as the external microphone audio signal or the playback audio signal.
The wearer walking or jogging could be determined by the processor (context detector 10) receiving an indication from a companion device that is paired or otherwise communicatively coupled to the headphone 1 (e.g., a smartphone, a tablet computer, a laptop computer, or a smartwatch), or it could be determined by processing an inertial measurement unit, IMU, output signal.
Critical listening refers to situations where sound is reproduced with high fidelity and without the non-linear effects that compression introduces. Such a wearer is typically sitting or lying down in a quiet ambient environment like a studio (not riding in a bus or a car, not inside a restaurant); the processor (context detector 10) may determine the context of usage as being critical listening by receiving an indication from the companion, or by processing an inertial measurement unit output signal to determine that the companion device or the headphone 1 is motionless.
Riding in car or a bus or an airplane may be determined by the processor receiving an indication from the companion device, or by processing a global positioning system location signal, a compass/magnetometer signal, or a communication network connection.
In another aspect of the disclosure here, the context detector 10 can signal the compressor parameter estimation 6 that it has detected a user context as being severe acoustic leak at the headphone 1, based on having processed (as a context inference signal) the Sest estimate of the S-path transfer function. Yet another user context that may be detected by the context detector 10 is whether ANC mode is active or whether ambient sound reproduction mode is active. In response to each of these detected user contexts, the compressor parameter interpolation 6 would change the compressor setting to better suit the particular user context.
In another aspect of the disclosure here, the processor changes the variable compressor to a more aggressive setting, or activates the variable compressor (by interpolating between a first setting and a second setting) only if the user volume is above a threshold. In other words, the compressor is activated or made more aggressive only if the user volume is above the threshold; if the user volume is below the threshold, then regardless of another user context being detected (by the context detector 10), the compressor setting is not changed (by the compressor parameter interpolation 6.) That may be because the tuning process performed in the laboratory has concluded that a default compressor setting (e.g., no compression) is acceptable for all of the available user contexts.
In yet another aspect, the processor is configured to change the variable compressor setting to its most aggressive setting regardless of the detected context of usage, whenever user volume is set to maximum, e.g., during music playback. In one aspect, the context aware compressor settings may include the following: no compression during critical listening; slow attack and slow release during transportation such as bus or airplane, and fast attack and fast release during maximum user volume with music playback. Note that the terms and slow and fast are relative to each other, meaning that a fast attack time is shorter than a slow attack time, and similarly for the release times.
It is well understood that the use of personally identifiable information should follow privacy policies and practices that are generally recognized as meeting or exceeding industry or governmental requirements for maintaining the privacy of users. In particular, personally identifiable information or data should be managed and handled so as to minimize risks of unintentional or unauthorized access or use, and the nature of authorized use should be clearly indicated to users.
While certain aspects have been described above and shown in the accompanying drawings, it is to be understood that such are merely illustrative of and not restrictive on the broad invention, and that the invention is not limited to the specific constructions and arrangements shown and described, since various other modifications may occur to those of ordinary skill in the art. The description is thus to be regarded as illustrative instead of limiting.