Spatial hearing refers to the fact that when a sound is emanating from a discrete position, the acoustic signals arriving at a listeners ears not only travel on a direct path from the sound source to the ear-canal entrance, but they also arrive after reflecting and diffracting around the human anatomy causing acoustic artefacts. These artefacts, which are often different for left and right ears, give the listener cues to localize the sound. These features of sound transmission that are related to a listener can be encapsulated in a digital electronic data structure or dataset, referred to as a head-related transfer function (HRTF). A single HRTF is a pair of acoustic filters (one for each ear) which characterize the acoustic transmission from one position in a reflection-free environment to respective microphones placed in the ears of a listener at a given position or pose of the listener. An HRTF is used by a binaural simulation digital signal processing algorithm, to reproduce an audio recording as binaural sound, through driving a pair of headphones worn by a listener. The process uses the HRTF to create the illusion of a sound source somewhere in the environment. They encapsulate the fundamentals of spatial hearing.
Due to physiological differences between humans' ears, head and body, an HRTF is highly individualized. Binaural simulation using non-individualized HRTFs (for example, a listener auditioning a simulation using the HRTF dataset of another person) can cause audible problems in both the perceived position and quality (timbre) of the virtual sound.
There are a number of methods to achieve individualized HRTFs but these are often time-consuming or practically unfeasible when implemented in a consumer electronic device setting. When HRTF individualization is not possible, a generic HRTF is often used which aims to represent the ‘average’ HRTF. An HRTF dataset can be broken down into a set of underlying parameters such as inter-aural time difference (ITD), inter-aural level differences (ILD) and diffuse field HRTF (DF-HRTF). This information is useful in the individualization of an HRTF dataset. For example, an average HRTF could be created as a composite HRTF dataset that contains the ITDs from one person and the ILDs of another person. If enough of the features are personalized, the composite HRTF dataset should be indistinguishable from a measurement of their own HRTF dataset.
An aspect of the disclosure here is an iterative method for finding which user characteristic data should be used for pruning a database of available HRTFs before selecting an HRTF from the database for a target user (target listener.) The selected HRTF is expected to be the one that is more suitable for target user. This is also referred to as having “personalized” the HRTF selection process for the target user. This is not about computing a suitable HRTF but rather how to use the user characteristic data to improve the chances of selecting the most appropriate one from a database of available HRTFs.
Another aspect of the disclosure is part of a method for producing binaural sound through headphones (while worn by a target user). First and second user characteristics of a target user, for whom a selection of an HRTF is to be made from a database of available HRTFs, are obtained. First and second subsets of the available HRTFs are then removed from consideration for the selection (based on the first and second user characteristics.) Advantageously, such a pruning process increases the likelihood that the selected HRTF will be a good one (due to fewer bad HRTFs remaining in the database.) An HRTF for the target user is then selected from the remaining members of the database of available HRTFs, and audio signals (user program audio) are then digitally processed by a binaural processor, according to the selected HRTF, to simulate binaural hearing (generating left and right transducer drive signals of the headphones.)
In some instances, an initial pruning operation may be performed (before the first and second subset are removed) in which a subset of the available HRTFs that are determined to be less generalizable or less generic than the rest of the available HRTFs are removed. This has also been shown to be effective in reducing the proportion of poor HRTFs in the database, helping improve the final selection odds of a good HRTF when combined with the subsequent first subset and second subset removals.
The above summary does not include an exhaustive list of all aspects of the present disclosure. It is contemplated that the disclosure includes all systems and methods that can be practiced from all suitable combinations of the various aspects summarized above, as well as those disclosed in the Detailed Description below and particularly pointed out in the Claims section. Such combinations may have particular advantages not specifically recited in the above summary.
Several aspects of the disclosure here are illustrated by way of example and not by way of limitation in the figures of the accompanying drawings in which like references indicate similar elements. It should be noted that references to “an” or “one” aspect in this disclosure are not necessarily to the same aspect, and they mean at least one. Also, in the interest of conciseness and reducing the total number of figures, a given figure may be used to illustrate the features of more than one aspect of the disclosure, and not all elements in the figure may be required for a given aspect.
Several aspects of the disclosure with reference to the appended drawings are now explained. Whenever the shapes, relative positions and other aspects of the parts described are not explicitly defined, the scope of the invention is not limited only to the parts shown, which are meant merely for the purpose of illustration. Also, while numerous details are set forth, it is understood that some aspects of the disclosure may be practiced without these details. In other instances, well-known circuits, structures, and techniques have not been shown in detail so as not to obscure the understanding of this description.
General Concepts
To help explain or illustrate the iterative method for finding which user characteristic data is more likely than others to predict the correct selection of an HRTF, let us first consider the ‘ideal data’ collected and presented as a matrix shown in
Ideally, if the matrix in
Finding the Characteristic Data
A goal here is to ensure that an HRTF that has been selected for listener #3 is closer to the ‘good’ rating region (see
A goal here is to produce pleasant binaural sound to a target user (target listener), for whom a selection of an HRTF is to be made from a database of available HRTFs. The sound production may by through for example headphones (transducers placed at the ears of the target user.) This will be done by improving the chances of selecting a “good” HRTF, from a database of available HRTFs, which results in better binaural sound simulation for the target user in particular. The method may proceed as follows, with reference to the diagrams in
The method may begin with operation 18 in
This initial refinement or pruning of the database resulted in the histogram of measured subjective preference ratings, by a single listener, for the remaining database members, changing as shown in the plot of
The method may continue with operation 20 in
More generally the first user characteristic in operation 20 could be selected from the group consisting of: gender, race, age range, and height range. The user characteristic could be obtained by retrieving a predetermined characteristic from a data storage that is i) remotely accessed or ii) local memory of the audio device that is generating the transducer drive signals, wherein the predetermined characteristic is part of personal information data of the target user, e.g., health information of the target user.)
The method may continue with operation 22 in
The experimental results for such an operation (as performed in a laboratory setting) are plotted in the graph of
In one embodiment, the process may continue with additional pruning operations (after removing the first subset and the second subset as described above), until reaching a point where a decision is made that the remaining group of HRTFs in the database is small enough, and to select an HRTF for the target user (from remaining members of the database of available HRTFs.) In one embodiment, the HRTF is selected by determining which one of the remaining members has a highest approval rating by a listener group of one or more listeners that has the first user characteristic and the second user characteristic. Other ways of selecting the HRTF from the remaining database are possible. Next, the digital audio signals of the target user's program audio are then processed by a binaural processor according to the selected HRTF, to generate the transducer drive signals that drive the transducers that are placed at the ears of the target user.
While certain aspects have been described and shown in the accompanying drawings, it is to be understood that such are merely illustrative of and not restrictive on the broad invention, and that the invention is not limited to the specific constructions and arrangements shown and described, since various other modifications may occur to those of ordinary skill in the art. The description is thus to be regarded as illustrative instead of limiting.
This non-provisional patent application claims the benefit of the earlier filing date of provisional application No. 62/736,409 filed Sep. 25, 2018. An aspect of the disclosure here relates to digital audio systems that have 3D audio signal processing capability for binaural sound reproduction through headphones. Other aspects are also described.
Number | Name | Date | Kind |
---|---|---|---|
9955279 | Riggs et al. | Apr 2018 | B2 |
20150010160 | Udesen | Jan 2015 | A1 |
20150312694 | Bilinski et al. | Oct 2015 | A1 |
20160269849 | Riggs et al. | Sep 2016 | A1 |
Number | Date | Country |
---|---|---|
2017158232 | Sep 2017 | WO |
Number | Date | Country | |
---|---|---|---|
62736409 | Sep 2018 | US |