Crosstalk cancellation (also known as “CTC”) is an acoustic display technique where loudspeakers are used in place of headphones to deliver binaural signals to human ears. Crosstalk cancellation is one instance of a class of acoustic display techniques called sound field control (also known as “SFC”).
In certain instances, crosstalk cancellation performance can be improved by improving the accuracy of the sound field model. In particular, the free-field assumption is violated by the scattering and occlusion effects of the human head and body. These physical effects diminish the quality of binaural localization, since they combine with the virtual effects of scattering and occlusion already present in binaural audio. Improving the accuracy of the sound field model establishes a means to attenuate the presence of physical effects, thus improving the perception of virtual effects.
It would be advantageous if crosstalk cancellation techniques could be improved in terms of any, some, or all sound field control metrics.
It should be appreciated that this Summary is provided to introduce a selection of concepts in a simplified form, the concepts being further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of this disclosure, nor is it intended to limit the scope of the optimal crosstalk cancellation filter sets using an obstructed field model and methods of use.
The above objects as well as other objects not specifically enumerated are achieved by a crosstalk cancellation filter set configured for use in delivering binaural signals to human ears. The crosstalk cancellation filter set includes a pressure matching system configured to perform spatial filtering or sound field control and an obstructed field model in communication with the pressure matching system. The crosstalk cancellation filter set is configured to take acoustic advantage of scattering effects and occlusional effects caused by violations to a free-field assumption, thereby delivering improved crosstalk cancellation acoustic displays to a listener without the use of headphones.
The above objects as well as other objects not specifically enumerated are also achieved by a method of providing a crosstalk cancellation filter set configured for use in delivering binaural signals to human ears. The method includes the steps of configuring a pressure matching system to perform spatial filtering or sound field control and configuring a spherical head model for communication with the pressure matching system. The crosstalk cancellation filter set is configured to take acoustic advantage of scattering effects and occlusional effects caused by a human head, thereby delivering improved crosstalk cancellation acoustic displays without the use of headphones.
Various objects and advantages of the optimal crosstalk cancellation filter sets using an obstructed field model and methods of use will become apparent to those skilled in the art from the following detailed description, when read in light of the accompanying drawing.
The optimal crosstalk cancellation filter sets generated by using an obstructed field model and methods of use (hereafter “crosstalk cancellation filter sets”) will now be described with occasional reference to specific embodiments. The crosstalk cancellation filter sets may, however, be embodied in different forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the crosstalk cancellation filter sets to those skilled in the art.
Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which the crosstalk cancellation filter sets belong. The terminology used in the description of the crosstalk cancellation filter sets herein is for describing particular embodiments only and is not intended to be limiting of the crosstalk cancellation filter sets. As used in the description of the crosstalk cancellation filter sets and the appended claims, the singular forms “a,” “an,” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise.
Unless otherwise indicated, all numbers expressing quantities of dimensions such as length, width, height, and so forth as used in the specification and claims are to be understood as being modified in all instances by the term “about.” Accordingly, unless otherwise indicated, the numerical properties set forth in the specification and claims are approximations that may vary depending on the desired properties sought to be obtained in embodiments of the crosstalk cancellation filter sets. Notwithstanding that the numerical ranges and parameters setting forth the broad scope of the crosstalk cancellation filter sets are approximations, the numerical values set forth in the specific examples are reported as precisely as possible. Any numerical values, however, inherently contain certain errors necessarily resulting from error found in their respective measurements.
The term “binaural”, as used herein, is defined to mean any stereo (two-channel) audio signal that contains complete, partial, or approximations of head-related transfer function (also known as “HRTF”, “anatomical transfer function” or “ATF”) components, whether recorded, synthesized, or imparted on an audio signal in another way, so as to reproduce localization cues, and in turn, a virtual auditory environment for a listener. The term “head-related transfer function”, as used herein, is defined to mean a response that characterizes how an ear receives a sound from a point in space. As sound strikes the listener, the size and shape of the head, ears, ear canal, density of the head, size and shape of nasal and oral cavities, all transform the sound and affect how it is perceived, boosting some frequencies, attenuating other frequencies, as well as possibly causing frequency-dependent delays. The terms “crosstalk cancellation” or “CTC”, as used herein, is defined to mean any system for two-dimensional or three-dimensional audio reproduction. It is a system configured to play binaural stereo signals from loudspeakers.
Physical properties of an array in an obstructed free field produce additional head-related transfer function components that are neither intended nor compensated for by the binaural audio signal itself. In all crosstalk cancellation applications physical head-related transfer functions are known to sum with virtual head-related transfer functions, thereby decreasing the fidelity of the virtual auditory environment intended by a binaural signal in terms of how it is measured at control points, which may or may not include a listener's ears. As a result, the spatial image intended by the binaural signal is degraded.
The description and figures disclose crosstalk cancellation filter sets for use in delivering binaural signals to human ears. Generally, the crosstalk cancellation filter sets are configured to optimize and take advantage of the acoustic scattering and occlusion effects of the human head, thereby delivering improved crosstalk cancellation acoustic displays.
Without being held to the theory, it is believed the crosstalk cancellation filter sets cancel out physical head-related transfer functions while ideally leaving the virtual head-related transfer functions intact. More formally, it is believed the crosstalk cancellation filter sets partially or completely “undo” unintentional acoustic transformations in crosstalk cancellation contexts. The intended result of applying the crosstalk cancellation filter sets is to increase the fidelity of the spatial image and/or virtual auditory environment in crosstalk cancellation contexts.
Referring now to
In contrast to the conventional loudspeaker assembly 10 illustrated in
Referring now to
Referring now to
Referring again to
Conventional pressure matching methods can use an array, which is an ensemble of loudspeakers, each combined with filter sets, to perform spatial filtering or sound field control. Control points are sometimes referred to as either “bright spots” (as in the existence of acoustic pressure) or dark spots (no acoustic pressure). The free-field transfer function is estimated between the L loudspeakers and the M control points. At a given frequency, a column vector of complex filter weights (which can be converted into magnitude and phase) are determined to optimize the pressure at the control points. This can be described using the matrix notation:
p=Zq
where p is defined as a column vector of acoustic pressure at M control points, q is defined as a column vector of L complex weights (one per loudspeaker) and Z is defined as the transfer function matrix with dimensions M×L describing the acoustic transfer function between each driver and each control point. An inverse matrix Z−1 (defined as the exact solution to I=AA−1), or a pseudoinverse matrix Z+ (defined as the approximation of an inverse that allows error I≈AA+), is used to either solve or approximate spatial filter sets q at one or more arbitrary frequencies. For each driver, this sequence of spatial filter sets at one or more frequencies is transformed into a time domain filter. The ensemble of acoustic drivers and filter sets work together to form desired sound field response(s).
The term “pseudoinverse”, as used herein, is defined to mean either or both inverse and pseudoinverse. The statement I≈AA+ includes the possibility that I=AA−1. The term “inverse problem”, as used herein, is defined to mean a problem that can be solved via this general definition of pseudoinverse.
This method is well suited to the general problem of beam forming. Crosstalk cancellation is an acoustic display technique where loudspeakers are used in place of headphones to deliver binaural signals to human ears. In certain instances, the crosstalk cancellation technique can be improved with the use of beam forming techniques. However, the free-field assumption can be violated when the scattering and occlusion effects of the human head are considered, as shown in
Novel and innovative crosstalk cancellation filter sets are provided. The crosstalk cancellation filter sets are configured to optimize and take acoustic advantage of the scattering and occlusion effects of the human head, thereby delivering improved crosstalk cancellation acoustic displays without the use of headphones. Referring now to
Referring now to
p=Xq
Referring again to
Referring now to
While the crosstalk cancellation filter sets 10, 110 illustrated in
Referring again to
The term “matrix condition”, as used herein, is a metric invoked when sound field control is cast as an inverse problem. The matrix condition can be improved by substituting matrix elements, such as for example, replacing free-field transfer functions with head-related transfer functions or other types of suitable transfer function. Improved matrix condition in sound field control is often but not always a beneficial side-effect of acoustic shadowing. Improving the matrix condition causes other important sound field control metrics to improve as well.
The term “numerical stability”, as used herein, refers to a metric related more directly to inverse problems (mathematical situations where the solution requires the calculation of a matrix pseudoinverse), than sound field control problems. When a matrix is ill conditioned, or rather has a high condition number, its pseudoinverse can become unstable. Numerically, instability causes small changes to an ill-conditioned matrix to produce disproportionately large changes in its pseudoinverse. The ideal rates of change to a matrix and its pseudoinverse should be 1:1. Acoustically, instability causes small errors in physical parameters and models, such as the non-limiting examples of speaker or control point positions or errors in the transfer function matrix, to result in disproportionately large errors in the sound field.
The term “error”, as used herein, refers to a sound field control metric that shows the difference between intended and actual control point responses. When the ratio of intended responses is high, such as the non-limiting example of a crosstalk cancellation method intended responses of 1:0=inf, error can also be described by the term “ear separation”. Optimal obstructed field filter sets can be built in order to minimize error in terms of either ear separation, or more general numerical error. Some instances of sound field control may focus on minimizing one type of error over another type of error.
The term “effort”, as used herein, refers to a distribution of gain across an array. In all cases, low effort is better than high effort, though due to the physical differences in length between low frequency wavelengths, and human interaural distances, low effort is more difficult to achieve at low frequencies.
The term “spectral flatness”, as used herein, refers to the distribution of gain across frequencies. Since filter sets can be built from independent solutions at multiple frequencies, the acoustic properties of filter sets depend on frequency. In practice, transaural systems require more effort as frequency decreases. The result is that filter sets have significantly varied spectra which listeners are sensitive to, even when they are in the ideal listening location.
In accordance with the provisions of the patent statutes, the principle and mode of operation of the crosstalk cancellation filter sets and method of use have been explained and illustrated in a certain embodiment. However, it must be understood that the crosstalk cancellation filter sets and method of use may be practiced otherwise than as specifically explained and illustrated without departing from its spirit or scope.
This application is a national phase entry under 35 U.S.C. § 371 of International Application No. PCT/US2019/062381, filed Nov. 20, 2019, which claims the benefit of U.S. Provisional Patent Application No. 62/770,373, filed Nov. 21, 2018, the entire disclosures of which are incorporated herein by reference.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2019/062381 | 11/20/2019 | WO |
Publishing Document | Publishing Date | Country | Kind |
---|---|---|---|
WO2020/106821 | 5/28/2020 | WO | A |
Number | Name | Date | Kind |
---|---|---|---|
10306396 | Romigh | May 2019 | B2 |
10966026 | Liu | Mar 2021 | B2 |
11197118 | Dicker | Dec 2021 | B2 |
20070110250 | Bauck | May 2007 | A1 |
20150208166 | Raghuvanshi | Jul 2015 | A1 |
20150373477 | Norris | Dec 2015 | A1 |
20170339508 | Norris et al. | Nov 2017 | A1 |
Number | Date | Country |
---|---|---|
1860826 | Nov 2006 | CN |
101212834 | Jul 2008 | CN |
103222187 | Jul 2013 | CN |
1998007127 | Feb 1998 | WO |
2018190875 | Oct 2018 | WO |
Entry |
---|
International Search Report and Written Opinion for International Application No. PCT/US2019/062381 dated Jan. 24, 2020. 10 pages. |
Notification of the First Office Action for Chinese Patent Application No. 201980075960.5 dated Jan. 26, 2022. 8 pages. |
International Preliminary Report on Patentability for International Application No. PCT/US2019/062381 dated Jun. 3, 2021. 10 pages. |
Number | Date | Country | |
---|---|---|---|
20220021975 A1 | Jan 2022 | US |
Number | Date | Country | |
---|---|---|---|
62770373 | Nov 2018 | US |