The subject matter described herein relates to sound propagation. More specifically, the subject matter relates to methods, systems, and computer readable media for supporting source or listener directivity in a wave-based sound propagation model.
Sound propagation techniques determine how sound waves travel in a space and interact with the environment. In many applications, it is important to be able to simulate sound propagation in large scenes, such as games and training in both indoor and outdoor scenes, noise prediction in urban areas, and architectural acoustics design for buildings and interior spaces such as concert halls. Realistic acoustic effects can also improve the realism of a virtual environment. Acoustic phenomena such as interference, diffraction, and scattering for general scenes can only be captured by accurately solving the acoustic wave equation. The large spatial and frequency domain and accuracy requirement pose a significant challenge for acoustic techniques to be used in these diverse domains with different requirements.
Sound sources can be omnidirectional (e.g., radiating sound isotropically) or directional (e.g., radiating sound anisotropically). Most sound sources we come across in real life (e.g., ranging from human voices through speaker systems in televisions, radio, smartphones, machine noises in cars, aircrafts, helicopters, and musical instruments) are directional sources that have a specific directivity pattern. This directivity depends on the shape, size, and material properties of the sound source, as well as a complex interaction of the processes of vibration and sound radiation, resulting in varying directivity at different frequencies. Due to the non-uniform radiation of sound, directional sources have a significant impact on sound propagation and the corresponding acoustic response of the environments. Acoustic effects generated from directional sources are noticeable in everyday life: a person talking towards/away from a listener, positioning of different types of musical instruments in an orchestra, good-sounding places (sweet spots) in front of television in the living room, aircraft, helicopter, or fire trucks in an urban environment.
Analogous to source directivities, listeners also have directivities. In other words, listeners do not receive sound in the same way from all directions. The human auditory system obtains significant directional cues from the subtle differences in sound received by each ear which are caused by the scattering of sound around the head. Listener directivity can be used to enhance a user's immersion in a virtual environment by providing the listener with cues corresponding to the directions the sound is coming from, and thereby enriching the experience.
Various techniques may be used in predicting or modeling sound propagation. Some techniques may involve assuming sound travels like rays (e.g., beams of lights). Other techniques may involve assuming sound travels like waves. However, current techniques are unable to efficiently support source or listener directivity in an interactive wave-based sound propagation model.
Accordingly, there exists a need for methods, systems, and computer readable media for supporting source or listener directivity in an interactive wave-based sound propagation model.
Methods, systems, and computer readable media for supporting source or listener directivity in a wave-based sound propagation model are disclosed. According to one method, the method includes computing, prior to run-time, one or more sound fields associated with a source or listener position and modeling, at run-time and using the one or more sound fields and a wave-based sound propagation model, source or listener directivity in an environment.
A system for supporting source or listener directivity in a wave-based sound propagation model is also disclosed. The system includes a processor and a sound propagation model (SPM) module executable by the processor. The SPM module is configured to compute, prior to run-time, one or more sound fields associated with a source or listener position and to model, at run-time and using the one or more sound fields and a wave-based sound propagation model, source or listener directivity in an environment.
The subject matter described herein can be implemented in software in combination with hardware and/or firmware. For example, the subject matter described herein can be implemented in software executed by a processor. In one exemplary implementation, the subject matter described herein may be implemented using a computer readable medium having stored thereon computer executable instructions that when executed by the processor of a computer control the computer to perform steps. Exemplary computer readable media suitable for implementing the subject matter described herein include non-transitory devices, such as disk memory devices, chip memory devices, programmable logic devices, and application specific integrated circuits. In addition, a computer readable medium that implements the subject matter described herein may be located on a single device or computing platform or may be distributed across multiple devices or computing platforms.
As used herein, the terms “node” and “host” refer to a physical computing platform or device including one or more processors and memory.
As used herein, the term “module” refers to hardware, firmware, or software in combination with hardware and/or firmware for implementing features described herein.
Preferred embodiments of the subject matter described herein will now be explained with reference to the accompanying drawing, wherein like reference numerals represent like parts, of which:
The subject matter described herein discloses methods, systems, and computer readable media for supporting source or listener directivity in a wave-based sound propagation model. In accordance with some aspects of the present subject matter described herein, exemplary mechanisms, processes, or systems may be provided or supported for modeling time-varying, data-driven, or arbitrary source directivity and head-related transfer function (HRTF)-based listener directivity for interactive wave-based sound propagation in frequency domain. For example, a sound propagation modeling or simulation application in accordance with some aspects of the present subject matter described herein may support time-varying, data-driven source directivity based on spherical harmonic (SH) basis, where the source directivity can be dynamically modified at run-time. In some embodiments, the propagated sound fields due to SH sources may be precomputed, stored, and used to compute the total sound field at run-time.
In accordance with some aspects of the present subject matter described herein, an exemplary sound propagation modeling or simulation application may support a plane-wave decomposition based on pressure derivatives for modeling HRTF-based listener directivity to generate spatial sound in interactive applications, where the plane-wave decomposition may be performed in real-time compared to prior plane-wave decomposition methods which are offline.
In accordance with some aspects of the present subject matter described herein, an exemplary sound propagation modeling or simulation application may accurately simulate sound fields that include wave-phenomena such as scattering and diffraction and can also produce spatial sound cues that provide localization and immersion for interactive applications.
In accordance with some aspects of the present subject matter described herein, a general framework may be provided for integrating source and/or listener directivities in any offline or online frequency domain wave-based propagation algorithm (e.g., a boundary element method or an interactive equivalent sources method).
In accordance with some aspects of the present subject matter described herein, a real-time and/or memory efficient sound rendering system may use aspects described herein to provide realistic acoustic effects from directional sources and spatial sound in interactive applications.
Reference will now be made in detail to exemplary embodiments of the subject matter described herein, examples of which are illustrated in the accompanying drawing. Wherever possible, the same reference numbers will be used throughout the drawings to refer to the same or like parts.
Node 102 may include a communications interface 104, a shared memory 106, and one or more processor cores 108. Communications interface 104 may be any suitable entity (e.g., a communications interface and/or a data acquisition and generation (DAG) card) for receiving and/or sending messages. For example, communications interface 104 may be interface between various nodes 102 in a computing cluster. In another example, communications interface 104 may be associated with a user interface or other entity and may receive configuration setting and/or source data, such as audio information, for processing during a sound propagation model application.
In some embodiments, communications interface 104 or another component may be configured to identify or select a processor core 108 for processing or analysis and/or information for storage. For example, communications interface 104 may receive information from another node in a cluster and may determine that a particular processor core 108 should process the received information. In another example, communications interface 104 may store information in shared memory 106 and the stored information may be retrieved later by an available processor core 108. Shared memory 106 may be any suitable entity (e.g., random access memory or flash memory) for storing sound propagation information, such as sound field data, pressure derivatives, plane-wave decomposition information, SH information, source audio information, and/or other information. Various components, such as communications interface 104 and software executing on processor cores 108, may access (e.g., read from and/or write to) shared memory 106.
Each of processor cores 108 represents any suitable entity (e.g., a physical processor, a field-programmable gateway array (FPGA), and/or an application-specific integrated circuit (ASIC)) for performing one or more functions associated with sound propagation modeling. Processor cores 108 may be associated with a sound propagation model (SPM) module 110. For example, SPM module 110 or software therein may be executable by one or more processor cores 108.
SPM module 110 may be configured to use one or more techniques (e.g., geometric acoustic techniques and/or numerical acoustic techniques) for modeling sound propagation in one or more environments. Existing techniques for modeling sound propagation can be broadly classified into geometric and wave-based techniques. Geometric techniques can handle arbitrary source and listener directivity for offline and interactive applications. However, due to their inherent assumption of rectilinear propagation of sound, modeling wave effects like diffraction and interference, using geometric techniques remains a significant challenge, especially at low frequencies. This becomes a limiting factor for simulating sources that have prominent directivity patterns at low frequencies (e.g. human voices).
In contrast with geometric techniques, wave-based techniques can accurately perform sound propagation at low frequencies, but their computational complexity increases significantly for high frequencies. There has been considerable interest in recent years in developing interactive wave-based sound propagation techniques for computer graphics. Current techniques for interactive applications have a high precomputation overhead and can only model source directivity during the offline computation. As a result, the source directivity is hard-coded into the final solution, and it is not possible to modify the directivity pattern at run-time for interactive applications (e.g., for a rotating siren, a person covering his/her mouth, etc.). Additionally, integrating listener directivity into wave-based techniques requires a plane-wave decomposition of the sound field, which is computationally expensive for interactive applications.
In some embodiments, SPM module 110 may be configured to support source and/or listener directivity for interactive wave-based sound propagation modeling. For example, SPM module 110 may be configured to model, using precomputed sound fields, sound propagation where a sound source's directivity can be dynamically modified during run-time, e.g., a moving ambulance with siren controlled by a user during a video game. In another example, SPM module 110 may be configured to model, using precomputed sound fields and related derivative values, sound propagation where a listener's directivity can be dynamically modified during run-time, e.g., a character controlled by a user moving through a virtual environment.
In some embodiments, SPM module 110 may be configured to precompute (e.g., prior to run-time or executing of a sound propagation model application) and store propagated sound fields or pressure fields due to spherical harmonic (SH) sources. Using these precomputed sound fields, SPM module 110 may compute a total sound field using a wave-based sound propagation technique at run-time, thereby allowing modeling of a source with a directivity that can be dynamically modified at run-time.
In some embodiments, a pressure field of an entire domain may be expressed to any order of accuracy by using pressure and its higher order derivatives at a single point.
In some embodiments, plane wave decomposition of a pressure field may be computed to any order of accuracy using pressure derivatives and SH decomposition. Computing plane wave decomposition of a pressure field using pressure derivatives and SH decomposition may significantly faster and more memory efficient than conventional methods thereby allowing computation of spatial sound at run-time.
In some embodiments, SPM module 110 may be configured to compute listener directivity for any frequency-domain, wave-based sound propagation technique. For example, an exemplary method for computing listener directivity may include computing a plane-wave decomposition of the sound field around the listener, expressing plane-wave coefficients associated with the plane-wave decomposition in terms of an SH basis, and use pressure and its derivatives at a listener position to compute corresponding SH coefficients. The HRTF may also be expressed in the SH basis. At run-time, spatial sound may be computed as a dot product between the SH coefficients of the sound field and the HRTF. By computing a plane-wave decomposition using pressure and pressure derivatives, spatial sound may be modeled or simulated in real-time. In contrast, current plane-wave decomposition methods are performed offline.
In some embodiments, SPM module 110 may be configured to work in parallel with a plurality of processor cores 108 and/or nodes 102. For example, a plurality of processor cores 108 may each be associated with a SPM module 110. In this example, each processor core 108 may perform processing associated with modeling sound propagation for a particular environment. In another example, some nodes 102 and/or processing cores 108 may be utilized for precomputing (e.g., computing pressure or sounds fields due to SH sources) and other nodes 102 and/or processing cores 108 may be utilized during run-time, e.g., to execute a sound propagation model application that utilizes precomputed values or functions.
In some embodiments, SPM module 110 may be configured to perform sound propagation modeling using any wave-based sound propagation technique, such as any technique that numerically solve or attempt to solve the acoustic wave equation. Exemplary wave-based sound propagation techniques usable by SPM module 110 may include a finite element method, a boundary element method, an equivalent source method, various spectral methods, or another frequency-domain wave-based technique.
It will be appreciated that
At step 204, these pressure fields are encoded in basis functions (e.g. multipoles) and stored for runtime use.
At runtime, given a dynamic source directivity, a SH decomposition of its directivity is performed to compute the corresponding SH coefficients.
At step 206, the final pressure field is computed as a summation of the pressure fields due to SH sources evaluated at the listener position weighted by the appropriate SH coefficients.
At step 208, in order to incorporate dynamic listener directivity in wave-based techniques, an interactive plane-wave decomposition approach based on derivatives of the pressure field is utilized.
At step 210, acoustic responses for both ears are computed at runtime by using this efficient plane-wave decomposition of the pressure field and the HRTF-based listener directivity. In some embodiments, these binaural acoustic responses are convolved with the (dry) audio to compute the propagated spatial sound at the listener position.
In some embodiments, exemplary process 200 may be incorporated into the state-of-the-art boundary element method [Liu 2009] and the interactive equivalent source technique [Mehra et al. 2013]. In such embodiments, existing implementations of the state-of-the-art boundary element method and the interactive equivalent source technique may be computationally efficient up to a few kHz.
In some embodiments, exemplary process 200 or a runtime portion therein may be integrated with Valve's Source™ game engine. Using this game engine, acoustic effects from both source and listener directivity on a variety of scenarios are demonstrable, such as people talking on the street, loudspeakers between buildings, a television in a living room, a helicopter in a rocky outdoor terrain, a bell tower in a snow-covered town, a rotating siren, a helicopter behind obstacles, and musical instruments in an amphitheater.
In some embodiments, results associated with exemplary process 200 may be validated by comparing these results to results computed analytically using the offline Biot-Tostoy-Medwin (BTM) technique [Svensson et al. 1999].
In some embodiments, exemplary process 200 enables accurate sound propagation for directional sources and can handle moving directional sources and a moving directional listener.
In contrast to conventional or known approaches, exemplary process 200 and/or other aspects described herein include an approach that allows modifiable source directivity and HRTF-based listener directivity at runtime for interactive wave-based sound propagation.
In this section, aspects related to handling directional sources in a general frequency-domain wave-based sound propagation technique are presented as an overview. In one exemplary embodiment, a source formulation is presented that incorporates the complete radial and directional sound radiation from sources. Next, a far-field representation of this source formulation is discussed that can be used to efficiently handle data-driven, and rotating or time-varying source directivities. Based on the linearity of the Helmholtz equation, aspects of the present subject matter are provided that incorporates an exemplary source representation as described herein into a general frequency-domain wave-based propagation technique. In some embodiments, all the variables used in an exemplary approach, except the SH basis functions, positions and speed of sound are frequency dependent. For the sake of brevity these dependencies are not mentioned explicitly.
The radiation pattern of a directional source can be expressed using the one-point multipole expansion [Ochmann 1995] as:
where s(x, y) is the pressure field (/sound field) at point x of the directional source centered at point γ, h21(.) are the spherical Hankel functions of the second kind, Ylm(.) are complex-valued SH basis functions [Hobson 1955], (r, θ, φ) is the vector (x−y) expressed in spherical coordinates, blm, are weights of the expansion and L is order of the expansion, c is the speed of sound in the medium (343 m/s for air at standard temperature and pressure) and ν is the frequency. This source formulation is valid in both near and far-fields1.
The selection of source representation for directional sources may be motivated by the measured directivity data that is currently available for real-world sources. Most available measurements has been collected by placing sources in an anechoic chamber and recording their directivity by rotating microphones every few degrees at a fixed distance from the source. Typically, these measurements are carried out at a distance of a few meters, which corresponds to the far-field for the frequencies emitted by these sources2. Keeping this in mind, a source representation that corresponds to the far-field radiation pattern of a directional source is selected. Under far-field approximation, h12(z)≈ll+1e−lz/z where i=√{square root over (−1)} resulting in the following source representation [Menzies 2007]:
where αim=blmil+1c/(2πν) are the SH coefficients and D(θ, φ)=ΣΣαimγim(θ, φ) is the directivity function at frequency ν. This directivity function can be either specified for each frequency analytically or measured or simulated at discrete sample directions. Depending on the data, the directivity function can be a complex-valued (both magnitude and phase) or a real-valued (magnitude-only). Typically, the measured data is magnitude-only and available as directivities averaged over octave-wide frequency bands [PTB 1978].
The coefficients αim of the source representation is computed from the D(θ, φ) as follows:
Analytical: Given an analytical expression for the directivity function D(θ, φ), the coefficients αim can be computed by SH projection as follows:
αlm=∫02π∫0πD(θ,φ)Ylm(θ,φ)sin θdθdφ. (4)
This expression can be evaluated either symbolically or numerically, depending on D(θ, φ) [Green 2003].
Data-driven: Given the directivity function D(θ, φ) at sampled locations {(θ1,φ1)(θ2, φ2), . . . , (θn, . . . , φn)}, the spherical harmonic expansion can be fitted to this function in the least-square sense, by solving an over-determined linear system (n>L2) to compute the coefficients αim
An exemplary source formulation described herein and the corresponding spherical harmonic representation can handle both complex-valued and real-valued directivities. In case the directivity function is real-valued (as the widely available measured data is magnitude only), the real-valued SH basis [Green 2003] is used in the aforementioned expressions.
In this section, aspects related to an exemplary approach for incorporating the directional source representation in a general frequency domain, wave-based sound propagation technique are disclosed. The steps outlined below are repeated for frequency samples in the range [0; νmax], where νmax is the maximum frequency simulated.
Helmholtz equation: Sound wave propagation in the frequency-domain can be expressed as a boundary value problem using the Helmholtz equation:
where p(x) is the complex-valued pressure field at ν (frequency), w=2πν is the angular frequency, c is the speed of sound, Ω is the propagation domain, and ∇2 is the Laplacian operator. The behavior of p at infinity must be specified using the Sommerfeld radiation condition [Pierce 1989]. In order to complete the problem specification, either a Dirichlet, Neumann or mixed boundary condition is specified at the boundary of the domain. This equation can be solved using any frequency-domain wave-based numerical technique, including the boundary element method, the finite element method or the equivalent source method.
Directional sources: The linearity of the Helmholtz equation implies that the pressure field of a linear combination of sources is a linear combination of their respective pressure fields [Pierce 1989]. In one exemplary embodiment, a source representation for directional sources described herein is a linear combination of elementary spherical harmonic sources slm(x,y)=e−l2πνr/c/rYlm with different weights αlm (equation 2). Therefore, for a given scene, the pressure field plm(x) corresponding to each of the elementary sources slm(x,y) is computed, then the pressure field due to any arbitrary directional source s(x, y) can be expressed as the linear combination of the precomputed pressure fields of the elementary sources with the same weights αim:
The pressure fields for elementary sources can be computed using any underlying numerical solver. In the case of interactive applications, this computation is performed during the preprocessing stage, and the resulting pressure field data is efficiently encoded and stored. This pressure field data completely defines the acoustic response to any directional source at the given position up to the SH approximation order. At runtime, the specified source directivity D(θ,φ) is decomposed into a spherical harmonic-based representation, using Equations (4) or (5), and the resulting weights αim are used to compute the final acoustic response p(x0) at listener position x0, as described above.
Time-varying and rotating directivity: For a source with time-varying directivity, the spherical harmonic decomposition (5) of the source directivity function D(θ, φ) is computed at runtime at interactive rates using fast linear solvers as discussed in detail in Section 6. For the special case of a rotating directional source, the new SH coefficients after rotation can be computed by applying the SH rotation matrix to the original SH coefficients [Green 2003]. Section 6 describes an exemplary method for handling directivity for dynamic sources in further detail.
Aspects related to a fast and efficient method of computing listener directivity for any frequency-domain, wave-based sound propagation technique is disclosed herein. In one exemplary embodiment, the plane-wave decomposition of the sound field is computed around the listener, the plane-wave coefficients is expressed in terms of a SH basis, and the pressure and its derivatives at the listener position is used to compute the SH coefficients. The head-related transfer function (HRTF) is also expressed in the SH basis. At runtime, spatial sound can be computed as a dot product between the SH coefficients of the sound field and the HRTF.
In the frequency domain, the global sound field can be expressed as a superposition of pressure due to plane waves [Duraiswami et al. 2005]. This basis is also known as the Herglotz wave basis, with the basis functions centered at listener position x0:
ψs(x)=eiks·(x·x
where s=(sx; sy; sz) is the unit vector in the direction of plane wave propagation, k=2πν/c is the wave number, ν is the frequency, and c is the speed of sound. In terms of these basis functions, the total pressure at x can be determined by integrating over all directions:
where μ(s), also known as the signature function [Zotkin et al. 2010], specifies the complex-valued amplitude of the plane wave traveling along direction s, for a given frequency ν. The signature function can be further decomposed using complex-valued SH basis functions, yielding:
where Y*im(s) is the complex conjugate of the SH basis functions Yim(s) and αim are the corresponding coefficients. The plane wave basis functions, shown in equation (7), have an important property: all of the basis functions have zero phase at the listener position x0. In other words, ψs(x0)=1, ∇s. Based on this property and the orthonormality of the SH basis, it may be proven that the SH-based plane-wave decomposition of the pressure field can be computed up to any order by computing the derivatives of the pressure field up to the same order.
Theorem: The sound field of the entire domain can be expressed in terms of pressure and its derivatives at a single point. Given the polynomial expression of the nth order and qth degree SH
where α≧0, b≧0, n≧0 and a+b+c=n, the corresponding SH coefficient in the plane-wave decomposition of the pressure field is
where
The above expression and equation 9 gives the plane-wave decomposition of sound field in terms of pressure and its high-order derivatives at point x0.
See Appendix A in supplementary text for proof, and Section 5 for details on calculating the derivatives p(a,b,c). The above result is used in Section 4.2 to compute the spatial sound at both ears as a dot product of SH coefficients of plane-wave decomposition of the sound field and HRTFs.
The HRTF relates the pressure received at each ear of the listener to the pressure at a point positioned at the center of the listener's head, and accounts for scattering from the head. a SH decomposition of the left and right ear HRTFs are performed, HL(s)=Σl′Σm′βl′m′LYl′m′(s) and HR(s)=Σl′Σm′βl′m′RYl′m′(s). Similar to Rafaely et al. [2010], the pressure received at left and right ear (pL and pR) can be expressed as a dot product of the SH coefficients of the sound field and the HRTFs (see Appendix B in supplementary text for details).
As shown in Section 5, pressure derivatives (equation 11) may be computed by differentiating the pressure basis functions analytically rather than using a finite difference stencil. Therefore, higher-order derivatives do not suffer from numerical instabilities, allowing the SH coefficients of the sound field to be computed to any desired order. Rafaely et al [2010] conducted a study on the effect of SH order on spatial perception. The results indicate that: a SH order of 1-2 is sufficient for spatial perception of frequencies up to 1 kHz; a SH order of 3 suffices up to 2 kHz; and a SH order of 3-6 suffices up to 8 kHz. Higher SH order results in better spatial resolution, but computing the higher-order derivatives also increases the computational cost. Therefore, the SH order may be determined based on a performance-accuracy trade-off.
Some aspects of the present subject matter allow spatial audio to be computed efficiently using two dot products (left and right ear) of complex-valued vectors. This enables the use of HRTF-based listener directivity for interactive wave-based sound propagation techniques as compared to previous approaches which are offline. Some aspects of the present subject matter also provide the flexibility of using individualized (user-specific) HRTFs without recomputing the simulation results. In one exemplary approach, only the SH coefficients of the individualized HRTFs need to be updated; the SH coefficients of the sound field remain the same. Additionally, some aspects of the present subject matter enable head-tracking for a moving and rotating listener at interactive rates by simplifying the head rotation computation using SH rotation matrices.
In this section, aspects associated with integration of an exemplary source and listener directivity representation described herein with frequency-domain wave-based sound propagation techniques are disclosed. In order to demonstrate that an exemplary directivity representation described herein can be incorporated in any frequency-domain wave-based technique, the exemplary directivity representation has been integrated with the state-of-the-art boundary element method [Liu 2009] and the interactive equivalent source method [Mehra et al. 2013].
The boundary element method is a standard numerical technique used for solving the 3D Helmholtz equation for accurate sound propagation in indoor and outdoor spaces. BEM transforms the Helmholtz equation into boundary integral equations and solves for pressure and velocity on the boundary, and thereby pressure at any point in the domain.
Simulation: Given an indoor or outdoor scene with source position y, the Helmholtz equation corresponding to the elementary spherical harmonic source slm(x,y) is solved using BEM. This gives pressure qim(z) and velocity νim(z) on the domain boundary (z is a point on the boundary). This step is repeated for all the elementary sources
in the spherical harmonic expansion (Equation 2).
Pressure evaluation: The pressure field p(x) for the elementary spherical harmonic sources sim(x) is computed at the listener position x by using the boundary integral equation:
p
lm(x)=∫s[G(x,z,ω)qlm(z)−F(x,z)νlm(z)]dS(z)+slm(x,y),
where
is the Green's function
and n(z) is normal at a boundary point z.
For a source s(x, y) with directivity function D(θ, φ) placed at point γ, the coefficients αim corresponding to its spherical harmonic expansion are computed (Equations (4) and (5)). The final acoustic response (pressure field) to the directional source s(x, y) is computed as a linear combination of the pressure fields Pim(x) of the elementary sources Sim(x):
Pressure derivative evaluation: In order to produce spatial sound at the listener position, the coefficients of the plane wave decomposition are determined using pressure and its first-order derivatives (Equations 9 and 11). The pressure is computed as shown above. The derivative of the pressure field due to elementary source ∂pim(x)/∂x is computed by differentiating the functions involved analytically:
The first-order derivative of the complete pressure field is computed as:
Higher order derivatives can be computed in a similar manner.
Equivalent source technique [Mehra et al. 2013] can perform interactive wave-based sound propagation in large outdoor scenes. This technique decomposes the global pressure field of a scene into local per-object sound fields and inter-object interactions, which are precomputed offline. The pressure field is computed by solving a global linear system consisting of per-object and inter-object interactions, and encoded efficiently as the strengths of equivalent sources. At runtime, the acoustic response at a moving listener is efficiently computed by performing a fast summation over all the equivalent sources.
Preprocessing: Assume a scene composed of K objects, A1, A2, . . . , Aκ and source position y. During the preprocessing stage, the elementary SH source Sim(x,y) are expressed in terms of the incoming equivalent sources of these objects. Next, the global solve step is performed to compute the strengths of outgoing equivalent sources for all the objects (Cim)A1, (Cim)A2, . . . , (Cim)Aκ. These outgoing equivalent source strengths represent an efficient encoding of the pressure field of the scene. This step is repeated for all the elementary sources
in the spherical harmonic expansion (Equation 2), and the computed equivalent source strengths are stored for runtime use.
Runtime: The final acoustic response due to the directional source s(x, y) is computed using (Equation 13). The stored equivalent source strengths is used to compute the pressure fields pim(x) for the elementary spherical harmonic sources sim(x) at the listener position x:
The derivatives of the pressure field are computed as before (Equation 14). The derivative of the pressure field due to elementary source ∂pim(x)/∂x is computed by analytically differentiating the functions involved: the equivalent source basis functions Φout(x) (as defined in [Mehra et al. 2013]) and the source field Sim(x).
In this section, details for an exemplary implementation of a listener directivity approach described herein is disclosed. In this exemplary implementation, a 64-bit FASTBEM implementation of the fast multipole boundary element method (www.fastbem.com) is used. For ESM, the preprocessing code is implemented in MATLAB. The runtime code is implemented in C++ and has been integrated with Valve's Source game engine. The timing results for the BEM and ESM precomputation are measured on a 64-node CPU cluster (Xeon 5560 processor nodes, 8 cores, 2.80 GHz, 48 GB memory). The precomputation for each frequency is performed in parallel over all the nodes of the cluster. The timing results of the exemplary runtime system are measured on a single core of a 4-core 2.80 GHz Xeon X5560 desktop with 4 GB of RAM and NVIDIA GeForce GTX 480 GPU with 1.5 GB memory. Acoustic responses are computed up to the maximum simulation frequency of 1 kHz.
Measurement data: The source directivity data used in the exemplary system is extracted from real-world measurement data provided by Meyers et al. [PTB 1978]. The data is magnitude only and averaged over the frequencies in each octave band and is provided for all the octave bands within the frequency range of the sound sources. SH order L=3 or 4 is used for the source representation, which gives less than 10-15% error (see
where D is the measured directivity data and DSH is the SH-based source directivity representation (Equation 2). For spatial sound, the HRTF dataset provided by Algazi et al. is used, particularly for the KEMAR dummy. The SH coefficients of this HRTF are computed in a manner similar to Equation 5.
Rotating sources and time-varying directivity: The SH coefficients of the directivity after rotation by an angle φ about the z axis can be obtained by multiplying the vector of unrotated SH coefficients by the blockwise-sparse matrix given in [Green 2003, p. 23]. Matrices can also be derived for more general rotations about arbitrary axes [Green 2003, p. 21-26]. For handling time-varying directivity, Intel MKL is used to solve the linear system corresponding to the SH decomposition of the directivity function (Equation 5) at interactive rates during runtime.
Dynamic sources: In order to enable dynamic (moving) sources, pressure fields are precomputed due to the elementary SH sources at sampled positions in the environment and, at runtime, spatial interpolation is used to perform sound propagation for dynamic sources as proposed in [Raghuvanshi et al. 2010]. The memory requirement scales linearly with the number of sampled positions.
Auralization: Acoustic responses for elementary sources precomputed using the equivalent source technique and the SH coefficients of the HRTF are loaded into Valve's game engine upon startup. At runtime, the acoustic responses of the elementary sources are evaluated and extrapolated to the output frequency (22 kHz), similar to [Mehra et al. 2013]. SH coefficients are computed based on the specified source directivity and are used to generate the total pressure field and its derivatives at the listener position. As the listener (player) moves through the scene, the computed pressure and pressure derivatives are combined with the SH coefficients of the HRTF to compute binaural frequency responses to produce spatial sound at the listener. These frequency response computations are performed asynchronously from both the visual rendering pipeline and the audio processing pipeline. As shown in Table 2, an exemplary technique described herein can update binaural frequency responses at a rate of 10-15 Hz or more, which is sufficient for audio applications [IASIG 1999], while the game itself is able to perform visual rendering at 60 frames per second (or more). Audio processing is performed using FMOD (with Fourier transforms computed using Intel MKL), and rendered in frames of 1024 samples.
The BEM error tolerance and order of multipole expansion was set to 1% and 6, respectively. ESM error thresholds for scattering and interactive matrices are 15% and 1%, respectively. These error thresholds were chosen for interactive performance and can be reduced to achieve higher accuracy. The pressure fields produced by the two techniques agree within error of <5-10%. The offline BEM method has a much higher time and memory overhead, but the resulting pressure fields are more accurate.
In Table 1, the cost of precomputing the pressure fields using BEM and ESM are shown. Since the computation for all sound sources, the elementary SH sources and different frequencies are independent, all these computations can be easily performed in parallel. Table 2 shows the runtime performance and memory requirements of the ESM technique for computing the acoustic response for arbitrary source directivity at a moving listener. For the case of dynamic source in parallel buildings, 30 sampled positions in the scene were selected. FASTBEM computes the acoustic response (pressure evaluation) for the empty room and the furnished room at the listener position in 3 and 5 seconds, respectively. The storage cost for the BEM pressure fields is 716 MB and 1.4 GB for the empty and furnished room, respectively. BEM has much higher memory requirement than the ESM (Table 4 in [Mehra et al. 2013]) since it captures the sound-field interactions close to the surface of the object. ESM, on the other hand, captures the sound-field interactions outside the object's offset surface. As stated above,
In a related video, auralizations are modeled for the various scenes listed in Table 2 with principles and/or effects demonstrated described below. Parallel buildings and an empty living room: A comparison between the sound propagation for an omnidirectional source and a directional source is shown. Furnished living room: The sound propagation due to a television (a typical directional source) in the living room setting is shown and low-passing of diffracted sound behind doors is demonstrated.
Wall: Using an exemplary listener directivity approach described herein enables an listener to localize the direction of the sound source. Also, time-varying source directivity caused by the motion of an obstacle in front of the source is shown.
Reservoir (from Half-Life 2): An exemplary listener directivity approach described herein has been integrated with an existing game engine (Valve Source™ game engine) to generate realistic acoustic effects from directional sources and listener.
Christmas town: The source directivity of bell tower and diffraction low-passing of sound behind houses are demonstrated.
Dynamic directional sources: To demonstrate source and listener directivity for dynamic sources and listeners, a helicopter flying between two parallel buildings and a moving player is demonstrated.
Amphitheater: Propagated sound due to directional sources at various locations around a real-world environment.
Source modeling: In the SH decomposition of the directivity function, no significant ringing was observed for the first four octave bands. For the last two octave bands, ringing was resolved by standard techniques (windowing the truncated SH coefficients using Lanczos sigma factors). SH-based source representation described herein converges to the measured directivity with increasing SH order, as shown in
Sound propagation: In order to validate that an exemplary listener directivity approach described herein can correctly capture the effect of source directivity on sound propagation, validation experiments were performed against the Biot-Tolstoy-Medwin (BTM) method [Svensson et al. 1999]. This is an offline technique that provides a wideband reference solution with accurate diffraction in simple scenes and can incorporate source directivities. A data-driven SH-based directivity formulation described herein was integrated with the BTM Toolbox for MATLAB (www.iet.ntnu.norsvensson/software/index.html#EDGE). The BTM approach models edge diffraction by placing several secondary sources at positions sampled along the diffracting edges. Their intensities are determined only by the intensity of the sound source and the distance between the sound source and the secondary source. The strengths of each secondary source was scaled by the SH-approximated directivity function D(θ, φ), where (θ, φ) are determined by the direction vector from the source to the secondary source. BTM integral calculations were observed to be significantly slowed down due to the need to evaluate source directivity for each integration sample along each edge. BTM simulations were performed for musical instruments over three receiver positions and a fixed source position in the wall scene. Simulations for receivers 1, 2, and 3 took 57, 27, and 18 minutes, respectively. A supplementary video included receiver positions and auralizations.
The present subject matter discloses aspects related to incorporating modifiable source directivity and HRTF3 based listener directivity in a general frequency-domain wave-based sound propagation techniques. Some aspects disclosed herein can automatically model wave-effects from low frequency directional sources. Some aspects disclosed herein can handle analytical, data-driven, rotating or time-varying directivity at runtime. The present subject matter also discloses aspects related to an efficient method for performing spatial sound rendering at interactive rates.
In some embodiments, directional sources with sharp directivity patterns (delta functions) may be expensive to handle with our current approach since it would require a very high-order SH expansion. As such, other basis functions (such as wavelets) may be usable to handle sharp directivities.
Source formulation described herein can handle both near- and far-field sound radiation by directional sources (equation 1). In some embodiments, near-field directivity may require a set of dense measurements of complex frequency responses very close to the source at twice the Nyquist rate, which is currently unavailable. In such embodiments, near-field directivity may be determined when such a dataset becomes available in the future.
In some embodiments, hybridization of wave-based techniques with geometric approaches may be utilized to handle directional sources over the complete audible frequency range.
In some embodiments, support for artist-controlled directivity may be added, thereby allowing real-time feedback of the effect of directivity on propagated sound.
It will be appreciated that while magnitude-only directivity data may be used in some aspects of the present subject matter, aspects of the present subject matter can easily support complex data (e.g., magnitude and phase), which we plan to test in the future.
At step 802, one or more sound fields associated with a source or listener position may be computed prior to run-time. For example, SPM module 110 may be configured to precompute and store propagated sound fields or pressure fields due to SH sources.
In some embodiments, computing, prior to run-time, one or more sound fields associated with a source position may include using a one-point multipole method to represent a sound field radiated by a directional source and capturing directivity using spherical harmonic (SH) expansion.
In some embodiments, one or more precomputed sound fields may be associated with one or more elemental spherical harmonics sources located at the source position.
In some embodiments, one or more precomputed sound fields may represent one or more acoustic responses of the environment to any directivity at the source position.
At step 804, at run-time, source or listener directivity in an environment may be modeled using the one or more pressure fields and a wave-based sound propagation model. For example, SPM module 110 may compute a total sound field using a wave-based sound propagation technique at run-time, thereby allowing modeling of a source with a directivity that can be dynamically modified at run-time.
In some embodiments, run-time may occur when executing a sound propagation model application for the environment and precomputing may occur prior to run-time.
In some embodiments, modeling, at run-time and using the one or more sound fields and the wave-based sound propagation model, source or listener directivity in an environment may include generating, using the one or more sound fields, an acoustic response to an arbitrary time-varying or rotating directional source.
In some embodiments, modeling, at run-time and using the one or more sound fields and the wave-based sound propagation model, source or listener directivity in an environment may include performing a spherical harmonic (SH) decomposition of the source directivity, computing one or more SH coefficients corresponding to the SH decomposition, and computing an acoustic response of the environment, wherein the acoustic response is computed as a summation of the one or more sound fields evaluated at the listener position weighted by the one or more SH coefficients.
In some embodiments, computing an acoustic response using precomputed sound fields may involve convolving the acoustic response with source audio information to render the sound.
In some embodiments, acoustic responses for both ears may be computed using a head-related transfer function (HRTF).
In some embodiments, modeling, at run-time and using the one or more sound fields and the wave-based sound propagation model, source or listener directivity in an environment may include computing spatial sound as an inner product of spherical harmonic (SH) coefficients associated with an SH decomposition of the sound field and a head-related transfer function for a moving listener.
The disclosures of all of the references listed herein are hereby incorporated herein by reference in their entireties.
It will be understood that various details of the subject matter described herein may be changed without departing from the scope of the subject matter described herein. Furthermore, the foregoing description is for the purpose of illustration only, and not for the purpose of limitation, as the subject matter described herein is defined by the claims as set forth hereinafter.
This application claims the benefit of U.S. Provisional Patent Application Ser. No. 61/841,910, filed Jul. 1, 2013; the disclosure of which is incorporated herein by reference in its entirety.
This invention was made with government support under Grant Nos. IIS-0917040, CMMI-1000579, and 0904990 awarded by the National Science Foundation and W911NF-10-1-0506 awarded by the Army Research Office. The government has certain rights in the invention.
Number | Date | Country | |
---|---|---|---|
61841910 | Jul 2013 | US |