The present invention relates to the field of room impulse response simulation and, more particularly, to methods and systems for generating simulated room impulse responses including spatially enveloping reverberation.
Sound characteristics of an enclosure are generally due to a combination of direct sound received from a sound source, as well as indirectly received sound due to multiple reflections of the sound from the boundaries and other surfaces within the enclosure. In general, the transmitted sound may be reflected, absorbed and/or diffused by various surfaces within the enclosure prior to reaching the receiver. The absorption, reflectivity and diffusion characteristics of each surface may also vary as a function of frequency. The sound characteristics of an enclosure may be described with respect to a room impulse response (also referred to herein as an impulse response) between the sound source and the receiver.
Room impulse responses for an enclosure may also be used to determine various psychoacoustic parameters. The psychoacoustic parameters are related to acoustical attributes of an enclosure and are generally correlated with acoustical qualities of the enclosure. For example, the psychoacoustic parameters may be used to characterize an enclosure in terms of it's spaciousness, envelopment, clarity, reverberance and warmth of sound.
Room impulse responses may be measured or simulated. Room impulse responses, as well as psychoacoustic parameters, may be used to design acoustically desirable enclosures. Room impulse responses may also be combined with a desired sound signal, to create a virtual listening environment for the sound signal.
The present invention is embodied in methods and systems for simulating at least one room impulse response between two or more sound sources and two or more receivers positioned in an enclosure. At least one early impulse response is generated that includes early reflections from the two or more sound sources to at least one of the receivers. At least one late impulse response is generated which includes a reverberation portion. The late impulse response is generated to spatially shape the reverberation portion corresponding to a spatial parameter of the enclosure. The at least one early impulse response is combined with the at least one late impulse response to form the at least one simulated room impulse response.
The invention may be understood from the following detailed description when read in connection with the accompanying drawings. It is emphasized that, according to common practice, various features/elements of the drawings may not be drawn to scale. On the contrary, the dimensions of the various features/elements may be arbitrarily expanded or reduced for clarity. Moreover, in the drawings, common numerical references are used to represent like features/elements. Included in the drawing are the following figures:
In conventional room impulse response simulation, including simulation of binaural room impulse responses, there may be a substantial computational load to simulate a full scope of the room impulse response. To reduce the computational load, conventional simulators often simulate an early part of the room reflections, while appending an artificially produced late part (typically referred to as the reverberation tail) to the binaural simulation of the early part of the room impulse response.
Conventional room impulse response simulators, however, do not take into account the psychoacoustic qualities of the enclosure when generating the reverberation tail. For example, one acoustical quality of an enclosure is its perceived spaciousness. In general, spaciousness includes an apparent source width (ASW) of the early part of the room impulse response and a listener envelopment (LEV) of the reverberant tail. Both the ASW and the LEV may be determined for enclosures from the interaural cross correlation coefficient (IACC). The IACC is a measure of the difference in sounds arriving at each of the ears at any instant in time. For example, a sound wave that arrives laterally to a listener may be received by one ear earlier than the other, and the character of the sound may be different (due to the intervening head). Accordingly, the IACC may provide a measure of spatial impression of the enclosure. Typically a measure of the IACC from the direct sound to about 80 msec is used to determine the ASW and a measure of the IACC after 80 msec is used to determine the LEV.
According to aspects of the present invention, the late impulse response is generated to include a perceived listener envelopment. The present invention uses deterministic coded signals to generate the late reverberation tail using a spatial shaping matrix. The spatial shaping matrix may be selected to provide a perceived spatially sounding enveloping reverberance. The reverberation tail may be appended to an early impulse response, which may also include a measure of perceived spaciousness. By including the spaciousness in the early part of the impulse response, as well as in the reverberation tail, the simulated room impulse response may have a more natural perceived spaciousness. The simulated room impulse responses may be used for filtering music and or speech signals. The signals may be rendered binaurally through headphones or transaural systems. The present invention may also be extended to multiple channels of spatial reverberation. The present invention may be used for artificial reverberators, active field synthesizers, for producing digital sound and for audio mixing devices. The present invention may also be used in virtual reality systems.
Referring to
As shown in
where n represents the number of sources, p represents the number of receivers, X(t) represents the source signal for the ith source, t represents time and * represents the convolution operation. As can be seen by
Accordingly, all of the impulse responses between n sources 102 and p receivers 104 may be represented in vector form,
In addition, the source signals may be represented in vector form as:
and the received signals may be represented in vector form as:
Referring next to
Memory 210 may store a plurality of predetermined enclosure parameters for use in generating the simulated room impulse response. For example, the predetermined enclosure parameters may include at least one of enclosure dimensions (e.g., length, width and height), acoustic properties (e.g., absorption characteristics or diffusion characteristics over a plurality of frequency bands for one or more surfaces of the enclosure) and psychoacoustic properties (e.g., an interaural cross correlation (IACC)) for a plurality of predetermined enclosures. Memory 210 may also store one or more simulated room impulse responses,
Controller 202 may be a conventional digital signal processor that controls generation of simulated room impulse responses in accordance with the subject invention. System 200 may include other electronic components and software suitable for performing at least part of the functions of generating the simulated room impulse response.
Referring to
In general, room impulse response 302 is determined as a function of time. Room impulse response 302 includes direct sound component 304, early reflections 306 and reverberation tail 308. The early impulse response
In
Controller 202 may be configured to select predetermined enclosure parameters from memory 210 for generating a simulated room impulse response. Controller 202 may configure early IR generator 204 with the selected enclosure parameters retrieved from memory 210. Thus, early IR generator 204, as configured by the controller 202, may generate an early impulse response,
Early IR generator 204 may generate the early impulse response based on the enclosure parameters (e.g., the enclosure dimensions and acoustical parameters of the enclosure) and the location of each source and each receiver in the room. According to the present invention, the early impulse response components 304 and 306 may be determined based on the propagation path lengths of the respective component in the enclosure from the source to the receiver. Early reflections 306 may include, for example, first and second order reflections of sound from surfaces of the enclosure. The time delay may be determined from the speed of sound of the fluid (e.g. 341 m/s for air under ambient conditions). For example, ray tracing techniques or image source modeling may be used to estimate the delay time and the amplitude of each reflection. Examples of simulating the early impulse response may be found, for example, in U.S. 2008/0273708 to Sandgren et al., entitled “Early Reflection Method for Enhanced Externalization,” the contents of which are incorporated herein by reference.
Controller 202 may also configure late IR generator 206 with the selected enclosure parameters retrieved from memory 210. Thus, late IR generator 206, as configured by the controller 202, may generate a late impulse response,
In conventional room impulse response simulators, reverberation tail 308 is typically simulated using statistical methods. For example, a pseudorandom sequence may be used with an exponential decay to simulate reverberation tail 308. The conventional methods, however, do not take into account the psychoacoustic properties of the enclosure, such as the spaciousness of the enclosure. According to an exemplary embodiment, late IR generator 206 incorporates a spatial shaping matrix to reverberation tail 308, based on the psychoacoustic parameters of the enclosure. Accordingly, any spatial envelopment present in the early impulse response may be matched by reverberant tail 306, thus, providing a more natural sounding listening experience. Late IR generator 206 is described further below with respect to
Controller 202 may also configure room impulse response generator 208 to combine the early impulse responses
Room impulse response generator 208 in general, may concatenate the early impulse responses generated by early IR generator 204 with the late impulse responses generated by late IR generator 206. For example,
According to an exemplary embodiment, the early impulse response may be determined by early IR generator 204 for about the first 80 to 100 ms of the room impulse response. The late impulse response may be generated by late IR generator 206 for the remaining portion of the impulse response. The duration of the late impulse response generally depends on the reverberation time for the enclosure.
System 200 may optionally include display 216 configured to display at least one of early impulse responses
System 200 may optionally include user interface 218, e.g., for use in selecting the enclosure parameters to simulate the room impulse response. User interface 218 may further be used to select enclosure parameters, impulse responses and other sound signals to be displayed and/or stored. User interface 218 may include a pointing device type interface for selecting control parameters using display 216. User interface 218 may further include a text interface for entering information, for example, a filename for storing the simulated room impulse response, such as in memory 210 or in a remote device (not shown).
System 200 may optionally include loudspeaker 214 for playing back the simulated room impulse responses. Loudspeaker 214 may include any loudspeaker capable of playing back the simulated room impulse responses.
System 200 may optionally include virtual room convolver 212 for convolving source sound signals
It is contemplated that system 200 may be configured to connect to a global information network, e.g. the Internet, (not shown) such that simulated room impulse response may also be transmitted to a remote location for further processing and/or storage.
A suitable controller 202, early IR generator 204, late IR generator 206, room impulse response generator 208, memory 210, virtual room convolver 212, loudspeaker 214, display 216 and user interface 218 for use with the present invention will be understood by one of skill in the art from the description herein.
Referring next to
Coded sequence generator 402 generates a coded pseudorandom sequence, referred to as
where m(t) represents a MLS sequence and mR(t) represents a reciprocal MLS-sequence. In general, any number of sources mv(t)=m(t) mR(t+v) may be used, where v is an integer greater than or equal to 1. Generation of a reciprocal MLS may be obtained from the time-reversed version m(t), such that mR(t)=m (−t). Reciprocal pairs of MLS sequences may be easily generated, via time-reversal. In addition, the cross-correlation values of reciprocal MLS sequences are also sufficiently low, to allow for creation of a maximum desired perceived spaciousness.
Any suitable MLS-related sequence may be used, where the sequence possesses a pulse-like periodic autocorrelation function and where the periodic cross-correlation function between any pair of sequences selected from the set includes a peak values that is significantly lower than the peak value of the autocorrelation function. Other example sequences include, for example, Gold sequences and Kasami sequences. In this manner, a large number of sequences may be generated, from among which any pair possesses a low-valued cross-correlation. Examples of generating reciprocal MLS-related sequences may be found, for example, in Xiang et al., entitled “Simultaneous acoustic channel measurement via maximal-length-related sequences,” JASA vol. 117 no. 4, April 2005, pp. 1889-1894 and Xiang et al., entitled “Reciprocal maximum-length sequence pairs for acoustical dual source measurements,” JASA vol. 113 no. 5, May 2003, pp. 2754-2761, the contents of which are incorporated herein by reference.
Spatial shaping block 404 receives coded sequence
Referring to
S
1(t)=k1mR(t)+m(t)
S
2(t)=k2m(t)+mR(t) (6)
where
As shown in
As shown in
Referring back to
Decay shape generator 408 receives the filtered signals
and where RTband represents the reverberation time for the enclosure for the respective octave or third octave band. The reverberation time represents an acoustic parameter that may be stored in memory 210 (
Referring to
Late IR generator 206′ is similar to late IR generator 404 (
Spatial shaping generator 404′ applies a mixing matrix to the filtered signals, as described further below. For a two channel system, the spatially shaped signals
S
1
m(t)=k1B2m(t)+B1m(t)
S
2
m(t)=k2B1m(t)+B2m(t) (8)
where
Equation (8) may be rewritten in matrix form as:
where the attenuation coefficients may be formulated as a mixing matrix. In eq. (9) the individual attenuation coefficient subscripts have been dropped.
Referring to
The mixing matrix may be selected to match a predetermined spatial index for a particular enclosure. The spatial index may be stored as one of the enclosure parameters in memory 210 (
For each frequency band m, the attenuation coefficients may be selected for each channel to control the amount of perceived spaciousness for the shaped response. In general, combining two channels together (i.e. combining B1(t) and B2(t)) tends to decrease a perceived spaciousness. Accordingly, if the attenuation coefficient k is set to 1, B1(t) is maximally combined with B2(t), and there is no perceived spaciousness for the channel. In contrast, if the attenuation coefficient k is set to 0, only one filtered signal (i.e., B1(t) or B2(t) depending on the channel in eq. (8)), and there is high perceived spaciousness for the channel.
In general, the spatial index for the reverberant tail relates to the late IACC, as described above. A spatial index may be determined for a number of predetermined enclosures, over a number of frequency bands m. The mixing matrix may be determined to substantially match the spatial index, for each of the predetermined enclosures.
Referring to
Although
The mixing matrix may be selected to substantially match a spatial index for a predetermined enclosure, as described above.
Referring to
In
Referring back to
Equation (11) may also be represented in matrix form as:
The IACC may be stored in memory 210 (
According to an exemplary embodiment of the present invention, equation (12) may also be expanded for multiple channels as:
By using eqs. (11-13), the summed broadband late-impulse response may be further controlled for a desired overall spatial index profile. For example, referring to
Referring to
At step 1102, spatial coefficients corresponding to the predetermined enclosure are selected. The spatial coefficients may include spatial coefficients to be applied to the early impulse response and attenuation coefficients to be applied to the late impulse response. For example, controller 202 (
At step 1104, early impulse responses are generated for two or more sources and receivers, for example, by early IR generator 204 (
At optional step 1110, the simulated room impulse response may be stored, for example, by memory 210 (
Referring to
At step 1204, the spatially shaped signals are band-pass filtered over a plurality of frequency bands, for example, by bandpass filter 406 (
Referring to
At step 1214, spatial shaping is applied to the filtered signals, over each frequency band, for example, by spatial shaping generator 404′ (
At optional step 1218, an IACC shaping is applied to the late impulse responses, for each frequency band, for example, by IACC shaping applicator 410 (
Referring next to
The test results are shown in
A second test included comparing spatially shaped and spatially unshaped spatial profiles in the late room impulse response. By spatially unshaped, the spatial index over each frequency band is substantially the same, without including a shape of the naturally measured room characteristics. The second test provides a comparison for sources directed to a side of a binaural receiver (i.e., such that there is a delay in the received sound to each ear) and for sources directly in front of a binaural receiver (i.e., so that each ear receives the sound at the same time). For sources directly in front of the binaural receiver, 55.56 percent of the subjects (18 total subjects) selected the spatially shaped profile, 22.22 percent of the subjects selected the unshaped profile and 22.22 percent did not perceive a difference. For sources located to the side of the binaural receiver, 72.22 percent of the subjects (18 total subjects) selected the spatially shaped profile, 16.67 percent of the subjects selected the unshaped profile and 11.11 percent did not perceive a difference. These test results clearly indicate that including spatial shaping according to embodiments of the present invention is a better approach as compared with conventional reverberation tail simulators.
A third test compared measured and simulated room impulse responses which were spatially shaped according to embodiments of the present invention. 44.44 percent of the subjects (18 total subjects) selected the measured shaped profile, 33.33 percent of the subjects selected the unshaped profile and 22.22 percent did not perceive a difference, indicating that reverberation tails simulated with an exemplary spatial shaping generator according to the present invention produces a similar perceived listening experience as compared to measured room impulse responses.
Although the invention has been described in terms of systems and methods for generating simulated room impulse responses including spatially enveloping reverberation, it is contemplated that one or more components may be implemented in software on microprocessors/general purpose computers (not shown). In this embodiment, one or more of the functions of the various components may be implemented in software that controls a general purpose computer. This software may be embodied in a computer readable medium, for example, a magnetic or optical disk, or a memory-card.
Although the invention is illustrated and described herein with reference to specific embodiments, the invention is not intended to be limited to the details shown. Rather, various modifications may be made in the details within the scope and range of equivalents of the claims and without departing from the invention.
This application is related to and claims the benefit of U.S. Provisional Application No. 61/198,826 entitled SPATIALLY ENVELOPING REVERBERATION IN SOUND FIXING, PROCESSING, AND ROOM-ACOUSTIC SIMULATIONS USING CODED SEQUENCES filed on Nov. 10, 2008 and U.S. Provisional Application No. 61/253,971 entitled SPATIALLY ENVELOPING REVERBERATION IN SOUND FIXING, PROCESSING, AND ROOM-ACOUSTIC SIMULATIONS USING CODED SEQUENCES filed on Oct. 22, 2009, the contents of which are incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
61198826 | Nov 2008 | US | |
61253971 | Oct 2009 | US |