Virtual auditory displays (including computer games, virtual reality systems or computer music workstations) create virtual worlds in which a virtual listener can hear sounds generated from sound sources within these worlds. In addition to reproducing sound as generated by the source, the computer also processes the source signal to simulate the effects of the virtual environment on the sound emitted by the source. In a computer game, the player hears the sound that he/she would hear if he/she were located in the position of the virtual listener in the virtual world.
One important environmental factor is reverberation, which refers to the reflections of the generated sound which bounce off objects in the environment. Reverberation can be characterized by measurable criteria, such as the reverberation time, which is a measure of the time it takes for the reflections to become imperceptible. Computer generated sounds without reverberation sound dead or dry.
Reverberation processing is well-known in the art and is described in an article by Jot et al. entitled “Analysis and Synthesis of Room Reverberation Based on a Statistical Time-Frequency Model”, presented at the 103rd Convention of the Audio Engineering Society, 60 East 42nd St. N.Y., N.Y., 10165-2520.
As depicted in
FIG. 14 of Jot et al. depicts a reverberation model (Room) that breaks the reverberation process into “early”, “cluster”, and “reverb” phases. In this model, a single feed from the sound source is provided to the Room module. The early module is a delay unit producing several delayed copies of the mono input signal which are used to render the early reflections and feed subsequent stages of the reverberator. A Pan module can be used for directional distribution of the direct sound and the early reflections and for diffuse rendering of the late reverberation decay.
In the system of FIG. 14 of Jot et al. the source signal is fed to early block R1 and a reverb block R3 for reverberation processing and then fed to a pan block to add directionality. Thus, processing multiple source feeds requires implementing blocks R1 and R3 for each source. The implementation of these blocks is computationally costly and thus the total cost can become prohibitive on available processors for more than a few sound sources.
Other systems utilize angular panning of the direct sound and a fraction of the reverberation or sophisticated reverberation algorithms providing individual control of each early reflection in time, intensity, and direction, according to the geometry and physical characteristics of the room boundaries, the position and directivity patterns of the source, and the listening setup.
Research continues in methods to create realistic sounds in virtual reality and gaming environments.
According to one aspect of the invention, a method and system processes individual sounds to realistically render, over headphones or 2 or more loudspeakers, a sound scene representing multiple sound sources at different positions relative to a listener located in a room. Each sound source is processed by an associated source channel block to generate processed signals which are combined and processed by a single reverberation block to reduce computational complexity.
According to another aspect, each sound source provides several feeds which are sent separately to an early reflection block and a late reverberation block.
According to another aspect of the invention, the early reflection feed is encoded in multi-channel format to allow a different distribution of reflections for each individual source channel characterized by a different intensity and spectrum, different time delay and different direction of arrival relative to the listener.
According to another aspect of the invention, the late reverberation block provides a different reverberation intensity and spectrum for each source.
According to another aspect of the invention, the intensity and direction of the reflections and late reverberation are automatically adjusted according to the position and directivity of the sound sources, relative the position and orientation of the listener.
According to another aspect of the invention, the intensity and direction of the reflections and late reverberation are automatically adjusted to simulate muffling effects due to occlusion by walls located between the source and listener and obstruction due to diffraction around obstacles located between the source and the listener.
Additional features and advantages of the invention will be apparent in view of the following detailed description and appended drawings.
The present invention is a system for processing sounds from multiple sources to render a sound scene representing the multiple sounds at different positions in a room.
In
The control parameters for controlling the magnitudes of the delay, the transfer function of the low-pass filter, and the level of attenuation are indicated in FIG. 4. These control parameters are passed from an application to the reverberation processing model 20.
The delay elements 40 implement the temporal division between the reverberation sections labeled Direct (Direct path 32), Reflections (early reflection path 34), and Reverb (late reverberation path 36) depicted in FIG. 1.
The processing model for each sound source comprises an attenuation 44 and a low-pass filter 42 that are applied independently to the direct path 32 and the reflected sound 34 as depicted in
In one embodiment of the invention, all spectral effects are controlled by specifying an attenuation at a reference high frequency of 5 kHz. All low-pass effects are specified as high-frequency attenuations in dB relative to low frequencies. This manner of controlling low-pass effects is similar to a using a graphic equalizer (controlling levels in fixed frequency bands). It allows the sound designer to predict the overall effect of combined (cascaded) low-pass filtering effects by adding together the resulting attenuations at 5 kHz. This method of specifying low-pass filters is also used in the definition of the Occlusion and Obstruction properties and in the source directivity model as described below.
The “Direct filter” 42d is a low-pass filter that affects the Direct component by reducing its energy at high frequencies. The “Room filter” 42e in
As is well known in the art, multi-channel signals are fed to loudspeaker arrays to simulate 3-dimensional audio effects. These 3-dimensional effects can also be encoded into stereo signals for headphones. In
In the late reverberation block 70, the filtered W channel of the source signal is input through an all-pass cascade (diffusion) filter 72 to a tapped delay line 74 inputting delayed feeds as a 4-channel input signal into a feedback matrix 76 including absorptive delay elements 78. The 4-channel output of the feedback matrix is input to a shuffling matrix 80 which outputs a 4-channel signal which is added to the (L, R, SR, SL) outputs of the early reflection block.
The magnitude of each signal is adjusted according to whether it propagates through walls or diffracts around obstacles.
Occlusion occurs when a wall that separates two environments comes between source and listener, e.g., the wall separating S1 from the listener 10 in FIG. 2. Occlusion of sound is caused by a partition or wall separating two environments (rooms). There's no open-air sound path for sound to go from source to listener, so the sound source is completely muffled because it's transmitted through the wall. Sounds that are in a different room or environment can reach the listener's environment by transmission through walls or by traveling through any openings between the sound source's and the listener's environments. Before these sounds reach the listener's environment they have been affected by the transmission or diffraction effects, therefore both the direct sound and the contribution by the sound to the reflected sound in the listener's environment are muffled. In addition to this, the element which actually radiates sound in the listener's environment is not the original sound source but the wall or the aperture through which the sound is transmitted. As a result, the reverberation generated by the source in the listener's room is usually more attenuated by occlusion than the direct component because the actual radiating element is more directive than the original source.
Obstruction occurs when source and listener are in the same room but there is an object directly between them. There is no direct sound path from source to listener, but the reverberation comes to the listener essentially unaffected. The result is altered direct-path sound with unaltered reverberation. The Direct path can reach the listener via diffraction around the obstacle and/or via transmission through the obstacle. In both cases, the direct path is muffled (low-pass filtered) but the reflected sound form that source is unaffected (because the source radiates in the listener's environment and the reverberation is not blocked by the obstacle). Most often the transmitted sound is negligible and the low-pass effect only depends on the position of the source and listener relative to the obstacle, not on the transmission coefficient of the material. In the case of a highly transmissive obstacle (such as a curtain), however, the sound that goes through the obstacle may not be negligible compared to the sound that goes around it.
Additionally, different adjustments are made at different frequencies to model the frequency-dependent effects of occlusion and obstruction on the signals.
In a preferred embodiment, the reverberation block of
The values of these parameters may be grouped in presets to implement a particular Environment, e.g., a padded cell, a cave, or a stone corridor. In addition to these properties, toggle flags may be set to TRUE or FALSE by the program to implement certain effects when the value of the Environment_size property is modified. The following is a list of the flags utilized in a preferred embodiment.
If one of these flags is set to TRUE, the value of the corresponding property is affected by adjustments of the Environment_size property. Changing Environment_size causes a proportional change in all Times or Delays and an adjustment of the Reflections and Reverb levels. Whenever Environment_size is multiplied by a certain factor, the other Environment properties are modified as follows:
The following list describes the sound source properties, which, in a prefered embodiment of the present invention, control the filtering and attenuation parameters in the source channel block for each individual sound source:
The directivity of a sound source is modeled by considering inside and outside sound cones as depicted in
Within the inside cone, defined by Inside_angle, the volume of the sound is the same as it would be if there were no cone, that is the Inside_volume_dB is equal to the volume of an omni directional source. In the outside cone, defined by an Outside_angle, the volume is attenuated by Outside_volume_dB. The volume of the sound between Inside_angle and Outside_angle transitions from the inside volume to the outside volume. A source radiates its maximum intensity within the Inside Cone (in front of the source) and its minimum intensity in the Outside Cone (in back of the source). A sound source can be made more directive by making the Outside_angle wider or by reducing the Outside_volume_dB.
The following equations control the filtering and attenuation parameters in the source channel block for each individual sound source, according to the values of the Source and Environment properties, in a prefered embodiment depicted in FIG. 4.
The direct-path filter and attenuation 42d and 44d in
direct—0 Hz_dB=−20*log 10((min_dist+ROF*(dist−min_dist))/min_dist)+Occl_dB*Occl—LF_ratio+Obst_dB*Obst—LF_ratio+direct—0 Hz_radiation_dB; and
direct—5 kHz_dB=−20*log 10((min_dist+ROF*(dist−min dist))/min_dist)+Air_abs_HF_dB*Air_abs_factor*ROF*(dist−min_dist)+Occl_dB+Obst_dB+direct—5 kHz_radiation_dB.
In the above expression of direct—0 Hz_dB, direct—0 Hz_radiation_dB is a function of the source position and orientation, listener position, source inside and outside cone angles and Outside_volume_dB. Direct—0 Hz_radiation dB is equal to 0 dB for an omnidirectional source. In the expression of direct—5 kHz_dB, direct—5 kHz_radiation_dB is computed in the same way, except that Outside_volume_dB is replaced by (Outside volume_dB+Outside_volume_HF_dB).
The reverberation filter and attenuation 42e and 44r in
room—0 Hz_dB=−20*log 10((min_dist+Room—ROF*(dist−min_dist))/min_dist)−60*ROF*(dist−min_dist)/(c0*Decay_time)
+MIN(OCCL_dB*(Occl—LF_ratio+Occl_Room_ratio), room—0 Hz_radiation_dB); and
room—5 kHz_dB=−20*log 10((min_dist+Room—ROF*(dist−min_dist))/min_dist)
+AIR_abs—HF_dB*ROF*(dist−min_dist)−60*ROF*(dist−min_dist)/(c0*Decay_time—5 kHz)+min(Occl_dB*(1+Occl_Room_ratio), room—5 kHz_radiation_dB); and
c0 is the speed of sound (=340 m/s).
In the expression of room—0 Hz_dB, room—0 Hz_radiation_dB is obtained by integrating source power over all directions around the source. It is equal to 0 dB for an omnidirectional source. An approximation of room—0 Hz_radiation_dB is obtained by defining a “median angle” (Mang) as shown in the equations below, where angles are measured from the front axis direction of the source:
room—0 Hz_radiation_dB=10*log 10([1−cos(Mang)+Opow*(1+cos(Mang))]/2);
where:
Mang=[Iang+Opow*Oang]/[1+Opow];
lang, Oang: inside and outside cone angles expressed in radians;
Opow=10^(Outside_volume/10).
In the expression of room—5 kHz_dB, room 5kHz_radiation_dB is computed in the same way as room—0 Hz_radiation_dB, with:
Opow=10^([Outside_volume+Outside_volume—HF]/10).
The more directive the source, the more the reverberation is attenuated. When Occlusion is set strong enough, the directivity of the source no longer affects the reverberation level and spectrum. As Occlusion is increased, the directivity of the source is progressively replaced by the directivity of the wall (which we assume to be frequency independent).
The early reflection attenuation 44e in
early—0 Hz_dB=room—0 Hz_dB−20*log 10((min_dist+ROF*(dist−min_dist))/min_dist).
The invention has now been described with reference to the preferred embodiments. In a preferred embodiment the invention is implemented in software for controlling hardware of a sound card utilized in a computer. As is well-known in the art the invention can be implemented utilizing various mixes of software and hardware. Further, the particular parameters and formulas are provided as examples and are not limiting. The techniques of the invention can be extended to model other environmental features. Accordingly, it is not intended to limit the invention except as provided by the appended claims.
This application is a continuation of and claims priority from application Ser. No. 09/441,141, filed Nov. 12, 1999 now U.S. Pat. No. 6,188,769, which is a continuation of and claims priority from provisional application No. 60/108,244 filed Nov. 13, 1998, the disclosures of which are both incorporated herein by reference.
Number | Name | Date | Kind |
---|---|---|---|
4731848 | Kendall et al. | Mar 1988 | A |
4817149 | Myers | Mar 1989 | A |
5436975 | Lowe et al. | Jul 1995 | A |
5555306 | Gerzon | Sep 1996 | A |
5666136 | Fujishita et al. | Sep 1997 | A |
5812674 | Jot et al. | Sep 1998 | A |
Number | Date | Country | |
---|---|---|---|
20010024504 A1 | Sep 2001 | US |
Number | Date | Country | |
---|---|---|---|
60108244 | Nov 1998 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 09441141 | Nov 1999 | US |
Child | 09782908 | US |