FIELD OF THE INVENTION
The present invention relates generally to a method of analyzing and simulating a 3D sound environment in an audio system, using an at least two-channel reproduction device.
BACKGROUND OF THE INVENTION
It is a fact that surround and multi-channel sound tracks are gradually replacing stereo as the preferred standard of sound recording. Many new audio devices are equipped with surround capabilities, and most new sound systems sold today are multi-channel systems equipped with multiple speakers and surround sound decoders. In fact, many companies have devised algorithms that modify old stereo recordings so that they will sound as if they were recorded in surround. Other companies have developed algorithms that upgrade older stereo systems to produce surround-like sound using only two speakers. Stereo-expansion algorithms enlarge perceived ambiance, and many sound boards and speaker systems contain the circuitry necessary for delivering expanded stereo sound.
3-D positioning algorithms take matters a step further by seeking to place sounds in particular locations around the listener—to his left or right, above him or below, all in respect to the image displayed. These algorithms are based upon simulating psycho-acoustic cues, replicating the way sounds are actually heard in a 360-degree space. These algorithms often use a head-related transfer function (HRTF) to calculate a sound heard at the listener's ears relative to the spatial coordinates of the sound's origin. For example, a sound emitted by a source located to one's left is first receipted by the left ear and only a split second later by the right one. The relative amplitude of different frequencies also varies, due to the directionality of the pinna and the obstruction of the listener's own head.
As stated above, an HRTF is the measured transformation of sound from a point in space to a specific eardrum. Reproducing the same acoustic information at the ear drums as found in natural free-field listening can create a virtual sound source.
Therefore it is clear that attempts are being made to improve the methods for acquiring HRTF data in order to improve, in turn, the capability to simulate virtual sound sources, using a headphone or speakers. Two of these prior art methods are:
- 1) using a dummy head with a microphone placed in the location of the ear drums, the dummy head simulating the human head and ears, and
- 2) placing small microphones inside a subject's ear canal. Due to physical limitations, microphones are placed only halfway into the ear canal.
The measured microphone output represents the individual or dummy head's specific HRTF information. In order to simulate a virtual sound source, the sound signal is convolved with the measured HRTF information.
The above-mentioned prior art methods suffer from the following drawbacks:
- 1) Since each person has unique HRTF data which represents his unique ears and head sound transformation, the result of using non-individualized HRTF data which was measured using a dummy head or a specific subject, causes a non-satisfactory 3D sensation. This problem affects mostly the higher frequencies, thus causing front-back confusion and an “inside the head” sensation.
- 2) Another drawback is that the measurements were done near the eardrum, yet the reproduction is done outside the ear, causing the sound to be convolved twice, once using the reference HRTF of the dummy head or specific subject and once using the individual HRTF of the person listening to the headphones. This, of course, causes an inaccurate reproduction of the sound, resulting in an unsatisfactory 3D audio sensation.
- 3) In order to conduct such an experiment, additional stimulation and measurement equipment must be used. Such equipment (speakers, amplifiers, microphones, etc.) would inevitably influence the measurement by distorting the stimuli and the measured signals. Some components have a linear transfer function, such as the room, the air, the head, the pinnas and ear canals; some have a non-linear transfer function, such as amplifiers, speakers and microphones. A skillful conductor of such an experiment would be able to eliminate the linear influence of the measurement equipment by pre-measuring its frequency response and taking that into account during the analysis. However, current signal-processing techniques are usually unable to eliminate the non-linear portions of equipment distortion.
- 4) In prior art two-speaker surround systems, the listener must be located exactly between the speakers. Any deviation from that spot results in a distorted sound image.
- 5) Prior art two-speaker surround systems perform well only in symmetrical environments. The speakers must be matched and the room's acoustics must be symmetrical. This restriction prevents many users from enjoying surround sound over two speakers.
- 6) Prior art 3D headphone systems provide non-satisfactory 3D sound, mainly causing front-back confusion and an “inside the head” sensation.
SUMMARY OF THE INVENTION
It is therefore a broad objective of the present invention; to provide a measurement and reproduction method and a system which overcomes the disadvantages of the prior art technology, in that it adapts itself to the listener's HRTF data, thus achieving the most accurate 3D sound reproduction; is adapted for reproduction of sound outside the ear canal; cancels out distortion and the influence of both the linear and non-linear portions of the measurement equipment; creates a virtual surround sound environment, while using less speakers (two or more) and without requiring the user to sit in the center or to change his room's acoustic behavior, and provides significantly better 3D simulation using headphones, in which simulated sound sources are perceived “out of the head” and without any tonal change whatsoever.
In accordance with the present invention, the above objective is achieved by providing a method for simulating a 3D sound environment in an audio system using an at least two-channel reproduction device, said method comprising generating first and second pseudo head-related transfer function (HRTF) data, first using at least one speaker and then using headphones; dividing said first and second frequency representation of said data or using a deconvolution operator on the time domain representation of said first and second data, or subtracting the cepstrum representation of said first and second data, and using the results of said division or subtraction to prepare filters having an impulse response operable to initiate natural sounds of a remote speaker for preparing at least two filters connectable to said system in the audio path from an audio source to sound reproduction devices to be used by a listener.
The invention also provides a method for simulating a 3D sound environment using at least one speaker, said method comprising placing a dummy head having dummy left and right ears, pinnas and ear canals, in a selected acoustic environment; recording first and second head-related transfer functions (HRTF) sound data transmitted via said speaker and received at said dummy head by first and second microphones; recording third and fourth HRTF sound data transmitted to said dummy head via a pair of headphones; preparing transfer functions for left and right ear filters for each audio source channel by dividing, deconvolving or subtracting, respectively, said first and second frequency representation of said sound data and said third and fourth sound data of each speaker, and introducing said left and right filters in a sound reproduction system between each audio source channel and two sound transducers connected to said system.
The invention further provides a method for simulating a 3D sound environment using at least one speaker, said method comprising locating a listener's head, fitted with a miniature microphone in each ear canal, in a selected acoustic environment; recording first and second head-related transfer functions (HRTF) sound data transmitted via said speaker and received by said microphones; recording third and fourth HRTF sound data transmitted to said listener's head via said microphones; preparing transfer functions for left and right ear filters for each audio source channel by dividing, deconvolving or subtracting, respectively, said first and second frequency representation of said sound data and said third and fourth sound data of each speaker, and introducing said left and right filters in a sound reproduction system between each audio source channel and two sound transducers connected to said system.
The invention still further provides an audio system for simulating a 3D sound environment having an audio source, audio reproducing and processing means and at least two speakers or headphones, said system comprising at least two filters, each filter being connected between said audio source and one of said speakers or headphones; each of said filters being characterized by an impulse response obtained by generating pseudo head-related transfer functions prepared by the method described herein.
BRIEF DESCRIPTION OF THE DRAWINGS
The invention will now be described in connection with certain preferred embodiments with reference to the following illustrative figures so that it may be more fully understood.
With specific reference now to the figures in detail, it is stressed that the particulars shown are by way of example and for purposes of illustrative discussion of the preferred embodiments of the present invention only, and are presented in the cause of providing what is believed to be the most useful and readily understood description of the principles and conceptual aspects of the invention. In this regard, no attempt is made to show structural details of the invention in more detail than is necessary for a fundamental understanding of the invention, the description taken with the drawings making apparent to those skilled in the art how the several forms of the invention may be embodied in practice.
In the drawings:
FIGS. 1A and lB illustrate a system for recording an audio signal originating in an open field transducer inside a human ear canal, using a dummy head;
FIGS. 2A and 2B illustrate a system for headphone measurement, using the same equipment used for the open air experiment illustrated in FIG. 1;
FIGS. 3A and 3B are schematic illustrations of a subject listening to an audio track, using one speaker;
FIGS. 4A and 4B are schematic illustrations of a subject listening to an audio track, using headphones;
FIGS. 5A and 5B are schematic illustrations of a 3D sound environment virtualizing system for headphones;
FIG. 6 is a schematic illustration of an on-site measurement system for a speaker-based 3D sound environment virtualizing system;
FIG. 7 illustrates a speaker-based 3D sound environment virtualizing system, and
FIG. 8 illustrates a two-speaker virtualizing system, simulating three virtual speakers.
DETAILED DESCRIPTION
FIGS. 1A and lB depict a system 2 for recording an audio signal originating in an open field, using a dummy head and a transducer located in place of each ear drum. The signal is recorded in order to obtain the HRTF parameters for one specific angle α, representing, in this case, a front left speaker. The signal generator 4 generates the test signal used for the measurement. The signal is amplified by a power amplifier 6 and reproduced by a speaker 8. The acoustic characteristics of the room 10 affect the sound, adding early reflections and reverberations to the direct sound. The influence of the room is different in each location, hence, the sound arriving at the left ear of head 12 is different from the sound arriving at the right one. The head 12 affects the sound, reflecting it into one ear 14 and obstructing it from the other ear 14′. The sound passes through the pinnas 16, 16′ and ear canals 18, 18′ of head 12 before it is recorded by microphones 20, 20′. The output signals of microphones 20, 20′ are amplified by microphone pre-amplifiers 22, 22′ and analyzed by signal analyzer 24. Signal analyzer 24 analyzes two separate signals: one of the left ear 14, and one of the right ear 14′.
By comparing the original signals and the measured signal, the conductor of the test can obtain the transfer function of the entire system.
In practice, the obtained transfer function is comprised of a series of transfer functions of each and every component in the signal path. The measured transfer functions DSα(Left) and DSα(Right) can be represented as a multiplication of several transfer functions (refer to blocks 6 through 22 in FIG. 1B):
wherein:
- the transfer functions mentioned above depend on the speaker angle α and are marked Hα;
- DS is Dummy-Speaker constellation, and
- {circle around (X)} is Convolution operator (in time domain environment).
Referring now to FIGS. 2A and 2B, there is illustrated a setup for headphone measurement, using the same equipment used in the open field system and method described above with regard to FIGS. 1A and 1B. This time, the audio source is headphones 26, 26′, which are placed on the dummy head 12. The angle β between the head's median plane and the axis of each ear-piece of headphones 26, 26′ is fixed, and depends on the mechanical structure of the headphones. The signal generator 4 generates the test signal, which is amplified by power amplifiers 6, 6′ and reproduced by the headphones 26, 26′. The sound passes through the pinnas 16, 16′ and ear canals 18, 18′ of dummy head 12 before it is recorded by microphones 20, 20′. The output signal of the microphones 20, 20′ is amplified by microphone pre-amplifiers 22, 22′ and analyzed by signal analyzer 24. Signal analyzer 24 analyzes two separate signals: one from the left ear 14 and one from the right 14′.
By comparing the original signals with the measured signal, the conductor of the test can obtain the transfer function of this system.
The measured transfer functions DPβ(Left) and DPβ(Right) can be represented as a multiplication of several transfer functions (refer to blocks 6, 6′ to 22, 22′ in FIG. 2B):
wherein:
- DP is Dummy-Headphones constellation.
FIGS. 3A and 3B describe the situation of a person listening to audio source 25 via a single speaker 8. The audio source 25 generates the audio signal, which is amplified by power amplifier 6 and reproduced by speaker 8. The acoustic characteristics of the room 10 affect the sound, adding early reflections and reverberations. The influence of the room is different in each location, hence, the sound arriving at the left ear 28 is different from that arriving at the right one 28′. The person's head 12′ affects the sound by reflecting it into one ear 28 and obstructing it from the other 28′. The sound passes through the pinnas 30, 30′ and ear canals 32, 32′, causing the left and-right eardrums 34, 34′ to vibrate. The vibrations are translated into nerve impulses by the inner ears; these impulses finally arrive at the user's brain. While traveling to the brain, the original audio track is modified. The overall modification can be described as a series of blocks, each of which has a different transfer function (refer to blocks 6, 6′ to 34, 34′ in FIG. 3B).
Provided that the transfer function of the entire system is HSα(Left) and HSα(Right):
wherein:
- HS is Human-Speaker constellation.
FIGS. 4A and 4B, corresponding to FIGS. 2A and 2B, illustrate a person listening to audio material via headphones. The audio source 25 generates the audio signal, which is amplified by power amplifiers 6, 6′ and reproduced by headphones 26, 26′. The sound passes through the person's pinnas 30, 30′ and ear canals 32, 32′, causing the left and right eardrums 34, 34′ to vibrate. The inner ear translates the vibrations into nerve impulses and those impulses finally arrive at the brain. The original audio track is modified during its path to the brain. The overall modification can be described as a series of blocks, each of which has a different transfer function (refer to blocks 6, 6′ to 34, 34′ in FIG. 4B).
Provided that the transfer function of the entire system is HPα(Left ear) and HPα(Right ear):
wherein:
- HP is Human-Headphones constellation.
A headphones virtualizing system is shown in FIGS. 5A and 5B. In this system, two filters 36, 36′ are placed in the path of the audio material. The rest of the audio path is similar to that described above with regard to FIGS. 4A and 4B.
The transfer function of the left filter 36 in prior art surround headphones, is:
F(Left)α=DS(Left)α (9)
The transfer function of the right ear filter 36′ is:
F(Right)α=DS(Right)α (10)
According to the present invention, different filters are used. The transfer function of the left ear filter 6 is:
The transfer function of the right ear filter 6′ is:
The overall transfer function of that system would be:
wherein:
- HV is Human-Virtualized constellation.
Alternatively, instead of dividing the right and left data, the filters can be calculated by using a deconvolution operator on the time domain representation of the right and left data, or subtracting the cepstrum representation of the right and left data.
An on-site measurement system for a speaker based virtualizer system according to the present invention, is illustrated in FIG. 6. The purpose of this measurement is to obtain information about the real playback conditions in the listener's playback room. The measurement is based on miniature microphones placed close to, or inside, the listener's ear canal. The speaker quality, speaker placement and room acoustics affect the measurement. In contrast to prior art speaker virtualizing systems, speaker placement is not important; the system will perform well even in non-symmetrical environments. The signal generator 4 generates the test signal used for the measurement. The signal is amplified by power amplifier 6, 6′ and reproduced by the left speaker 8 or right speaker 8′. The acoustic characteristics of the playback room 10 affect the sound, adding early reflections and reverberations. The influence of the room is different in each location; hence the sound arriving at the left ear is different from the sound arriving at the right one. The subject's head 12′ affects the sound by reflecting it into one ear 28 and obstructing it from the other 28′. The sound passes through the pinnas 30, 30′ before being recorded by left and right microphones 38, 38′ which are placed inside the ear canals 32, 32′. The output signals of microphones 38, 38′ are amplified by microphone preamplifiers 22, 22′ and analyzed by signal analyzer 24. Signal analyzer 24 analyzes two separate signals: one from the left ear and one from the right.
A total of four different measurements are taken during this phase: two measurements (left and right ear) from left speaker 8 and two from right speaker 8′. In a case where the user has more than two speakers, two measurements are taken from each additional speaker.
FIG. 7 illustrates a speaker virtualizing system. Two filters 36, 36′ are placed between audio source 26 and power amplifiers 6, 6′. The left and right speakers 8, 8′, respectively, reproduce the audio.
As long as the listener 12 and speakers 8, 8′ are located in the same spot used for the measurement (see FIG. 6), and the acoustic characteristics of the room are not significantly changed, the user will hear the sound as if it were originated by a virtual speaker 8″, positioned at angle α. The sound of virtual speaker 8″ will be similar to that of the real speaker 8 that was used for the dummy head measurement, which was placed in the room 10 (see FIG. 1).
The overall transfer function of the system of FIG. 7 would be:
wherein:
- HVS is Human-Virtualized-Speakers constellation, and
- H(P—room . . . ) is the transfer function of the playback room.
FIG. 8 illustrates a two-speaker virtualizing system simulating three virtual speakers 8II, 8III, 8IV. Two filters 46, 48 are placed between a first audio source 40 and adders 42, 44. Filters 50, 52 filter a second source 54 and filters 56, 58 filter a third source 60. The left adder 42 sums up the results of all the left filters (46, 50 and 56, and right adder 44 sums the results of all the right filters (48, 52 and 58). The output of adders 42, 44 is amplified by power amplifiers 62, 64 and reproduced by the left and right speakers 8, 8′, respectively. The transfer function of each pair of filters determines the position of the respective virtual speaker.
The above-described method is suitable for the reproduction of any number of virtual speakers, and is not limited to specific azimuth, elevation and distance range. It is also possible to simulate different acoustic environments by changing the room used for the original measurement. Adding more real speakers to the system will enable control of additional aspects of the listening experience, as described in the mathematical section below.
The physical and mathematical development of the prior art systems and the system of the present invention are as follows:
In the prior art systems, development of Eq. 13, while using Eq. 9 for the left filter, provides:
Evidently, the sound of the virtualized system is very different from that of a speaker system. It is possible to pre-measure and eliminate the linear part from the transfer function of the power amplifier, the speaker, the microphone and the microphone pre-amplifier, however, the nonlinear parts of those devices will remain active.
It is impossible to isolate the transfer functions of the dummy head's pinna and ear canal from that of the system. Therefore, a person listening to such a system will hear the sound filtered through the dummy head's ears, as well as his own.
Hence, prior art virtualized systems sound different from real speakers.
In contradistinction to the prior art systems, the development of Eq. 13 while using Eq. 11 for the filter description according to the present invention, yields:
In a similar way, it can be shown that development of Eq. 14 would result in:
From these equations, it can be seen that the difference between the virtualized system and the real-speaker system is the difference between the obstruction characteristics of the dummy head and the listener's head. The most significant difference between the obstruction characteristics is caused by the differences in head size, which result in different delays between the arrival time to both ears. It is possible to provide a calibration feature to the system that would change the delay manually or automatically and cause the virtualized system to sound like a real one.
As long as the headphones used for playback are similar to those used for the experiment, the virtualized system will sound just like a real speaker system with a speaker positioned at angle α.
It is desirable to use the best equipment the best recording room possible for the experiment. The sound of the virtualized system will sound like the very speaker used for the experiment, placed in the very room used for the experiment. Thus, it is possible to simulate excellent speakers and excellent playback rooms, while in fact the listener is using relatively simple and inexpensive equipment.
The two equations describing the transfer function of the two-speaker surround system (Eq. 15 and Eq. 16, FIG. 7) are:
In order to equalize these transfer functions with those of a real speaker placed in a real room (described in FIG. 3):
HVS(Left)β,γ=HS(Left)α
and
HVS(Right)β,γ=HS(Right)α
It can now be written:
The only unknowns in these equations are the transfer functions of the left and right filters. Since there are two unknowns and two equations, it is possible to find a single solution to those equations and calculate the filter's transfer function.
It is possible to use more than two real speakers in order to enhance the experience and add features to the system.
Adding a third real speaker, positioned in angle θ, and a third filter F3 behind it, would change the equations to:
Now, there are two equations to solve and three unknowns: F(Left), F(Right) and F(3). In order to solve the equations, a restriction must be added. This restriction may be arbitrary and can be used to change the behavior of the system. It is possible, for instance, to control the size and shape of the “sweet spot” (the sitting position in which the surround experience is optimal).
Adding more speakers would require more restrictions and more filters. It can be shown that more speakers can add more “sweet spots” (actually, each pair of additional speakers can add one new “sweet spot”), create “dark spots” (areas in which the acoustic energy is reduced) or control the size and shape of the “sweet spot”.
Different restrictions, controlling other features of the surround sensation, can be similarly developed.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrated embodiments and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein.