This application claims the priority under 35 U.S.C. §119 of European patent application no. 09179748.0, filed on Dec. 17, 2009, the contents of which are incorporated by reference herein.
The invention relates to a system which extracts a measure of the acoustic response of the environment, and a method of extracting the acoustic response.
An auditory display is a human-machine interface to provide information to a user by means of sounds. These are particularly suitable in applications where the user is not permitted or not able to look at a display. An example is a headphone-based navigation system which delivers audible navigation instructions. The instructions can appear to come from the appropriate physical location or direction, for example a commercial may appear to come from a particular shop. Such systems are suitable for assisting blind people.
Headphone systems are well known. In typical systems a pair of loudspeakers are mounted on a band so as to be worn with the loudspeakers adjacent to a user's ears. Closed headphone systems seek to reduce environmental noise by providing a closed enclosure around each user's ear, and are often used in noisy environments or in noise cancellation systems. Open headphone systems have no such enclosure. The term “headphone” is used in this application to include earphone systems where the loudspeakers are closely associated with the user's ears, for example mounted on or in the user's ears.
It has been proposed to use headphones to create virtual or synthesized acoustic environments. In the case where the sounds are virtualized so that listeners perceive them as coming from the real environment, the systems may be referred to as augmented reality audio (ARA) systems.
In systems creating such virtual or synthesized environments, the headphones do not simply reproduce the sound of a sound source, but create a synthesized environment, with for example reverberation, echoes and other features of natural environments. This can cause the user's perception of sound to be externalized, so the user perceives the sound in a natural way and does not perceive the sound to originate from within the user's head. Reverberation in particular is known to play a significant role in the externalization of virtual sound sources played back on headphones. Accurate rendering of the environment is particularly important in ARA systems where the acoustic properties of the real and virtual sources must be very similar.
A development of this concept is provided in Härmä et al, “Techniques and applications of wearable augmented reality audio”, presented at the AES 114th convention, Amsterdam, Mar. 22 to 25, 2003. This presents a useful overview of a number of options. In particular, the paper proposes generating an environment corresponding to the environment the user is actually present in. This can increase realism during playback.
However, there remains a need for convenient, practical portable systems that can deliver such an audio environment.
Further, such systems need data regarding the audio environment to be generated. The conventional way to obtain data about room acoustics is to play back a known signal on a loudspeaker and measure the received signal. The room impulse response is given by the deconvolution of the measured signal by the reference signal.
Attempts have been made to estimate the reverberation time from recorded data without generating a sound, but these are not particularly accurate and do not generate additional data such as the room impulse response.
According to the invention, there is provided a headphone system according to claim 1 and a method according to claim 9.
The inventor has realised that a particular difficulty in providing realistic audio environments is in obtaining the data regarding the audio environment occupied by a user. Headphone systems can be used in a very wide variety of audio environments.
The system according to the invention avoids the need for a loudspeaker driven by a test signals to generate suitable sounds for determining the impulse response of the environment. Instead, the speech of the user is used as the reference signal. The signals from the pair of microphones, one external and one internal, can then be used to calculate the room impulse response.
The calculation may be done using a normalised least mean squares adaptive filter.
The system may have a binaural positioning unit having a sound input for accepting an input sound signal and to drive the loudspeakers with a processed stereo signal, wherein the processed sound signal is derived from the input sound signal and the acoustic response of the environment.
The binaural positioning unit may be arranged to generate the processed sound signal by convolving the input sound system with the room inpulse response.
In embodiments, the input sound signal is a stereo sound signal and the processed sound signal is also a stereo sound signal.
The processing may be carried out by convolving the input sound system with the room inpulse response to calculate the processed sound signal. In this way, the input sound is processed to match the auditory properties of the environment of the user.
For a better understanding of the invention, embodiments of the invention will now be described, purely by way of example, with reference to the accompanying drawings, in which:
Referring to
A sound processor 20 is provided, including reverberation extraction units 22,24 and a binaural positioning unit 26.
Each ear unit 6,8 is connected to a respective reverberation extraction unit 22,24. Each takes signals from both the internal microphone 12 and the external microphone 14 of the respective ear unit, and is arranged to output a measure of the environment response to the binaural positioning unit 26 as will be explained in more detail below.
The binaural positioning unit 26 is arranged to take an input sound signal 28 and information 30 together with the information regarding the environment response from the reverberation extraction units 22,24. Then, the binaural positioning unit creates an output sound signal 32 based on the measures of the environment response to modify the input sound signal and outputs the output sound signal to the loudspeakers 16.
In the particular embodiment described, the reverberation extraction units 22,24 extract the environment impulse response as the measure of the environment response. This requires an input or test signal. In the present case, the user's speech is used as the test signal which avoids the need for a dedicated test signal.
This is done using the microphone inputs using a normalised least mean squared adaptive filter. The signal from the internal microphone 12 is used as the input signal and the signal from the external microphone 14 is used as the desired signal.
The techniques used to calculate the room impulse response will now be described in considerably more detail.
Consider the reference speech signal produced by the user which will be referred to as x. When in a reverberant environment, the speech signal will be filtered by the room impulse response, and reach the external microphone (signal Mice). Simultaneously, the speech signal is captured by the internal microphone (signal Mici) through skin and bone conduction. He and Hi are the transfer functions between the reference speech signal and the signal recorded with the external and internal microphones respectively. He is the desired room impulse response while Hi is the result of the bone and skin conduction from the throat to the ear canal. Hi is typically independent from the environment the user is in. It can be thus measured off-line and used as an optional equalization filter.
One of the many possible techniques to identify the room impulse response He based on the microphone inputs Mici and Mice is an adaptive filter, using a Least Mean Square (LMS) algorithm.
In the present invention, illustrated in
Ŵ=H
e
/H
i.
In a further embodiment, the system could be calibrated in an anechoic environment using the same procedure as described above. In this case the resulting filter ŵanechoic[n], expressed in frequency domain is now
Ŵ
anechoic
=H
e-anechoic
/H
i (1)
Hi is the room independent path to the internal microphone and He-anechoic, the path from the mouth to the external microphone in anechoic conditions. It includes the filtering effect due to the placement of the microphone behind the mouth instead of in front of it. This effect is neglected in the first embodiment, but can be compensated for when a calibration in anechoic conditions is possible. In the remainder of this document, He, the path from the mouth to the external microphone, will hence be split in two parts: He-anechoic and He-room, where He-room is the desired room response, such that
H
e
=H
e-anechoic
·H
e-room. (2)
Ŵanechoic can be used as a correction filter
Hc=Ŵanechoic, (3)
illustrated in
Indeed, the filter ŵ[n] obtained according to
Ŵ=H
e/(Hi·Hc). (4)
As seen (1) and (3), we obtain
Ŵ=(He·Hi)/(Hi·He-anechoic). (5)
If we split He according to (2), we finally obtain
Ŵ=He-room.
Using the anechoic measurement as correction filter indeed allows the suppression of all contributions not related to the room transfer function to be identified.
The environment impulse response is then used to process the input sound signal 28 by performing a direct convolution of the input sound signal with the room impulse response.
The input sound signal 28 is preferably a dry, anechoic sound signal and may in particular be a stereo signal.
As an alternative to convolution, the environment impulse response can be used to identify the properties of the environment and this used to select suitable processing.
When used in a room, the environment impulse response will be a room impulse response. However, the invention is not limited to use in rooms and other environments, for example outside, may also be modelled. For this reason, the term environment impulse response has been used.
Note that those skilled in the art will realise that alternatives to the above approach exist. For example, the environment impulse response is not the only measure of the auditory environment and alternatives, such as reverberation time, may alternatively or additionally be calculated.
The invention is also applicable to other forms of headphones, including earphones, such as intra-concha or in-ear canal earpieces. In this case, the internal microphone may be provided on the inside of the ear unit facing the user's inner ear and the external microphone is on the outside of the ear unit facing the outside.
It should also be noted that the sound processor 20 may be implemented in either hardware or software. However, in view of the complexity and necessary speed of calculation in the reverberation extraction units 22,24, these may in particular be implemented in a digital signal processor (DSP).
Applications include noise cancellation headphones and auditory display apparatus.
Number | Date | Country | Kind |
---|---|---|---|
09179748.0 | Dec 2009 | EP | regional |