Speaker and room virtualization using headphones

Description

TECHNICAL FIELD

The present disclosure relates generally to audio processing, and more specifically to speaker and room virtualization for audio signal that is to be provided to headphones.

BACKGROUND OF THE INVENTION

When a user listens to music with headphones, audio signals that are mixed to come from the left or right side sound to the user as if they are located adjacent to the left and right ears. Audio signals that are mixed to come from the center sound to the listener as if they are located in the middle of the listener's head. This placement effect is due to the recording process, which assumes that audio signals will be played through speakers that will create a natural dispersion of the reproduced audio signals within a room, where the room provides a sound path to both ears. Playing audio signals through headphones sounds unnatural because there is no sound path to both ears. Also, the lack of room reflections concentrates the audio signals in the listener's head.

SUMMARY OF THE INVENTION

In accordance with the present disclosure, a system for audio processing for headphones is disclosed. The system includes a room reflection emulation system for emulating sound reflections in a room, and a room acoustics emulation system for emulating acoustic properties of the room. A head, shoulder and ear emulation system for emulation sound reflections near the head is also provided.

Other systems, methods, features, and advantages of the present disclosure will be or become apparent to one with skill in the art upon examination of the following drawings and detailed description. It is intended that all such additional systems, methods, features, and advantages be included within this description, be within the scope of the present disclosure, and be protected by the accompanying claims.

BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWINGS

Aspects of the disclosure can be better understood with reference to the following drawings. The components in the drawings are not necessarily to scale, emphasis instead being placed upon clearly illustrating the principles of the present disclosure. Moreover, in the drawings, like reference numerals designate corresponding parts throughout the several views, and in which:

FIG. 1 is a diagram of a system in accordance with an exemplary embodiment of the present disclosure;

FIG. 2 is a diagram of an exemplary HRTF engine in accordance with an exemplary embodiment of the present disclosure;

FIG. 3 is a diagram of a stereo reverberation generator in accordance with an exemplary embodiment of the present disclosure;

FIG. 4 is a diagram of an exemplary shoulder reflection generator in accordance with an exemplary embodiment of the present disclosure;

FIG. 5 is a diagram of an exemplary pinnae reflection generator in accordance with an exemplary embodiment of the present disclosure;

FIG. 6 is a diagram of an exemplary all-pass filter in accordance with an exemplary embodiment of the present disclosure; and

FIG. 7 is a diagram of an exemplary nested delay structure timeline.

DETAILED DESCRIPTION OF THE INVENTION

In the description that follows, like parts are marked throughout the specification and drawings with the same reference numerals. The drawing figures might not be to scale and certain components can be shown in generalized or schematic form and identified by commercial designations in the interest of clarity and conciseness.

The present disclosure implements an algorithm that emulates speakers placed in a room for use with stereo headphones, to simulate the existence of sound paths to both ears, and also to add stereo reverberation for a realistic room effect. The location of the virtual speakers and the associated room size (which is reflected in the reverberation effect) are user selectable. This disclosure uses delay and cross-mixing of the left and right channel audio signals to the headphone speakers, but extensions to N-channel sound with additional audio signals (such as left front, left rear, right front and right rear) are also possible. The delay and mixing amplitude is based on a physical environment.

The present disclosure includes a tuned stereo reverb algorithm that emulates room reflections. There is very little coloration of the sound so it is basically unnoticeable.

Some previous simple reverb solutions cause metallic sound. The density of the disclosed reverb is high enough to not cause unnatural sound. Likewise, some previous reverb solutions use identical reverb on both sound channels, but such applications do not emulate the reflections that would normally be heard by a listener. In contrast, the disclosed system uses tuned non-identical reverb to generate a stereo room effect.

The disclosed cross-mixing, delay and reverb processing is efficiently configured so as to be within the processing capability of a general purpose processor, such as a personal computer or tablet computer, or of other embedded systems, such as those used in personal electronic devices, cellular telephones or other common devices.

The present disclosure can be used to emulate a room environment with virtual speakers for use with headphones. The user can select the angle to the center where the virtual speakers should be located. A head-related transfer function (HRTF) algorithm is applied to each audio channel so as to cause the sound to appear to the user to come from that angle. The user can also select the room size, which can be used by the reverb engine for intensity and duration of the reverberation effect.

FIG. 1 is a diagram of a system 100 in accordance with an exemplary embodiment of the present disclosure. System 100 can be implemented in hardware or a suitable combination of hardware and software.

As used herein, “hardware” can include a combination of discrete components, an integrated circuit, an application-specific integrated circuit, a field programmable gate array, or other suitable hardware. As used herein, “software” can include one or more objects, agents, threads, lines of code, subroutines, separate software applications, two or more lines of code or other suitable software structures operating in two or more software applications or on two or more processors, or other suitable software structures. In one exemplary embodiment, software can include one or more lines of code or other suitable software structures operating in a general purpose software application, such as an operating system, and one or more lines of code or other suitable software structures operating in a specific purpose software application.

The first stage of system 100 includes HRTF emulation, which emulates sound reflections that would normally occur when the audio signals travel around the head to the ears, such as to model reflection of audio signals by the listener's shoulders. Each channel of audio pulse code modulated (PCM) signals passes through a pair of HRTF emulation engines. Each HRTF engine emulates the sound coming in as having a predetermined azimuth and elevation angle with respect to the user. The second stage of system 100 includes a stereo reverberation generator, which is discussed in greater detail herein.

FIG. 2 is a diagram of an exemplary HRTF engine in accordance with an exemplary embodiment of the present disclosure. The HRTF engine includes the following components:

1. Head shadow filter—the head shadow filter provides attenuation on higher frequency audio components when the source is within the shadow of the head, i.e., on the opposite side from the channel being processed.

2. Head delay filter—the head delay filter emulates the delay for sound to pass around head to the ear.

3. Shoulder reflection processor—the shoulder reflection processor emulates reflections when sound is reflected from shoulder to ear.

4. Pinnae reflection processor—the pinnae reflection processor emulates reflections that occur within the pinnae.

For the head shadow filter, the azimuth angle θ of sound is used to generate a variable α, where:

$α = 1.05 + 0.95 \cos (\frac{Θ}{150 °} * 180 °)$

The transfer function of the 1-tap infinite impulse response (IIR) filter that emulates head shadowing can then be calculated by:

$H_{hs} = \frac{(ω_{0} + α F_{s}) + (ω_{0} - α F_{s}) z^{- 1}}{(ω_{0} + F_{s}) + (ω_{0} - F_{s}) z^{- 1}}$

where

ω_O=speed of sound/radius of head, and

F_S=sampling rate

The head shadow filter can be implemented using this algorithm in conjunction with a first order IIR digital filter.

The head delay filter can be implemented using a first order all-pass digital filter. The group delay for the azimuth angle θ can be defined as:

$τ_{h} 0 = {\begin{matrix} - \frac{α}{c} \cos, & 0 \leq < Π / 2 \\ \frac{α}{c} (|| - \frac{Π}{2}), & \frac{Π}{2} \leq || < Π \end{matrix} a = \frac{1 - τ_{h}}{1 + τ_{h}} H_{sh} = \frac{a + z^{- 1}}{1 + {az}^{- 1}}$

FIG. 4 is a diagram of an exemplary shoulder reflection generator in accordance with an exemplary embodiment of the present disclosure. The shoulder reflection generator can be implemented with a digital tap delay. An approximation of the time delay can be defined as:

$τ_{SH} (Θ) = 1.2 \frac{180 - Θ}{180} {(1 - 0.00004 ((ϕ - 80) * \frac{180}{180 + ϕ})}^{2}$

where the gain can be defined as:

g_sh=cos(+90)*0.15

FIG. 5 is a diagram of a pinnae reflection generator in accordance with an exemplary embodiment of the present disclosure. The pinnae reflection generator can be implemented using 5 stages of a digital tap delay.

A_n={1,5,5,5,5}
B_n={2,4,7,11,13}
D_n={1,0.5,0.5,0.5,0.5}

Delay can be defined as:

$τ_{pn} = A_{n} \cos (\frac{Θ}{2}) \sin (D_{n} (90 - ϕ)) + B_{n}$

where

φ is the elevation angle.

In one exemplary embodiment, the gain for the 5 stages can be G={0.5, −0.4, 0.5, −0.25, 0.25}

FIG. 3 is a diagram of a stereo reverberation generator in accordance with an exemplary embodiment of the present disclosure. The stereo reverberation generator is the second stage of system 100, and can be used to provide reverberation for the purpose of simulating room acoustics. Reverberation can be approximated by using a tapped delay all pass digital filter as shown. The nested architecture provides dense reflections. Left and right parameters are slightly different (gain and delay varies by 10% for example) to generate a stereo diffused acoustic effect.

FIG. 6 is a diagram of an exemplary all-pass filter in accordance with an exemplary embodiment of the present disclosure. The all-pass filter transfer function can be provided by:

$H (z) = \frac{z^{- M} - g}{1 - {gz}^{- M}}$

In one exemplary embodiment, 5 stages of nested all-pass filters can be used to create reverb. An exemplary nested delay structure timeline is shown in FIG. 7.

It should be emphasized that the above-described embodiments are merely examples of possible implementations. Many variations and modifications may be made to the above-described embodiments without departing from the principles of the present disclosure. All such modifications and variations are intended to be included herein within the scope of this disclosure and protected by the following claims.

Claims

1. A system for processing an audio signal for output to headphones comprising: a room reflection emulation system configured to emulate sound reflections in a room and apply the emulated sound reflections to the audio signal;a room acoustics emulation system configured to emulate acoustic properties of the room and apply the emulated acoustic properties to the audio signal, the room acoustic emulation system comprising a stereo reverberation generator; anda channel output configured to provide the audio signal with the applied emulated sound reflections and the applied emulated acoustic properties to the headphones;wherein the room reflection emulation system further comprises a head shadow filter comprising a 1 tap IIR filter, the head shadow transfer filter receiving an input audio signal and generating an output, wherein the head shadow filter applies the transfer function
2. The system of claim 1 wherein the room acoustics emulation system further comprises a plurality of nested all-pass filters having a nested delay structure timeline in accordance with FIG. 7.
3. The system of claim 1 wherein the room reflection emulation system further comprises a head delay filter comprising a first order all-pass digital filter, the head delay filter receiving the output of the head shadow filter as an input and generating an output by applying a head delay transfer function.
4. The system of claim 1 wherein the room reflection emulation system further comprises a shoulder reflection system comprising a digital tap delay, the shoulder reflection system receiving the input audio signal and generating an output; and wherein the room reflection emulation system further comprises a pinnae reflection system comprising a plurality of stages of digital tap delays, the pinnae reflection system receiving an output of an adder as an input and generating an output.
5. A system for processing an audio signal for output to headphones comprising: a room reflection emulation system configured to emulate sound reflections in a room and apply the emulated sound reflections to the audio signal;a room acoustics emulation system configured to emulate acoustic properties of the room and apply the emulated acoustic properties to the audio signal, the room acoustic emulation system comprising a stereo reverberation generator; anda channel output configured to provide the audio signal with the applied emulated sound reflections and the applied emulated acoustic properties to the headphones;wherein the room reflection emulation system further comprises a head delay filter comprising a first order all-pass digital filter, the head delay filter receiving an output of a head shadow filter as an input and generating an output, wherein the head delay filter applies the transfer function
6. The system of claim 5 wherein the room reflection emulation system further comprises an adder receiving an output of the head delay filter and a shoulder reflection system and generating an output.
7. The system of claim 5 wherein the room reflection emulation system further comprises a shoulder reflection system comprising a digital tap delay, the shoulder reflection system receiving the input audio signal and generating an output; and wherein the room reflection emulation system further comprises a pinnae reflection system comprising a plurality of stages of digital tap delays, the pinnae reflection system receiving an output of an adder as an input and generating an output.
8. A system for audio processing an audio signal for output to headphones comprising: a room reflection emulation system configured to emulate sound reflections in a room and apply the emulated sound reflections to the audio signal;a room acoustics emulation system configured to emulate acoustic properties of the room and apply the emulated acoustic properties to the audio signal, the room acoustic emulation system co a stereo reverberation generator; anda channel output configured to provide the audio signal with the applied emulated sound reflections and the applied emulated acoustic properties to the headphones;wherein the room reflection emulation system further comprises a shoulder reflection system comprising a digital tap delay, the shoulder reflection system receiving the input audio signal and generating an output, wherein a time delay of the shoulder reflection system is generated in accordance with
9. The system of claim 8 wherein the room reflection emulation system further comprises a head shadow filter comprising a 1 tap IIR filter, the head shadow transfer filter receiving an input audio signal and generating an output by applying a head shadow transfer function; and a head delay filter comprising a first order all-pass digital filter, the head delay filter receiving the output of the head shadow filter as an input and generating an output by applying a head delay transfer function.
10. The system of claim 8 wherein the room reflection emulation system further comprises a pinnae reflection system comprising a plurality of stages of digital tap delays, the pinnae reflection system receiving an output of an adder as an input and generating an output.
11. A system for audio processing an audio signal for output to headphones comprising: a room reflection emulation system configured to emulate sound reflections in a room and apply the emulated sound reflections to the audio signal;a room acoustics emulation system configured to emulate acoustic properties of the room and apply the emulated acoustic properties to the audio signal, the room acoustic emulation system comprising a stereo reverberation generator; anda channel output configured to provide the audio signal with the applied emulated sound reflections and the applied emulated acoustic properties to the headphones;wherein the room reflection emulation system further comprises a pinnae reflection system comprising five stages of digital tap delays, the pinnae reflection system receiving the output of an adder as an input and generating an output in accordance with An={1, 5, 5, 5, 5},Bn={2, 4, 7, 11, 13},Dn={1, 0.5, 0.5, 0.5, 0.5},
12. The system of claim 11 wherein the room reflection emulation system further comprises a head shadow filter comprising a 1 tap IIR filter, the head shadow transfer filter receiving an input audio signal and generating an output by applying a head shadow transfer function; and a head delay filter comprising a first order all-pass digital filter, the head delay filter receiving the output of the head shadow filter as an input and generating an output by applying a head delay transfer function.
13. A system for audio processing comprising: a room reflection emulation system for emulating sound reflections in a room, the room reflection emulation system further comprising:a head shadow filter comprising a 1 tap IIR filter, the head shadow transfer filter receiving an input audio signal and generating an output, wherein the head shadow filter applies the transfer function
14. A method for audio processing comprising: receiving a left channel audio signal and a right channel audio signal;applying head-related transfer function (HRTF) processing to the left channel audio signal and the right channel audio signal;adding the HRTF-processed left channel audio signal to the HRTF-processed right channel audio signal to generate an HRTF-processed output; andapplying stereo reverb processing to the HRTF-processed output to generate an audio output;wherein applying HRTF processing to the left channel audio signal and the right channel audio signal comprises applying head shadow filter (HSF) processing to the left channel audio signal and the right channel audio signal to generate an HSF output; andwherein applying the HSF processing comprises applying a 1-tap infinite impulse response (IIR) filter that can be represented by:
15. The method of claim 14 wherein applying HRTF processing to the left channel audio signal and the right channel audio signal comprises applying head delay filter (HDF) processing to the HSF output to generate an HDF output.
16. The method of claim 15 wherein the HDF processing comprises applying a first order all-pass digital filter.
17. The method of claim 14 wherein applying HRTF processing to the left channel audio signal and the right channel audio signal comprises applying shoulder reflection (SR) processing to the left channel audio signal and the right channel audio signal to generate an SR output.
18. The method of claim 17 wherein the SR processing comprises applying a digital tap delay in accordance with
19. The method of claim 15 wherein applying HRTF processing to the left channel audio signal and the right channel audio signal comprises adding the HDF output and the SR output and performing pinnae reflection processing on the sum.
20. The method of claim 14, wherein applying HRTF processing to the left channel audio signal and the right channel audio signal comprises applying head delay filter (HDF) processing to the HSF output to generate an HDF output using a first order all-pass digital filter;wherein applying HRTF processing to the left channel audio signal and the right channel audio signal comprises applying shoulder reflection (SR) processing to the left channel audio signal and the right channel audio signal to generate an SR output by applying a digital tap delay in accordance with

RELATED APPLICATIONS

The present application claims benefit of U.S. Provisional patent application 61/598,267, entitled “Speaker and Room Virtualization Using Headphones,” filed Feb. 13, 2012, which is hereby incorporated by reference for all purposes.

US Referenced Citations (39)

Number	Name	Date	Kind
4347405	Davis	Aug 1982	A
5748758	Menasco, Jr.	May 1998	A
6470087	Heo	Oct 2002	B1
6643379	Onglao	Nov 2003	B1
6931134	Waller et al.	Aug 2005	B1
7391876	Cohen et al.	Jun 2008	B2
7572972	Gustafsson	Aug 2009	B2
7600608	Freiheit	Oct 2009	B2
7604094	Magyari	Oct 2009	B2
7634092	McGrath	Dec 2009	B2
7936887	Smyth	May 2011	B2
8270616	Slamka	Sep 2012	B2
8374365	Goodwin et al.	Feb 2013	B2
8553895	Plogsties	Oct 2013	B2
8638946	Mahabub	Jan 2014	B1
8885442	Agevik	Nov 2014	B2
8885834	Kuhr et al.	Nov 2014	B2
8908875	De Sena et al.	Dec 2014	B2
8965000	Engdegard	Feb 2015	B2
20020039421	Kirkeby	Apr 2002	A1
20050100171	Reilly	May 2005	A1
20050271214	Kim	Dec 2005	A1
20050276430	He	Dec 2005	A1
20070213990	Moon	Sep 2007	A1
20070223708	Villemoes	Sep 2007	A1
20070223749	Kim	Sep 2007	A1
20070280485	Villemoes	Dec 2007	A1
20080008324	Sim	Jan 2008	A1
20080025519	Yu	Jan 2008	A1
20080037795	Ko	Feb 2008	A1
20080168188	Yue	Jul 2008	A1
20080226103	Schobben	Sep 2008	A1
20080273708	Sandgren	Nov 2008	A1
20110046761	Titchener	Feb 2011	A1
20110200195	Lau et al.	Aug 2011	A1
20130163784	Tracey et al.	Jun 2013	A1
20130163787	Moon	Jun 2013	A1
20130216073	Lau	Aug 2013	A1
20150244869	Cartwright	Aug 2015	A1

Related Publications (1)

	Number	Date	Country
	20130216073 A1	Aug 2013	US

Provisional Applications (1)

	Number	Date	Country
	61598267	Feb 2012	US

Speaker and room virtualization using headphones

Information

Patent Number

Date Filed

Date Issued

Inventors

Original Assignees

Examiners

Agents

CPC

Field of Search

US

CPC

International Classifications