The technology described in this patent document relates generally to signal processing and more particularly to audio management.
Mobile devices (e.g., smart phones, tablets) often perform audio signal processing. Various audio signals (e.g., phone calls, music, radio, video, games, system notifications, etc.) may need to be mixed or routed in mobile devices. Different strategies may be implemented to control the mixing or routing of audio streams. For example, music playback may be muted during a phone call and then resume when the phone call is finished.
Information about spatial location of a simulated audio source to a listener over audio equipment (e.g., headphones, speakers, etc.) is often determined using head-related transfer function (HRTF) parameters. HRTF parameters are associated with digital audio filters that reproduce direction-dependent changes that occur in magnitudes and phase spectra of audio signals reaching left and right ears of the listener when the location of the audio source changes relative to the listener. HRTF parameters can be used for adding realistic spatial attributes to arbitrary sounds presented over headphones or speakers.
In accordance with the teachings described herein, system and methods are provided for audio management. Initial head-related transfer function (HRTF) parameters indicating an initial virtual configuration of a plurality of audio sources are determined. A first user operation is detected through a user interface. Target HRTF parameters are generated in response to the first user operation. A target virtual configuration of the plurality of audio sources is determined based at least in part on the target HRTF parameters.
In one embodiment, a system for audio management includes: one or more data processors; and a computer-readable storage medium encoded with instructions for commanding the one or more data processors to execute certain operations. Initial head-related transfer function (HRTF) parameters indicating an initial virtual configuration of a plurality of audio sources are determined. A first user operation is detected through a user interface. Target HRTF parameters are generated in response to the first user operation. A target virtual configuration of the plurality of audio sources is determined based at least in part on the target HRTF parameters.
In another embodiment, a system for audio management includes: a computer-readable medium, a user interface, and one or more data processors. The computer-readable medium is configured to store an initial virtual configuration of a plurality of audio sources and initial head-related transfer function (HRTF) parameters associated with the initial virtual configuration of the plurality of audio sources. The user interface is configured to receive a user operation for audio management. The one or more data processors are configured to: detect the user operation through the graphical user interface; generate target HRTF parameters in response to the user operation; store the target HRTF parameters in the computer-readable medium; determine a target virtual configuration of the plurality of audio sources based at least in part on the target HRTF parameters; and store the target virtual configuration in the computer-readable medium.
During audio signal processing for mobile devices, if multiple audio streams are rendered at the same time, it is usually chaotic because different audio signals may interfere with each other. In addition, a listener may not be able to conveniently adjust volumes of these audio signals. A common audio management strategy involves rendering only one audio stream at a time. However, this strategy has some disadvantages. For example, if a listener wants to listen to music during a phone call, the listener may have to switch the phone application to background, and then open a music player to play music, while the phone call may be unnecessarily interrupted or put on hold.
Specifically, the regions “1,” “2,” . . . , “N” indicate different audio sources that provide audio streams to a listener currently. In one embodiment, if a listener is in a phone call while listening to music, N is equal to 2. As shown in
In another embodiment, if there are three audio sources, such as a phone call, music, and game sounds, N is equal to 3. The virtual configuration of the three audio sources is shown in
In yet another embodiment, if there are four audio sources, N is equal to 4. The virtual configuration of the four audio sources is shown in
The HRTF parameters are determined based at least in part on one or more azimuth parameters associated with the plurality of audio sources. For example, an azimuth parameter includes a direction angle in a horizontal plane, as shown in
If the virtual configuration of the plurality of audio sources is not to be changed (e.g., no user operation being detected, the user operation not including dragging or rolling, etc.), at 610, it is determined whether volumes for one or more audio sources are to be changed. If the volumes for one or more audio sources are to be changed, at 612, the volumes are adjusted accordingly. Then, at 616, it is determined whether the software application (or the hardware implementation) is to be ended.
If it is determined that the volumes for one or more audio sources are not to be changed, at 620, it is determined whether there are any previous user operations being detected. If there are no previous user operations being detected, at 614, one or more default volume curves are applied for the plurality of audio sources. Then, at 616, it is determined whether the software application (or the hardware implementation) is to be ended. If the software application (or the hardware implementation) is not to be ended, the process continues, at 604. If the software application (or the hardware implementation) is to be ended, at 618, the software application (or the hardware implementation) ends. Furthermore, if there are previous user operations being detected, then the process proceeds directly to determine whether the software application (or the hardware implementation) is to be ended. In certain embodiments, if it is determined that the volumes for one or more audio sources are not to be changed, one or more predetermined volume curves (e.g., the default volume curves) are applied for the plurality of audio sources.
In some embodiments, the HRTF parameters for the plurality of audio sources are stored in a data structure—hrtf[azimuth]. For example, the HRTF parameters for the plurality of audio sources are associated with a special representation of the plurality of audio sources in the three-dimensional space 200 as shown in
y(n)=x(n)*hrtf(n) (1)
where hrtf(n) represents HRTF parameters, x(n) represents an initial position of an audio source, and y(n) represents an updated position of the audio source.
For example, the bar panel is used for a speaker of a mobile device (e.g., a smart phone, a tablet). The virtual configuration of the plurality of audio sources includes a line (or a plane) in front of the listener. The HRTF parameters include [−90°, 90° ], where −90° represents a leftmost direction, and 90° represents a rightmost direction.
In some embodiments, when a new audio source is detected, the positions of all audio sources may be adjusted automatically (e.g., using a default setting) or adjusted by user operations in real time. For example, when the new audio source is detected, new HRTF parameters may be determined for all audio sources, and a new virtual configuration of all audio sources is determined based at least in part on the new HRTF parameters.
As shown in
This written description uses examples to disclose the invention, include the best mode, and also to enable a person skilled in the art to make and use the invention. The patentable scope of the invention may include other examples that occur to those skilled in the art. Other implementations may also be used, however, such as firmware or appropriately designed hardware configured to carry out the methods and systems described herein. For example, the systems and methods described herein may be implemented in an independent processing engine, as a co-processor, or as a hardware accelerator. In yet another example, the systems and methods described herein may be provided on many different types of computer-readable media including computer storage mechanisms (e.g., CD-ROM, diskette, RAM, flash memory, computer's hard drive, etc.) that contain instructions (e.g., software) for use in execution by one or more processors to perform the methods' operations and implement the systems described herein.
This disclosure claims priority to and benefit from U.S. Provisional Patent Application No. 61/925,504, filed on Jan. 9, 2014, the entirety of which is incorporated herein by reference.
Number | Name | Date | Kind |
---|---|---|---|
6181800 | Lambrecht | Jan 2001 | B1 |
7917236 | Yamada | Mar 2011 | B1 |
20040196991 | Iida | Oct 2004 | A1 |
20060056638 | Schobben | Mar 2006 | A1 |
20060072764 | Mertens | Apr 2006 | A1 |
20080056503 | McGrath | Mar 2008 | A1 |
20090041254 | Jin | Feb 2009 | A1 |
20090122995 | Kim | May 2009 | A1 |
20090214045 | Fukui | Aug 2009 | A1 |
20100266133 | Nakano | Oct 2010 | A1 |
20100322428 | Fukui | Dec 2010 | A1 |
20150010160 | Udesen | Jan 2015 | A1 |
20150055783 | Luo | Feb 2015 | A1 |
Number | Date | Country | |
---|---|---|---|
61925504 | Jan 2014 | US |