An audio output device receives an audio stream and generates an output that can be heard by a user. Examples of audio output devices include a speaker and a headphone jack for use with headphones or earbuds. A user may listen to various types of audio from the audio output device such as music, sound associated with a video, and the voice of another person (e.g., a voice transmitted in real time over a network). In some examples, the audio output device may be implemented in a computing device such as a desktop computer, an all-in-one computer, or a mobile device (e.g., a notebook, a tablet, a mobile phone, etc.).
The accompanying drawings illustrate various examples of the principles described herein and are part of the specification. The illustrated examples are given merely for illustration, and do not limit the scope of the claims.
Throughout the drawings, identical reference numbers designate similar, but not necessarily identical, elements. The figures are not necessarily to scale, and the size of some parts may be exaggerated to more clearly illustrate the example shown. Moreover, the drawings provide examples and/or implementations consistent with the description; however, the description is not limited to the examples and/or implementations provided in the drawings.
Audio output devices generate audio signals which can be heard by a user. Audio output devices may include speakers, headphone jacks, or other devices and may be implemented in, or coupled to, any number of electronic devices. For example, audio output devices may be placed in or coupled to electronic devices such as mobile phones, tablets, desktop computers, laptop computers, televisions, and audio receivers, among others. However, such audio output devices may not accurately replicate the characteristics of a recorded audio. That is, in a natural environment, a user may hear sounds from a variety of different directions such as in front of the user, behind the user, to the side of the user. However, certain audio streams do not capture the directionality or movement of audio signals. Spatial audio processing refers to the processing of an audio signal to replicate or mimic the directionality of sound. For example, an incoming audio stream may be processed such that a user, upon listening to the audio, may perceive the audio as coming from a particular direction. As a specific example, it may be the case that an audio track of a movie includes sound effects, such as a car engine, that are intended to be behind the subject. When watching the movie using headphones where spatial audio processing does not occur, the position of the car engine behind the subject may be lost. However, with spatial audio processing, the audio track may be processed such that a listener watching the movie perceives the car engine noise as being behind them. As such, the spatial audio processing provides an immersive experience where a listener has a 360-degree soundscape.
However, while spatial audio processing may generate a more immersive experience for a user, some characteristics may negatively impact the immersive experience. For example, it may be the case that the audio output device and the computing device to which the audio output device is connected both perform spatial audio processing on a particular audio stream. This can lead to interference which creates undesirable artifacts in the audio output. Put another way, a spatial audio processor on a computing device such as a personal computer may perform spatial audio processing to provide a surround sound experience for a user. However, an audio output device, such as headphones, may also have an embedded signal processor that also performs spatial audio processing to create a 3D sound environment.
The spatial audio processing of the audio track by both devices may result in artifacts in the audio and may otherwise negatively impact the output audio. For example, the processing by the audio output device spatial audio processor may interfere with the computing device spatial audio processor as pre-processing on the computing device is supposed to give specific desired experience on headphones. However, if there is additional processing on headphones, it could generate artifacts due to cross of both. That is, while spatial audio processing has the objective of providing directionality to output audio signals, cascaded processing where multiple devices are executing spatial audio processing operations may destroy the directionality of the audio.
As a specific example where a particular audio track is intended to be spatially processed providing a perceived origin of 30 degrees to the front-left of the listener, the spatial audio processing by both the computing device and the audio output device may destroy this 30 degree front-left perception and make the audio sound as if it came from directly behind the user or all directionality may be lost such that there is no perceived direction of the audio. Such cascaded processing may also introduce auditory artifacts such as echoes and vibrations into the audio stream.
Accordingly, the present specification describes a system to prevent such a cascaded signal processing scenario. Specifically, the present specification describes systems and methods for detecting and disabling cascaded signal processing on audio output devices such as headphones. In one example, the system disables spatial audio processing occurring on the audio output device by 1) instructing the headphone to disable the spatial audio processing or 2) generating an inverse filter that accounts for and cancels any spatial audio processing performed by the audio output device. In another example, if the computing device is not able to disable the headphone’s spatial audio processing operations, the computing device may disable its own spatial audio processing.
In some examples, the system may include a database of audio output devices and their respective spatial audio processing capabilities. The database may also include a database of commands to enable/disable particular audio output device’s spatial audio processors and/or inverse filters to cancel the effects of an audio output device’s spatial processing. The database may be updated periodically using a retrieval system and natural language processing with machine learning techniques to identify audio output devices with spatial audio processing technology.
Specifically, the present specification describes a system. The system includes a processor to perform spatial audio processing on a received audio signal and an audio interface to connect an audio output device to a computing device. The system also includes a controller. The controller determines a spatial audio processing capability of the audio output device and disables spatial audio processing on the audio output device or the processor based on a determination of the spatial audio processing capability of the audio output device.
The present specification also describes a method. According to the method, an audio output device connected to a computing device is identified. Based on an identity of the audio output device, a spatial audio processing capability of the audio output device is determined. Spatial audio processing of the computing device or the audio output device is disabled responsive to a determination of the spatial audio processing capability of the audio output device.
The present specification also describes a non-transitory machine-readable storage medium encoded with instructions executable by a processor. The machine-readable storage medium includes instructions to fetch, from a network, data indicating spatial audio processing capabilities of multiple audio output devices. The machine-readable storage medium also includes instructions to populate a database with a mapping between 1) fetched information regarding spatial audio processing capabilities of multiple audio output devices and 2) device-specific instructions for disabling spatial audio processors of the multiple audio output devices. The machine-readable storage medium also includes instructions to identify an audio output device connected to a computing device and, based on an identity of the audio output device and a database entry associated with the audio output device, disable spatial audio processing on the computing device or the audio output device.
Such systems and methods 1) avoid interference from two spatial audio processors of a single audio signal; 2) provide directionality to audio tracks of an audio signal; and 3) prevents cascaded signal processing without user input.
As used in the present specification and in the appended claims, the term “spatial audio processing” refers to an operation wherein directionality is provided to audio tracks of a received audio signal.
Also as used in the present specification and in the appended claims, the term “audio output device,” refers to any device that converts an electronic representation of an audio stream to an audio output that is perceptible by humans. Examples of such devices include, speakers, ear buds, and headphones.
As used in the present specification and in the appended claims, the terms “controller,” “retrieval system,” and “switch,” may refer to electronic components which may include a processor and memory. The processor may include the hardware architecture to retrieve executable code from the memory and execute the executable code. As specific examples, the controller as described herein may include computer readable storage medium, computer readable storage medium and a processor, an application specific integrated circuit (ASIC), a semiconductor-based microprocessor, a central processing unit (CPU), and a field-programmable gate array (FPGA), and/or other hardware device.
As used in the present specification and in the appended claims, the term “machine-readable storage medium” refers to machine-readable storage medium that may be a tangible device that can retain and store the instructions for use by an instruction execution device. The machine-readable storage medium may be an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), a static random access memory (SRAM), a portable compact disc read-only memory (CD-ROM), a digital versatile disk (DVD), and a memory stick.
Turning now to the figures,
The system (100) may include a processor (102) to perform spatial audio processing on a received audio signal. That is, the processor (102) may take a stereo audio signal, and provide directionality, such as a point of origin, for the audio. For example, it may the case that a movie or immersive gaming experience has sound information that is intended to be reproduced as if it originated in a 3D space around the listener. Accordingly, the processor (102) takes an audio track that includes this sound information and processes them, for example using head-related transfer functions (HRTFs). An HRTF may be measured using loudspeakers in an anechoic chamber with microphone placed at the entrance. This processing is done such that the sounds are in fact perceived as originating around the user. That is, the audio signals may be processed such that a user’s brain perceives the sound effects as originating behind them. By adding perceived directionality to an audio signal, such spatial audio processing provides a more immersive experience. That is, while a user may be watching a 3D or 2D video, the spatial audio processing which processes audio signals to generate a three-dimensional soundscape gives the perception that the user is immersed in the environment.
The system (100) also include an audio interface (104) through which an audio output device is connected to the computing device in which the system (100) is disposed. For example, in the example where the audio output device is a set of headphones, the audio interface (104) may be an audio jack by which the headphones are physically coupled to the computing device. In another example, the audio interface (104) may be a wireless interface such that audio data is transmitted wirelessly.
The system (100) also includes a controller (106) to alter the spatial audio processing of the audio signal. Specifically, the controller (106) may determine a spatial audio processing capability of the audio output device. This may be done in any number of ways. For example, the controller (106) may include an application programming interface (API) that can detect when an audio output device is connected via the audio interface (104). Via this same API, metadata that identifies the make and model of the audio output device may be determined For example, the metadata may be embedded in a bitstream or in a digital bitstream for universal serial bus (USB) based head-sets or headphones. The API parses the metadata for identification of a type and make of the headphone.
From this make and model information, it may be determined, for example relying on information from a database or retrieved from a network search, whether that make and model are capable of spatial audio processing. That is, upon connection of the audio output device to the computing device, the audio output device may transmit a data packet that includes certain identifying information such as a make and model of an audio output device. Using this information, and other data which may be stored on a database of the system (100) or which may be retrieved remotely, the system (100) may determine whether a particular audio output device has a spatial audio processor and may provide characteristics, protocols, etc. for the spatial audio processing that is performed.
In another example, the metadata itself may identify whether the particular audio output device performs spatial audio processing and may identify the particular spatial audio processing operations carried out by that audio output device. That is, in addition to including the make and model of the audio output device, the metadata may indicate the make and model of a spatial audio processor of the audio output device and/or operating characteristics of a spatial audio processor of the audio output device. Accordingly, from this information, and potentially other information, the controller (106) may determine the full spatial audio processing capabilities of a particular audio output device.
The controller (106) may also disable spatial audio processing on the audio output device or the processor (102) based on a determination of the spatial audio processing capability of the audio output device. That is, once it is determined that an audio output device performs spatial audio processing, the spatial audio processing of the audio output device may be disabled or the spatial audio processing of the processor (102) may be disabled. As will be described below in connection with
In the case that the spatial audio processing of the audio output device is disabled, this may include disabling all audio signal processing performed on the audio output device. That is, in addition to performing spatial audio processing, the audio output device may perform other types of signal processing such as equalization which is a function of frequency and gain and pre-compensates the audio output device to generate a flat frequency response. In other examples, disabling the spatial audio processing of the audio output device includes disabling spatial audio processing performed on the audio output device without disabling other audio signal processing performed by the audio output device. For example, such equalization, and other, signal processing operations may be permitted to continue.
In some examples the processor (102) and the controller (106) are separate components. For example, the processor (102) may be a central processing unit (CPU) and the controller may be a digital signal processor (DSP). In other examples, the processor (102) and the controller (106) may be same component, which same component may be the CPU or the DSP.
Accordingly, the present system (100) reduces the effects of cascading signal processing by deactivating either a spatial audio processor of the audio output device or the processor (102) of the system (100). Accordingly, the directionality of the audio tracks is preserved and the aforementioned audio artifacts that result from signal cascading are eliminated.
That is, the controller (
With the connected audio output device (
In some examples, determining (block 302) the spatial audio processing capability of the audio output device (
In another example, the determination (block 302) is made based on the identity of the audio output device (
Responsive to a determination (block 302) of the spatial audio processing capability of the audio output device (
This may take many forms. For example, as described above, the spatial audio processing of the audio output device (
That is, the computing device (
In another example, rather than disabling the spatial audio processor (
During operation, the inverse filters are used in addition to the spatial audio processing of the processor (
To generate the inverse filter, the output of an audio output device (
In some examples this may be done by supplying a log-sweep signal to each of the two input channels of the spatial audio processor (
In yet another example, disabling (block 303) spatial audio processing includes bypassing a spatial audio processing of the computing device (
That is, if the system (
The database (414) may include other mappings. For example, the database (414) may include a mapping between 1) each identified audio output device (
Accordingly, when the system (100), and more specifically the controller (106), determines the identity of the audio output device (
The system (100) also includes a retrieval system (416) to fetch data from a network regarding spatial audio processing capabilities of multiple audio output devices (
In some examples, a host computing device (
In another example, instead of periodically initiating the retrieval system (416), a hosted service may continuously crawl webpages updating the database (414) with the most up-to-date information about audio output devices (
The system (100) may also include a switch (418) to bypass the processor (102) of the system (100). That is, as described above in some examples the spatial audio processor (
Such a bypass of the processor (102) may occur when disabling of the spatial audio processor (
Referring to
Such systems and methods 1) avoid interference from two spatial audio processors of a single audio signal; 2) provide directionality to audio tracks of an audio signal; and 3) prevents cascaded signal processing without user input.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2020/022590 | 3/13/2020 | WO |