Examples of the present disclosure relate to apparatus, methods and computer programs for controlling band limited audio objects. Some relate to apparatus, methods and computer programs for providing directional control of band limited audio objects
Band limited audio objects such as low frequency effect audio objects may require specific speakers to enable the audio within the frequency bands to be rendered. This may need to be taken into account when spatial audio is being rendered to a user. A sound system may contain fewer speakers for rendering low frequency effect audio objects than for rendering other types of audio object.
According to various, but not necessarily all, examples of the disclosure there may be provided, an apparatus comprising means for: obtaining a band limited audio object comprising one or more parameters; obtaining spatial metadata associated with the band limited audio object; determining a position of a user; and using the determined position of the user and the spatial metadata associated with the band limited audio object to control at least one of the parameters of the band limited audio object.
The spatial metadata may be obtained with the band limited audio signal.
The spatial metadata may be stored with a non-band limited audio object and the band limited audio object may be obtained with metadata indicative of the non-band limited audio object. The metadata obtained with the band limited audio object may be indicative of a connection between the band limited audio object and the non-band limited audio object. The band limited audio object and the non-band limited audio object may be configured to be played back at the same time.
The band limited audio object may comprise a low frequency effect audio object.
The band limited audio object may comprise a band limited audio object playback volume and/or a band limited audio object playback signal.
The band limited audio object may be configured to be played back via at least one band limited speaker.
The one or more parameters may comprise at least one of volume, delay, reverberation, diffusivity.
The means may be configured to determine the position of the user while the band limited audio object is being played back.
The position of the user may be determined relative to one or more speakers configured to play back the band limited audio object.
The position of the user may comprise the distance between the user and one or more speakers configured to playback the band limited audio object.
According to various, but not necessarily all, examples of the disclosure there may be provided, an apparatus comprising: processing circuitry; and memory circuitry including computer program code, the memory circuitry and the computer program code configured to, with the processing circuitry, cause the apparatus to: obtain a band limited audio object comprising one or more parameters; obtain spatial metadata associated with the band limited audio object; determine a position of a user; and use the determined position of the user and the spatial metadata associated with the band limited audio object to control at least one of the parameters of the band limited audio object.
According to various, but not necessarily all, examples of the disclosure there may be provided, an audio rendering device comprising an apparatus as described above.
According to various, but not necessarily all, examples of the disclosure there may be provided method comprising: obtaining a band limited audio object comprising one or more parameters; obtaining spatial metadata associated with the band limited audio object;
determining a position of a user; and using the determined position of the user and the spatial metadata associated with the band limited audio object to control at least one of the parameters of the band limited audio object.
The spatial metadata may be obtained with the band limited audio signal.
The spatial metadata may be stored with a non-band limited audio object and the band limited audio object may be obtained with metadata indicative of the non-band limited audio object.
According to various, but not necessarily all, examples of the disclosure there may be provided a computer program comprising computer program instructions that, when executed by processing circuitry, cause: obtaining a band limited audio object comprising one or more parameters; obtaining spatial metadata associated with the band limited audio object; determining a position of a user; and using the determined position of the user and the spatial metadata associated with the band limited audio object to control at least one of the parameters of the band limited audio object.
According to various, but not necessarily all, examples of the disclosure there may be provided a physical entity embodying the computer program as described above.
According to various, but not necessarily all, examples of the disclosure there may be provided an electromagnetic carrier signal carrying the computer program as claimed described above.
According to various, but not necessarily all, examples of the disclosure there may be provided apparatus comprising means for: obtaining a band limited audio object comprising one or more parameters; obtaining spatial metadata associated with the band limited audio object; determining a direction of a display associated with the band limited audio object; and using the spatial metadata associated with the band limited audio object to control one or more parameters of the band limited audio object in accordance with the determined direction of the display.
The one or more parameters may comprise a volume of the band limited audio object.
The determining a direction of a display may comprise determining whether the display is oriented within a threshold angular range wherein the threshold angular range is defined by the spatial metadata.
The means may be configured to control the one or more parameters of the band limited audio object in a first way if the display is oriented within the threshold angular range and control the one or more parameters of the band limited audio object in a second way if the display is not oriented within the threshold angular range.
According to various, but not necessarily all, examples of the disclosure there may be provided an apparatus comprising: processing circuitry; and memory circuitry including computer program code, the memory circuitry and the computer program code configured to, with the processing circuitry, cause the apparatus to: obtain a band limited audio object comprising one or more parameters; obtain spatial metadata associated with the band limited audio object; determine a direction of a display associated with the band limited audio object; and use the spatial metadata associated with the band limited audio object to control one or more parameters of the band limited audio object in accordance with the determined direction of the display.
According to various, but not necessarily all, examples of the disclosure there may be provided audio rendering device comprising an apparatus as described above.
According to various, but not necessarily all, examples of the disclosure there may be provided a method comprising: obtaining a band limited audio object comprising one or more parameters; obtaining spatial metadata associated with the band limited audio object; determining a direction of a display associated with the band limited audio object; and using the spatial metadata associated with the band limited audio object to control one or more parameters of the band limited audio object in accordance with the determined direction of the display.
The one or more parameters may comprise a volume of the band limited audio object.
According to various, but not necessarily all, examples of the disclosure there may be provided a computer program comprising computer program instructions that, when executed by processing circuitry, cause: obtaining a band limited audio object comprising one or more parameters; obtaining spatial metadata associated with the band limited audio object; determining a direction of a display associated with the band limited audio object; and using the spatial metadata associated with the band limited audio object to control one or more parameters of the band limited audio object in accordance with the determined direction of the display.
According to various, but not necessarily all, examples of the disclosure there may be provided a physical entity embodying the computer program as described above.
According to various, but not necessarily all, examples of the disclosure there may be provided an electromagnetic carrier signal carrying the computer program as described above.
According to various, but not necessarily all, examples of the disclosure there may be provided an apparatus comprising means for: obtaining a band limited audio object comprising one or more parameters; obtaining spatial metadata associated with the band limited audio object; and using metadata associated with the band limited audio object to control one or more parameters of the band limited audio object.
Some example embodiments will now be described with reference to the accompanying drawings in which:
The Figures illustrate an apparatus 101 comprising means for: obtaining 401 a band limited audio object 211 comprising one or more parameters and obtaining spatial metadata associated with the band limited audio object 211; determining 403 a position of a user 605; and using 405 the determined position of the user 605 to control at least one of the parameters of the band limited audio object 211. The parameters could comprise at least one of volume, delay, reverberation, diffusivity or any other suitable parameter. This provides the technical effect of enabling spatial control of band limited audio objects 211. This may provide for an improved spatial audio experience for a user 605. For example it may enable the band limited audio objects 211 to be controlled as the user 605 moves within an audio space to enable a more realistic audio signal to be provided to the user 605.
As illustrated in
The processor 105 is configured to read from and write to the memory 107. The processor 105 may also comprise an output interface via which data and/or commands are output by the processor 105 and an input interface via which data and/or commands are input to the processor 105.
The memory 107 is configured to store a computer program 109 comprising computer program instructions (computer program code 111) that controls the operation of the apparatus 101 when loaded into the processor 105. The computer program instructions, of the computer program 109, provide the logic and routines that enables the apparatus 101 to perform the methods illustrated in
The apparatus 101 therefore comprises: at least one processor 105; and at least one memory 107 including computer program code 111, the at least one memory 107 and the computer program code 111 configured to, with the at least one processor 105, cause the apparatus 101 at least to perform: obtaining 401 a band limited audio object 211 comprising one or more parameters and also obtaining spatial metadata associated with the band limited audio object 211; determining 403 a position of a user 605; and using 405 the determined position of the user 605 and the obtained spatial metadata to control at least one of the parameters of the band limited audio object 211.
In some examples the apparatus 101 may comprise: at least one processor 105; and at least one memory 107 including computer program code 111, the at least one memory 107 and the computer program code 111 configured to, with the at least one processor 105, cause the apparatus 101 at least to perform: obtaining a band limited audio object comprising one or more parameters and also obtaining spatial metadata associated with the band limited audio object 211; determining a direction of a display associated with the band limited audio object 211; and using the spatial metadata associated with the band limited audio object 211 to control one or more parameters of the band limited audio object 211 in accordance with the determined direction of the display.
As illustrated in
The computer program 109 comprises computer program instructions for causing an apparatus 101 to perform at least the: obtaining 401 a band limited audio object 211 comprising one or more parameters wherein the band limited audio object 211 is configured to be played back via at least one band limited speaker and also obtaining spatial metadata associated with the band limited audio object 211; determining 403 a position of a user 605; and using 405 the determined position of the user 605 and the obtained spatial metadata to control at least one of the parameters of the band limited audio object 211.
In some examples the computer program 109 may comprise computer program instructions for causing an apparatus 101 to perform at least: obtaining a band limited audio object comprising one or more parameters wherein the band limited audio object 211 is configured to be played back via at least one band limited speaker and also obtaining spatial metadata associated with the band limited audio object 211; determining a direction of a display associated with the band limited audio object 211; and using the spatial metadata associated with the band limited audio object 211 to control one or more parameters of the band limited audio object 211 in accordance with the determined direction of the display.
The computer program instructions may be comprised in a computer program 109, a non-transitory computer readable medium, a computer program product, a machine readable medium. In some but not necessarily all examples, the computer program instructions may be distributed over more than one computer program 109.
Although the memory 107 is illustrated as a single component/circuitry it may be implemented as one or more separate components/circuitry some or all of which may be integrated/removable and/or may provide permanent/semi-permanent/ dynamic/cached storage.
Although the processor 105 is illustrated as a single component/circuitry it may be implemented as one or more separate components/circuitry some or all of which may be integrated/removable. The processor 105 may be a single core or multi-core processor.
References to “computer-readable storage medium”, “computer program product”, “tangibly embodied computer program” etc. or a “controller”, “computer”, “processor” etc. should be understood to encompass not only computers having different architectures such as single/multi-processor architectures and sequential (Von Neumann)/parallel architectures but also specialized circuits such as field-programmable gate arrays (FPGA), application specific circuits (ASIC), signal processing devices and other processing circuitry. References to computer program, instructions, code etc. should be understood to encompass software for a programmable processor or firmware such as, for example, the programmable content of a hardware device whether instructions for a processor, or configuration settings for a fixed-function device, gate array or programmable logic device etc.
As used in this application, the term “circuitry” may refer to one or more or all of the following:
This definition of circuitry applies to all uses of this term in this application, including in any claims. As a further example, as used in this application, the term circuitry also covers an implementation of merely a hardware circuit or processor and its (or their) accompanying software and/or firmware. The term circuitry also covers, for example and if applicable to the particular claim element, a baseband integrated circuit for a mobile device or a similar integrated circuit in a server, a cellular network device, or other computing or network device.
The apparatus 101 could be an apparatus 101 as shown in
The memory 107 may be configured to store information representing one or more band limited audio objects 211. A band limited audio object could be an object which has a bandwidth which is substantially narrower than normal human hearing range. A band limited audio object 211 could comprise a low frequency effect audio object. The low frequency object could comprise frequencies at the lower range of human hearing. A band limited audio object 211 may comprise only low frequency sounds. In some examples the band limited audio object 211 could be limited to a frequency range of 20-120 Hz. In some examples the lowest frequencies of the band limited audio object 211 could be between 10-50 Hz and in some examples the highest frequencies of the band limited audio object 211 could be between 50-120 Hz.
A band limited audio object 211 may be different from a non-band limited audio object in that a non-band limited audio object may cover all of, or almost all of, normal human hearing frequencies while the band limited audio object only covers a small range of these frequencies. The band limited audio object may be configured to be played back via at least one band limited speaker whereas the non-band limited audio object could be played back via at least one normal speaker.
The band limited audio object 211 may be associated with a spatial position. The spatial position could be the location of a sound source which generates the band limited audio object 211. In some examples the spatial position could be the direction from which the band limited audio object 211 is perceived to arrive. This could be affected by walls or other physical objects which could reflect or otherwise direct the sound.
The band limited audio object 211 may comprise a band limited audio object playback volume and/or a band limited audio object playback signal.
The band limited audio object 211 may comprise one or more different parameters. The parameters may be controlled to enable the spatial properties of the band limited audio object 211 to be recreated and perceived by a user 605. The different parameters could comprise any one or more of volume, delay, diffusivity, reverberation or any other parameter which is determined by the spatial properties of the band limited audio object 211.
The memory 107 may also be configured to store metadata 213. The metadata 213 may be stored with the band limited audio object 211. The metadata 213 may be stored with the band limited audio object 211 so that when the band limited audio object 211 is retrieved the metadata 213 can also be retrieved.
In some examples the metadata 213 could comprise spatial metadata. The spatial metadata may comprise information which enables spatial effects of the band limited audio object 211 to be recreated. For instance, it may comprise information indicative of how the volume, or other parameters, of the band limited audio object 211 should be controlled in dependence upon the user's position. The volume could be the loudness of the band limited audio object 211. The volume could be the gain applied to the band limited audio object 211. The position of the user 605 could be the angular orientation of the user 605 and/or the distance between the user 605 and a reference point. The reference point could be the rendering device 201 or any other suitable reference point. The spatial metadata could be obtained using a spatial audio capture system 301 such as the system shown in
In some examples the metadata 213 may comprise information indicative of another audio object which is associated with the band limited audio object 211. The another audio object could be a non-band limited audio object. For example the non-band limited audio object may comprise high frequency sounds. The non-band limited audio object may comprise sounds that cover a normal range of hearing. The non-band limited audio object may comprise sounds that cover a frequency range of 20 Hz to 20 kHz. The ranges of frequencies covered by the non-band limited audio object could overlap with the ranges of frequencies covered by a band limited audio object 211. The another audio object could be stored in the memory 107 of the apparatus 101 or could be stored in the memory of a different device.
In some examples the band limited audio object 211 and the non-band limited audio object could be associated in that they may originate from the same sound source. For example a sound source could produce both low frequency sounds and higher frequency sounds. The low frequency sounds could be comprised within the band limited audio object 211 and the higher frequency sounds could be comprised within the non-band limited audio object.
In some examples the band limited audio object 211 and the non-band limited audio object could be associated in that they may originate from the same direction or a similar direction but could be generated by different sources. For example if the audio is used to recreate the sound of a battle scene the band limited audio object 211 could correspond to cannon fire while the non-band limited audio object could correspond to gun fire. These sounds could be generated by different sources but the sources may be located in the same or similar positions.
When the audio is being rendered both the band limited audio object 211 and the non-band limited audio object could be played back at the same time. The band limited audio object 211 and the non-band limited audio object could be played back via different speakers.
In these examples a single set of spatial metadata 213 could be stored. This could be stored with the non-band limited audio object. The metadata that is stored with the band limited audio object 211 could provide an indication of the non-band limited audio object which is associated with the band limited audio object 211 and could enable the spatial metadata 213 to be retrieved. This enables the same spatial metadata to be shared between two or more different audio objects. It is to be appreciated that the spatial metadata 213 could be stored with any one or more of the associated audio objects. This may reduce the amount of data that needs to be transmitted and/or stored.
The band limited audio object 211 may be obtained by the apparatus 101 by any suitable means. In some examples the apparatus may form part of a spatial audio capture system which may be configured to record and capture the band limited audio object 211 and other audio objects. In some examples the band limited audio object 211 may be received via a communication link and stored in the memory 107 of the apparatus 101.
The at least one loudspeaker 203 may comprise any means which enables an electrical input signal to be rendered into an audible output signal. In some examples the at least one loudspeaker 203 may comprise a band limited speaker which may be configured to provide a low frequency effect audible output signal. This may enable the band limited audio object 211 to be rendered to a user 605. The at least one loudspeaker 203 may be coupled to the memory 107 to enable the band limited audio object 211 to be retrieved from the memory 107 and provided to the loudspeaker 203.
The positioning means 205 may comprise any means which may enable a position of a user 605 to be determined. In some examples the position of the user 605 may comprise the distance between the user 605 and one or more reference points. The reference points could be the position of the loudspeaker 203 or any other suitable point.
In some examples the position of the user 605 may comprise the angular orientation of the user 605. The angular orientation of the user 605 may be determined compared to a given reference point such as the loudspeaker 203 or a display or any other suitable point.
In some examples the positioning means 205 could comprise one or more electromagnetic sensors. The electromagnetic sensors could comprise infrared sensors or any other suitable type of sensors. The electromagnetic sensors may be used to determine the distance between a user 605 and a reference point and/or determine an angular orientation of the user 605. Other types of sensors may be used in other examples of the disclosure.
In some examples the positioning means 205 may be configured to infer the angular orientation of the user 605 from the position of a display or other device within a system. For instance, if the position of a display is known, or determined by the positioning means 205, it may be assumed that the user 605 is facing towards the display. The position of the display could be determined using any suitable means such as accelerometers, magnetometers or any other suitable devices. The display could be a head mounted display or any other suitable type of display.
It is to be appreciated that the device shown in
The system 301 comprises a plurality of microphones 303 which are configured to capture spatial audio signals. The microphones 303 could be provided in any suitable devices. For instance the microphones 303 may be provided in a mobile phone, a microphone array, a computing device or any other suitable type of microphone device. The microphones 303 may be configured to capture low frequency sounds so as to enable a band limited audio object 211 to be obtained.
The band limited audio object 211 can be obtained from the microphones using any suitable means. In some examples the band limited audio object 211 can be obtained by an audio engineer using a digital audio workstation or by any other suitable means.
The system 303 comprises a microphone capture processing module 305. The microphone capture processing module 305 is configured to process the signals captured by the plurality of microphones 303. The microphone capture processing module 305 may comprise any means which may be configured to process the signals captured by the plurality of microphones 303 so as to provide a spatial audio output signal 307. The spatial audio output signal may comprise any suitable type of spatial format such as Ambisonics, multichannel formats, plurality of channels with spatial metadata or any other suitable format.
In some examples the microphone capture processing module 305 may be configured to process the captured audio signals to create a band limited audio object 211. The microphone capture processing module 305 may also be configured to generate spatial metadata associated with the band limited audio object 211 so as to enable the spatial properties of the band limited audio object 211 to be recreated.
The spatial audio signals 307 are provided to an encoder module 309. The encoder module 309 may comprise any means which may be configured to process the spatial audio output signal 307 and any spatial metadata into a format suitable for transmission. The encoder module 309 is configured to encode and multiplex the spatial audio signal 307 and spatial metadata to a bitstream 311. The encoder module 309 provides a bitstream 311 as an output. Any suitable type of encoder module 309 could be used in examples of the disclosure. In some examples the encoder module could be an immersive voice and audio services (IVAS) encoder module 309.
In some examples the bitstream 311 could be provided to a transmitter to enable the bitstream 311 to be transmitted to a device such as the rendering device 201 shown in
The method comprises, at block 401, obtaining a band limited audio object 211 comprising one or more parameters. The band limited audio object 211 could comprise a low frequency effect audio object or any other suitable type of object. The band limited audio object 211 could be configured to be played back or rendered via at least one band limited speaker 203.
The band limited audio object 211 may comprise one or more different parameters. The parameters may enable the spatial properties of the band limited audio object 211 to be recreated. The different parameters could comprise any one or more of volume, delay, diffusivity, reverberation, position or any other parameter which affects the spatial properties of the band limited audio object 211.
The band limited audio object 211 could be obtained by any suitable means. In some examples the obtaining of the band limited audio object 211 could comprise retrieving the band limited audio object 211 from a memory 107. The memory 107 could be the memory of the rendering device 201 or the memory of a different device such as a storage device. In some examples the obtaining of the band limited audio object 211 could comprise receiving the band limited audio object 211 from a spatial audio capture system 301 such as the system shown in
In some examples the method may also comprise obtaining spatial metadata. The spatial metadata may be obtained with the band limited audio object 211 or may be obtained separately from the band limited audio object 211.
At block 403 the method comprises determining a position of a user 605. Any suitable process may be used to determine the position of the user 605. In some examples the position of the user 605 could be determined by positioning means 205 which comprise part of the rendering device 201. In other examples the position of the user 605 could be determined by a remote device which then provides information indicative of the determined position to the rendering device 201.
In some examples the position of the user 605 may comprise the distance between the user 605 and one or more reference points. The reference points could be the position of the loudspeaker 203, the position of part of the rendering device 201 or any other suitable point. In some examples the position of the user 605 may comprise the angular orientation of the user 605. The angular orientation of the user 605 may be determined compared to a given reference point such as the loudspeaker 203 or a display or any other suitable point.
In some examples the position of the user 605 could be inferred from the position of a display or other part of an audio rendering system. The display could be configured to display visual images which are associated with the band limited audio object 211 and/or other audio that is being rendered. In some examples the display could be a near eye display which may be provided in a headset. The display could be used for augmented reality purposes or for any other suitable purpose.
At block 405 the method comprises using the determined position of the user 605 and the obtained spatial metadata to control at least one of the parameters of the band limited audio object 211. The control of the parameters may enable the spatial effects of the band limited audio object 211 to be recreated to correspond to the position of the user 605.
If the user 605 is moving then the way in which the parameters are controlled may be changed so as to enable the spatial effects to correspond to the movement of the user 605. The positioning means 205 may enable both the distance and orientation of a user 605 to be determined so as to enable movement of the user 605 with six degrees of freedom. This enables translation movement of the user 605 as well as rotational movement of the user 605 to be accounted for by the control of the parameters of the band limited audio object 211. The translational movement could comprise movement along any of three perpendicular axes. The rotational movement may comprise rotation about any of three perpendicular axes.
In some examples spatial metadata may be used to control the parameters of the band limited audio object 211. The spatial metadata may be obtained with the band limited audio object 211. The spatial metadata may comprise information indicating how the parameters should be varied in dependence upon the position of the user 605.
In examples of the disclosure the position of the user 605 may be determined while the band limited audio object 211 is being played back. That is, the position of the user 605 may be determined while the band limited audio object 211 is being rendered by the one or more loudspeakers 203. This may enable the current position of the user 605 to be determined and may enable the parameters of the band limited audio object 211 to be controlled so as to account for movement of the user 605.
At block 501 the location of a user 605 is determined. The location of the user 605 may be determined relative to components of an audio rendering system. For example the location of one or more speakers 203 may be known. The location of the speakers 203 may be known from calibration data of the audio rendering system or from measurements made by a suitable positioning means or by any other suitable process.
The position of the band limited audio object 211 may also be determined. The position of the band limited audio object 211 may be determined from spatial metadata which may be stored with, or otherwise associated with, the band limited audio object 211. The position of the band limited audio object 211 could be a virtual position which represents the position of the band limited audio object 211 in a virtual audio space. This may determine the position in which a user 605 perceives the band limited audio object 211 to be located or to originate from. The position may be determined by the location of a sound source when the sound was being captured. The position may also be affected by other factors which affect the directionality of sound such as the presence of walls and other features which may reflect or divert sound.
At block 503 the volume of the band limited audio object 211 is controlled based on the determined location of the user 605. The positions of the speakers 203 and the virtual position of the band limited audio object 211 may also be used to control the volume of the band limited audio object 211. For instance if the user 605 moves closer to the virtual position of the band limited audio object 211 then the volume of the band limited audio object 211 may be increased while if the user 605 moves further away from the virtual position of the band limited audio object 211 then the volume of the band limited audio object 211 may be decreased.
The positions of the speakers within the audio rendering system may also be taken into account while the volume is being controlled. For example it may be determined if the user 605 is moving towards a speaker 203 rendering the band limited audio object 211 or away from a speaker 203 rendering the band limited audio object 211 and the volume can be controlled as needed.
The audio rendering system 601 comprises a plurality of loudspeakers 203, 603. The audio rendering system 601 comprises a plurality of non-band limited speakers 603 and a band limited speaker 203. The non-band limited speakers 603 may be configured to render non-band limited audio objects. The band limited speaker 203 may be configured to render band limited audio objects 211. In the example of
In the example of
In the example of
The apparatus 101 for controlling the parameters of the band limited audio object 211 could be provided at any suitable position within the audio rendering system. In some examples the apparatus 101 could be provided within the band limited speaker 203.
The band limited audio object 211 is located in a first location 611 within the virtual audio space. In the example of
The band limited audio object 211 could comprise a mono signal or a stereo signal. In the example of
The band limited speaker 203 is positioned at a second location 613 within the audio rendering system 601. The second location 613 may be different to the first location 611. In the example audio rendering system 601 of
In the example of
While the user 605 is moving the user 605 could also be consuming additional content related to the audio. For example the user 605 could be using augmented reality content or virtual reality content which may comprise images as well as audio. The images could be rendered on a near eye display which could be provided in a head set or any other suitable type of display.
The movement of the user 605 could be tracked using any suitable means. In some examples one or more of the speakers 203, 603 could comprise positioning means 205 which enables the distance between the user 605 and one or more of the speakers 203, 603 to be determined. In some examples a device such as a head set which can be worn by the user 605 may comprise positioning means which may be configured to track the movements of the user 605.
In the example of
The calibrated location 615 may be a central position within the audio rendering system 601. The speakers 603, 203 may be calibrated so that when the user 605 is in this calibrated location 615 the level of sound coming from each of the speakers 203, 603 is the same. The calibrated location 615 may be an optimal position for listening to the sound rendered by the audio rendering system 601.
When the user 605 is in the calibrated location 615 the gain applied to the band limited audio object 211 is indicated by the metadata 213 associated with the band limited audio object 211. In the example of
When the user 605 changes location the gain that is applied to the band limited audio object 211 may be determined based both on the distance between the user 605 and the band limited audio object 211 and also the distance between the user 605 and the band limited audio speaker 203. For instance, in the example of
Where A is the distance between the band limited speaker 203 and the current location 617 of the user, B is the distance between the band limited audio object 211 and the calibrated location 615, C is the distance between the band limited speaker 203 and the calibrated location 615 and D is the distance between the band limited audio object 211 and the current location 617 of the user 605.
In the example audio rendering system 701 of
In the example show in
Where A is the distance between the band limited speaker 203 and the current location 617 of the user, B is the distance between the location of the nearest speaker 603 and the calibrated location 615, C is the distance between the band limited speaker 203 and the calibrated location 615 and D is the distance between the nearest speaker 603 and the current location 617 of the user 605.
In the example audio rendering system 801 of
In the example of
In the example of
In the examples of
At block 1111 the location of a user 605 is determined. The location of the user 605 may be determined relative to components of an audio rendering system. For example the location of one or more speakers 203, 603 may be known. The one or more speakers could be band limited speakers 203. In some examples the speakers 203 could also include non-band limited speakers 603. The location of the speakers 203, 603 may be known from calibration data of the audio rendering system or from measurements made by a suitable positioning means or by any other suitable process.
The position of the band limited audio object 211 may also be determined. The position of the band limited audio object 211 may be determined from spatial metadata which may be stored with, or otherwise associated with, the band limited audio object 211. The position of the band limited audio object 211 could be a virtual position which represents the position of the band limited audio object 211 in a virtual audio space. This may determine the position in which a user 605 perceives the band limited audio object 211 to be located or to originate from. The position may be determined by the location of a sound source when the sound was being captured. The position may also be affected by other factors which affect the directionality of sound such as the presence of walls and other features which may reflect or divert sound.
At block 1113 the delay of the band limited audio object 211 is controlled based on the determined location of the user 605. The positions of the speakers 203, 603 and the virtual position of the band limited audio object 211 may also be used to control the delay of the band limited audio object 211. The delay of the band limited audio object 211 could be the delay as compared to other non-band limited audio objects which be being rendered at the same time as the band limited audio object 211.
The audio rendering system 1201 of
In the example audio rendering system 1201 of
In the example shown in
In some examples the delay that is applied to the band limited audio object 211 may be given by
Where A is the distance between the user 605 and the band limited speaker 203, B is the distance between the user 605 and the non-band limited speaker 603 and c is the speed of sound. The speed of sound may be estimated as 340 m/s.
In some examples the audio rendering system 1201 may be calibrated so that the audio from each of the speakers 203, 603 is synchronised to arrive at the same time to a calibration location 615. The calibration location 615 could be a central location within the audio rendering system 1201. The delay that is added in the examples of the disclosure could be added in addition to this calibration delay.
The delay that is added in the examples shown in
At block 1301 spatial metadata stored with a first audio object is accessed. In some examples the spatial metadata may be stored with a non-band limited audio object but may be needed in order to enable a band limited audio object 211 to be rendered. The non-band limited audio object and the band limited audio object 211 could be associated with each other in that they represent the same or similar sound sources, they may come from the same or similar direction, they may be played back simultaneously to create a spatial audio space or there may be any other suitable connection.
In such cases metadata 213 indicative of the non-band limited audio object may be stored with a band limited audio object 211. The metadata 213 obtained with the band limited audio object 211 may be indicative of the connection between the band limited audio object 211 and the non-band limited audio object. The metadata 213 indicative of the audio object or the connection between the audio objects could be an integer which represents the audio object. In such examples each of the audio objects could be assigned a reference integer.
The metadata 213 obtained with the band limited audio object 211 may therefore enable the spatial metadata to be accessed even though the spatial metadata may be stored with a different audio object.
At block 1303 the spatial metadata is used to control parameters of both a first audio object and the second audio object. The first audio object could be the non-band limited audio object and the second audio object could be the band limited audio object 211. In some examples the spatial metadata could be used to control the parameters of more than two audio objects. The spatial metadata can be used to control the parameters of the different audio objects simultaneously.
In the example described above the spatial metadata is stored with the non-band limited audio object and the band limited audio object 211 is stored with metadata 213 which indicates how to access the stored spatial metadata. In other examples the spatial metadata could be stored with the band limited audio object 211 while the metadata 213 stored with the non-band limited audio object could be used to retrieve the spatial metadata as needed.
The method of
The method comprises, at block 1401, obtaining a band limited audio object 211 comprising one or more parameters. The band limited audio object 211 could comprise a low frequency effect audio object or any other suitable type of object. The band limited audio object 211 could be configured to be played back or rendered via at least one band limited speaker 203.
The band limited audio object 211 may comprise one or more different parameters. The parameters may enable the spatial properties of the band limited audio object 211 to be recreated. The different parameters could comprise any one or more of volume, delay, diffusivity, reverberation or any other parameter which affects the spatial properties of the band limited audio object 211. The volume could be the loudness or the gain applied to the band limited audio object 211.
The band limited audio object 211 could be obtained by any suitable means. In some examples the obtaining of the band limited audio object 211 could comprise retrieving the band limited audio object 211 from a memory 107. The memory 107 could be the memory of the rendering device 201 or the memory of a different device such as a storage device. In some examples the obtaining of the band limited audio object 211 could comprise receiving the band limited audio object 211 from a spatial audio capture system 301 such as the system shown in
At block 1403 the method comprises determining a direction of a display associated with the band limited audio object 211. The display could be a display on which visual content associated with the band limited audio object 211 is being displayed. It may be assumed that the user 605 viewing the content on the display is positioned so that they are facing towards the display.
The display could be a near eye display which could be provided within a headset or other similar device. In such examples the direction of the display may change as the user 605 rotates their head and/or body.
In some examples the display could be provided within a handheld device such as mobile telephone. In such examples the user 605 could tilt or otherwise change the direction of the hand held device while they are viewing content displayed on the display.
The direction of the display could be determined using positioning means 205. The positioning means could comprise accelerometers, magnetometers or any other suitable means which could be configured to determine the direction of the display.
In some examples the method may also comprise obtaining spatial metadata. The spatial metadata may be obtained with the band limited audio object 211 or may be obtained separately from the band limited audio object 211.
At block 1403 the method comprises using spatial metadata associated with the band limited audio object 211 to control one or more parameters of the band limited audio object 211 in accordance with the determined direction of the display. The one or more parameters that are controlled could comprise the volume or any other suitable parameter.
As an example the band limited speaker 203 in an audio rendering system may only cover a limited angular range. The angular range of the band limited speaker 203 may be limited compared to non-band limited speakers 603. The angular range of the band limited speaker 203 may be limited in that it does not cover the entire angular range within which the display could be directed.
If it is determined that the direction of the display is within a threshold range then the band limited audio object 211 may be controlled so that no change is made to the parameters of the band limited audio object 211. The threshold range could comprise an angular range that corresponds to the user 605 being positioned within the angular range covered by the band limited speaker 203.
If it is determined that the direction of the display is outside of the threshold range then the band limited audio object 211 may be controlled as indicated by the spatial metadata. If the direction is determined to be outside of the threshold range then this could correspond to the user being positioned outside of the angular range covered by the band limited speaker 203. In which case the parameters of the band limited audio object 211 so that the spatial effects of the band limited audio object 211 can be recreated for the user 605. For instance the volume of the band limited audio object 211 may be decreased if it is determined that the display is outside of the threshold range. This could recreate the spatial effect of the band limited audio object 211 being position behind or towards the back of the user 605.
Examples of the disclosure provide the technical effect of enabling the spatial aspect of band limited audio objects 211to be recreated. As a user's spatial awareness of the band limited audio objects 211 may be lower than their awareness of the non-band limited audio objects this enable different methods of providing the spatial effects to be used.
The blocks illustrated in the
In this application the term coupled means operationally coupled. Any number or combination of intervening elements can exist between coupled elements, including no intervening elements.
The term ‘comprise’ is used in this document with an inclusive not an exclusive meaning. That is any reference to X comprising Y indicates that X may comprise only one Y or may comprise more than one Y. If it is intended to use ‘comprise’ with an exclusive meaning then it will be made clear in the context by referring to ‘comprising only one . . . ’ or by using ‘consisting’.
In this description, reference has been made to various examples. The description of features or functions in relation to an example indicates that those features or functions are present in that example. The use of the term ‘example’ or ‘for example’ or ‘can’ or ‘may’ in the text denotes, whether explicitly stated or not, that such features or functions are present in at least the described example, whether described as an example or not, and that they can be, but are not necessarily, present in some of or all other examples. Thus ‘example’, ‘for example’, ‘can’ or ‘may’ refers to a particular instance in a class of examples. A property of the instance can be a property of only that instance or a property of the class or a property of a sub-class of the class that includes some but not all of the instances in the class. It is therefore implicitly disclosed that a feature described with reference to one example but not with reference to another example, can where possible be used in that other example as part of a working combination but does not necessarily have to be used in that other example.
Although embodiments have been described in the preceding paragraphs with reference to various examples, it should be appreciated that modifications to the examples given can be made without departing from the scope of the claims.
Features described in the preceding description may be used in combinations other than the combinations explicitly described above.
Although functions have been described with reference to certain features, those functions may be performable by other features whether described or not.
Although features have been described with reference to certain embodiments, those features may also be present in other embodiments whether described or not.
The term ‘a’ or ‘the’ is used in this document with an inclusive not an exclusive meaning.
That is any reference to X comprising a/the Y indicates that X may comprise only one Y or may comprise more than one Y unless the context clearly indicates the contrary. If it is intended to use ‘a’ or ‘the’ with an exclusive meaning then it will be made clear in the context. In some circumstances the use of ‘at least one’ or ‘one or more’ may be used to emphasis an inclusive meaning but the absence of these terms should not be taken to infer and exclusive meaning.
The presence of a feature (or combination of features) in a claim is a reference to that feature) or combination of features) itself and also to features that achieve substantially the same technical effect (equivalent features). The equivalent features include, for example, features that are variants and achieve substantially the same result in substantially the same way. The equivalent features include, for example, features that perform substantially the same function, in substantially the same way to achieve substantially the same result.
In this description, reference has been made to various examples using adjectives or adjectival phrases to describe characteristics of the examples. Such a description of a characteristic in relation to an example indicates that the characteristic is present in some examples exactly as described and is present in other examples substantially as described.
The use of the term ‘example’ or ‘for example’ or ‘can’ or ‘may’ in the text denotes, whether explicitly stated or not, that such features or functions are present in at least the described example, whether described as an example or not, and that they can be, but are not necessarily, present in some of or all other examples. Thus ‘example’, ‘for example’, ‘can’ or ‘may’ refers to a particular instance in a class of examples. A property of the instance can be a property of only that instance or a property of the class or a property of a sub-class of the class that includes some but not all of the instances in the class. It is therefore implicitly disclosed that a feature described with reference to one example but not with reference to another example, can where possible be used in that other example as part of a working combination but does not necessarily have to be used in that other example
Whilst endeavoring in the foregoing specification to draw attention to those features believed to be of importance it should be understood that the Applicant may seek protection via the claims in respect of any patentable feature or combination of features hereinbefore referred to and/or shown in the drawings whether or not emphasis has been placed thereon.
Number | Date | Country | Kind |
---|---|---|---|
1812038.6 | Jul 2018 | GB | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/FI2019/050554 | 7/19/2019 | WO | 00 |