This application claims the benefit of Korean Patent Application No. 10-2020-0186524 filed on Dec. 29, 2020, in the Korean Intellectual Property Office, the entire disclosure of which is incorporated herein by reference for all purposes.
One or more example embodiments relate to a method and apparatus for processing an audio signal based on an extent sound source, and more particularly, to a technique for rendering an audio signal by setting a reference area of an extent sound source and performing sound localization on a virtual sound source according to a positional relationship between the reference area and a listener.
With the recent increase in the demand for virtual reality (VR) technology or games, research on audio technology for implementing realistic spatial sound is being actively conducted. An object-based audio signal for implementing spatial sound refers to an audio signal rendered in consideration of a relationship between a position of an object and a listener while regarding a sound source as the object.
An existing object-based audio signal processes a sound source as a point in space. However, in the real environment, sound sources may exist in various forms in space. For example, in a natural phenomenon, a fountain, a waterfall, a river, breaking waves, and the like may produce sounds in the whole of a predetermined area.
A sound source that produces a sound in the whole of a predetermined area such as a line or a plane is referred to as an extent sound source. Accordingly, in order to implement realistic spatial sound, a technique for processing an audio signal in consideration of an extent sound source is needed.
Example embodiments provide a method and apparatus for processing an extent sound source with a small amount of computation by setting a reference area of the extent sound source and performing sound localization on a virtual sound source according to a positional relationship between the reference area and a listener.
Example embodiments provide a method and apparatus for providing realistic spatial sound by rendering an audio signal for an extent sound source, without performing individual sound localization on a virtual sound source in all areas of the extent sound source.
According to an aspect, there is provided a method of processing an audio signal based on an extent sound source, the method including identifying information on a reference area of the extent sound source and information on a position of a listener, determining a position of a virtual sound source within the extent sound source based on a relationship between the position of the listener and the reference area of the extent sound source, and rendering an audio signal based on the determined position of the virtual sound source, wherein the reference area may be determined based on a position and a size of the extent sound source.
The determining of the position of the virtual sound source may include determining the position of the virtual sound source corresponding to the position of the listener, when the position of the listener is included in the reference area of the extent sound source.
The determining of the position of the virtual sound source may include determining the position of the virtual sound source in an edge area of the extent sound source, when the position of the listener is not included in the reference area of the extent sound source.
The rendering may include rendering the audio signal based on a frequency response of the listener to a virtual sound source positioned in front of the listener, when the position of the listener is included in the reference area of the extent sound source.
The rendering may include rendering the audio signal based on a frequency response of the listener to a virtual sound source positioned in an edge area of the extent sound source, when the position of the listener is not included in the reference area of the extent sound source.
According to an aspect, there is provided a method of processing an audio signal based on an extent sound source, the method including identifying information on a reference area of the extent sound source and information on a position of a listener, determining whether the position of the listener is included in the reference area of the extent sound source, determining a sound localization point of a virtual sound source corresponding to the position of the listener, when the position of the listener is included in the reference area of the extent sound source, determining the sound localization point of the virtual sound source in an edge area of the extent sound source, when the position of the listener is not included in the reference area of the extent sound source, and rendering the audio signal based on the sound localization point.
The rendering may include rendering the audio signal based on a frequency response of the listener to a sound localization point positioned in front of the listener, when the position of the listener is included in the reference area of the extent sound source.
The rendering may include rendering the audio signal based on a frequency response of the listener to a sound localization point positioned in an edge area of the extent sound source, when the position of the listener is not included in the reference area of the extent sound source.
According to an aspect, there is provided a processing apparatus to perform a method of processing an audio signal based on an extent sound source, the processing apparatus including a processor, wherein the processor may be configured to identify information on a reference area of the extent sound source and information on a position of a listener, determine a position of a virtual sound source within the extent sound source based on a relationship between the position of the listener and the reference area of the extent sound source, and render an audio signal based on the determined position of the virtual sound source, wherein the reference area may be determined based on a position and a size of the extent sound source.
The processor may be further configured to determine the position of the virtual sound source corresponding to the position of the listener, when the position of the listener is included in the reference area of the extent sound source.
The processor may be further configured to determine the position of the virtual sound source in an edge area of the extent sound source, when the position of the listener is not included in the reference area of the extent sound source.
The processor may be further configured to render the audio signal based on a frequency response of the listener to a virtual sound source positioned in front of the listener, when the position of the listener is included in the reference area of the extent sound source.
The processor may be further configured to render the audio signal based on a frequency response of the listener to a virtual sound source positioned in an edge area of the extent sound source, when the position of the listener is not included in the reference area of the extent sound source.
According to an aspect, there is provided a processing apparatus to perform a method of processing an audio signal based on an extent sound source, the processing apparatus including a processor, wherein the processor may be configured to identify information on spatial coordinates of the extent sound source and spatial coordinates of a position of a listener, determine whether the position of the listener is included in a reference area of the extent sound source, determine a sound localization point of a virtual sound source corresponding to the position of the listener, when the position of the listener is included in the reference area of the extent sound source, determine the sound localization point of the virtual sound source in an edge area of the extent sound source, when the position of the listener is not included in the reference area of the extent sound source, and render the audio signal based on the sound localization point.
The processor may be further configured to render the audio signal based on a frequency response of the listener to a sound localization point positioned in front of the listener, when the position of the listener is included in the reference area of the extent sound source.
The processor may be further configured to render the audio signal based on a frequency response of the listener to a sound localization point positioned in an edge area of the extent sound source, when the position of the listener is not included in the reference area of the extent sound source.
Additional aspects of example embodiments will be set forth in part in the description which follows and, in part, will be apparent from the description, or may be learned by practice of the disclosure.
According to example embodiments, it is possible to process an extent sound source with a small amount of computation by setting a reference area of the extent sound source and performing sound localization on a virtual sound source according to a positional relationship between the reference area and a listener.
According to example embodiments, it is possible to provide realistic spatial sound by rendering an audio signal for an extent sound source, without performing individual sound localization on a virtual sound source in all areas of the extent sound source.
These and/or other aspects, features, and advantages of the invention will become apparent and more readily appreciated from the following description of example embodiments, taken in conjunction with the accompanying drawings of which:
Hereinafter, example embodiments will be described in detail with reference to the accompanying drawings. However, various alterations and modifications may be made to the example embodiments. Here, the example embodiments are not construed as limited to the disclosure. The example embodiments should be understood to include all changes, equivalents, and replacements within the idea and the technical scope of the disclosure.
The terminology used herein is for the purpose of describing particular example embodiments only and is not to be limiting of the example embodiments. The singular forms “a”, “an”, and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises/comprising” and/or “includes/including” when used herein, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components and/or groups thereof.
Unless otherwise defined, all terms including technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which example embodiments belong. It will be further understood that terms, such as those defined in commonly-used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
When describing the example embodiments with reference to the accompanying drawings, like reference numerals refer to like constituent elements and a repeated description related thereto will be omitted. In the description of example embodiments, detailed description of well-known related structures or functions will be omitted when it is deemed that such description will cause ambiguous interpretation of the present disclosure.
The present disclosure relates to a technique for processing an audio signal 102 by setting a reference area of an extent sound source and performing sound localization on a virtual sound source according to a positional relationship between the reference area and a listener, for rendering the audio signal 102 for the extent sound source with a small amount of computation.
A method of processing the audio signal 102 based on an extent sound source may be performed by a processing apparatus 101. The processing apparatus 101 may include a processor of an electronic device such as a smartphone, a PC, or a tablet.
Referring to
The processing apparatus 101 may determine whether a position of a listener is included in a reference area of the extent sound source, determine a position of a virtual sound source according to a determination result, and render the audio signal 102 based on the determined position of the virtual sound source.
Herein, the extent sound source may be a line or a plane, and the type of the line or the plane is not limited to examples set forth herein. That is, when the extent sound source is a line, the extent sound source may be in various shapes such as a straight line, a curve, and the like. When the extent sound source is a plane, the extent sound source may be in various shapes such as a triangle, a rectangle, a pentagon, and the like.
The reference area may be determined to determine the position of the virtual sound source within the extent sound source. The reference area may be an area determined according to a position and a size of the extent sound source, and an area in three-dimensional space. The reference area may be determined based on spatial coordinates of the extent sound source. The reference area will be described further with reference to
Specifically, the processing apparatus 101 may identify spatial coordinates of the position of the extent sound source and spatial coordinates of the position of the listener. The processing apparatus 101 may determine whether the position of the listener is included in the reference area of the extent sound source based on the spatial coordinates of the position of the extent sound source and the spatial coordinates of the position of the listener.
An extent sound source 201 of
To generate an audio signal for the extent sound source 201, all points included in the area of the extent sound source 201 may be determined as virtual sound sources 202, as shown in
Therefore, in terms of computational efficiency or data size, it may be advantageous to determine virtual sound sources 202 using the position of the listener and the position and size of the extent sound source 201 based on the spatial coordinates of the extent sound source 201.
An extent sound source 300 of
For example, referring to
When the position 301 of the listener is included in the reference area of the extent sound source 300, the processing apparatus may determine a position of a virtual sound source within the extent sound source 300 corresponding to the position 301 of the listener. That is, the processing apparatus may determine a sound localization point of the virtual sound source within the extent sound source 300 corresponding to the position 301 of the listener.
Specifically, when the position 301 of the listener is included in the reference area of the extent sound source 300, the processing apparatus may determine a position closest to the position 301 of the listener within the extent sound source 300 as the position of the virtual sound source. That is, when the position 301 of the listener is included in the reference area of the extent sound source 300, the processing apparatus may determine the point closest to the position 301 of the listener on the plane corresponding to the extent sound source 300 as the sound localization point of the virtual sound source.
For example, referring to
When the positions 302 and 303 of the listeners are not included in the reference area of the extent sound source 300, the processing apparatus may determine the position of the virtual sound source in an edge area of the extent sound source 300. That is, the processing apparatus may determine the sound localization point of the virtual sound source in the edge area of the extent sound source 300. The edge area will be described further with reference to
An extent sound source 400 of
In
Specifically, referring to
The processing apparatus may render an audio signal based on a frequency response of the listener to the virtual sound source positioned in the edge area. For example, in
In
Specifically, referring to
The processing apparatus may render an audio signal based on a frequency response of the listener to the virtual sound source positioned in the edge area. For example, in
In
The processing apparatus may determine positions of virtual sound sources within the extent sound source 400 corresponding to the positions 402 and 403 of the listeners. That is, the processing apparatus may determine sound localization points of the virtual sound sources within the extent sound source 400 corresponding to the positions 402 and 403 of the listeners.
Specifically, when the positions 402 and 403 of the listeners are included in the reference area of the extent sound source 300, the processing apparatus may determine positions (e.g., (−2, 0, 0) when the position of the listener is (−2, 0, 2), and (2, 0, 0) when the position of the listener is (2, 0, 2)) closest to the positions 402 and 403 of the listeners within the extent sound source 400 as the positions of the virtual sound sources.
That is, when the positions 402 and 403 of the listeners are included in the reference area of the extent sound source 400, the processing apparatus may determine the points closest to the positions 402 and 403 of the listeners on the plane corresponding to the extent sound source 400 as the sound localization points of the virtual sound sources.
The processing apparatus may render the audio signal based on frequency responses of the listeners to the virtual sound sources positioned in front of the listeners. For example, in
Referring to
When the position of the listener is (a) of
For example, when a position of a listener is (d), (e), (f) or (g) of
For example, when a position of a listener is (h), (i) or (j) of
When the position of the listener is (h) of
In operation 601, a processing apparatus may identify information on a reference area of an extent sound source and information on a position of a listener. The information on the reference area of the extent sound source and the information on the position of the listener may be identified by spatial coordinates.
In operation 602, the processing apparatus may determine whether the position of the listener is included in the reference area of the extent sound source. When the position 301, 302, 303 of the listener is included in a normal of a plane corresponding to the extent sound source, the processing apparatus may determine that the position of the listener is included in the reference area of the extent sound source.
In operation 603, the processing apparatus may determine a position of a virtual sound source corresponding to the position of the listener, when the position of the listener is included in the reference area of the extent sound source. That is, when the position of the listener is included in the reference area of the extent sound source, the processing apparatus may determine a sound localization point within the extent sound source corresponding to the position of the listener.
In operation 604, the processing apparatus may determine a sound localization point of the virtual sound source in an edge area of the extent sound source, when the position of the listener is not included in the reference area of the extent sound source.
The processing apparatus may determine a position closest to the position of the listener within the extent sound source as the position of the virtual sound source. That is, the processing apparatus may determine a point closest to the position of the listener on a plane or a line corresponding to the extent sound source as the position of the virtual sound source.
In operation 605, the processing apparatus may render an audio signal based on the position of the virtual sound source. The processing apparatus may render the audio signal based on a frequency response of the listener to the determined position of the virtual sound source.
The components described in the example embodiments may be implemented by hardware components including, for example, at least one digital signal processor (DSP), a processor, a controller, an application-specific integrated circuit (ASIC), a programmable logic element, such as a field programmable gate array (FPGA), other electronic devices, or combinations thereof. At least some of the functions or the processes described in the example embodiments may be implemented by software, and the software may be recorded on a recording medium. The components, the functions, and the processes described in the example embodiments may be implemented by a combination of hardware and software.
The method according to example embodiments may be written in a computer-executable program and may be implemented as various recording media such as magnetic storage media, optical reading media, or digital storage media.
Various techniques described herein may be implemented in digital electronic circuitry, computer hardware, firmware, software, or combinations thereof. The implementations may be achieved as a computer program product, for example, a computer program tangibly embodied in a machine readable storage device (a computer-readable medium) to process the operations of a data processing device, for example, a programmable processor, a computer, or a plurality of computers or to control the operations. A computer program, such as the computer program(s) described above, may be written in any form of a programming language, including compiled or interpreted languages, and may be deployed in any form, including as a stand-alone program or as a module, a component, a subroutine, or other units suitable for use in a computing environment. A computer program may be deployed to be processed on one computer or multiple computers at one site or distributed across multiple sites and interconnected by a communication network.
Processors suitable for processing of a computer program include, by way of example, both general and special purpose microprocessors, and any one or more processors of any kind of digital computer. Generally, a processor will receive instructions and data from a read-only memory or a random-access memory, or both. Elements of a computer may include at least one processor for executing instructions and one or more memory devices for storing instructions and data. Generally, a computer also may include, or be operatively coupled to receive data from or transfer data to, or both, one or more mass storage devices for storing data, e.g., magnetic, magneto-optical disks, or optical disks. Examples of information carriers suitable for embodying computer program instructions and data include semiconductor memory devices, e.g., magnetic media such as hard disks, floppy disks, and magnetic tape, optical media such as compact disk read only memory (CD-ROM) or digital video disks (DVDs), magneto-optical media such as floptical disks, read-only memory (ROM), random-access memory (RAM), flash memory, erasable programmable ROM (EPROM), or electrically erasable programmable ROM (EEPROM). The processor and the memory may be supplemented by, or incorporated in special purpose logic circuitry.
In addition, non-transitory computer-readable media may be any available media that may be accessed by a computer and may include both computer storage media and transmission media.
Although the present specification includes details of a plurality of specific example embodiments, the details should not be construed as limiting any invention or a scope that can be claimed, but rather should be construed as being descriptions of features that may be peculiar to specific example embodiments of specific inventions. Specific features described in the present specification in the context of individual example embodiments may be combined and implemented in a single example embodiment. On the contrary, various features described in the context of a single embodiment may be implemented in a plurality of example embodiments individually or in any appropriate sub-combination. Furthermore, although features may operate in a specific combination and may be initially depicted as being claimed, one or more features of a claimed combination may be excluded from the combination in some cases, and the claimed combination may be changed into a sub-combination or a modification of the sub-combination.
Likewise, although operations are depicted in a specific order in the drawings, it should not be understood that the operations must be performed in the depicted specific order or sequential order or all the shown operations must be performed in order to obtain a preferred result. In specific cases, multitasking and parallel processing may be advantageous. In a specific case, multitasking and parallel processing may be advantageous. In addition, it should not be understood that the separation of various device components of the aforementioned example embodiments is required for all the example embodiments, and it should be understood that the aforementioned program components and apparatuses may be integrated into a single software product or packaged into multiple software products.
The example embodiments disclosed in the present specification and the drawings are intended merely to present specific examples in order to aid in understanding of the present disclosure, but are not intended to limit the scope of the present disclosure. It will be apparent to those skilled in the art that various modifications based on the technical spirit of the present disclosure, as well as the disclosed example embodiments, can be made.
Number | Date | Country | Kind |
---|---|---|---|
10-2020-0186524 | Dec 2020 | KR | national |