The application is based on and claims priority under 35 U.S.C. § 119 to Korean Patent Application No. 10-2017-0161566, filed on Nov. 29, 2017 in the Korean Intellectual Property Office, the disclosure of which is incorporated by reference herein in its entirety.
The disclosure relates to technology for providing a realistic sound to a user through an audio signal output apparatus or display apparatus with one or more directional or omnidirectional speakers.
As an acoustic system for playing a three-dimensional (3D) sound, a home-theater system has become widespread. In general, such a system with 5.1 or more channels includes loudspeakers for center (C), front left (FL), font right (FR), surround left (SL), surround right (SR), and the like channels, as well as a subwoofer for a low-frequency effects channel.
However, various factors have made it difficult to provide a home-theater system in home. These factors include space limitations, inconvenience or complexities in cable connection, etc. Further, realistic sound effects are restricted without using a sound system of a home-theater quality level.
Taking these problems into account, a sound bar having a combination of speaker units corresponding to one frequency or different frequencies, and a headphone providing a personalized sound experience have been developed as alternatives to the home-theater system. To change an auditory image, signals have to be processed in their own ways, and then output through corresponding loudspeakers. However, it is difficult to comprehensively consider the number of speaker units, the characteristics of each speaker unit, a listening environment, etc., while processing and distributing the signals.
Such an overall procedure of receiving an audio signal, processing the received audio signal, and distributing processed audio signals to the speaker units is referred to as sound rendering. The foregoing alternatives to the home theater system lack the number of output channels and thus are subjected to a virtualization technique during the sound rendering. Although the virtualization technique is applied, the effects may be limited since body information and listening environments vary from one individual user to another.
For example, in a related art display apparatus that provides a multi-channel audio platform, multi-channel loudspeakers are mounted along a front bezel of a display panel, and the loudspeakers arranged as distributed in such a manner are subjected to gain control to achieve the virtualization. However, the loudspeakers mounted on the front side of the display apparatus restrict a position of an auditory image to an inside of a front display. Therefore, there is a limit to providing proper acoustic effects due to changes in a listening space, a user's posture, etc.
Furthermore, a head-related transfer function (HRTF) and the like customizing technique may be employed. However, this technique also has a physical limit in providing constant acoustic effects, and such a limit is caused by various factors such as system specifications, additional customization, etc.
Accordingly, there is a need for technology that processes an audio signal so that the loudspeakers arranged in the audio signal output apparatus or the display apparatus can, on their own, sufficiently provide a realistic sound and a sound field even in an environment in which a home-theater system is difficult to provide.
Provided is a display apparatus that uses one or more omnidirectional loudspeakers mounted to one side and one or more directional loudspeakers mounted to a back side of the display apparatus so as to provide a surround sound and the height of acoustic effects to a user, thereby providing a realistic sound to the user.
In accordance with an aspect of the disclosure, a separation phenomenon of an auditory image, which is caused by sound waves emanating from directional loudspeakers being reflected in various indoor environments, is decreased thereby providing a more natural sound to a user.
Additional aspects will be set forth in part in the description which follows and, in part, will be apparent from the description or may be learned by practice of the presented embodiments.
In accordance with an aspect of the disclosure, there is provided an apparatus for outputting an audio signal, the apparatus including: a channel processor configured to generate two or more channel signals from audio data; a signal processor configured to render the generated two or more channel signals; and a directional speaker configured to reproduce a rendered channel signal, among the rendered two or more channel signals, as audible sound, wherein the signal processor includes: a frequency converter configured to generate channel signals of a frequency domain by converting the generated two or more channel signals through frequency conversion; and a re-panner configured to change, by as much as an adjustment value for a channel gain, the channel gain of at least one channel signal of the generated channel signals of the frequency domain, and wherein the adjustment value monotonically varies as a frequency of the at least one channel signal of the generated channel signals of the frequency domain increases.
In accordance with an aspect of the disclosure, there is provided a display apparatus including: an external housing including a front side on which a display panel is provided; an audio signal processing device accommodated in the external housing and configured to process and render, for output, two or more channel signals generated from audio data; and directional speakers of two or more channels, provided on at least one of a back side opposite to the front side of the external housing, a top side of the external housing, or a lateral side of the external housing, and configured to convert the rendered two or more channel signals into audible sound and to output the audible sound in a predetermined directions, wherein the audio signal processing device includes: a frequency converter configured to generate channel signals of a frequency domain by converting the generated two or more channel signals through frequency conversion; and a re-panner configured to change, by as much as an adjustment value for a channel gain, the channel gain of at least one channel signal of the generated channel signals of the frequency domain, and wherein the adjustment value is at least partially varied based on a frequency of the at least one channel signal of the generated channel signals of the frequency domain.
In accordance with an aspect of the disclosure, there is provided a method of outputting an audio signal, which is performed by at least one processor to reproduce and output an audible sound from audio data, the method including: generating two or more channel signals from the audio data; generating channel signals of a frequency domain by converting the generated two or more channel signals through frequency conversion; changing, by as much as an adjustment value for a channel gain, the channel gain of at least one channel signal of the generated channel signals of the frequency domain; and reproducing, as audible sound, the at least one channel signal having the changed channel gain, wherein the adjustment value monotonically varies as a frequency of the at least one channel signal of the generated channel signals of the frequency domain increases.
In accordance with an aspect of the disclosure, there is provided a non-transitory computer-readable recording medium having recorded thereon a program executable by a computer for performing the method.
In accordance with an aspect of the disclosure, there is provided a signal processor for rendering channel signals of audio data for output by directional speakers, the signal processor including: a frequency converter configured to generate channel signals of a frequency domain by converting two or more channel signals, generated from the audio data, through frequency conversion; and a re-panner configured to change, by as much as an adjustment value for a channel gain, the channel gain of at least one channel signal of the generated channel signals of the frequency domain, wherein the adjustment value monotonically varies as a frequency of the at least one channel signal of the generated channel signals of the frequency domain increases.
The above and other aspects, features, and advantages of certain embodiments of the present disclosure will be more apparent from the following description taken in conjunction with the accompanying drawings, in which:
Below, exemplary embodiments will be described in detail and clearly to such an extent that one of ordinary skill in the art can implement an inventive concept without undue burden or experimentation. Further, it is understood that expressions such as “at least one of,” when preceding a list of elements, modify the entire list of elements and do not modify the individual elements of the list. Like numerals refer to like elements throughout.
Below, one or more embodiments will be described with reference to the accompanying drawings.
Further, the media players 7a, 7b, 9a and 9b comprehensively include display apparatuses 7a and 7b capable of reproducing both video content and audio content and audio signal output apparatuses 9a and 9b capable of reproducing audio content but not video content. The display apparatuses 7a and 7b may include a television, but are not limited thereto. For example, the display apparatuses 7a and 7b may include a monitor, a smartphone, a desktop computer, a laptop computer, a tablet computer, a navigation system, a digital signage, and the like that includes a display and a loudspeaker and reproduces video and audio content through the display and the loudspeaker, respectively.
Further, the audio signal output apparatuses 9a and 9b include at least a speaker or an audio output interface (e.g., a 3.5 mm audio terminal, a Bluetooth interface, etc.) for reproducing and outputting the audio content. For example, the audio signal output apparatuses 9a and 9b may include a radio device, an audio device, a phonograph, a voice recognition loudspeaker, a compact disc (CD) player with a loudspeaker, a digital audio player (DAP), an audio system for a vehicle, home appliances with a loudspeaker, and various other devices for outputting audio.
Accordingly, the display apparatus and the audio signal output apparatus according to an embodiment include at least an audio signal processing device for reproducing and rendering an audio signal from a sound source, and a speaker or audio output interface for outputting the rendered audio signal. Further, the display apparatus includes a display and a video player (e.g., image processor, video decoder, etc.) in addition to the audio signal output apparatus. In this regard, it is understood that the audio signal output apparatus according to an embodiment may be not limited to a standalone audio output device, but may include a component mounted to the display apparatus as a part of the display apparatus.
Further, in
Referring to
Meanwhile, the audio signal processing device 50 may further include a channel processor 110 for generating two or more channel signals from a sound source, a signal processor 130 for rendering the two or more generated channel signals for output, and a signal distributor 150 for outputting the rendered signal.
The processor 10 may be dedicated to control of the channel processor 110, the signal processor 130, and the signal distributor 150, or may be provided to control a general operation of the audio signal output apparatus 100 including the memory 11, the wireless communicator 12, the wired communicator 13, and the input interface 14. According to another embodiment, the processor 10 may be integrated into at least one or a part of the channel processor 110, the signal processor 130, and the signal distributor 150.
Moreover, the channel processor 110, the signal processor 130, and the signal distributor 150 may be integrated into one or more functional modules in various other embodiments. For example, the channel processor 110 and the signal processor 130 may be integrated into one signal processing module, or the signal processor 130 and the signal distributor 150 may be integrated into one signal processing module. Further, the channel processor 110, the signal processor 130 and the signal distributor 150 may be all integrated into one signal processing module.
The processor 10 may, for example, include a central processing unit (CPU), a micro controller unit (MCU), a micro processor (MICOM), an electronic control unit (ECU), an application processor (AP), and/or other electronic units capable of performing various calculations and generating various control signals. The processor 10 may be designed to drive or execute a previously defined application (e.g., program, programming instructions, code, application, or “App”), and perform various control operations in response to a user's input to an input interface 14 and/or according to settings.
Further, the sound source may have various formats such as voice, music and sound effects, which can propagate in the form of waves when reproduced. Here, the sound source includes audio data of at least one channel, and may further include metadata containing information about the audio data. For example, the audio data of at least one channel may include audio data of 2 channels, 3 channels, 5 channels, etc., or may further include audio data of 2.1 channels, 5.1 channels, 7.1 channels, etc., with additional audio data to be reproduced by the subwoofer. In addition, the audio data of at least one channel may further include audio data of 5.1.2 channels, 7.1.4 channels, etc., with an additional height loudspeaker channel for height effects. It is understood that the sound source may include audio data defined in various formats that can be taken into account by a designer.
An analog signal output from the signal distributor 150 is emanated by the plurality of sound output devices 30a, 30b and 30n corresponding to the number of supported channels as an audible sound (i.e., a sound wave) that a user can listen to. The plurality of sound output devices 30a, 30b and 30n may output different sounds or one sound under control of the processor 10. The plurality of sound output devices 30a, 30b and 30n may be provided inside the audio signal output apparatus 100, or may independently communicate with the audio signal output apparatus 100. The plurality of sound output devices 30a, 30b and 30n may include a directional loudspeaker that restores the audible sound from the rendered signal and emanates the audible sound in a specific direction, and/or may include an omnidirectional loudspeaker that outputs a sound of a channel signal different from that of the directional loudspeaker. For example, the directional loudspeaker may output surround signals Ls and Rs, and the omnidirectional loudspeaker may be configured to include loudspeakers for outputting front signals L and R. Further, the omnidirectional loudspeaker may also include a loudspeaker and a subwoofer for respectively outputting a center signal C and a woofer signal LTE which have low directionality like a voice.
According to an embodiment, the processor 10 receives audio data (i.e., a sound source) through a memory 11, a wired/wireless communicator 12/13, and/or the input interface 14, and decodes and converts the audio data into audio data of an uncompressed format. Here, the decoding refers to restoring audio data compressed or encoded by an audio compression format such as MPEG layer-3 (MP3), advanced audio coding (AAC), an audio codec-3 (AC-3), digital theater system (DTS), free lossless audio codec (FLAC), Windows media audio (WMA), etc., into audio data of an uncompressed or decoded format. Of course, when the sound source has not been compressed or encoded, such a decoding process may be omitted. The restored audio data may include one or more channels. For example, when the sound source is audio data of 5.1 channels, the one or more channels of the restored audio data include six channels L, R, C, LFE, Ls and Rs with an additional subwoofer signal. In this case, the processor 10 provides the restored audio data to the channel processor 110, and generates and transmits a control signal for controlling the operations of the channel processor 110, the signal processor 130, and the signal distributor 150.
The channel processor 110 determines whether the provided audio data corresponds to or matches with the number of sound output devices or loudspeaker devices 30a, 30b and 30n, and may perform channel mapping as needed. For example, when the sound source includes audio data of which channels are less than the number of input channels of the channel processor 110, the channel processor 110 performs up-mixing to increase the number of channels of the audio data (i.e., source audio data) and provides the audio data with the increased number of channels to the signal processor 130. On the other hand, when the sound source includes audio data of which channels are greater than the number of loudspeaker devices 30a, 30b and 30n, the channel processor 110 performs down-mixing to decrease the number of channels of the audio data to match with the number of loudspeaker devices 30a, 30b and 30n. Of course, when the number of channels of the sound source is equal to the number of loudspeaker devices 30a, 30b and 30n, the signal processor 110 may not perform any separate up-mixing or down-mixing process.
The signal processor 130 performs a signal process to render the plurality of channel signals, which are received from the channel processor 110, for output, and provides the rendered signal to the signal distributor 150. In particular, the signal processor 130 subjects the plurality of generated channel signals to frequency conversion to thereby generate channel signals of a frequency domain. Then, adjusts a channel gain of the channel signals of the frequency domain that belong to an adjustment frequency range, among the generated channel signals of the frequency domain. Here, the signal processor 130 changes a channel gain as much as an adjustment value. Since the signal processor 130 performs the signal process by considering reflective properties in an indoor space and/or the directionality of the directional loudspeakers 30-1 and 30-2 included in the loudspeaker devices 30a, 30b and 30n, a user may hear more realistic sound from the audio signal output apparatus 100. More detailed operations performed in the signal processor 130 will be described below with reference to
The channel processor 110 and the signal processor 130 may be physically and/or logically separable from each other. In the case of being physically separated, the channel processor 110 and the signal processor 130 may be materialized or embodied by individual circuits or semiconductor chips, respectively.
The signal distributor 150 may perform the channel mapping on the audio signal rendered in the signal processor 130. Specifically, the signal distributor 150 may distribute the channels of the audio data to the plurality of loudspeaker devices 30a, 30b and 30n and thereby determine the audio data to be output. In this case, the signal distributor 150 may distribute the channels to the plurality of loudspeaker devices 30a, 30b and 30n on the basis of additionally given metadata. By this process, the audio data that each of the plurality of loudspeaker devices 30a, 30b and 30n outputs is determined.
Meanwhile, the signal distributor 150 may further include a digital-to-analog converter (DAC) for converting a digital signal output by the channel mapping into an analog signal, and/or a signal amplifier for amplifying the analog signal. Thus, the signal converted into the analog signal and then subjected to the amplification is transmitted to typical passive loudspeakers and changed into an audible sound. On the other hand, when the loudspeaker devices 30a, 30b and 30n are materialized or embodied by an active loudspeaker with a signal amplifier, when the loudspeakers with the DAC are present, or when a separate audio receiver or amplifier is present, the signal distributor may be provided without the DAC or the amplifier.
Referring back to
The memory 11 is configured to temporarily or non-temporarily store the audio data, and transmits the audio data to the processor 10 in response to a call or instruction from the processor 10. Further, the memory 11 may be configured to store various pieces of information for the calculation, process or control operations of the processor 10 in an electronic format. For example, the memory 11 may be configured to store all or a part of various pieces of data, applications, filters, algorithms, instructions, code, etc., for the operations of the processor 10, and provide the same to the processor 10 as needed or instructed. Here, the application may be obtained through an electronic software distribution network accessible by the wireless communicator 12 or the wired communicator 13.
The memory 11 may for example include at least one of a main memory unit and an auxiliary memory unit. The main memory unit may be materialized or embodied by a semiconductor storage medium such as a read-only memory (ROM) and/or a random-access memory (RAM). The ROM may for example include a typical ROM, an erasable and programmable read only memory (EPROM), an electrically erasable and programmable read only memory (EEPROM), a mask ROM, and/or etc. The RAM may for example include a dynamic RAM (DRAM), a static RAM, and/or the like. The auxiliary memory unit may be materialized or embodied by at least one of a flash memory unit, a secure digital (SD) card, a solid state drive (SSD), a hard disk drive (HDD), a magnetic drum, an optical recording media such as a compact disc (CD), a digital versatile disc (DVD), a laser disc (LD), etc., a magnetic tape, a magnetooptical disc, a floppy disk, and/or the like storage medium capable of permanently or semi-permanently storing data.
The wireless communicator 12 is provided to communicate with at least one of external server devices 1, 2 and 3 on the basis of a wireless communication network, receives audio data from another terminal device or server device, and transmits the received audio data to the processor 10. The wireless communicator 12 may be materialized or embodied with an antenna, a communication chip, a substrate, and the like for transmitting an electromagnetic wave externally or receiving an electromagnetic wave from an external source.
Further, the wireless communicator 12 may be provided to communicate with at least one of the external server devices 1, 2 and 3 through wireless communication technology, or at least one of the server devices 1, 2 and 3 through long distance communication technology, e.g., mobile communication technology.
The wireless communication technology may for example include Bluetooth, Bluetooth Low Energy, a controller area network (CAN), Wi-Fi, Wi-Fi Direct, ultra-wide band (UWB), ZigBee, infrared data association (IrDA), near field communication (NFC), etc. The mobile communication technology may for example include 3GPP, Wi-Max, long term evolution (LTE), etc.
The wired communicator 13 is provided to communicate with at least one of the external server devices 1, 2 and 3 through a wired communication network, to receive audio data from another terminal device or server device, and to transmit or provide the received audio data to the processor 10. Here, the wired communication network may for example be materialized or embodied by a pair cable, a coaxial cable, an optical fiber cable, an Ethernet cable or the like physical cable.
However, either of the wireless communicator 12 or the wired communicator 13 may be omitted in one or more embodiments. Therefore, the audio signal output apparatus 100 may include the wireless communicator 12 without the wired communication 13 or may include the wired communicator 13 without the wireless communicator. Further, the audio signal output apparatus 100 may include an integrated communicator that supports both the wireless connection using the wireless communicator 12 and the wired connection using the wired communicator 13.
The input interface 14 is connectable to a device provided separately from the audio signal output apparatus 100, for example, an external storage device, receives audio data from another device, and transmits the received audio data to the processor 10. For example, the input interface 14 may be a USB terminal, and may also include at least one of various interface terminals such as a high definition multimedia interface (HDMI) terminal, a thunderbolt terminal, etc.
As shown in
Further, the display apparatus 200 may further include a back-light unit (BLU) for illuminating the display panel 201 as needed or instructed, and the BLU may be provided inside the housing 210. The display panel 201 may include a rigid display panel or a flexible display panel according to various embodiments.
The housing 210 is provided with the display panel 201 exposed at a front side, and directional speakers 30-1 and 30-2 installed at a back side 210h. However, it is understood that the directional speakers 30-1 and 30-2 are not necessarily installed on the rear side of the display panel 201 in one or more other embodiments. Alternatively, the directional loudspeakers may be installed or provided at any position, including at a top side, a lateral side, a bottom side, etc., of the display panel 201, so long as there are some paths in which emanated sound waves are reflected without being directly transferred to a user.
According to one or more embodiments, the housing 210 may be additionally provided with a stand 203 for supporting the display apparatus 200. The stand 203 may be installed or provided at a suitable position to support the display apparatus 200, such as the bottom side, the back side 210h, etc., of the display apparatus 100. When the display apparatus 200 is mounted to a wall, the stand 203 may be omitted.
The directional speakers 30-1 and 30-2 may be installed at certain positions on the back side 210h of the housing 210, and additional speakers 30-3 and 30-4 may be additionally provided at different positions. To install the directional speakers 30-1 and 30-2, accommodating brackets 204-1 and 40-2 may be further provided on the back side 210h of the housing. Furthermore, the additional speakers 30-3 and 30-4 may include directional and/or omnidirectional speakers according to various embodiments. In the following description, the omnidirectional speaker will be described by way of example.
The omnidirectional speakers 30-3 and 30-4 may be materialized using typical speaker devices, which are installed within the housing 210 and emanate an audible sound via a through hole formed in the housing 210 in a frontward or downward direction.
The directional speakers 30-1 and 30-2 may be installed on the back side 210h of the housing 210, but not limited thereto. Alternatively, the directional speakers may be installed in an upper portion of the back side 210h in order to decrease the thickness of the display apparatus 200. Further, the directional speakers 30-1 and 30-2 may be installed as close to the upper portion of the housing back side 210h as shown in
Further, the directional speakers 30-1 and 30-2 may be installed so that each sound maker 31 (see
As shown in
As shown in
As shown in
According to an embodiment, the emanation holes 32a may be formed or provided to increase in size from the first end of the guide pipe 32 positioned at the sound maker 31 (e.g., driver) to the second end opposite to the first end. This causes more sound be emanated through the emanation holes 32a positioned close to the second side of the guide pipe 32, thereby increasing the directionality of the sound made in a direction corresponding to the lengthwise direction of the guide pipe 32.
The hollow guide pipe 32 has an emanation surface 32b on which the emanation holes 32a are formed and through which a sound is emanated. As described above, when the emanation holes 32a are provided in a row on the emanation surface 32b of the guide pipe 32, a sound propagated through the throat pipe 33 is partially emanated outward through each of the emanation holes 32a while passing through the guide pipe 32.
Because a sound is a wave using air as a medium for propagating based on pressure change, destructive and constructive interferences may occur between sounds emanated through the emanation holes 32a provided in a row in the guide pipe 32 while leaving time lags. While the sounds interfere with each other, the sounds have the directionality in a direction corresponding to the lengthwise direction of the guide pipe 32. Therefore, the directional speakers 30-1 and 30-2 can operate as the directional speakers 30-1 and 30-2 due to the structure of the guide pipe 32 formed with the emanation holes 32a.
The sound propagating in the guide pipe 32 emanates through the emanation holes 32a while passing through the guide pipe 32. Therefore, when the guide pipe 32 gradually tapers with the decreasing internal cross-sections from the first end toward the second end, a sound emanates from the emanation hole 32a adjacent to the second end of the guide pipe 32 at the same level as those from different emanation holes 32a even though sound pressure gradually decreases while passing through the guide pipe 32.
Further, when the internal cross-section of the guide pipe 32 gradually decreases from the first end toward the second end of the guide pipe 32, most of the sounds propagating in the guide pipe 32 emanate through the emanation holes 32a so that the sound made in the sound maker 31 can more efficiently emanate outward. As such sounds emanating outward through the emanation hole 32a increase, sounds reaching the cap 34 positioned at the second end of the guide pipe 32 decrease. In other words, noise caused when the sound reaching the cap 34 returns toward the sound maker 31 is reduced by decreasing the internal cross-section of the guide pipe 32.
As illustrated, the emanation surface 32b may be at an acute angle relative to the lengthwise direction of the guide pipe 32. Since the emanation hole 32a is provided on the emanation surface 32b as described above, the sound is guided to emanate by the emanation surface 32b. The emanation surface 32b of the directional speakers 30, 30-1 and 30-2 may be formed at a predetermined angle θ to the lengthwise direction of the guide pipe 32. Since the sound is guided by the emanation surface 32b and emanates, the directionality of the directional speakers 30, 30-1 and 30-2 is varied depending on the angle θ between the lengthwise direction of the guide pipe 32 and the emanation surface 32b. Specifically, the directionality of the directional speakers 30, 30-1 and 30-2 increases with the increasing angle θ between the lengthwise direction of the guide pipe 32 and the emanation surface 32b.
The cap 34 is placed at the second end of the opened guide pipe 32 and closes the second end of the guide pipe 32. Further, the cap 34 facing the second end of the guide pipe 32 is internally formed with gradually decreasing upper and lower widths. The upper and lower widths intersect to have an approximately V-shaped groove. Thus, destructive interference occurs as the sound reaching the cap 34 is reflected from the inside of the cap 34, thereby reducing noise caused when the sound reaching the second end of the guide pipe 32 is reflected back toward the sound maker 31.
In this manner, the emanating characteristics, which the directional speakers 30-1 and 30-2 installed on the back of the display apparatus 200 have, show some physical properties. First, sounds emanating from the directional speakers 30-1 and 30-2 are not directly transmitted to a user due to the display panel 201. Further, the sound emanating from the directional speakers 30-1 and 30-2 change in directionality as reflected from the display panel 201. Further, when general room environments of a user are taken into account, the sounds emanating from the directional speakers 30-1 and 30-2 are reflected from the ceiling and the left and right walls and thus transmitted to a user via multiple paths. With these physical properties, the paths and characteristics of transmitting the sounds emanating from the directional speakers 30-1 and 30-2 to a user will be described in detail.
First, the acoustic characteristics of the omnidirectional speakers 30-3 and 30-4 are shown in
As illustrated in
On the other hand, the acoustic characteristics of the directional speakers 30-1 and 30-2 are shown in
The characteristics shown in
One sound wave CDS3 between the sound waves corresponding to the two paths is a sound wave transmitted leaving a delay time of about 17˜22 ms, and the other sound wave CDS1 is a sound wave transmitted via a different path leaving a delay time of about 7˜8 ms. Ultimately, the sound wave CDS2 having the frequency lower than or equal to about 2.2 kHz is transmitted to the microphone as reflected from the ceiling, and the sound wave having the frequency higher than or equal to about 2.3 kHz is transmitted to the microphone as a signal CDS1 reflected from the rear wall or a signal CDS3 reflected from the left and right walls. As such, when the directional speakers 30-1 and 30-2 according to an embodiment are arranged on the back side 210h of the display apparatus 200, the characteristics of transmitting the sound waves to a user are varied depending on the frequencies.
Further, a sound wave of 4˜9 kHz is transmitted to the user 20 via a path R2 as reflected from—not the ceiling 21—but a rear wall 23. In addition, a sound wave of 2.2˜10 KHz is transmitted to the user 20 via a path R3 as reflected from both the ceiling 21 and the lateral walls 22b or via a path R4 as reflected from the right wall 22b. The paths shown in
In this manner, the sound waves emanating from the directional speakers 30-1 and 30-2 are reflected and transmitted over different paths according to their frequencies because of the directionalities of the directional speakers 30-1 and 30-2, the placement of the directional speakers 30-1 and 30-2 on the back of the display apparatus 200, and a room structure such as a ceiling, rear wall, lateral walls, etc. Such environments go against supposition of a point-source, and therefore a realistic sound rendering method according to an embodiment is implemented in consideration of the sound characteristics based on the placement of the directional speakers 30-1 and 30-2 in the display apparatus 200 and the room environments.
Specifically, transmission characteristics (e.g., delay time) that vary according to the frequency bands shown in
Therefore, the emanating characteristics varied depending on the frequency bands are schematized as shown in
The reflection positions 24a and 24b on the lateral walls may differ according to room environments. For example, the reflection positions 24a and 24b may be given within an angle of about 30˜0 degrees toward the lateral directions. That is, an auditory image of a frequency lower than 2.2 kHz is reflected from the ceiling and becomes focused at a position near to the median plane, but an auditory image of a frequency higher than or equal to 2.2 kHz is reflected from the left and right lateral walls and becomes focused at a position rapidly distant from the median plane.
Meanwhile, the sound waves reflected from the rear wall are likely to mix with the sound waves of the omnidirectional speakers 30-3 and 30-4 since they emanate from the display apparatus 200 placed in front of the rear wall. Therefore, the effects of the sound waves emanating from the directional speakers 30-1 and 30-2 and reflected from the rear wall will be ignored in a re-panning process to be described below.
Eventually, an auditory image is not uniform but separated at a specific frequency band (e.g. 2.2 kHz), i.e., a frequency separation phenomenon occurs since propagation and reflection paths are different according to the frequencies. Such a non-uniform auditory image jumps up in some frequency ranges according to frequency changes. This may exert an adverse influence upon sound quality and a 3D-spatial audio effect, and also may increase user fatigue. For example, in a case of a scene where a frequency of a sound increases as time passes (e.g., as a vehicle passes by a user), the user 20 may feel a very unnatural sound as if an auditory image suddenly and spatially jumps up from a certain frequency. Therefore, a signal process according to an embodiment is implemented to remove such a non-uniform auditory image and increasing the size of a specific auditory image.
Referring to
As described above, the adjustment frequency range may be defined by a lower limit frequency and an upper limit frequency. It is understood, however, that one or more other embodiments are not limited thereto. For example, according to another embodiment, the adjustment frequency range may be defined without either of the lower limit frequency or the upper limit frequency. Most extremely, the full audible frequency range of 0.02˜20 kHz may be set as the adjustment frequency range.
In general, a process of changing a certain position, at which an auditory image (i.e., a virtual source) is formed, by adjusting a channel gain of a plurality of speakers (e.g. left and right speakers for 2 channels) may be referred to as panning adjustment or re-panning. Below, a process of adjusting the channel gain to prevent the auditory image from being separated at a specific frequency as shown in
The signal processor 130 may include a frequency converter 131, a re-panner 140, a room gain controller 133, and an inverse frequency converter 135.
The frequency converter 131 converts two or more channel signals (i.e. multi-channel signals) generated in the channel processor 110 (see, e.g.,
For example, when the DFT is applied to the levels of two channels L and R with respect to an nth audio sample in a time domain, the levels of the two channels L and R may be represented by the following Expression 1.
L(w)=Dft(L[n]),R(w)=Dft(R[n]) [Expression 1]
where n is an audio sample number, w is a frequency band, L(n) is the level of the left channel in the time domain, R(n) is the level of the right channel in the time domain, L(w) is the level of the left channel in the frequency domain, and R(w) is the level of the right channel in the frequency domain.
The re-panner 140 changes a channel gain by as much as a corresponding adjustment value with regard to a channel signal in the frequency domain, which belongs to the adjustment frequency value, among generated channel signals in the frequency domain. In this case, the adjustment value may be at least partially vary (or be variably determined) according to frequencies that the channel signal of the frequency domain has. According to an embodiment, the adjustment value may be set (or determined) to decrease as the frequency that the channel signal of the frequency domain has becomes higher (see
Alternatively, without limitations, the adjustment value may be set to increase as the frequency the channel signal of the frequency domain becomes higher. In
In this manner, the re-panner 140 may set the adjustment value for the channel signal of the frequency domain, which belongs to the adjustment frequency domain, to be subjected to monotonic change as the frequency becomes higher. The monotonic change includes monotonic increase and monotonic decrease. Here, the monotonic increase of the adjustment value refers to a pattern where the adjustment value is constant or increases without a decreasing section as the frequency becomes higher. Likewise, the monotonic decrease of the adjustment value refers to a pattern where the adjustment value is constant or decreases without an increasing section as the frequency become higher. As an example pattern of the monotonic change, there is a linear pattern as shown in
As described above with reference to
The adjustment frequency range, to which the re-panning is applied, may be variously set between the lowest frequency (2.2 kHz) and the highest frequency (10 kHz) among the frequencies (2.2˜10 kHz) of the sound emanating at the high-frequency auditory image positions 24a and 24b. Alternatively, and without limitations, the adjustment frequency range may be set to be wider or narrower than the lowest frequency and the highest frequency in accordance with actual listening environments.
The adjustment value according to frequency bands used in the re-panning is applied to each of the left channel signal and the right channel signal among the channel signals of the frequency domain, so that the sum of channel gain changed for the left channel signal and the channel gain changed for the right channel signal can be kept constant (linear panning), and the sum of squares can be kept constant (pairwise constant power panning). More detailed operations of the re-panner 140 will be described below with reference to
Referring back to
For example, as shown in
The adjustment of the room gain utilizes the free-field microphone, the dummy head, or the like measurement device and varies depending on a user's position since the adjustment is based on real-time measurements depending on a user's position and room environments. In one or more other exemplary embodiments, the adjustment of the room gain may be omitted from the whole signal process.
The levels Lo′[w] and Ro′[w] of two or more channels, which are adjusted by the room gain controller 133, or the levels Lo[w] and Ro[w] of two or more channels, which are output from the re-panner 140 without the room gain controller 133, are provided to the inverse frequency converter 135. The inverse frequency converter 135 applies the inverse frequency conversion to the provided channel signal or the levels of the channel, thereby restoring the channel signal of the time domain. The channel signal of the time domain may be two surround signals Lo[n] and Ro[n] to be output to the directional speakers 30-1 and 30-2. The channel signal to be converted by the inverse frequency converter 135 into that of the time domain may, for example, be the channel signal of the full frequency range including not only frequency components, of which the channel gain is changed by the re-panner 140, but also frequency components of which the channel gain is not changed. As a result, the channel signals Lo[n] and Ro[n] output from the inverse frequency converter 135 are provided to the signal distributor 150 (see
The panning index calculator 141 may calculate a panning index corresponding to a frequency band on the basis of a level ratio between a left channel signal and a right channel signal among channel signals of the frequency domain. According to one or more other embodiments, a coherence component ratio between the left and right channel signals, a cross-spectral density function, an auto-spectral density function, or the like may be employed in defining the panning index.
The panning index has values within a predetermined range, and refers to an index for indicating a position of a virtual sound source, i.e., a position of an auditory image in accordance with a level ratio between the left channel signal and the right channel signal. Conceptually, the panning index refers to an angle for indicating a position of an auditory image between a left channel and a right channel. For example, on the assumption that the panning index has a value ranging between −1 and 1, a sound is output from only the left channel when the panning index is −1, and a sound is output from only the right channel when the panning index is 1. Further, in the present example, the frequency band power of the left channel is equal to the frequency band power of the right channel when the panning index is 0.
According to an embodiment, the panning index calculator 141 calculates a panning index PI[w] based on a level ratio between a left channel signal L[w] and a right channel signal R[w] by the following Expression 2.
where w is a frequency band, r=R[w]/L[w], L[w]2 is a frequency band power of a left channel signal, and R[w]2 is a frequency band power of a right channel signal. Since PI[w] is normalized by dividing a difference between frequency band powers of both of the channels by the sum of frequency band powers, the panning index has a value between −1 and 1. In the Expression 2, the panning index increases as the frequency band power of the right channel signal becomes relatively great. However, this is a matter of notation. Thus, when R[w] and L[w] are exchanged, the panning index may increase as the frequency band power of the left channel signal becomes relatively great.
The mapping section 142 applies a mapping function (f(x)) to the panning index PI calculated in the panning index calculator 141 so that the panning index can be adjusted and then provided to the panning gain calculator 143. According to an embodiment, the mapping function may be omitted at times or in certain implementations. When applied, however, there is an effect on amplifying or reducing a difference between the left and right channel signals at a specific frequency band w when the mapping function.
Referring back to
The linear panning scheme will be described with reference to
As shown in
The following Table 1 shows an example in which the channel gains GL and GR are calculated by applying such a simple linear panning scheme to the right auditory images 27b and 28b under the condition that the auditory image is bisected as shown in
Here, it will be assumed that the adjustment frequency range is 2.2˜10 kHz as described above, and the gain of the left channel and the gain of the right channel before being subjected to the panning are respectively constant at 0.1 and 0.9 regardless of the frequency.
First, a frequency range lower than or equal to 2.0 kHz does not belong to the adjustment frequency range and the panning is not performed. Therefore, the left channel gain GL and the right channel gain GR are respectively constant at 0.1 and 0.9 at frequencies of 1.0, 1.5 and 2.0 kHz. On the other hand, at a frequency range higher than or equal to 3.0 kHz, the channel gain is controlled to be adjusted, i.e., increased or decreased by as much as the corresponding adjustment value JR by the foregoing linear panning. For example, the adjustment values JR are 0.3, 0.2, 0.1 and 0.0 at frequencies of 3.0, 4.0, 6.0, 8.0 kHz, respectively. At any frequency before and after the adjustment, the sum of the left channel gain GL and the right channel gain GR is constant at 1.
It will be understood that a higher adjustment value is applied as the frequency becomes lower within the adjustment frequency range. In light of the panning scheme, when the decreasing width of the channel gain of the right channel signal and the increasing width of the channel gain of the left channel signal are large, this means that the auditory image at the specific frequency moves from a right channel to a left channel. Therefore, as shown in
Next, the pairwise constant power panning scheme will be described with reference to
In
Referring to
In accordance with the panning based on the trigonometric function, when a position of π/4, i.e., 45°, is set as a reference position, as shown in
where the sum of a square of GR[w] and a square of GL[w], which shows the power, is constant at 2. Further, m is a natural number greater than 2, which may be varied depending on the positions of the left and right speakers with respect to a user's position. For example, m is 4 when the left and right speakers are arranged to form an angle of 90° with respect to the user.
As another panning scheme, the VBAP may be used. The foregoing pairwise constant power panning employs the trigonometric function to keep the power constant. Although it is known that a virtual source panned along sine and cosine values is generally matched with psychological recognition, its theoretical basis has not been clearly provided. To provide the theoretical basis, the VBAP uses vectors to represent a position of a virtual source and positions of speakers, and makes the sum of the vectors be the position of the virtual source.
As shown in
In the present example, it is assumed that the head of the user 20 has coordinates (0,0), the vector A has coordinates (ax, ay), and the vector B has coordinates (bx, by). In this case, the coordinates (cx, cy) of the vector C, which represents the position of the virtual source (i.e., the position of the auditory image), are defined by the following Expression 4. Here, GL is a channel gain of a left channel, and GR is a channel gain of a right channel.
C(cx,cy)=GL*A(ax,ay)+GR*B(bx,by) [Expression 4]
Since the vectors A, B and C are all given, it is possible to obtain GL and GR from the Expression 4. GL and GR accurately represent a direction of a certain vector C but are varied in power according to directions. Therefore, normalization is additionally performed as shown in the following Expression 5.
GL′ and GR′ obtained as described above form the vector C moving along an active arc connecting two speakers. According to the VBAP scheme, the panning for the auditory image is achieved independently of the position of the speaker. Even when the positions of the speakers are changed, it is possible to obtain GL and GR by changing only the information about the vectors A and B in the Expression 4.
Referring back to
Meanwhile, the panning gain calculator 143 may additionally consider a frequency weight to more accurately calculate the panning gain. The frequency weighting section 145 applies the frequency weight to the panning index to reduce a panning effect in a frequency band higher than or equal to a specific frequency, and then provides the panning index, to which the frequency weight is applied, to the panning gain calculator 143. When the characteristics of the directional speaker are taken into account, it may not be suitable to apply the panning effect up to the frequency band higher than or equal to a specific frequency.
For example, a frequency weighting function FW[w] for such a frequency weight may be provided as shown in
In this manner, when the frequency weight FW[w] is provided to the panning gain calculator 143, the panning gain calculator 143 can reflect the frequency weight in obtaining the channel gain. While calculating and obtaining the panning gain, the panning index PI[w] may be replaced by PI′[w] by being multiplied with the frequency weight as shown in the following Expression 6.
PI′[w]=PI[w]*FW[w] [Expression 6]
As described above, the signal processor 130 shown in
For example, when a user listens to a sound while watching an image in front of a TV and the sound is a human voice, an auditory image of the voice should be formed in front of the TV. This is because a sound is more naturally provided when a direction of a TV image is matched with a direction of a voice component in the TV image. For this matching, about 70% of the voice component is typically distributed to each of the left channel and the right channel. In this case, components other than a common component, i.e., uncommon components La and Ra, are subjected to various audio effects (e.g., the sound field effect, the panning effect, etc.) and matched with the position of the TV image in order to achieve a realistic sound. Actually, the TV supports various sound modes for an audio option to make such audio effects.
However, when such common components are included in two channel signals and subjected to the panning, is the result is unnatural since a human voice is spread leftward and rightward with respect to the median plane. Accordingly, as according to another embodiment (or a modification to the embodiment of
Here, the signal processor 230 may include the frequency converter 131, an ambient signal splitter 232, the re-panner 140, the room gain controller 133, the inverse frequency converter 135, and a signal compensator 233. According to one or more other embodiments, at least one of the room gain controller 133, the inverse frequency converter 135, and a signal compensator 233 may be omitted. Here, the configuration and operations of the frequency converter 131, the re-panner 140, the room gain controller 133, and the inverse frequency converter 135 are the same as or similar to those described above with reference to
First, the frequency converter 131 converts signals of two or more channels from the channel processor 110 through frequency conversion, thereby generating a channel signal of a frequency domain.
The ambient signal splitter 232 extracts an ambient signal by removing the common components between the left channel signal and the right channel signal from the channel signal of the frequency domain. To remove the common components, the ambient signal splitter 232 calculates a correlation between the left channel signal and the right channel signal according to the frequency bands.
For example, the correlation is calculated by the following Expression 7.
where GLR[w] is a cross-spectral density between a left channel L and a right channel R, and GLL[w] and GRR[w] are auto-spectral densities of the left channel L and the right channel R, respectively. The correlation CohLR[w] has a value ranging from 0 to 1. The details of the correlation are described in “Random Data” published in 1971 by “J. S. Bendat” et al.
As an alternative method of extracting the common components, similarity may be used instead of the correlation or together with the correlation. The details of the similarity is described in A Frequency-Domain Approach to Multichannel Upmix” published in 2004 by “C. Avendano” et al.
According to an embodiment, the signal processor 230 may calculate the common component M[w] by the following Expression 8.
where Coh[w] is a correlation in a specific frequency band, and Sim[w] is a similarity in the frequency band. By multiplying Coh[w] and Sim[w], unique components thereof may be involved in the common component M[w]. Alternatively, without limitations, only one of Coh[w] and Sim[w] in the Expression 8 may be employed in various other embodiments.
The ambient signal splitter 232 obtains the common component M[w] by multiplying the product of the correlation and the similarity with an average of the left channel signal L[w] and the right channel signal R[w]. In this manner, when the common component is obtained, the ambient signals La[w] and Ra[w] of the left and right channels may be defined by the following Expression 9.
La[w]=L[w]−M[w]
Ra[w]=R[w]−M[w] [Expression 9]
The ambient signals obtained as above, i.e., La[w] and Ra[w] are input to the re-panner 140. The re-panning performed in the re-panner 140 and the room gain control performed in the room gain controller 133 are the same as or similar to those described above except that the input signals L[w] and R[w] are replaced by the ambient signals La[w] and Ra[w]. Thus, redundant descriptions are omitted below.
Meanwhile, the common component signal M[w] obtained in the ambient signal splitter 232 is input not to the re-panner 140, but an additional signal compensator 233. The signal compensator 233 applies compensation and various types of filtering to the common component signal.
The inverse frequency converter 135 receives an output from the room gain controller 133 or an output from the re-panner 140 when the room gain control is omitted, and applies the inverse frequency conversion to the output, thereby providing result signals Lao[n] and Rao[n] to the signal distributor 150. The result signals Lao[n] and Rao[n] are converted into audible sounds by the directional speakers 30-1 and 30-2 via the signal distributor 150. Meanwhile, the common signal M′[w] compensated and filtered in the signal compensator 233 is subjected to the inverse frequency conversion by the inverse frequency converter 135 since the common signal M′[w] is also the signal of the frequency domain, and then provided as a signal M[n] of the time domain to the signal distributor 150. Ultimately, the common component signal M[n] is converted to have an audible frequency through the directional speakers 30-1 and 30-2 or the omnidirectional speakers 30-3 and 30-4.
The elements shown in
Further, each block may depict a part of a module, a segment or a code, which includes one or more executable instructions for implementing a specific logic function(s). Further, according one or more other embodiments, the functions mentioned in or described with reference to the blocks may be implemented in any sequence. For example, two blocks illustrated in succession may actually be performed at substantially the same time, or may be performed in reverse order according to their corresponding functions.
Referring to
The frequency converter 131 converts two or more channel signals (i.e., multi-channel signals) generated in the channel processor 110 by time-frequency conversion, thereby generating a channel signal of the frequency domain (operation S82). For such time-frequency conversion, the DFT, the FFT, the DCT, the DST, etc., may be used.
The ambient signal splitter 232 splits a common component between the left channel signal and the right channel signal from the converted channel signal of the frequency domain (operation S83). To extract the common component, the ambient signal splitter 232 calculates a correlation between the left channel signal and the right channel signal according to the frequency bands. The ambient signal splitter 232 generates the ambient signal of two channels by subtracting the common component from each converted channel signal.
The ambient signal is input to the panning index calculator 141. The panning index calculator 141 calculates the panning index according to the frequency bands on the basis of a level ratio between the left and right channel signals of the ambient signal (operation S84).
The mapping section 142 adjusts the panning index by applying the mapping function f(x) to the panning index PI calculated in the panning index calculator 141, and then provides the adjusted panning index to the panning gain calculator 143 (operation S85). Here, the mapping function may amplify or reduce a difference between the left and right channel signals in a specific frequency band (w). In one or more other embodiments, the mapping function may be omitted.
The panning gain calculator 143 calculates a channel gain changed or adjusted for the left channel signal and a channel gain changed or adjusted for the right channel signal by applying a specific panning scheme to the panning index, and provides the changed channel gains to the panning gain controller 144 (operation S86). In this case, the panning gain controller 144 multiplies two channel signals included in the ambient signal with the changed channel gains, and outputs the results (operation S86).
The room gain controller 133 controls the room gain by applying different room gains or parameter EQs according to the frequency bands before applying the inverse frequency conversion to the channel signals as a whole (operation S87). In one or more other embodiments, the room gain control may be omitted.
The inverse frequency converter 135 applies the inverse frequency conversion to the provided channel signal or channel level and thus restores a channel signal of a time domain (operation S88). The channel signal of the time domain is output to the directional speakers 30-1 and 30-2 via the signal distributor 150 (operation S89).
Meanwhile, the common component signal split by the ambient signal splitter 232 is input to the signal compensator 233, and the signal compensator 233 performs compensation and various kinds of filtering on the common component signal (operation S91). Such a compensated and filtered common component signal is subjected to the inverse frequency conversion, and then output to the omnidirectional speakers 30-3 and 30-4 (operation S92), and/or the directional speakers 30-1 and 30-2.
Here, a white noise signal, which has been subjected to bandpass filtering according to frequency bands, is used as a test signal. While changing the test signal in an auditory image from −90 degrees to +90 degrees in the present example, power change was measured through a dummy head with regard to the left channel and the right channel.
First, referring to
Next, referring to
As described above, the audio signal processing device 50 according to an embodiment, the audio signal output apparatus 100 including the audio signal processing device 50, and the display apparatus 200 including the audio signal output apparatus 100 and the display panel have been described. Further, the directional speakers 30-1 and 30-2 according to an embodiment, to be mounted to the audio signal output apparatus 100 or the display apparatus 200, have been described.
It is understood that the re-panning process in the audio signal processing device 50 illustrated in
A directional speaker 60 of
Further, a directional speaker 70 of
As shown in
According to one or more embodiments, without establishing a traditional home-theater system, the directional speaker and the omnidirectional speaker are properly arranged in the audio signal output apparatus or the display apparatus, and a signal input to the speakers is rendered suitably for the arrangement, thereby sufficiently providing a realistic sound and a sound field within a restricted indoor environment.
Further, the separation phenomenon of the auditory image, which occurs when the directional speakers arranged on the back of the display apparatus are used, is eliminated by the re-panning process, thereby providing a more natural sound and enhanced sound quality to a user.
Although certain embodiments have been shown and described, it will be appreciated by a person having an ordinary skill in the art, to which the present disclosure pertains, that alternative embodiments may be made without changing the technical concept or essential features. Therefore, it will be understood that the foregoing embodiments are for not restrictive but illustrative purposes only in all aspects.
Number | Date | Country | Kind |
---|---|---|---|
10-2017-0161566 | Nov 2017 | KR | national |
Number | Name | Date | Kind |
---|---|---|---|
9172901 | Chabanne et al. | Oct 2015 | B2 |
20080205676 | Merimaa et al. | Aug 2008 | A1 |
20090080666 | Uhle et al. | Mar 2009 | A1 |
20090116652 | Kirkeby et al. | May 2009 | A1 |
20100290630 | Berardi | Nov 2010 | A1 |
20120101605 | Gaalaas | Apr 2012 | A1 |
20120183162 | Chabanne et al. | Jul 2012 | A1 |
20150117686 | Kim et al. | Apr 2015 | A1 |
20160219364 | Seefeldt et al. | Jul 2016 | A1 |
20160353205 | Munch | Dec 2016 | A1 |
20170272881 | Geiger | Sep 2017 | A1 |
20180184227 | Chon et al. | Jun 2018 | A1 |
20200128346 | Noh | Apr 2020 | A1 |
Number | Date | Country |
---|---|---|
3125240 | Feb 2017 | EP |
10-2005-0115801 | Dec 2005 | KR |
10-2007-0066820 | Jun 2007 | KR |
10-2008-0060640 | Jul 2008 | KR |
10-2016-0141765 | Dec 2016 | KR |
2012021713 | Feb 2012 | WO |
2014157975 | Oct 2014 | WO |
2015147619 | Oct 2015 | WO |
Entry |
---|
International Search Report (PCT/ISA/210) & Written Opinion (PCT/ISA/237) dated Mar. 20, 2019 issued by the International Searching Authority in International Application No. PCT/KR2018/014693. |
Communication dated Apr. 17, 2019, issued by the European Patent Office in counterpart European Application No. 18208701.5. |
Communication dated Mar. 18, 2021 by the Intellectual Property Office of India in counterpart Indian Patent Application No. 201947052118. |
Number | Date | Country | |
---|---|---|---|
20190166419 A1 | May 2019 | US |