These and/or other aspects and advantages of the invention will become apparent and more readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to the like elements throughout. Embodiments are described below to explain the present invention by referring to the figures.
As shown in
The time/frequency transformer 202 may receive an input signal obtained by compressing multi-channel signals into a mono or a stereo signal through an input terminal IN 1, for example, and transform the input signal into a frequency-domain signal.
The channel level analyzer 204 analyzes information on channel level differences (CLDs), e.g., input through an input terminal IN 2, in order to obtain a full band channel level (FBCL) for each channel in the multi-channel system. Here, the FBCL is a representative energy level from among energy levels of bands within each channel in the multi-channel system and can have a constant energy level across the full band, respectively.
Referring to
Here, according to an embodiment of the present invention, CLDs having different values according to sub-bands of each channel in the multi-channel system in the frequency domain may be adjusted to a representative energy level across the full band, i.e., all bands. The channel level analyzer 204 filters the different CLDs according to bands in the frequency domain in order to obtain a constant energy level across the full band of each channel through a predetermined calculation as shown in
FBCL(i)=K(i,j)·A(j) Equation 1:
Here, A denotes a weighted value for a band, K denotes a channel level difference, i denotes a channel number, and j denotes a band number.
As shown in Equation 1, the FBCL can be calculated by multiplying a channel level difference by a weighted band level in the frequency domain.
The FBCL, e.g., calculated in the channel level analyzer 204, may be set as the gain value of the HRTF in the HRTF adjusting unit 206. More specifically, in order to set the FBCL as the gain value of the HRTF, the HRTF can be multiplied by the FBCL in order to adjust the HRTF. In this case, since the FBCLs have different values depending on the channel, the HRTFs are also adjusted to have different values according to the respective channels. As noted above, the HRTFs may model a sonic process of transferring a sound source localized in free space to a person's ears, and include important information for detecting the position of the sound source from the perspective of the person, including information representing the perceived direction of the received sound. The HRTFs may take into account inter-aural time differences, inter-aural level differences, and a shape of an auricle, for example, and may include a lot of information about the properties of a space in which the sound is transferred.
The 2-channel synthesizer 208 may localize data of each channel included in the input signal, transformed into the frequency-domain signal by the time/frequency transformer 202, in directions corresponding to respective channels by using a first HRTF in which a gain value has been set and a second HRTF in which a gain value has been set. More specifically, according to an embodiment of the present invention, in order to localize the data of each channel included in the input signal in directions corresponding to the channel based on the FBCLs of the channels calculated in the channel level analyzer 204, the HRTFs are used.
As noted above, the FBCLs typically have different values depending on the respective channels. Therefore, the 2-channel synthesizer 208 may use HRTFs that have been adjusted to have different gain values for each respective channel. Therefore, when the data of each channel included in the input signal is localized in directions corresponding to each respective channel, the localized data of each channel can be output in proportion to the defined gain values, so that the data of the channels are listened to separately. However, since a constant gain value is used across the full band, such a separation effect according to bands may not be good.
The first frequency/time transformer 210 may receive a left signal from among signals output from the 2-channel synthesizer 208, e.g., from the first head related transfer function, so it can transform the left signal into a time-domain signal, e.g., to be output through an output terminal OUT 1.
The second frequency/time transformer 212 may receive a right signal from among signals output from the 2-channel synthesizer 208, e.g., from the second head related transfer function, so it can transform the right signal into a time-domain signal, e.g., to be output through an output terminal OUT 2.
An input signal, obtained by compressing multi-channel signals into a mono or stereo signal, may be received, e.g., by the time/frequency transformer 202 through an input terminal IN 1, in operation 400.
The input signal may then be transformed into a frequency-domain signal, e.g., by the time/frequency transformer 202, in operation 402.
Information on channel level differences (CLDs) may further be received, e.g., by the channel level analyzer 204, from among spatial cues that are generated when the multi-channel signals were initially compressed into the mono or stereo signal and can be used to reconstruct the input signal.
The received CLDs may then be analyzed, e.g., by the channel level analyzer 204, in order to obtain an FBCL for each channel.
In one embodiment, the aforementioned Equation 1 may be used to obtain the FBCL.
In a further embodiment, the obtained FBCL has a constant energy level across the full band as shown in
The obtained FBCL may be set to a gain value of a HRTF, e.g., by the HRTF adjusting unit 206, in operation 408. In this case, since only the gain value is adjusted in a measured HRTF, only the output magnitude of the HRTF changes and the HRTF itself is not modified.
The FBCLs obtained in the channel level analyzer 204 have different values depending on the respective channels, so that a signal output from a channel having a greater gain value is louder than other signals. More specifically, data of the channels included in the input signal are localized in directions corresponding to the respective channels based on the FBCLs that are set to the gain values. Here, in effect, the FBCLs serves as a filter.
The HRTFs having different gain values depending on the respective channels may be used, e.g., by the 2-channel synthesizer 208, to localize the data of each channel in directions corresponding to the channel, to be synthesized as 2-channel signals. In this case, the synthesized signals are divided into a left signal component and a right signal component.
Thus, the left and right signal components, e.g., output from the 2-channel synthesizer 208, may be transformed into time-domain signals, e.g., by the first and second frequency/time transformers 210 and 212 to be output through the example output terminals OUT 1 and OUT 2, respectively, in operation 412.
Here, the decoding device may include a time/frequency transformer 502, a sub-band channel level analyzer 504, an equalized head related transfer function (eHRTF) generator 506, a 2-channel synthesizer 508, a first frequency/time transformer 510, and a second frequency/time transformer 512, for example.
The time/frequency transformer 502 may receive an input signal, e.g., obtained by compressing multi-channel signals into a mono or stereo signal, through an example input terminal IN1 in order to transform the input signal into a frequency-domain signal.
The sub-band channel level analyzer 504 may then calculate a sub-band channel level (SBCL) for each channel in the multi-channel system by using information on channel level differences (CLDs) input through an example input terminal IN 2. More specifically, the sub-band channel level analyzer 504 may adjust the CLDs having different levels according to respective bands in a respective channel so as to calculate a FBCL based on the CLDs according to the sub-bands shown in
In this case, the below Equation 2 may be used to obtain the SBCLs, for example.
SBCL(i,k)=K(i,j)·B(j,k) Equation 2:
Here, K denotes a channel level difference (CLD) in the frequency domain, B denotes an interpolation coefficient of a respective band, i denotes a respective channel number, j denotes the respective band number, and k denotes the respective frequency number.
As shown in Equation 2, the SBCL may be calculated by multiplying a CLD by an interpolation coefficient of each band in the frequency domain, so that continuous energy levels across the full band are calculated.
The eHRTF generator 506 may synthesize the SBCL, obtained in the sub-band channel level analyzer 504, and the HRTF, input through the input terminal IN3, for example, so as to generate an eHRTF. In this embodiment, the eHRTFs represent HRTFs using CLDs between the channels according to bands in the frequency domain. The below, Equation 3 may be used as a method of generating the eHRTF, for example.
Here, SBCL denotes a sub-band channel level HRTFi(i) and HRTFc(i) denotes a pair of HRTFs in a direction of a channel, HRTFi(i) denotes a HRTF in a direction close to a direction of a sound source, HRTFc(i) denotes a HRTF in a direction far from a direction of the sound source, i denotes a channel number, and j denotes a band number.
The 2-channel synthesizer 508 may use the eHRTFs to localize data of each channel included in the input signal in directions corresponding to the respective channels. The eHRTFs uses the CLDs between the channels according to bands in the frequency domain. Therefore, when the data of each channel is localized in directions corresponding to the channels, the localized data of each channel can be generated based on energy levels of the respective channels according to the respective bands. Accordingly, the data of the respective channels can be listened to separately depending on the respective bands. Therefore, unlike the embodiment shown in
The first frequency/time transformer 510 may receive a left signal from among signals output from the 2-channel synthesizer 508, e.g., from the first equalized head related transfer function, in order to transform the left signal into a time-domain signal, e.g., that may be output through an output terminal OUT1.
The first frequency/time transformer 512 may receive a right signal from among signals output from the 2-channel synthesizer 208, e.g., from the first equalized head related transfer function, in order to transform the right signal into a time-domain signal, e.g., that may be output through an output terminal OUT2.
Here, an input signal, obtained by compressing multi-channel signals into a mono or stereo signal, may be received, e.g., by the time/frequency transformer 502 through an input terminal IN 1, in operation 600.
The input signal may further be transformed into a frequency-domain signal, e.g., by the time/frequency transformer 502, in operation 602.
Information on CLDs from spatial cues, which are generated when the multi-channel signals were initially compressed into the mono or stereo signal, may also be received, e.g., by the sub-band channel level analyzer 504 and input through an input terminal IN2, and used to reconstruct the signal, in operation 604.
The received CLDs may then be analyzed, e.g., by the sub-band channel level analyzer 504, to obtain a SBCL for each channel. For this, the aforementioned Equation 2 may be used to obtain the SBCLs.
As an example, the obtained SBCLs may be represented as continuous energy levels across the full band based on the CLDs according to the respective bands as shown in
HRTFs, e.g., input through an input terminal IN3, and the SBCLs may be synthesized, e.g., by the eHTRF generator 506, in operation 608. In this case, in another embodiment of the present invention, the aforementioned Equation 3 may be used to generate the eHRTFs using the CLDs according to the respective bands.
The eHRTFs may be used to localize the data of each channel in directions corresponding to the respective channels, e.g., by the 2-channel synthesizer 508. In this case, the synthesized signals may then be divided into a left signal component and a right signal component.
Thereafter, as en example, in operation 612, the first and second frequency/time transformers 510 and 512 may transform the left and right signal components into time-domain signals to be output through the aforementioned output terminals OUT1 and OUT2, respectively.
According to a decoding method, medium, and device outputting the multi-channel signals as 2-channel binaural signals, in one or more embodiments of the present invention, there are advantages in at least that, firstly, an operation of reconstructing an input signal, generated previously by compressing multi-channel signals into a mono or stereo signal, and a binaural processing operation of down-mixing the input signal to the 2-channel signals are performed simultaneously. Therefore, coding is simple. Secondly, the conventional operation of reconstructing the input signal in the QMF domain is not needed. Therefore, the number of operations is reduced.
Accordingly, a spatial audio signal can be reproduced by a mobile audio device having limited hardware resources without deterioration. In addition, a desktop video having greater hardware resources than the mobile audio device can also reproduce high-quality audio using a previously allocated hardware resource. Lastly, the multi-channel reconstructing operation and the binaural processing operation can also be performed simultaneously, so that an additional binaural processing dedicated processor is not required. Therefore, spatial audio can be reproduced by using a reduced amount of hardware resources.
Still further, according to an embodiment of the present invention, data of each channel included in an input signal can be localized based on input CLDs based on the respective bands, so that a loss of spatial cues can be minimized. Therefore, the data can be reproduced without sound quality degradation.
In addition to the above described embodiments, embodiments of the present invention can also be implemented through computer readable code/instructions in/on a medium, e.g., a computer readable medium, to control at least one processing element to implement any above described embodiment. The medium can correspond to any medium/media permitting the storing and/or transmission of the computer readable code.
The computer readable code can be recorded/transferred on a medium in a variety of ways, with examples of the medium including magnetic storage media (e.g., ROM, floppy disks, hard disks, etc.), optical recording media (e.g., CD-ROMs, or DVDs), and storage/transmission media such as carrier waves, as well as through the Internet, for example. Here, the medium may further be a signal, such as a resultant signal or bitstream, according to embodiments of the present invention. The media may also be a distributed network, so that the computer readable code is stored/transferred and executed in a distributed fashion. Still further, as only an example, the processing element could include a processor or a computer processor, and processing elements may be distributed and/or included in a single device.
Although a few embodiments of the present invention have been shown and described, it would be appreciated by those skilled in the art that changes may be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the claims and their equivalents.
| Number | Date | Country | Kind |
|---|---|---|---|
| 10-2006-0073470 | Aug 2006 | KR | national |