These and/or other aspects and advantages of the invention will become apparent and more readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to the like elements throughout. Embodiments are described below to explain the present invention by referring to the figures.
Here, the decoding system may include a quadrature mirror filter (QMF) 202, a multi-channel synthesizer 204, a binaural synthesizer 206, a filter transformer 208, a first inverse quadrature mirror filter (IQMF) 210, and a second IQMF 212, for example.
The QMF 202 may receive the compressed multi-channel signal, as the mono or stereo signal, e.g., from a multi-channel encoder (not shown), through an input terminal IN 1, and may then transform the mono or stereo signal into the QMF-domain.
The multi-channel synthesizer 204 may then receive spatial cues, e.g., generated during a down-mixing of the original multi-channel signals by staged down-mixing modules of a multi-channel encoder (not shown) into the mono or stereo signal, through an input terminal IN 2. The multi-channel synthesizer 204, thus, up-mixes the QMF domain mono or stereo signal using the spatial cues. Therefore, the multi-channel synthesizer 204 may output the up-mixed left front channel signal, right front channel signal, center front channel signal, left surround channel signal, right surround channel signal, and low frequency effect channel signal (not shown).
Here, the filter transformer 208 may receive head related transfer functions (HRTFs), e.g., through an input terminal IN 3 and an input terminal IN 4, and transform the received HRTFs into QMF domain spatial parameters usable by the binaural synthesizer 206 in the QMF domain.
Such operations for transforming the HRTF, represented as values in the time domain, into spatial parameters in the QMF domain by the filter transformer 208 will now be described in greater detail
In general, the HRTFs used for localizing channel signals making up multi-channel signals are applied in the frequency domain. However, in an embodiment of the present invention, the HRTFs used for localizing channel signals making up the multi-channel signals are used in the QMF domain. Therefore, an operation of transforming the HRTFs for use in the QMF domain is needed.
The filter transformer 208 receives corresponding HRTFs in a direction close to a direction of a sound source (at an acute angle) represented as values in the time domain, e.g., through the input terminal IN 3, and receives corresponding HRTFs in a direction far from a direction of the sound source (at an obtuse angle) represented as values in the time domain, e.g., through the input terminal IN 4. Here, the HRTF is a transfer function used for localizing channel signals in the frequency domain. The HRTF is generated by performing frequency transformation on a head-related impulse response (HRIR) measured from the sound source at the left or right eardrum in the time domain. Therefore, according to an embodiment of the present invention, the HRIRs representing the HRTF in the time domain are input through the input terminal IN 3 and the input terminal IN 4. Along with the HRIR, important information of the HRTF representing a sonic process of transferring a sound source localized in free space to a person's ears includes an inter-aural time difference (ITD) and an inter-aural level difference (ILD), which represent corresponding spatial properties. Thus, the ITD and the ILD, as parameters showing properties of the HRTF in the time domain, may be input through the input terminal IN 3 and the input terminal IN 4.
In an embodiment, the filter transformer 208 may be constructed with a one-to-two (OTT) module, for example. Thus, the filter transformer 208 may generate a signal synthesized by down-mixing input signals based on spatial parameters according to a general property of the OTT module. Such an OTT module may, thus, be used for performing binaural cue coding (BCC). Generally, during an encoding operation, when two signals in the time domain are received by an OTT module, the OTT module can output spatial parameters for subsequent reconstructing of the input two signals and a synthesized time-domain signal. Alternatively, during the decoding operation, the OTT module may receive the corresponding compressed time-domain signal and spatial parameters for reconstructing the compressed time-domain signal in order to output two reconstructed signals in the time domain. More specifically, the filter transformer 208 may output HRTFs synthesized by down-mixing the received first and second parameters, e.g., through an output terminal OUT 1. Further, the filter transformer 208 may output corresponding channel level differences (CLDs) and inter-channel correlations (ICCs), which are spatial parameters used in the QMF domain, through an output terminal OUT 2. Here, the output CLDs and the ICCs are transformed values which the filter transformer 208 receives the HRTFs used for localizing the channel signals represented as values in the time domain and transforms them to values which perform sound localization in the QMF domain. Therefore, the CLDs and the ICCs may be used as spatial parameters for localizing signals between channels in the QMF domain. Returning to
Here, operations for synthesizing channel signals input to the binaural synthesizer 206 to 2-channel binaural signals will now be described in greater detail.
The binaural synthesizer 206 may include first, second, third, fourth, and fifth decoders 402, 404, 406, 408, and 410, and first and second synthesizers 412 and 414, for example.
The first to fifth decoders 402 to 410 use the aforementioned OTT modules, with different multi-channel signals being input to the decoders 402 to 410. The first and second synthesizers 412 and 414 then separately synthesize signals as single signals.
First, operations of the up-mixing of an input signal of the first decoder 402 will be described.
Thus, the first decoder 402 receives the example left front channel signal through the input terminal IN 2 and spatial parameters, e.g., output from the output terminal OUT 2 of the filter transformer 208, through an input terminal IN 1. In this case, the spatial parameter refers to a corresponding CLD and ICC obtained in the filter transformer 208. In this embodiment, the first decoder 402 is thus a binaural cue coding decoder and uses the general property of the OTT module, so that the first decoder 402 up-mixes the left front signal for 2-channel binaural signals using the corresponding CLD and ICC. More specifically, after the first decoder 402 divides the input left front signal into a left component signal and a right component signal, the divided left component signal is output to the first synthesizer 412, and the divided right component signal is output to the second synthesizer 414. The second decoder 404 similarly receives the right front signal, e.g., through an input terminal IN 3, and by performing similar operations as those of the first decoder 402, a left component signal and a right component signal, obtained by up-mixing the input right front signal, are output to the first and second synthesizers 412 and 414, respectively. By performing similar operations as those of the first decoder 402, the third, fourth, and fifth decoders 406, 408, and 410 also similarly divide the input center front channel signal, the left surround channel signal, and the right surround channel signal into left component signals and right component signals so as to be output to the first and second synthesizers 412 and 414. In addition, as the low frequency effect channel signal (not shown) does not have directionality, the low frequency effect channel signal may be added to the first and second synthesizers 412 and 414 without performing decoding operations.
The first synthesizer 412 may then synthesize all input signals, e.g., so as to be output through an output terminal OUT 3. In other words, the generated left components channel signal is synthesized and output through the output terminal OUT 3.
The second synthesizer 414 further synthesizes all input signals, e.g., so as to be output through an output terminal OUT 4. In other words, the generated right component channel signal is synthesized and output through the output terminal OUT 4.
Returning to
The second IQMF 212 may receive the synthesized right components channel signal, and transforms the received signal into a time-domain signal and outputs the same through an output terminal OUT 6.
Operations for decoding an input compressed multi-channel signal, as a mono or stereo signal, into 2-channel binaural signals will now be described.
In operation 502, the input compressed signal may be received, e.g., by the QMF 202. In operation 504 the received input signal may be transformed into a QMF-domain signal, e.g., again by the QMF 202. Here, the example input compressed signal is a time-domain signal, but in order to output 2-channel binaural signals through synthesizing the corresponding encoded multi-channel signals, operations for transforming the input signal into the QMF-domain signal may, thus, be needed.
In operation 506, the transformed QMF-domain signal may be up-mixed, e.g., by the multi-channel synthesizer 204, to respective multi-channel signals. In this case, as an example, a left front channel signal, right front channel signal, center front channel signal, left surround channel signal, right surround channel signal, low frequency effect channel signal, or the like may be decoded.
In operation 508, in order to up-mix the respective multi-channel signals to the 2-channel signals, in the QMF domain, needed spatial cues may be extracted from the HRTF in the time domain, e.g., by the filter transformer 208. As noted above, as the filter transformer 208 uses OTT modules, the input signal may have to be a signal transformed into the QMF-domain. Therefore, a HRIR transformed into the QMF domain is used as an input HRTF. In this case, respective CLDs and ICCs may be extracted from the input HRIR.
In operation 510, the respective multi-channel signals may be up-mixed to the 2-channel signals by using the respective CLDs and the ICCs, e.g., by the binaural synthesizer 206. More specifically, as an example, the multi-channel synthesizer 204 may up-mix the left front channel signal, the right front channel signal, the center front channel signal, the left surround channel signal, and the right surround channel signal to 2-channel signals, respectively, by using the respective CLDs and ICCs. In one embodiment, as the low frequency effect channel signal does not have directionality, such operations may not be performed on the low frequency effect channel signal.
In operation 512, the 2-channel binaural signals may be generated by synthesizing the respective channel signals into the 2-channel signals. More specifically, by performing operation 510, the respective channel signals are up-mixed as left and right component signals, with the left component signal being synthesized from the respective channels and the right component signal being synthesized from the respective channels, thereby generating the 2-channel binaural signals.
In operation 514, the generated signals are then transformed into time-domain signals. Here, as the resultant 2-channel binaural signals generated in operation 512 may be in the QMF-domain, operations for transforming the generated signals into time domain signals may then be implemented.
According to a decoding method, medium, and system decoding an input compressed multi-channel signal, as a mono or stereo signal, into 2-channel binaural signals, of an embodiment of the present invention, an operation of reconstructing multi-channel signals from the input compressed signal and a binaural processing operation of outputting 2-channel binaural signals may be performed simultaneously. Therefore, decoding is simple. Further, such binaural processing operation can be performed in the QMF domain. Therefore, secondary operations of transforming decoded multi-channel signals into the frequency-domain for application of HRTF parameters in the frequency domain, as in the conventional binaural process, are not needed. Lastly, operation of reconstructing multi-channel signals from an input signal and a binaural processing operation can be performed by one device, such that additional designated chips for the operation of such binaural processing is not required. Therefore, spatial audio can be reproduced by using a small amount of hardware resources.
Accordingly, as an example, spatial audio can be reproduced by a mobile audio system/device with limited hardware resources and without deterioration. In addition, a desktop video (DTV) having a greater amount of hardware resources than the mobile audio device can still reproduce high-quality audio using previously allocated hardware resources, if selectively desired.
In addition to the above described embodiments, embodiments of the present invention can also be implemented through computer readable code/instructions in/on a medium, e.g., a computer readable medium, to control at least one processing element to implement any above described embodiment. The medium can correspond to any medium/media permitting the storing and/or transmission of the computer readable code.
The computer readable code can be recorded/transferred on a medium in a variety of ways, with examples of the medium including magnetic storage media (e.g., ROM, floppy disks, hard disks, etc.), optical recording media (e.g., CD-ROMs, or DVDs), and storage/transmission media such as carrier waves, as well as through the Internet, for example. Here, the medium may further be a signal, such as a resultant signal or bitstream, according to embodiments of the present invention. The media may also be a distributed network, so that the computer readable code is stored/transferred and executed in a distributed fashion. Still further, as only an example, the processing element could include a processor or a computer processor, and processing elements may be distributed and/or included in a single device.
Although a few embodiments of the present invention have been shown and described, it would be appreciated by those skilled in the art that changes may be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the claims and their equivalents.
Number | Date | Country | Kind |
---|---|---|---|
10-2006-0075301 | Aug 2006 | KR | national |