The present invention relates to quantization of spatial audio parameters and in particular to a concept to allow for a more efficient compression without significantly reducing the perceptual quality of an audio signal reconstructed using the quantized spatial audio parameters.
Recently, multi-channel audio reproduction techniques are becoming more and more important. In the view of an efficient transmission of multi-channel audio signals having 5 or more separate audio channels, several ways of compressing a stereo or multi-channel signal have been developed. Recent approaches for the parametric coding of multi-channel audio signals (parametric stereo (PS), “Binaural Cue Coding” (BCC) etc.) represent a multi-channel audio signal by means of a down-mix signal (could be monophonic or comprise several channels) and parametric side information, also referred to as “spatial cues”, characterizing its perceived spatial sound stage.
A multi-channel encoding device generally receives—as input—at least two channels, and outputs one or more carrier channels and parametric data. The parametric data is derived such that, in a decoder, an approximation of the original multi-channel signal can be calculated. Normally, the carrier channel (channels) will include subband samples, spectral coefficients, time domain samples, etc., which provide a comparatively fine representation of the underlying signal, while the parametric data do not include such samples of spectral coefficients but include control parameters for controlling a certain reconstruction algorithm instead. Such a reconstruction could comprise weighting by multiplication, time shifting, frequency shifting, phase shifting, etc. Thus, the parametric data includes only a comparatively coarse representation of the signal or the associated channel.
The binaural cue coding (BCC) technique is described in a number of publications, as in “Binaural Cue Coding applied to Stereo and Multi-Channel Audio Compression”, C. Faller, F. Baumgarte, AES convention paper 5574, May 2002, Munich, in the 2 ICASSP publications “Estimation of auditory spatial cues for binaural cue coding”, and “Binaural cue coding: a normal and efficient representation of spatial audio”, both authored by C. Faller, and F. Baumgarte, Orlando, Fla., May 2002.
In BCC encoding, a number of audio input channels are converted to a spectral representation using a DFT (Discrete Fourier Transform) based transform with overlapping windows. The resulting uniform spectrum is then divided into non-overlapping partitions. Each partition has a bandwidth proportional to the equivalent rectangular bandwidth (ERB). Then, spatial parameters called ICLD (Inter-Channel Level Difference) and ICTD (Inter-Channel Time Difference) are estimated for each partition. The ICLD parameter describes a level difference between two channels and the ICTD parameter describes the time difference (phase shift) between two signals of different channels. The level differences and the time differences are normally given for each channel with respect to a reference channel. After the derivation of these parameters, the parameters are quantized and finally encoded for transmission.
Although ICLD and ICTD parameters represent the most important sound source localization parameters, a spatial representation using these parameters can be enhanced by introducing additional parameters.
A related technique, called “parametric stereo” describes the parametric coding of a two-channel stereo signal based on a transmitted mono signal plus parameter side information. There, 3 types of spatial parameters, referred to as inter-channel intensity difference (IIDs), inter-channel phase differences (IPDs), and inter-channel coherence (IC) are introduced. The extension of the spatial parameter set with a coherence parameter (correlation parameter) enables a parametrization of the perceived spatial “diffuseness” or spatial “compactness” of the sound stage. Parametric stereo is described in more detail in: “Parametric Coding of stereo audio”, J. Breebaart, S. van de Par, A. Kohlrausch, E. Schuijers (2005) Eurasip, J. Applied Signal Proc. 9, pages 1305-1322)”, in “High-Quality Parametric Spatial Audio Coding at Low Bitrates”, J. Breebaart, S. van de Par, A. Kohlrausch, E. Schuijers, AES 116th Convention, Preprint 6072, Berlin, May 2004, and in “Low Complexity Parametric Stereo Coding”, E. Schuijers, J. Breebaart, H. Purnhagen, J. Engdegard, AES 116th Convention, Preprint 6073, Berlin, May 2004.
The international publication Wo 2004/008805 A1 teaches, how a multi-channel audio signal can be advantageously compressed by combining several parametric stereo modules, thus realizing a hierarchical structure to derive a representation of the original multi-channel audio signal comprising a down-mix signal and parametric side information.
Within the BCC and parametric stereo (PS) approach, a representation of the level differences (also called intensity differences ICLD or energy differences IID) between audio channels is a vital part of a parametric representation of a stereophonic/multi-channel audio signal. Such information and other spatial parameters are transmitted from the encoder to the decoder for each time/frequency slot. In the view of coding efficiency, it is therefore of high interest to represent these parameters as compactly as possible while preserving audio quality.
In BCC coding, the level differences are represented relative to a so-called “reference channel” and are quantized on a uniform scale in units of dB relative to a reference channel. This does not optimally exploit the fact that channels with low level with respect to the reference channel are subject to a significant masking effect when listened to by human listeners. In the extreme case of a channel having no signal at all, the bandwidth used by parameters describing this particular channel is completely wasted. In the more common case, where one channel is much fainter than another channel, that is a listener can hardly hear the faint channel during the playback, a less precise reproduction of the faint channel would also lead to the same perceptual quality of the listener, as the faint signal is mainly masked by the stronger signal.
To explain the situation and the problems arising when encoding a multi-channel signal, reference is made to
When, for example, a simple monologue is recorded, most of the energy would be contained in the center channel 103. In this example, especially the back channels will contain only little (or 0) energy. Therefore, parameters describing the properties of the back channels are merely wasted in this example, since mainly the center channel 102 or the front channels will be active during the play back.
Based on
a illustrates a multi channel parameterization for a five channel speaker set-up where the different audio channels are indicated by 101 to 105; a(t) 101 represents signal of the left surround channel, b(t) 102 represents the signal of the left front channel, c(t) 103 represents the signal of the center channel, d(t) 104 represents the signal of the right front channel, e(t) 105 represents the signal of the right surround channel. The speaker set-up is divided into a front part and a back part. The energy distribution between the entire front channel set-up (102, 103 and 104) and the back channels (101 and 105) are illustrated by the arrow in
LocalEnergyr4=E[a2(t)]+E[e2(t)].
Where E[.] is the expected value as defined by
b shows a multi-channel audio decoder built by hierarchically ordering parametric stereo modules, as for example described in WO 2004/008805 A1. Here, the audio channels 101 to 105, as introduced in
Thus, after the second step of the hierarchical decoding, the left back channel 101, the right back channel 105, the center channel 103, and a combined channel, being a combination of the front left channel 102 and the front right channel 104 are reconstructed, using the transmitted spatial parameters, that are comprising a level parameter for use by each of the two-channel decoders 122, 124, and 126.
In the third step of the hierarchical decoding, the fourth two-channel decoder 128 derives the front left channel 102 and the front right channel 104, using a level information transmitted as side information for the fourth two-channel decoder 128. Using a prior art hierarchical decoder as shown in
This is possible, since “leaf” modules are not aware of the global level distribution at a higher tree level (e.g. the “root” module). Each leaf has its own corresponding IID/ICLD parameter, which indicates the energy distribution from its input toward output channels. For example, the IID/ICLD parameter of leaf “r3” (processed by the first two-channel decoder 122) may indicate that 90% of the incoming energy should be sent to leaf r2, while the remaining energy (10%) should be sent to leaf r4. This process is repeated for each leaf in the tree. Since each energy distribution parameter is represented with limited accuracy, the deviation between the desired and the actual energy of each output channel A to E depends on the quantization errors in the IID/ICLD parameters, as well as on the energy distribution (and hence propagation of quantization errors). In other words, as the same quantization table is used for a certain parameter type, e.g. ICC or IID, within all parameterization stages r1 to r4, the IID/ICLD quantization is performed optimal only locally. This means that for each parameterization stage r1 to r4, the error in output energy of the (local) output channels is maximum for the weakest output channel in prior art implementations.
As detailed in the previous paragraphs, the quantization of level parameters (IID or ICLD) or other parameters such as ICC, phase differences or time differences describing the spatial perception of a multi-channel audio signal is still sub-optimal, since bandwidth may be wasted for spatial parameters describing channels that are mainly masked due to low energy within the channel.
It is the object of the present invention to provide an improved concept for quantization of spatial parameters of a multi-channel audio signal.
According to a first aspect of the present invention this object is achieved by a parameter quantizer for quantizing an input parameter, wherein the input parameter is a measure for a characteristic of a single channel or a pair of channels with respect to another single channel or a pair of channels of a multi-channel signal, comprising: a quantization rule generator for generating a quantization rule based on a relation of an energy measure of the channel or the pair of channels and an energy measure of the multi-channel signal; and a value quantizer for deriving a quantized parameter from the input parameter, using the generated quantization rule.
According to a second aspect of the present invention this object is achieved by a parameter dequantizer for dequantizing a quantized parameter to derive a parameter, wherein the parameter is a measure for a characteristic of a single channel or a pair of channels with respect to another single channel or a pair of channels of a multi-channel signal, comprising: a dequantization rule generator for generating a dequantization rule based on a relation of an energy measure of the channel or the pair of channels and an energy measure of the multi-channel signal; and a value dequantizer for deriving the parameter from the quantized parameter, using the generated dequantization rule.
According to a third aspect of the present invention this object is achieved by a method of quantizing an input parameter, wherein the input parameter is a measure for a characteristic of a single channel or a pair of channels with respect to another single channel or a pair of channels of a multi-channel signal, the method comprising: generating a quantization rule based on a relation of an energy measure of the channel or the pair of channels and an energy measure of the multi-channel signal; and deriving a quantized parameter from the input parameter using the generated quantization rule.
According to a fourth aspect of the present invention this object is achieved by a method of dequantizing a quantized parameter to derive a parameter, wherein the parameter is a measure for a characteristic of a single channel or a pair of channels with respect to another single channel or a pair of channels of a multi-channel signal, the method comprising: generating a dequantization rule based on a relation of an energy measure of the channel or the pair of channels and an energy measure of the multi-channel signal; and deriving the parameter from the quantized parameter using the generated dequantization rule.
According to a fifth aspect of the present invention this object is achieved by a representation of a multi-channel signal having a quantized parameter being a quantized representation of a parameter being a measure for a characteristic of a single channel or a pair of channels, wherein the parameter is a measure for a characteristic of the single channel or the pair of channels with respect to another single channel or a pair of channels of a multi-channel signal, wherein the quantized parameter is derived using a quantization rule based on a relation of an energy measure of the channel or the pair of channels and an energy measure of the multi-channel signal.
According to a sixth aspect of the present invention this object is achieved by a machine-readable storage medium having stored thereon a representation of a multi-channel signal as described above.
According to a seventh aspect of the present invention this object is achieved by a transmitter or audio recorder having a parameter quantizer for quantizing an input parameter, wherein the input parameter is a measure for a characteristic of a single channel or a pair of channels with respect to another single channel or a pair of channels of a multi-channel signal, comprising: a quantization rule generator for generating a quantization rule based on a relation of an energy measure of the channel or the pair of channels and an energy measure of the multi-channel signal; and a value quantizer for deriving a quantized parameter from the input parameter, using the generated quantization rule.
According to an eighth aspect of the present invention this object is achieved by a receiver or audio player having a parameter dequantizer for dequantizing a quantized parameter to derive a parameter, wherein the parameter is a measure for a characteristic of a single channel or a pair of channels with respect to another single channel or a pair of channels of a multi-channel signal, comprising: a dequantization rule generator for generating a dequantization rule based on a relation of an energy measure of the channel or the pair of channels and an energy measure of the multi-channel signal; and a value dequantizer for deriving the parameter from the quantized parameter, using the generated dequantization rule.
According to a ninth aspect of the present invention this object is achieved by a method of transmitting or audio recording, the method comprising a method of quantizing an input parameter, wherein the input parameter is a measure for a characteristic of a single channel or a pair of channels with respect to another single channel or a pair of channels of a multi-channel signal, the method comprising: generating a quantization rule based on a relation of an energy measure of the channel or the pair of channels and an energy measure of the multi-channel signal; and deriving a quantized parameter from the input parameter using the generated quantization rule.
According to a tenth aspect of the present invention this object is achieved by a method of receiving or audio playing, the method having a method of dequantizing a quantized parameter to derive a parameter, wherein the parameter is a measure for a characteristic of a single channel or a pair of channels with respect to another single channel or a pair of channels of a multi-channel signal, the method comprising: generating a dequantization rule based on a relation of an energy measure of the channel or the pair of channels and an energy measure of the multi-channel signal; and deriving the parameter from the quantized parameter using the generated dequantization rule.
According to an eleventh aspect of the present invention this object is achieved by a transmission system having a transmitter and a receiver, the transmitter having a parameter quantizer for quantizing an input parameter; and the receiver having a parameter dequantizer for dequantizing a quantized parameter.
According to a twelfth aspect of the present invention this object is achieved by a method of transmitting and receiving, the method including a transmitting method having a method of quantizing an input parameter; and the method including a method of receiving including a method of dequantizing a quantized.
According to a thirteenth aspect of the present invention this object is achieved by a computer program for performing, when running on a computer, one of the above methods.
The present invention is based on the finding that parameters being a measure for a characteristic of a single channel or of a pair of channels with respect to another single channel or of a pair of channels of a multi-channel signal can be quantized more efficiently using a quantization rule that is generated based on a relation of an energy measure of the channel or the pair of channels and an energy measure of the multi-channel signal.
The inventive concept has the major advantage that a quantization rule is either generated or an appropriate quantization rule is selected from a group of available quantization rules, depending on the energy of the signal to be described. Therefore, a psycho-acoustic model can be applied to a quantizer during encoding or a dequantizer during decoding, to use a quantization rule adapted to the needs of the actual signal. Especially, when a channel contains very little energy compared to other channels within the multi-channel signal, the quantization can be much more coarse than for signals having high energies. This is due to the fact that the high energy signals mask the low energy signals during playback, i.e. a listener will hardly recognize any details of the low energy signal and thus the low energy signal can be deteriorated more through coarse quantization without the listener being able to recognize the falsification because of the high masking of the low energy signal.
In one embodiment of the present invention, a parameter quantizer for quantizing parameters is having a quantization rule generator for generating a quantization rule and a value quantizer for deriving quantized parameters from input parameters using the generated quantization rule. To generate an appropriate quantization rule, the quantizer selector receives as an input the total energy of the multi-channel audio signal to be coded and the local energy of the channel or the pair of channels whose spatial parameters are to be quantized. Knowing the total energy and the local energy, the quantizer selector can decide, which quantization rule to use, i.e. select coarser quantization rules for channels or channel pairs having comparatively low local energy. Alternatively, the quantizer selector could also derive an algorithmic rule to modify an existing quantization rule or to calculate a completely new quantization rule depending on the local and the total energy. One possibility would for example be to calculate a general scale factor to be applied to a signal before a linear quantizer or a non-linear quantizer to achieve the goal of reducing the size of the side information to be transmitted.
In a further embodiment of the present invention a multi channel signal is encoded in a pairwise manner, i.e. by using a hierarchical structure that is having several 2-to-1 downmixers ordered in a tree-like structure, each downmixer generating a mono channel out of two channels input into the downmixer. Following the inventive concept, energy dependent quantization can now be implemented not only locally, i.e. at each 2-to-1 downmixer having the information available at the input of the 2-to-1 downmixer only, but based on the global knowledge on the sum of the signal energies. This enhances the perceptual quality of a perceptual signal significantly.
It is evident that following the inventive concept, the side information size can be decreased while the quality of the encoded multi-channel audio signal is hardly affected.
In a further embodiment of the present invention, an inventive parameter quantizer is incorporated in a parameter encoder before a differential encoder and a Huffman encoder, both of which are used for further encoding the quantized parameters to derive a parameter bit stream. Such an inventive encoder has the great advantage that in addition to decreasing the size of code words needed to describe the quantized parameters, a coarser quantization will automatically increase the abundance of identical code words fed into the differential encoder and the Huffman encoder, which allows for a better compression of the quantized parameters, further reducing the size of the side information.
In a further embodiment of the present invention, an inventive parameter quantizer is having a quantizer factor function generator and a parameter multiplier. The quantizer factor function generator receives the total and the local energy as input and derives a single scaler value from the input quantities. The parameter multiplier receives the parameters and the derived quantizer factor f to divide the parameters by the quantizer factor prior to transferring the modified parameters to the quantizer that applies a fixed quantization rule to the modified parameters.
A variation of this embodiment is to have a parameter multiplier after the quantizer and hence use the derived quantizer factor f to divide the resulting index out of the quantizer. The result of this then needs to be rounded into an integer index again.
Application of a scaling factor to the parameters has the same effect as choosing different quantization rules, since for example division by a big factor compresses the input parameter space such that effectively only a smaller part of a already existing quantization rule would be effective. This solution has the advantage that on the decoder and the encoder side additional memory can be saved because there is only one quantization rule to be stored or to be processed since the scaling is done by a simple multiplication requiring only limited additional hard- or software. An additional advantage is that by applying a quantizer factor, the quantizer factor can be derived using any possible functional dependence. Therefore, a quantizer or dequantizer sensitivity can be adjusted continuously within the whole possible input parameter space rather than selecting predefined quantization rules out of a given sample.
Preferred embodiments of the present invention are subsequently described by referring to the enclosed drawings, wherein:
a to c show several possible quantization rules to be applied;
a, 4b show an alternative embodiment of a parameter encoder having an inventive parameter quantizer;
a shows an embodiment of an inventive parameter dequantizer;
b shows a further embodiment of an inventive parameter dequantizer;
c shows an example for implementing energy dependent dequantization;
d shows a further example for implementing energy dependent dequantization.
e shows examples of quantization and dequantization of parameters;
a shows a representation of a 5-channel multi-channel audio signal; and
b shows a hierarchical parametric multi-channel decoder according to prior art.
The input parameters to the quantizer selector 202 are the total energy of the original multi-channel signal and the local energy for the channel described by the parameter to be quantized. In a preferred embodiment of the present invention the ratio between the local energy and the total energy gives a measure that can be used to decide which quantizer to use. As an example this ratio q (Relative Local energy) can be calculated in dB, using the following equation:
The selected quantizer is then used to quantize the parameter 206 with the quantizer.
The present invention teaches that a coarser quantization of IID/ICLD parameters (and the like) can be used if a parametrization stage is lower in energy compared to the total energy, i.e. when the relative Local energy q is small. The present invention utilizes the psycho-acoustic relation that it is more important to parameterize the dominant/high energy signals with high accuracy than the audio signal with less significance/low energy. To make this even clearer, reference is again made to
In the most extreme example, the surround channels A and E only have some faint noise and the front channels B, C, and D have full amplitude signals. In such a case, a 16 bit PCM original signal would indicate an energy difference of more than 80 dB. Therefore, parameter r4 could be quantized arbitrarily coarse without introducing any audible differences due to (coarse) quantization.
a to 2c show three possible quantization rules introducing different levels of quantization errors. All figures show the original parameter on their x-axis and the integer values assigned to the parameters on their y-axis. Furthermore the
The finest quantization is indicated in
These three quantization rules are examples of quantization rules that may be selected by the quantizer selector 202. In other words,
As an example, a possible quantization rule generation could be based on the relative Local energy q between the local energy and the total energy, as introduced above. A possible range of q-values with corresponding selections of quantization rules is summarized, as an example, within the following table:
The combination of an inventive parameter quantizer with a differential encoder and a Huffman encoder is particularly attractive since coarser quantization results in a higher abundance of equal symbols (quantized parameters). The combination of the differential encoder 220 and the Huffman encoder 222 will evidently provide an encoded representation of the quantized parameters (parameter bitstream element 224) that is more compact, when the maximum number of possible input symbols is decreased by a coarser quantization.
a shows a further embodiment of an inventive parameter encoder using an inventive parameter quantizer 250, a differential encoder 252, and a Huffman encoder 254.
The parameter quantizer 250 is having a quantizer factor generator 256, a parameter scaler 258, and a quantizer 260. In this case the quantizer factor generator 256 together with the parameter scaler 258 serve as a quantization rule generator.
The quantizer function generator 256 receives as input the total energy of the multi-channel audio signal and the local energy of the channel or the channel pair for the parameter to be quantized. The quantizer factor generator 256 generates a scale factor 262 (f) based on the local energy and the total energy. In a preferred embodiment this is done on a basis of a ratio between the local energy and the total energy resulting in a relative local energy q, as follows:
This ratio q can be used within the quantizer factor generator 256 to calculate the quantizer factor f (262) that is used as input for the parameter scaler 258 that additionally receives the parameter to be quantized.
The parameter scaler 258 applies a scaling to the input parameter that could for example be a division of the parameter by the quantizer factor 262. The scaling of the parameter is equivalent to selecting different quantization rules. The scaled parameter is then input into a quantizer 260 that applies a fixed quantization rule within this embodiment of the present invention. The further processing of the quantized parameter is equal to the processing of
Applying a scaling factor to the parameters has the advantage that the quantization rule could be adapted to the needs in a continuous way, since an analytical function deriving the quantization factor 262 can basically have any form.
b shows a further embodiment of an inventive parameter encoder 270 which is similar to the inventive parameter encoder 250 shown in
The inventive parameter encoder 270 is not having a parameter scaler (parameter scaler 258 of parameter encoder 250). To achieve an energy dependency of quantization, the parameter quantizer 270 is having a compression device 272 instead. That means the quantizer factor generator 256 together with the compression device 258 serve as a quantization rule generator in this case. The compression device 272 is connected to the quantizer 260 and to the quantizer factor generator 256. The compression unit 272 receives as an input a quantized parameter that is quantized by the quantizer 260 according using a fixed quantization scheme. To implement the energy dependence, the compression unit uses the quantized parameter as input and scales the quantized parameter using the scale factor 262. This saves bit rate by decreasing the possible number of quantized parameters to be transmitted to the delta coder 252. This compression can for example be achieved by a division of the quantized parameter index by the scaling factor 262.
Possible functions to derive the scale factor 262 from the relative Local energy ratio q are shown in
The factor functions 302, and 304 show two possibilities to implement factor functions, wherein the factor function 302 is the less aggressive one and would therefore increase the introduced quantization error less than using factor function 304. On the other hand, factor function 302 would save less bit rate than factor function 304. Factor function 303 shows a fourth possibility to derive the quantizer factor from the energy quota q, whereas the factor function 303 is step-like in form and therefore assigns intervals of the energy quota q to the same quantizer factor.
It may be noted that the dequantizer selector 504 may operate in different ways. A first possibility is that the dequantizer selector 504 derives the quantization rule directly and transfers the derived quantization rule to the dequantizer 502. Another possibility is that the dequantizer selector 504 meets a dequantization rule decision, which is transferred to the dequantizer 502 that can use the dequantization rule decision to select the appropriate dequantization rule from a number of quantization rules that are for example stored in the dequantizer 502.
The Huffman decoder 512 receives a parameter bit stream element 513 and in association therewith, the dequantizer selector 504 receives the local energy of a channel or a pair of channels described by the parameter bit stream element 513 and the total energy of the multi-channel audio signal. The parameter bit stream element 513 is produced by an inventive parameter encoder, as shown in
In other words,
a shows a further embodiment of an inventive parameter decoder, having an inventive energy dependent dequantizer 520, a Huffman decoder 512, and a differential decoder 510. The parameter dequantizer 520 comprises a quantizer factor generator 522, a dequantizer 524, and a parameter scaler 526. In this case the dequantizer factor generator 522 together with the parameter scaler 526 serve as a dequantization rule generator.
After decoding the parameter bit stream element 513 by the Huffman decoder and the differential decoder, the quantized parameter is dequantized by the dequantizer 524, wherein the dequantizer 524 is using a dequantization rule matching a quantization rule used to generate the quantized parameter. The quantizer factor generator 522 derives a scale factor 528 (f) from a ratio of the local energy and the total energy of the multi-channel audio signal. The parameter scaler 526 then applies the scale factor 528 to the dequantized parameter by a multiplication of the scale factor with the dequantized parameter.
After the scaling by the parameter scaler 526, the decompressed dequantized parameters are available at an output of the inventive parameter decoder.
b shows a further embodiment of an inventive parameter decoder 530, similar to the inventive parameter decoder 520. Therefore, only the differences to the parameter decoder 520 shall be elaborated on in the following paragraph.
The inventive parameter decoder 530 is having a decompressor 532, the decompressor 532 achieving the same functional result as the parameter scaler 526 in the inventive parameter decoder 520. The decompressor 532 receives as an input the quantized parameters and as further input the scale factor 528 from the factor generator 522. That means the factor generator 522 together with the decompressor 532 serve as a dequantization rule generator in this case. To implement the energy weighted dequantizing functionality, the quantized parameter is scaled by the decompressor 532 before the so derived scaled quantized parameter is input into the dequantizer 524. The dequantizer 524 then dequantizes the scaled quantized parameter to derive the dequantized parameter using a fixed dequantization rule. This decompression can for example be achieved by a multiplication of the quantized parameter index by the scale factor 528.
Although the scaling by the parameter scaler 258 and the parameter scaler 526 during the encoding and decoding is described to be a division during the encoding and a multiplication during the decoding, any other type of scaling that has the same effect as using a different quantization rule can be applied to the parameters during the encoding or decoding.
In the case of a stacked parameterization (hierarchical de- or encoding) as exemplified for example in
In other words, a decoder may either decide autonomously which dequantization rule to use using the total energy and the local energy. Alternatively, it could be signalled by some additional side information to the decoder, which dequantization rule is the appropriate one to dequantize the parameters.
Although described within different embodiments of the present invention, the application of a scale factor and the selection of an appropriate dequantization rule can also be combined within one embodiment of an inventive encoder or decoder.
To give a more detailed example, two possible ways of implementing energy dependent dequantization for the reconstruction of a multi-channel signal from a transferred monophonic signal M using additionally transmitted spatial parameters (CLD, ICC) are shown in
c shows the situation where the parameters CLD are derived such that it is assumed that a parameter CLD0 describes the energy distribution between channels that are combined using a number of channels of the original signal.
In the first hierarchic up-mix position 1000, CLD0 describes the energy relation between two channels, wherein a first channel is a combination 1002 of a front-left, a front-right, a center and a low-frequency-enhancement channel. The second channel is a combination of a back-left and a back-right channel. In other words, the parameter CLD0 describes the energy distribution between all rear channels and all front channels.
It is therefore evident when CLD0 indicates that only little energies contained in the rear channels, the parameters describing the spatial properties between the back-left and the back-right channel may be quantized stronger, since the additionally-introduced distortion by the coarse quantization is hardly audible when all channels are played back simultaneously.
An inventive parameter dequantizer, as shown in
In the following, the term “DEQ” describes the application of a fixed dequantization table to a parameter given to the procedure DEQ. That means, a transmitted parameter IDX CLD (0,L) can be dequantized directly, indicated by the following expression:
DCLDQ(0,l,m)=deq(idxCLD(0,l,m),CLD)
Since the CLD parameter describes an energy distribution between two channels and the channels are combinations of channels as indicated in
The relative local energy of the back channels is accordingly:
Given the above and the inventive concept, CLD1 can now be computed, taking into account the overall energy contained in the combination signal 1002:
idxCLDEdQ(1,l,m)=max(−15,min(15,round(idxCLD(1,l,m)·facFunc(RelativeLocalEnergyFC5151(l,m)))))
In the formula given above, the term “facFunc” describes a function giving a real value independency of the relative local energy FC. In other words, formula 4 describes that before dequantization, the transmitted parameter index IDX CLD (1,l,m) is multiplied with a scale factor (facFunc) to derive an intermediate quantized parameter. Since the intermediate quantized parameter is not necessarily integer-valued, the intermediate quantized parameter must be rounded to derive IdxCLDEdQ, which is then dequantized into the final parameter used by the following operation:
DCLDQ(1,l,m)=deq(idxCLDEdQ(1,l,m),CLD)
Dequantization is performed by a standard dequantization table, such as, for example, the following:
The derived parameter CLD1 describes an energy relation between a channel being a combination of a front-left and a front-right channel and a channel being a combination of a center and a low-frequency-enhancement channel, as can be seen from the channel decomposition in the second hierarchical step 1004. Such, a relative local energy F, describing an energy contained in the front channels, front-left and front-right, can be computed according to the following formula:
Previously, a relative local energy S describing the energy of the back channels has been derived such that an intermediate quantized parameter IDX CLD EDQ can be calculated for the hierarchical box 1006 according to the following formulas:
idxCLDEdQ(2,l,m)=max(−15,min(15,round(idxCLD(2,l,m)·facFunc(RelativeLocalEnergyS5151(l,m)))))
DCLDQ(2,l,m)=deq(idxCLDEdQ(2,l,m),CLD)
Since, as previously described, a relative local energy describing the energy of the front-channels only (F5151) is now available, parameter CLD3 describing an energy relation between the front-left and the front-right channel can now be derived in an energy-dependent way according to the following formulas:
idxCLDEdQ(3,l,m)=max(−15,min(15,round(idxCLD(3,l,m)·facFunc(RelativeLocalEnergyF5151(l,m)))))
DCLDQ(3,l,m)=deq(idxCLDEdQ(3,l,m),CLD)
In one possible implementation, parameter CAD4 describing an energy relation between the center and the low-frequency-enhancement channel can now be derived using no factor function:
DCLDQ(4,l,m)=deq(idxCLD(4,l,m),CLD)
In alternative embodiments, it is, of course, also feasible to implement energy-dependency also in the derivation of the parameter CLD4.
d shows another possibility of defining a hierarchic for the derivation of the spatial parameters.
In analogy to the description of
DCLDQ(0,l,m)=deq(idxCLD(0,l,m),CLD)
idxCLDEdQ(1,l,m)=max(−15,min(15,round(idxCLD(1,l,m)·facFunc(RelativeLocalEnergyLR5152(l,m)))
DCLDQ(1,l,m)=deq(idxCLDEdQ(1,l,m),CLD)
D
CLD
Q(2,l,m)=deq(idxCLD(2,l,m),CLD)
idxCLDEdQ(3,l,m)=max(−15,min(15,round(idxCLD(3,l,m)·facFunc(RelativeLocalEnergyL5152(l,m))))
DCLDQ(3,l,m)=deq(idxCLDEdQ(3,l,m),CLD)
idxCLDEdQ(4,l,m)=max(−15,min(15,round(idxCLD(4,l,m)·facFunc(RelativeLocalEnergyR5152(l,m))))
DCLDQ(4,l,m)=deq(idxCLDEdQ(4,l,m),CLD)
It may be noted that different factor functions may be used to implement the inventive concept as, for example, one of the functions shown in
Generally, as already mentioned above, it is the inventive concept to apply an energy-dependent quantization in the sense that parameters (CLD) of parts of the signal that contain relatively low energy compared to other signal parts, are quantized in a coarser way. That is, the factor function has to be such that for low energy components, the factor applied is large.
To illustrate this in more detail, one example is given in
Table 9d shows the manipulation of the quantization index on the quantizer side in a left column 1100, and the reconstruction of the transmitted parameter on the quantizer side in a column 1102. The transmitted parameter is given in column 1104. Two examples for a combination of channels having relatively low energy are shown. This is indicated by the common scale factor 4.5, which is significantly bigger than 1 (see
The dequantizer multiplies the transmitted index by the scale factor to derive a rekonstructed index IDXrek used for dequantization. As can be seen in the first example of an index 10 on the quantizer size, an additional error of 1 arises due to the rounding of the divided index on the quantizer size. On the other hand, when, by chance, the division of the scale factor at the quantizer side yields an integer valued index IDXtransm to be transmitted, no additional error is introduced.
Evidently, the danger of introducing additional errors rises with rising scale factor f. This means that the probability of adding additional errors to low energy signals is rather high. When signals described by the CLD parameter in question have comparatively equal energy, the CLD value will be close to unity and such will be the scale factor (see, for example
It is evidently an enormous advantage of the present invention that errors are only accepted for channels having comparatively low energy. For those channels, on the other hand, by dividing the indices of the associated parameters by some large numbers brings the index values of those channels closer to zero, on the average. This can be exploited perfectly by the following differential encoding and Huffman encoding procedure to efficiently decrease the bit rate consumed for the transmitted parameters of a multi-channel signal.
The relation of the local and the total energy upon which the decision which de-/quantization rule to use is based, is described to be a logarithmic measure within the previous paragraphs. This of course not the only possible measure that can be used to realize the inventive concept. Any other measure describing an energy difference between the local energy or the total energy, as for example the plain difference, can be used to make the decision.
Another important feature with the present invention is that in combination with a two channel decoder (PS) design that distributes the incoming energy into the two output channels typically controlled by e.g. CLD like parameter (meaning that the incoming energy equals the sum of the energies for the two output channels), is that the difference in energy, Relative Local Energy between the total energy and the local energy for each two channel decoders (122, 124, 126, and 128) is defined by the CLD parameters. This means that there is no need to actually measure the total energy and the local energy since the difference in energy in dB that is typically used to calculate the scale factor is defined by the CLD parameters.
Depending on certain implementation requirements of the inventive methods, the inventive methods can be implemented in hardware or in software. The implementation can be performed using a digital storage medium, in particular a disk, DVD or a CD having electronically readable control signals stored thereon, which cooperate with a programmable computer system such that the inventive methods are performed. Generally, the present invention is, therefore, a computer program product with a program code stored on a machine-readable carrier, the program code being operative for performing the inventive methods when the computer program product runs on a computer. In other words, the inventive methods are, therefore, a computer program having a program code for performing at least one of the inventive methods when the computer program runs on a computer.
While the foregoing has been particularly shown and described with reference to particular embodiments thereof, it will be understood by those skilled in the art that various other changes in the form and details may be made without departing from the spirit and scope thereof. It is to be understood that various changes may be made in adapting to different embodiments without departing from the broader concepts disclosed herein and comprehended by the claims that follow.
This is a continuing application, under 35 U.S.C. §120, of copending International application PCT/EP2006/003284, filed Apr. 10, 2006, which designated the United States; the application also claims the priority, under 35 U.S.C. §119(e), of U.S. application No. 60/672,943, filed Apr. 19, 2005; the prior applications are herewith incorporated by reference in their entirety.
Number | Name | Date | Kind |
---|---|---|---|
6134200 | Timmermans | Oct 2000 | A |
7382886 | Henn et al. | Jun 2008 | B2 |
7447629 | Breebaart | Nov 2008 | B2 |
7613609 | Makino | Nov 2009 | B2 |
7627482 | Tsuji et al. | Dec 2009 | B2 |
Number | Date | Country |
---|---|---|
2003337598 | Nov 2003 | JP |
2004309921 | Nov 2004 | JP |
2004535145 | Nov 2004 | JP |
2005533426 | Nov 2005 | JP |
2006528482 | Dec 2006 | JP |
2 073 913 | Feb 1997 | RU |
2004072956 | Aug 2004 | WO |
Number | Date | Country | |
---|---|---|---|
20070016416 A1 | Jan 2007 | US |
Number | Date | Country | |
---|---|---|---|
60672943 | Apr 2005 | US |
Number | Date | Country | |
---|---|---|---|
Parent | PCT/EP2006/003284 | Apr 2006 | US |
Child | 11406631 | US |