The present invention relates to a technique to encode or decode signal sequences, such as audio and video signal sequences, by vector quantization.
In a coding apparatus described in Patent literature 1, an input signal is first normalized by division by a normalization value. The normalization value is quantized to generate a quantization index. The normalized input signal is vector-quantized to generate the index of a representative quantization vector. The generated indexes, which are the quantization index and the index of the representative quantization vector, are output to a decoding apparatus.
The decoding apparatus decodes the quantization index to generate a normalization value. The decoding apparatus also decodes the index of the representative quantization vector to generate a decoded signal. The normalized decoded signal is multiplied by the normalization value to generate a decoded signal.
Patent literature 1: Japanese Patent Application Laid-Open No. 07-261800
High-performance vector quantization methods that produces the low quantization noise, such as SVQ (Spherical Vector Quantization (SVQ, see G.729.1), are well-known vector-quantization methods that assign pulses within a preset given quantization bit rate.
When the vector-quantization method is used in the coding and decoding apparatuses described in Patent literature 1 in the case where an input signal is a frequency-domain signal, for example, the lack of available bit budget used to quantize all frequency components can cause spectral holes. The spectral hole indicates a frequency component loss of when some frequency components are not present in an output signal but those are present in an input signal. As a result of the spectral hole, if a pulse of a certain frequency component is assigned or not in consecutive frames, so-called musical noise can be caused.
An object of the present invention is to provide a coding method, a decoding method, an apparatus, a program and a recording medium for reducing musical noise which can occur when an input signal is a frequency-domain signal, for example.
In coding, a normalization value that is representative of a predetermined number of input samples is calculated. The normalization value is quantized to obtain a quantized normalization value, and a normalization-value quantization index corresponding to the quantized normalization value is obtained. A value corresponding to the quantized normalization value is subtracted from a value corresponding to the magnitude of the value of each sample to obtain a difference value. When the difference value is positive and the value of the sample is positive, the difference value is set as the quantization candidate corresponding to the sample; when the difference value is positive and the value of the sample is negative, the sign of the difference value is reversed and is set as the quantization candidate corresponding to the sample; and when the difference value is not positive, zero is set as the quantization candidate corresponding to the sample. A plurality of quantization candidates corresponding to a plurality of samples are jointly vector-quantized to obtain a vector quantization index.
In decoding, a decoded normalization value corresponding to an input normalization-value quantization index is obtained. A plurality of values corresponding to an input vector quantization index are obtained as a plurality of decoded values. Calculation is performed to obtain a recalculated normalization value that decreases with increasing sum of the absolute values of a predetermined number of decoded values. When a decoded value is positive, the decoded value and the decoded normalization value are added together and when a decoded value is negative, the absolute values of the decoded value and the decoded normalization value are added together and the sign of the resulting value is reversed; when a decoded value is zero, the recalculated normalization value is multiplied by a first constant.
In coding, by selecting some dominant components from all frequency components and by actively quantizing them, occurrence of spectral holes related to the dominant components can be prevented and the musical noise can be reduced.
In decoding, by assigning a non-zero value based on a recalculated normalization value when a decoded value is zero, a spectral hole which can occur if, for example, an input signal is a frequency-domain signal can be prevented and the musical noise can be reduced.
An embodiment of the present invention will be descried below in detail.
A coding apparatus 1 includes a normalization value calculator 12, a normalization value quantizer 13, a quantization-candidate calculator 14, and a vector quantizer 15, for example, as illustrated in
The coding apparatus 1 executes the steps of a coding method illustrated in
An input signal X (k) is input into the normalization value calculator 12 and quantization-candidate calculator 14. The input signal X (k) in this example is a frequency-domain signal resulting from conversion into a frequency domain by the frequency-domain converter 11.
The frequency-domain converter 11 converts an input time-domain signal x (n) to a frequency-domain signal X (k) by MDCT (Modified Discrete Cosine Transform), etc., and outputs the frequency-domain signal X (k). Here, n is a number of a signal in a time domain (a discrete-time number) and k is a number of a signal in a frequency domain (a discrete-frequency number). Suppose that one frame includes L samples. The time-domain signal x (n) is converted to a frequency domain signal per each frame to generate frequency-domain signals X (k) (k=0, 1, . . . L−1) that constitute L frequency components. Here, L is a predetermined positive number, for example 64 or 80.
The normalization value calculator 12 calculates a normalization value X0− that is representative value of a predetermined number C0 of input samples (step E1). Here, X0− is the character X0 with an overbar. The calculated X0− is sent to the normalization value quantizer 13.
Here, C0 is L or a common divisor of L other than 1 and L. If C0 is a common divisor of L, it means that L frequency components are divided into sub-bands and a normalization value is calculated per each sub-band.
For example, if L=80 and one sub-band is composed of eight frequency components, 10 sub-bands are formed and a normalization value is calculated per each sub-band. The following describes using C0=L as an example.
The normalization value X0− is a representative value of C0 samples and an average value of powers of the C0 samples, for example.
The normalization value quantizer 13 quantizes the normalization value X0− to obtain a quantized normalization value X− and obtains a normalization-value quantization index corresponding to the quantized normalization value X− (step E2). Here, X− is the character X with an overbar. The quantized normalization value X− is sent to the quantization-candidate calculator 14 and the normalization-value quantization index is sent to the decoding apparatus 2.
The quantization-candidate calculator 14 subtracts a value corresponding to the quantized normalization value from a value corresponding to the magnitude of the each sample value X (x) of the input signal to obtain the difference value E− (k). If the difference value E− (k) is positive and the each sample value X (k) is positive, the quantization-candidate calculator 14 sets the difference value E− (k) as the quantization candidate E (k) corresponding to the sample. If the difference value E− (k) is positive and the each sample value X (k) is negative, the quantization-candidate calculator 14 reverses the sign of the difference value and sets the sign-reversed value as the quantization candidate E (k) corresponding to the sample. If the difference value E− (k) is not positive, the quantization-candidate calculator 14 sets 0 as the quantization candidate E (k) corresponding to the sample (step S3). The quantization candidate E (k) is sent to the vector quantizer 15.
In particular, the quantization-candidate calculator 14 performs the operations illustrated in
The quantization-candidate calculator 14 initializes character k as k=0 (step E31).
The quantization-candidate calculator 14 compares k with L (step E32). If k<L, the process proceeds to step E33; otherwise the process at step E3 exits.
The quantization-candidate calculator 14 calculates the difference value E− (k) between the absolute value of the each sample value X (k) of the input signal and the quantized normalization value (step E33). Here, E− is the character E with an overbar. For example the quantization-candidate calculator 14 calculates the value of E− (k) defined by Equation 1 given below. Here, C1 is an adjustment constant for adjusting the normalization value and takes on a positive value. For example, C1=1.0.
[Equation 2]
Ē(k)=|X(k)|−C1·
Thus, the value corresponding to the each sample value X (k) is for example the absolute value |X (k)| of the value X (k) of that sample. The value corresponding to the quantized normalization value X− is for example the product of the quantized normalization value X− and the adjustment constant C1.
The quantization-candidate calculator 14 compares the difference value E− (k) with zero (step E34). If not difference value E− (k)>0, the quantization-candidate calculator 14 sets zero as the quantization candidate E (k) (step E35).
If difference value E− (k)>0, the quantization-candidate calculator 14 compares X (k) with zero (step E36).
If not X (k)<0, the quantization-candidate calculator 14 sets the difference value E− (k) as the quantization candidate E (k) (step E37).
If X (k)<0, the quantization-candidate calculator 14 reverses the sign of the difference value E− (k) and sets the sign-reversed value −E− (k) as the quantization candidate E (k) (step E38).
The quantization-candidate calculator 14 increments k by 1 (step E39) and then proceeds to step E32.
In this way, the quantization-candidate calculator 14 subtracts the value corresponding to the quantized normalization value from the value corresponding to the magnitude of a sample value and selects the greater value of the difference value or 0, and sets the value obtained by multiplying the selected value by the sign of that sample value as the quantization candidate.
The vector quantizer 15 jointly vector-quantizes a plurality of quantization candidates E (k) corresponding to a plurality of samples to obtain a vector quantization index (step E4). The vector quantization index is sent to the decoding apparatus 2.
The vector quantization index represents a representative quantization vector. For example, the vector quantizer 15 selects a representative quantization vector closest to a vector composed of a plurality of quantization candidates E (k) corresponding to a plurality of samples from among a plurality of representative quantization vectors stored in a vector codebook storage not shown in the figure. And the vector quantizer 15 outputs a vector quantization index representing the selected representative quantization vector to accomplish vector quantization.
The vector quantizer 15 jointly vector-quantizes the quantization candidates E (k) corresponding to C0 samples, for example. The vector quantizer 15 uses a vector quantization method such as SVQ (Spherical Vector Quantization, see G.729.1) to perform the vector quantization. However, the vector quantizer 15 may use other vector quantization method.
In this way, if for example an input signal is a frequency-domain signal, dominant components are selected from among all frequencies and actively quantized. Thereby occurrence of a spectral hole in dominant components can be prevented and the musical noise can be reduced.
The normalization value decoder 21 calculates a decoded normalization value X− corresponding to a normalization-value quantization index which is input into the decoding apparatus 2 (step D1). The decoded normalization value X− is sent to the normalization value recalculator 23. It is assumed here that normalization values individually corresponding to a plurality of normalization-value quantization indices are stored in a codebook storage not shown in the figure. The normalization value decoder 21 searches the codebook storage using the input normalization-value quantization index as a key to obtain a normalization value corresponding to the normalization-value quantization index and sets the obtained value as a decoded normalization value X−.
The vector decoder 22 obtains a plurality of values corresponding to the vector quantization index, which is input into the decoding apparatus 2, and sets them as a plurality of quantized values Ê (k) (step D2). Here, Ê is the character E with a hat. The decoded value Ê (k) is sent to the synthesizer 24.
It is assumed here that the vector codebook storage not shown in the figure contains the representative quantization vectors individually corresponding to a plurality of vector quantization indices. The vector decoder 22 searches the vector codebook storage using the representative quantization vector corresponding to the input vector quantization index as a key to obtain the representative quantization vector corresponding to the vector quantization index. The components of the representative quantization vector are a plurality of values corresponding to the input vector quantization index.
The normalization value recalculator 23 calculates a recalculated normalization value X= that takes on a value that decreases with increasing sum of the absolute values of a predetermined number of decoded values Ê (k) (step D3). The recalculated normalization value X= is sent to the synthesizer 24. The recalculated normalization value X= is the character X with a double overbar.
In particular, the normalization value recalculator 23 performs the operations illustrated in
The normalization value recalculator 23 initializes the characters k, m and tmp as k=0, m=0 and tmp=0 (step D31).
The normalization value recalculator 23 compares k with C0 (step D32).
If k≧C0, the value of X= defined by the following equation is calculated (step D37), then the process at step D3 exits.
If k<C0, the normalization value recalculator 23 compares the decoded value Ê with zero (step D33). If the decoded value Ê (k) is zero, the normalization value recalculator 23 increments m by 1 (step D35), then proceeds to step D36. If the decoded value Ê (k) is not zero, the normalization value recalculator 23 proceeds to step D34.
The normalization value recalculator 23 calculates the power of the sample with number k and adds the power to tmp (step D34). The normalization value recalculator 23 then proceeds to step D36. That is, the sum of the calculated power and the value of tmp is set as a new value of tmp. The power is calculated according to the following equation, for example.
(C1·
The normalization value recalculator 23 increments k by 1 (step D36), then proceeds to step D32.
When a decoded value Ê (k) is positive, the synthesizer 24 adds the decoded value Ê (k) to the decoded normalization value X−, when a decoded value Ê (k) is negative, the synthesizer 24 reverses the sign of the sum of the absolute value of the decoded value Ê (k) and the decoded normalization value X−; if the decoded value Ê (k) is zero, the synthesizer 24 multiplies the recalculated normalization value X= by a first constant C3 and randomly reverse the sign of the product to obtain a decoded signal value X̂ (k) (step D4).
In particular, the synthesizer 24 performs the operations illustrated in
The synthesizer 24 initializes character k as k=0 (step D41).
The synthesizer 24 compares k with C0 (step D2). If not k<C0, the process at step D4 exits.
If k<C0, the synthesizer 24 compares the decoded value Ê (k) with zero. If the decoded value Ê (k) is zero, the synthesizer 24 multiplies the recalculated normalization value X= by the first constant C3 and randomly reverses the sign of the product to obtain the value X̂ (k) of the decoded signal (step D44). That is, the value defined by the equation given below is calculated as X̂ (k). Here, C3 is a constant for adjusting the magnitude of the frequency component and may be 0.9, for example, and rand (k) is a function that outputs 1 or −1, for example randomly outputs 1 or −1 based on random numbers.
In this way, the synthesizer 24 obtains X̂ (k) whose absolute value is set to the value obtained by multiplying the recalculated normalization value 96 X= by the first constant C3.
{circumflex over (X)}(k)=C3·
If the synthesizer 24 determines at step D43 that the decoded value Ê (k) is not zero, the synthesizer 24 compares the decoded value Ê (k) with zero (step D45).
If the decoded value Ê (k)<0, the synthesizer 24 reverses the sign of the sum of the absolute value |Ê (k)| of the decoded value Ê (k) and the decoded normalization value X−to obtain a value X̂ (k) of the decoded signal (step D46). That is, the value defined by the following equation is calculated as X̂ (k).
{circumflex over (X)}(k)=−(C1·
If not decoded value Ê (k)<0, the synthesizer 24 adds the decoded value Ê (k) to the decoded normalization value X− and sets the sum as X̂ (k) (step D47).
{circumflex over (X)}(k)=C1·
In this way, if not Ê (k)=0, the synthesizer 24 calculates X̂ (k) that is determined by X̂ (k)=σ (Ê (k))·(C1·τX̂+|Ê (k)|). Here, σ (·) is the sign of ·.
After determining X̂ (k), the synthesizer 24 increments k by 1 (step D48), then proceeds to step D42.
If X̂ (k) is the frequency-domain signal, the time-domain converter 25 converts X̂ (k) to the time-domain signal z (n) by the inverse Fourier transform etc..
In this way, if the decoded value Ê (k) is zero, the recalculated normalization value X= is used to assign the non-zero value as appropriate. Accordingly, spectral holes caused when the input signal is the frequency-domain signal can be eliminated. As a result, musical noise can be reduced.
The value assigned when the decoded value Ê (k) is zero is not always positive or negative. A more natural decoded signal can be produced by using the function rand (k) to randomly change the sign.
[Variations]
At step D3, if the recalculated normalization value X′= previously calculated is not zero, the normalization value recalculator 23 may obtain a weighted sum of the recalculated normalization value X= and the previously recalculated normalization value X′= as the recalculated normalization value X=. If the recalculated normalization value X′= is zero, the weighted summing of the recalculated normalization values does not need to be performed. That is, if the recalculated normalization value X′ is zero, smoothing of the recalculated normalization value does not need to be performed.
If C0=L and a recalculated normalization value X= is calculated per each frame, the previously recalculated normalization value X′= is a recalculated normalization value calculated by the normalization value recalculator 23 for the immediately preceding frame. If C0 is a divisor of L other than 1 and L and frequency components are divided into L/C0 sub-bands and a recalculated normalization value is calculated per each sub-band, the previously recalculated normalization value X′= may be a recalculated normalization value calculated for the same sub-band in the previous frame or may be a recalculated normalization value already calculated for the preceding or succeeding adjacent sub-band in the same frame.
The recalculated normalization value Xpost= newly calculated by considering the previously recalculated normalization value X′= can be expressed by the equation given below, where α and β are adjustment coefficients which are determined as appropriate according to the desired performance and specifications. For example, α=β=0.5.
By obtaining a recalculated normalization value considering the previously recalculated normalization value X′=, the newly recalculated normalization value will be closer to the previously recalculated normalization value X′=. As a result the continuity between these values will increase and therefore the musical noise caused when the input signal is the frequency-domain signal, etc., can be further reduced.
As indicated by a dashed line in
The quantization-candidate normalization value calculator 16 uses the quantized normalization value X− to calculate the value defined by the equation given below, for example, as an quantization candidate E (k), (step E3′). Here, C2 is a positive adjustment coefficient (also referred to as a second constant), which may be 0.3, for example.
E
#
=C
2
·
In this way, an quantization-candidate normalization value E# can be calculated from only quantized normalization value X− even at the decoding side without information transmission for the quantization-candidate normalization value E#. The need for transmitting information of the quantization-candidate normalization value E# is thus eliminated and so the communication traffic can be reduced.
In this case, the decoding-candidate normalization value calculator 26 is provided in the decoding apparatus 2 as indicated by dashed line in
The input signal X (k) does not necessarily need to be a frequency-domain signal; it may be any signal such as a time-domain signal. That is, the present invention can be used in coding and decoding of any signals beside frequency-domain signals.
C0, C1, C2 and C3 may be changed as appropriate according to desired performance and specifications.
The steps of the coding and decoding method can be implemented by a computer. The operations of processes at the steps are described in a program. The program is executed on the computer to implement the steps on the computer.
The program describing the operations of the processes can be stored in a computer-readable recording medium. At least part of the operations of the processes may be implemented by hardware.
The present invention is not limited to the embodiment described. Modifications can be made as appropriate without departing from the spirit of the present invention.
Number | Date | Country | Kind |
---|---|---|---|
2010-051820 | Mar 2010 | JP | national |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/JP2011/052541 | 2/7/2011 | WO | 00 | 10/4/2012 |