The present invention relates to a technique of encoding or decoding, for example, a series of acoustic and video signals of sounds, music, and the like according to vector quantization.
In an encoding device described in Patent literature 1, an input signal is first divided by a normalization value, and the input signal is thus normalized. The normalization value is quantized, and a quantization index is generated. The normalized input signal is vector-quantized, and a representative quantization vector index is generated. The generated quantization index and the generated representative quantization vector index are output to a decoding device.
In the decoding device, the quantization index is decoded, and a normalization value is generated. In addition, the representative quantization vector index is decoded, and a sample sequence is generated. A sequence of values that are obtained by multiplying respective samples in the generated sample sequence by the normalization value corresponds to a decoded signal sample sequence.
For the vector quantization, for example, a vector quantization method such as algebraic vector quantization (AVQ) described in Non-patent literature 1 is applied to normalized values of a predetermined number of samples. In this vector quantization method, a representative quantization vector is obtained by giving pulses within a range of a quantization bit number set in advance. Then, in the obtained representative quantization vector, bits for expressing a sample value are assigned to only part of the predetermined number of samples, and a quantization value other than 0 is obtained. Further, bits for expressing a sample value are not assigned to the other samples, and the quantization value is 0.
In the case where an input signal is, for example, a frequency-domain signal obtained by transforming an acoustic signal into a frequency-domain and where the above-mentioned vector quantization method is applied to the encoding device and the decoding device described in Patent literature 1, a spectral hole may occur if a bit number necessary to quantize every frequency component is not enough. The spectral hole refers to a partial loss in frequency components that is caused when a frequency component that should exist in an input signal does not exist in an output signal. If the occurrence of such a spectral hole prevents regular pulse rising of a given frequency component in a continuous frame, a problem of so-called musical noise unfavorably occurs. Such a problem of musical noise is particularly remarkable in the case where the encoding target is a frequency-domain signal, but similarly occurs in the case where the encoding target is a time-domain signal. In addition, in the case where an input signal is a video signal, a problem of block noise, which corresponds to musical noise in an acoustic signal, unfavorably occurs.
The present invention has an object to provide an encoding method, a decoding method, a device, a program, and a recording medium for reducing musical noise in an acoustic signal and block noise in a video signal.
In encoding, a quantized normalization value and a normalization-value quantization index corresponding to the quantized normalization value are obtained, the quantized normalization value being obtained by quantizing a normalization value that is a value representative of a predetermined number of samples. If a difference value that is obtained by subtracting a value corresponding to the quantized normalization value from a value corresponding to a magnitude of a value of each sample is positive and if the value of each sample is positive, the difference value is set as a quantization candidate corresponding to each sample. If the difference value is positive and if the value of each sample is negative, a value obtained by inverting positive/negative of the difference value is set as the quantization candidate corresponding to each sample. The quantization candidate is vector-quantized, and a vector quantization index is thus obtained and output. In addition, sign information is output, the sign information expressing positive/negative of at least one sample that does not make the difference value positive, among the samples.
In decoding, a decoded normalization value corresponding to an input normalization-value quantization index is obtained, and a plurality of values corresponding to an input vector quantization index are obtained as a plurality of decoded values. With the use of a recalculated normalization value having a value that becomes smaller as a sum of absolute values of a predetermined number of the decoded values is larger, if each decoded value is 0 and if sign information corresponding to the decoded value is input, a value obtained by giving positive/negative expressed by the sign information to a product of the recalculated normalization value and a first constant is set as a decoded signal. If each decoded value is 0 and if the sign information corresponding to the decoded value is not input, a value having, as an absolute value, a value obtained by multiplying the recalculated normalization value by the first constant is set as the decoded signal. If each decoded value is not 0, a value obtained by reflecting positive/negative of each decoded value in a linear sum of each decoded value or the absolute value of each decoded value and the decoded normalization value is set as the decoded signal.
According to the present invention, main components containing samples that are not quantization targets according to vector quantization such as AVQ are selected from among every frequency, and the selected main components are aggressively quantized. Accordingly, a spectral hole can be prevented from occurring in main components of a decoded signal, and this can reduce musical noise in the case where an input signal is an acoustic signal and can reduce block noise in the case where an input signal is a video signal.
Hereinafter, embodiments of the present invention are described in detail.
First, a first embodiment of the present invention is described.
<Configuration>
As illustrated in
<Encoding Process>
The encoding device 11 executes steps in an encoding method illustrated in
An input signal X(k) is input to the normalization value calculator 112, the quantization-candidate calculator 114, and the sign information output unit 117. The input signal X(k) in this example is a frequency-domain signal obtained by transforming, into a frequency-domain, a time-domain signal x(n) that is a time-series signal such as an acoustic signal. The input signal X(k) in the frequency-domain may be directly input to the encoding device 11, and the frequency-domain converter 111 may transform the input time-domain signal x(n) into a frequency-domain to generate the input signal X(k) in the frequency-domain. In the case where the frequency-domain converter 111 generates the input signal X(k) in the frequency domain, the frequency-domain converter 111 transforms the input time-domain signal x(n) into the input signal X(k) in the frequency-domain according to, for example, modified discrete cosine transform (MDCT), and outputs the resultant signal. n denotes an identification number (discrete time number) of a signal in a time-domain, and k denotes an identification number (discrete frequency number) of a signal (sample) in a frequency-domain. A larger value of k corresponds to a higher frequency. Assuming that one frame is constituted by L samples, the time-domain signals x(n) are transformed into a frequency-domain for each frame, and the input signals X(k) in the frequency-domain (k=0, 1, . . . , L−1) constituting L frequency components are generated. L is a predetermined positive number, for example, 64 or 80. Note that, in the case of using the MDCT, input time-series signals are transformed into a frequency-domain for each frame constituted by the L samples, at a timing of a ½ frame, that is, at a timing of a L/2 sample.
The normalization value calculator 112 calculates a normalization value τX0− for each frame (Step E1), and the normalization value τX0− is a value representative of a predetermined number C0 of samples among the L samples of the input signals X(k). τX; means τX0 with an overline. Here, τ is assumed to be an integer that is equal to or more than 0 and is uniquely assigned to each subband constituted by the predetermined number C0 of samples among the L samples in one frame. C0 is L or a common divisor of L other than 1 and L. Note that setting L to C0 means obtaining the normalization value on an L-sample basis. Setting a common divisor of L other than 1 and L to C0 means dividing the L samples into subbands and obtaining the normalization value on a C0-sample basis, the C0 samples constituting each subband. For example, in the case where L is equal to 64 and where each subband is constituted by 8 frequency components, 8 subbands are formed, and the normalization value for each subband is calculated. In addition, in the case where C0 is L, τ is equal to 0, and the normalization value τX0− is a value representative of the L samples. That is, in the case where C0 is L, one normalization value τX0− is calculated for each frame. On the other hand, in the case where C0 is a common divisor of L other than 1 and L, τ is an integer (τ=0, . . . , (L/C0)−1) corresponding to each subband in one frame, and the normalization value τX0− is a value representative of the C0 samples that belong to each subframe corresponding to τ. That is, in the case where C0 is a common divisor of L other than 1 and L, (L/C0) normalization values τX0− (τ=0, . . . , (L/C0)−1) are calculated for each frame. In addition, k is equal to τ·C0, . . . , (τ+1)·C0−1, irrespective of a value of C0. τX0− calculated by the normalization value calculator 112 is sent to the normalization value quantizer 113.
[Specific Examples of Normalization Value τX0−]
The normalization value τX0− is a value representative of the C0 samples. In other words, the normalization value τX0− is a value corresponding to the C0 samples. An example of the normalization value τX0− is the following square root to a power average value of the C0 samples.
Another example of the normalization value τX0− is the following value, which is obtained by dividing, by C0, the square root to a total power value of the C0 samples.
Still another example of the normalization value τX0− is the following average amplitude value of the C0 samples.
(End of Description of [Specific Examples of Normalization Value τX0−])
The normalization value quantizer 113 quantizes the normalization value τX0−, to thereby obtain a quantized normalization value τX−, and also obtains a normalization-value quantization index corresponding to the quantized normalization value τX− (Step E2). τX− means τX with an overline. The quantized normalization value τX− is sent to the quantization-candidate calculator 114, and a code (bit stream) corresponding to the normalization-value quantization index is sent to the decoding device 12.
The quantization-candidate-quantization-candidate calculator 114 subtracts a value corresponding to the quantized normalization value from a value corresponding to the magnitude of the value X(k) of each sample of the input signal, to thereby calculate a difference value E(k)′. In the case where the difference value E(k)′ is positive and where the value X(k) of each sample is positive, the quantization-candidate calculator 114 sets the difference value E(k)′ as a quantization candidate E(k) corresponding to each sample. In the case where the difference value E(k)′ is positive and where the value X(k) of each sample is negative, the quantization-candidate calculator 114 sets a value obtained by inverting the positive/negative of the difference value, as the quantization candidate E(k) corresponding to each sample. In the case where the difference value E(k)′ is not positive, the quantization-candidate calculator 114 sets 0 as the quantization candidate E(k) corresponding to each sample. Examples of the value corresponding to the magnitude of the value X(k) of each sample include: an absolute value of the value X(k) of each sample; a value proportional to an absolute value of the value X(k) of each sample; a value obtained by multiplying an absolute value of the value X(k) of each sample by a constant or a variable θ; and an absolute value of a value obtained by multiplying the value X(k) of each sample by a constant and/or a variable. Examples of the value corresponding to the quantized normalization value include: the quantized normalization value; a value proportional to the quantized normalization value; and a value obtained by multiplying the quantized normalization value by a constant and/or a variable (Step E3). The quantization candidate E(k) is sent to the vector quantizer 115.
[Specific Example 1 of Step E3]
For example, the quantization-candidate calculator 114 performs processing illustrated in
The quantization-candidate calculator 114 sets τ·C0 to k, to thereby initialize a value of k (Step E31).
The quantization-candidate calculator 114 compares k with (τ+1)·C0. If k is less than (τ+1)·C0, the quantization-candidate calculator 114 goes to Step E33. If k is not less than (τ+1)·C0, the quantization-candidate calculator 114 ends the processing of Step E3 (Step E32). Note that a comparison method for “comparing δ with η” is not limited, and any comparison method may be adopted as long as the adopted method can determine a magnitude relation between δ and η. For example, a process of comparing δ with η in order to know whether or not δ<η is satisfied may be a process of determining whether or not δ<η is satisfied, may be a process of determining whether or not 0<δ−δ is satisfied, may be a process of determining whether or not δ≧η is satisfied, and may be a process of determining whether or not 0≧η−δ is satisfied.
In Step E33, the quantization-candidate calculator 114 subtracts a value corresponding to the quantized normalization value from a value corresponding to an absolute value of the value X(k) of each sample of the input signal, to thereby calculate the difference value E(k)′ (Step E33). For example, the quantization-candidate calculator 114 calculates a value of E(k)′ defined by the following Equation (1)·C1 is an adjustment constant of the normalization value, and has a positive value. C1 is, for example, 1.0. |•| expresses an absolute value of •.
E(k)′=|X(k)|−C1·τ
The quantization-candidate calculator 114 compares the difference value E(k)′ with 0 (Step E34). If E(k)′ is not equal to or more than 0, the quantization-candidate calculator 114 updates E(k)′ to 0 (Step E35), and goes to Step E36. If E(k)′ is equal to or more than 0, the quantization-candidate calculator 114 goes to Step E36 without updating E(k)′.
In Step E36, the quantization-candidate calculator 114 compares X(k) with 0 (Step E36). If X(k) is not less than 0, the quantization-candidate calculator 114 sets E(k)′ to the quantization candidate E(k) (Step E37). If X(k) is less than 0, the quantization-candidate calculator 114 sets −E(k)′, which is obtained by inverting the positive/negative of E(k)′, to the quantization candidate E(k) (Step E38).
The quantization-candidate calculator 114 increments k by 1 (updates a value of k by setting k+1 as a new value of k), and goes to Step E32 (Step E39).
[Specific Example 2 of Step E3]
The quantization-candidate calculator 114 may decide the quantization candidate E(k) corresponding to the value X(k) of each sample of the input signal, for example, in the following manner.
The quantization-candidate calculator 114 sets 0 to k, to thereby initialize a value of k (Step E31).
The quantization-candidate calculator 114 compares k with C0 (Step E32). If k is less than C0, the quantization-candidate calculator 114 goes to Step E33. If k is not less than C0, the quantization-candidate calculator 114 ends the processing of Step E3.
The quantization-candidate calculator 114 subtracts a value corresponding to the quantized normalization value from a value corresponding to an absolute value of the value X(k) of each sample of the input signal, to thereby calculate the difference value E(k)′ (Step E33).
The quantization-candidate calculator 114 compares the difference value E(k)′ with 0 (Step E34). If E(k)′ is not equal to or more than 0, the quantization-candidate calculator 114 sets 0 to E(k) (Step E35′), increments k by 1 (Step E39), and goes to Step E32. If E(k)′ is equal to or more than 0, the quantization-candidate calculator 114 compares X(k) with 0 (Step E36). If X(k) is not less than 0, the quantization-candidate calculator 114 sets E(k)′ to the quantization candidate E(k) (Step E37). If X(k) is less than 0, the quantization-candidate calculator 114 sets −E(k)′, which is obtained by inverting the positive/negative of E(k)′, to the quantization candidate E(k) (Step E38). The quantization-candidate calculator 114 increments k by 1, and goes to Step E32 (Step E39).
In this way, the quantization-candidate calculator 114 selects a larger value of 0 and the difference value, which is obtained by subtracting a value corresponding to the quantized normalization value from a value corresponding to the magnitude of a sample value, and decides a value obtained by multiplying the selected value by the sign of the sample value, as the quantization candidate.
[Specific Example 3 of Step E3]
According to Specific Examples 1 and 2 of Step E3, the processing is branched in Step E34 depending on whether or not E(k)′ is equal to or more than 0. Alternatively, the processing may be branched in Step E34 depending on whether or not E(k)′ is more than 0 (End of Description of [Specific Examples of Step E3]).
The vector quantizer 115 collectively vector-quantizes the plurality of quantization candidates E(k) respectively corresponding to a plurality of samples, to thereby generate a vector quantization index.
The vector quantization index is an index that expresses a representative quantization vector. For example, the vector quantizer 115 selects a representative quantization vector that is the closest to a vector having, as its components, the plurality of quantization candidates E(k) corresponding to the plurality of samples, from among a plurality of representative quantization vectors stored in a vector code book storing part (not illustrated), and outputs a vector quantization index that expresses the selected representative quantization vector, to thereby perform the vector quantization. For example, the vector quantizer 115 collectively vector-quantizes the quantization candidates E(k) corresponding to the C0 samples. In the case where the quantization candidate E(k) is 0, the vector quantizer 115 performs the vector quantization using such a quantizing method that always makes a quantization value Ê(k) 0, for example, a vector quantization method such as algebraic vector quantization (AVQ; see G.718). In this way, in the case where the input signal is, for example, a frequency-domain signal, main components containing samples that are not quantization targets according to vector quantization such as AVQ are selected from among every frequency, and the selected main components are aggressively quantized. Accordingly, a spectral hole can be prevented from occurring in main components of a decoded signal, and this can reduce musical noise and block noise (the musical noise and the block noise are hereinafter collectively referred to as “musical noise and the like”).
In addition, the bit number of a code obtained by the vector quantization varies depending on the input signal. For some input signals, the bit number of a code (the vector quantization index and the like) obtained by the vector quantization may be less than a bit number assigned for the vector quantization, and part of bits assigned for the vector quantization may remain unused. Note that the “bits assigned for the vector quantization” mean bits assigned for a code (a code corresponding to the vector quantization index) obtained by the vector quantization, among codes sent from the encoding device 11 to the decoding device 12. The “bit number assigned for the vector quantization” means the bit number of the bits assigned for the vector quantization. The “bit number assigned for the vector quantization” may be determined for each frame, and may be determined for each subband. In addition, the “bit number assigned for the vector quantization” may vary depending on the input signal, and may be constant irrespective of the input signal. The vector quantizer 115 calculates, as an unused bit number U, the bit number of bits that are not used in actual vector quantization, among the bits assigned for the vector quantization. In the present embodiment, the unused bit number U is calculated for each frame (on an L-sample basis). For example, the vector quantizer 115 subtracts, from the bit number assigned for the vector quantization in a given frame to be processed, the total bit number of the vector quantization index obtained by vector-quantizing the L samples that actually belong to the given frame, and sets the resultant value as the unused bit number U.
Further, the vector quantizer 115 outputs the plurality of quantization values Ê(k), which are values obtained by locally decoding the vector quantization index. For example, the vector quantizer 115 outputs respective components of the representative quantization vector expressed by the vector quantization index, as the quantization values Ê(k). The quantization value Ê(k) in this example is equal to a decoded value Ê(k) obtained by the decoding device 12. Note that the quantization value Ê(k) does not necessarily need to be identical with the decoded value Ê(k), and a decoded value Ê(k)′, which is 0 in the case where the quantization value Ê(k) is 0 and is 0 in the case where the quantization value Ê(k) is not 0, may be used in place of the decoded value Ê(k). Note that Ê means E with a circumflex.
The vector quantizer 115 sends the vector quantization index, the unused bit number U, and the quantization value Ê(k) to the sign information output unit 117 (Step E4).
The sign information output unit 117 writes the sign information of a sample that makes the quantization value Ê(k) 0, of the input signal X(k) in the frequency-domain, into a region of unused bits (referred to as “unused bit region”) among the bits assigned for the vector quantization. In other words, the sign information output unit 117 places the sign information that expresses the positive/negative of the value X(k) of each sample that does not make E(k)′ positive (makes E(k)′ equal to or less than 0), into the unused bit region of a code (bit stream) corresponding to the vector quantization index (Step E5). Note that the unused bit region can be identified by, for example, a reference position (for example, an initial address) of a given unused bit region and the input unused bit number U.
As a result, the unused bit region can be effectively utilized, and the quality of decoded signals can be enhanced. Note that the upper limit of the bit number of the sign information written into the unused bit region is the unused bit number U. Accordingly, all pieces of the sign information are necessarily written into the unused bit region. Under the circumstances, it is preferable that the sign information output unit 117 extracts the sign information in accordance with criteria defined by considering auditory perceptual characteristics and writes the extracted sign information into the unused bit region. For example, the sign information output unit 117 preferentially extracts the sign information of the input signal X(k) in the frequency domain at frequencies easily perceived by human beings, and writes the extracted sign information into the unused bit region.
[Specific Example of Step E5]
A simple example for simplifying the processing is described. Assuming that auditory perceptual characteristics become lower in a higher frequency region, the sign information corresponding to the unused bit number U is written over the unused bit region in order from a lower frequency. In this example, the sign information output unit 117 performs processing illustrated in
The sign information output unit 117 sets τ·C0 to k, and sets 0 to m, to thereby initialize values of k and m, and goes to Step E52 (Step E51).
The sign information output unit 117 compares k with (τ+1)·C0 (Step E52). If k is less than (τ+1)·C0, the sign information output unit 117 goes to Step E53. If k is not less than (τ+1)·C0, the sign information output unit 117 sets a region obtained by subtracting a region in which bits b(m) are placed from the unused bit region, as a new unused bit region, sets U−m as a new value of U (Step E510), and ends the processing of Step E5. Note that, in the case where C0 is L, Step E510 does not necessarily need to be executed.
The sign information output unit 117 compares m with U (Step E53). If m is less than U, the sign information output unit 117 goes to Step E54. If m is not less than U, the sign information output unit 117 increments k by 1 (Step E55), and goes to Step E52.
In Step E54, the sign information output unit 117 determines whether or not Ê(k) is 0 (Step E54). If Ê(k) is not equal to 0, the quantization-candidate calculator 114 increments k by 1 (Step E55), and goes to Step E52. If Ê(k) is equal to 0, the quantization-candidate calculator 114 compares X(k) with 0 (Step E56). If X(k) is less than 0, the quantization-candidate calculator 114 writes 0 into the mth bit b(m) in the unused bit region (Step E57), and goes to Step E59. If X(k) is not less than 0, the quantization-candidate calculator 114 writes 1 into the mth bit b(m) in the unused bit region (Step E58), and goes to Step E59. Note that a determination method for “determining whether or not δ is 0” is not limited, and any determination method may be adopted as long as the adopted method can make a determination corresponding to whether or not δ is 0. For example, whether or not δ is 0 may be determined by determining whether or not δ is equal to 0, may be determined by determining whether or not δ is equal to γ (γ is not equal to 0), and may be determined by determining whether or not δ>0 and δ<0 are satisfied.
In Step E59, the quantization-candidate calculator 114 increments m by 1 (Step E59), increments k by 1 (Step E55), and goes to Step E52.
A code (bit stream) corresponding to a modified vector quantization index containing the vector quantization index and the sign information written into the unused bit region is sent to the decoding device 12.
Note that, in the case of adopting, for the vector quantizer 115, a quantizing method in which the quantization value E(k)′ may not be 0 even in the case where the quantization candidate E(k) is 0, the vector quantizer 115 may vector-quantize only the quantization candidates E(k) having a value other than 0, and the sign information output unit 117 may output the sign information of a sample that makes the quantization candidate E(k) 0, of the input signal X(k) in the frequency-domain. In this case, however, it is necessary to output and report, to the decoding device, the sample identification number k of the sample that makes the quantization candidate E(k) 0 or the sample identification number k of a sample that does not make the quantization candidate E(k) 0. For this reason, it is preferable that a vector quantization method in which the quantization value E(k)′ is always 0 in the case where the quantization candidate E(k) is 0 be adopted for the vector quantizer 115.
<Decoding Process>
The decoding device 12 executes steps in a decoding method illustrated in
The normalization value decoder 121 obtains a decoded normalization value τX− corresponding to the normalization-value quantization index input to the decoding device 12 (Step D1). The decoded normalization value τX− is sent to the normalization value recalculator 123. It is assumed that a normalization value corresponding to each of the plurality of normalization-value quantization indexes is stored in a code book storing part (not illustrated). The normalization value decoder 121 refers to the code book storing part using the input normalization quantization index as a key, and acquires a normalization value corresponding to the input normalization-value quantization index, as the decoded normalization value τX−.
The vector decoder 122 obtains, as the plurality of decoded values Ê(k), a plurality of values corresponding to the vector quantization index contained in the modified vector quantization index input to the decoding device 12. In addition, the vector decoder 122 calculates the unused bit number U using the vector quantization index (Step D2). The decoded values Ê(k) and the unused bit number U are sent to the synthesizer 124.
For example, it is assumed that a representative quantization vector corresponding to each of the plurality of vector quantization indexes is stored in a vector code book storing part (not illustrated). The vector decoder 122 refers to the vector code book storing part using a representative quantization vector corresponding to the input vector quantization index as a key, and acquires the representative quantization vector corresponding to the input vector quantization index. Components of the representative quantization vector are the plurality of values corresponding to the input vector quantization index.
In addition, the vector decoder 122 calculates, as the unused bit number U, the bit number of bits that are not used in actual vector quantization, among the bits assigned for the vector quantization. In the present embodiment, the unused bit number U is calculated for each frame (on an L-sample basis). For example, the vector decoder 122 subtracts, from the bit number assigned for the vector quantization in a given frame to be processed, the total bit number of the vector quantization index corresponding to the given frame, and sets the resultant value as the unused bit number U.
The normalization value recalculator 123 calculates a recalculated normalization value τX− having a value that becomes smaller as the sum of absolute values of a predetermined number of the decoded values Ê(k) is larger (Step D3). The calculated recalculated normalization value τX= is sent to the synthesizer 124. The recalculated normalization value τX= means τX with a double overline.
[Specific Example of Step D3]
For example, the normalization value recalculator 123 performs processing illustrated in
The normalization value recalculator 123 sets τ·C0 to k, sets 0 to m, and sets 0 to tmp, to thereby initialize these values k, m, and tmp (Step D31).
The normalization value recalculator 123 compares k with (τ+1)·C0 (Step D32). If k is equal to or more than (τ+1)·C0, the normalization value recalculator 123 calculates a value of τX= defined by the following equation (Step D37), and ends the processing of Step D3.
If k is less than (τc+1)·C0, the normalization value recalculator 123 determines whether or not the decoded value Ê(k) is 0 (Step D33). If the decoded value Ê(k) is 0, the normalization value recalculator 123 increments m by 1 (Step D35), and goes to Step D36. If the decoded value Ê(k) is not 0, the normalization value recalculator 123 goes to Step D34.
The normalization value recalculator 123 calculates the power of the sample of the identification number k, and adds the calculated power to tmp (Step D34). After that, the normalization value recalculator 123 goes to Step D36. That is, the normalization value recalculator 123 sets a value obtained by adding the calculated power to a value of tmp, as a new value of tmp. For example, the normalization value recalculator 123 calculates the power according to the following equation.
(C1·τ
The normalization value recalculator 123 increments k by 1 (Step D36), and goes to Step D32 (End of Description of [Specific Example of Step D3]).
In the case where each decoded value Ê(k) is positive, the synthesizer 124 calculates the linear sum of each decoded value Ê(k) and the decoded normalization value τX−. In the case where each decoded value Ê(k) is negative, the synthesizer 124 calculates the positive/negative inverted value of the linear sum of: an absolute value of each decoded value Ê(k); and the decoded normalization value τX−. In the case where each decoded value Ê(k) is 0, the synthesizer 124 calculates the multiplication value of the recalculated normalization value τX= and a first constant C3 or the positive/negative inverted value of the multiplication value thereof. In this way, the synthesizer 124 obtains a value X̂(k) of the decoded signal.
Here, for a sample that makes the decoded value Ê(k) 0 and whose positive/negative is expressed by the sign information contained in the modified vector quantization index, the positive/negative of X̂(k) of the sample is identified by the corresponding sign information. That is, for a sample that makes the decoded value Ê(k) 0 and is expressed as positive by the sign information, the multiplication value of the recalculated normalization value τX= and the first constant C3 is X̂(k). For a sample that makes the decoded value Ê(k) 0 and is expressed as negative by the sign information, the positive/negative inverted value of the multiplication value of the recalculated normalization value τX= and the first constant C3 is X̂(k). Further, for a sample that makes the decoded value Ê(k) 0 and whose positive/negative is not identified by the sign information, the positive/negative of X̂(k) is randomly determined. That is, a value obtained by randomly inverting the positive/negative of the multiplication value of the recalculated normalization value τX= and the first constant C3 is X̂(k) (Step D4). In the present embodiment, the positive/negative of X̂(k) can be identified by the sign information that is transmitted using the unused bit region, and hence the quality of X̂(k) can be enhanced.
[Specific Example 1 of Step D4]
The synthesizer 124 performs, for example, processing illustrated in
The synthesizer 124 sets τ·C0 to k, and sets 0 to m, to thereby initialize values of k and m (Step D41).
The synthesizer 124 compares k with (τ+1)·C0 (Step D42). If k is not less than (τ+1)·C0, the synthesizer 124 sets a region obtained by subtracting a region in which the bits b(m) are placed from the unused bit region, as a new unused bit region, sets U−m as a new value of U (Step D414), and ends the processing of Step D4. Note that, in the case where C0 is L, Step D414 does not necessarily need to be executed. If k is less than (τ+1)·C0, the synthesizer 124 determines whether or not the decoded value Ê(k) is 0 (Step D43). If the decoded value Ê(k) is 0, the synthesizer 124 compares m with the unused bit number U (Step D44). If m is not less than U, the synthesizer 124 sets a value obtained by randomly inverting the positive/negative of the multiplication value of the recalculated normalization value τX= and the first constant C3, as the value X̂(k) of the decoded signal (Step D45). That is, the synthesizer 124 calculates, as X̂(k), a value defined by the following equation. C3 is a constant that adjusts the magnitude of frequency components. C3 in this example is a positive constant, and is, for example, 0.9. rand(k) is a function that outputs 1 or −1, and randomly outputs 1 or −1 on the basis of, for example, a random number.
In this way, the synthesizer 124 sets, as X̂(k), a value having, as an absolute value, the value obtained by multiplying the recalculated normalization value τX= by the first constant C3.
{circumflex over (X)}(k)=C3·τ
After Step D45, the synthesizer 124 increments k by 1 (Step D413), and goes to Step D42.
If it is determined in Step D44 that m is less than U, the synthesizer 124 determines whether or not the mth bit b(m) in the input unused bit region is 0 (Step D46) (Note that the position of the mth bit b(m) in the unused bit region contained in the modified vector quantization index is determined by a start bit position of the unused bit region and the placement order of the bits b(m), and can be easily identified if the unused bit number U is obtained). If b(m) is 0, the synthesizer 124 sets a value obtained by inverting the positive/negative of the multiplication value of the recalculated normalization value τX= and the first constant C3, as the value X̂(k) of the decoded signal (Step D47). That is, the synthesizer 124 calculates, as X̂(k), a value defined by the following equation.
{circumflex over (X)}(k)=−C3·τ
After Step D47, the synthesizer 124 increments each of m and k by 1 (Steps D412 and D413), and goes to Step D42.
If it is determined in Step D46 that b(m) is not 0, the synthesizer 124 sets the multiplication value of the recalculated normalization value and the first constant C3, as the value X̂(k) of the decoded signal (Step D48). That is, the synthesizer 124 calculates, as X̂(k), a value defined by the following equation.
{circumflex over (X)}(k)=C3·τ
After Step D48, the synthesizer 124 increments each of m and k by 1 (Steps D412 and D413), and goes to Step D42.
On the other hand, if it is determined in Step D43 that the decoded value Ê(k) is not 0, the synthesizer 124 compares the decoded value Ê(k) with 0 (Step D49). If the decoded value Ê(k) is less than 0, the synthesizer 124 adds the absolute value |Ê(k)| of the decoded value Ê(k) to the decoded normalization value τX−, inverts the positive/negative of the resultant value, and sets the resultant value as the value X̂(k) of the decoded signal (Step D410). That is, the synthesizer 124 calculates, as X̂(k), a value defined by the following equation.
{circumflex over (X)}(k)=−(C1·τ
If the decoded value Ê(k) is not less than 0, the synthesizer 124 sets, as X̂(k), a value obtained by adding the decoded value Ê(k) to the decoded normalization value τX− (Step D411).
{circumflex over (X)}(k)=C1·τ
As described above, in the case where Ê(k) is not equal to 0, the synthesizer 124 calculates X̂(k) determined by X̂(k)=σ(Ê(k))·(C1·τX−+|Ê(k)|). Here, σ(•) expresses the positive/negative sign of •.
After Steps D410 and D411, the synthesizer 124 increments k by 1 (Step D48), and goes to Step D42.
[Specific Example 2 of Step D4]
The synthesizer 124 may perform, for example, processing illustrated in
The synthesizer 124 sets τ·C0 to k, to thereby initialize a value of k (Step D421).
The synthesizer 124 compares k with (τ+1)·C0 (Step D422). If k is not less than (τ+1)·C0, the synthesizer 124 goes to Step D429. If k is less than (τ+1)·C0, the synthesizer 124 determines whether or not the decoded value Ê(k) is 0 (Step D423). If the decoded value Ê(k) is 0, the synthesizer 124 sets a value obtained by randomly inverting the positive/negative of the multiplication value of the recalculated normalization value τX= and the first constant C3, as the value X̂(k) of the decoded signal (Step D424). That is, the synthesizer 124 calculates, X̂(k), as a value defined by the following equation.
In this way, the synthesizer 124 sets, as X̂(k), a value having, as an absolute value, the value obtained by multiplying the recalculated normalization value τX= by the first constant C3.
{circumflex over (X)}(k)=C3·τ
If it is determined in Step D423 that the decoded value Ê(k) is not 0, the synthesizer 124 compares the decoded value Ê(k) with 0 (Step D425). If the decoded value Ê(k) is less than 0, the synthesizer 124 adds the absolute value |Ê(k)| of the decoded value Ê(k) to the decoded normalization value τX−, inverts the positive/negative of the resultant value, and sets the resultant value as the value X̂(k) of the decoded signal (Step D426). That is, the synthesizer 124 calculates, as X̂(k), a value defined by the following equation.
{circumflex over (X)}(k)=−(C1·τ
If the decoded value Ê(k) is not less than 0, the synthesizer 124 sets, as X̂(k), a value obtained by adding the decoded value Ê(k) to the decoded normalization value τX− (Step D427).
{circumflex over (X)}(k)=C1·τ
The synthesizer 124 increments k by 1 (Step D428) after deciding X̂(k), and goes to Step D422.
In Step D429, the synthesizer 124 sets τ·C0 to k, and sets 0 to m, to thereby initialize values of k and m (Step D429).
The synthesizer 124 compares k with (τ+1)·C0 (Step D430). If k is not less than (τ+1)·C0, the synthesizer 124 sets a region obtained by subtracting a region in which the bits b(m) are placed from the unused bit region, as a new unused bit region, sets U−m as a new value of U (Step D438), and ends the processing of Step D4. Note that, in the case where C0 is L, Step D438 does not necessarily need to be executed. If k is less than (τ+1)·C0, the synthesizer 124 compares m with the unused bit number U (Step D431). If m is not less than U, the synthesizer 124 increments k by 1 (Step D437), and goes to Step D430. If m is less than U, the synthesizer 124 determines whether or not the decoded value Ê(k) is 0 (Step D432). If the decoded value Ê(k) is not 0, the synthesizer 124 increments k by 1 (Step D437), and goes to Step D430. If the decoded value Ê(k) is 0, the synthesizer 124 determines whether or not the mth bit b(m) in the input unused bit region is 0 (Step D433). If b(m) is 0, the synthesizer 124 sets a value obtained by inverting the positive/negative of the multiplication value of the recalculated normalization value τX= and a constant C3′, as the value X̂(k) of the decoded signal (Step D434). C3′ is a constant that adjusts the magnitude of frequency components, and C3′ is, for example, equal to C3 or ε·C3. ε is a variable determined in accordance with a constant or other processing. That is, the synthesizer 124 sets, as X̂(k), a value defined by the following equation.
{circumflex over (X)}(k)=−C3′·τ
Note that, in the processing in Step D434, only the positive/negative of X̂(k) obtained in Step D424 may be modified, only the positive/negative of a value obtained by changing the amplitude of X̂(k) obtained in Step D424 may be modified, and Equation (3) may be newly calculated. After Step D434, the synthesizer 124 increments each of m and k by 1 (Steps D436 and D437), and goes to Step D430.
If it is determined in Step D433 that b(m) is not 0, the synthesizer 124 sets the multiplication value of the recalculated normalization value τX= and the constant C3′, as the value X̂(k) of the decoded signal (Step D435). That is, the synthesizer 124 sets, as X̂(k), a value defined by the following equation.
{circumflex over (X)}(k)=C3′·τ
Note that, in the processing in Step D435, only the positive/negative of X̂(k) obtained in Step D424 may be modified, only the positive/negative of a value obtained by changing the amplitude of X̂(k) obtained in Step D424 may be modified, and Equation (4) may be newly calculated. After Step D434, the synthesizer 124 increments each of m and k by 1 (Steps D436 and D437), and goes to Step D430 (End of Description of [Specific Example of Step D4]).
In the case where a decoded signal in a time-domain is necessary, X̂(k) output from the synthesizer 124 is input to the time-domain converter 125. The time-domain converter 125 transforms X̂(k) into a time-domain signal z(n) according to, for example, inverse MDCT, and outputs the resultant signal.
As described above, in the present embodiment, in the case where the decoded value Ê(k) is 0, a value other than 0 is assigned to X̂(k) using the recalculated normalization value τX=, and hence a spectral hole can be prevented from occurring when the input signal is, for example, a frequency-domain signal. This can reduce the musical noise and the like.
Further, in the present embodiment, the sign information is transmitted to the decoding device 12 using the unused bit region that is not used for the vector quantization by the encoding device 11. Accordingly, the decoding device 12 can identify the positive/negative of X̂(k) using the sign information transmitted in the unused bit region, and hence the quality of X̂(k) can be enhanced.
Note that, because the upper limit of the bit number of the sign information written into the unused bit region is the unused bit number U, the sign information corresponding to every frequency is not necessarily written into the unused bit region. In this case, the sign information is extracted in accordance with criteria defined by considering auditory perceptual characteristics, and the extracted sign information is written into the unused bit region, whereby the decoding device 12 can correctly identify the positive/negative of X̂(k) at frequencies that are important in terms of, for example, the auditory perceptual characteristics. As a result, the quality of X̂(k) at the frequencies that are important in terms of the auditory perceptual characteristics can be preferentially enhanced.
In addition, the positive/negative of X̂(k) at frequencies at which the sign information cannot be transmitted is randomly determined using the function rand(k), and thus is not constant. Accordingly, a natural decoded signal can be made even for the frequencies at which the sign information cannot be transmitted.
As indicated by a broken line in
The quantization-candidate normalization value calculator 116 uses, for example, the quantized normalization value τX− to calculate a value defined by the following equation, as the quantization candidate E(k) (FIG. 2/Step E3′). C2 is a positive adjustment factor (may be referred to as second constant), and is, for example, 0.3.
τ
Ē=C
2·τ
In this way, because the quantization-candidate normalization value τE− is calculated from the quantized normalization value τX−, the decoding device can calculate the quantization-candidate normalization value τE− from the quantized normalization value τX− without transmission of information on the quantization-candidate normalization value τE−. Accordingly, the need to transmit the information on the quantization-candidate normalization value τE− is eliminated, and the volume of communication can be reduced.
In this case, as indicated by a broken line in
In a second embodiment, the decoding device 12 of the first embodiment or the modification thereof is replaced with a decoding device 22 (
In Step D3 (
In the case where C0 is equal to L and where the recalculated normalization value τX= is calculated for each frame, the recalculated normalization value τX′= calculated last time is a recalculated normalization value that is calculated for one frame before by the normalization value recalculator 223. In the case where C0 is a divisor of L other than 1 and L and where the frequency components are divided into L/C0 subbands and the recalculated normalization value is calculated for each subband, the recalculated normalization value τX′= calculated last time may be a recalculated normalization value that is calculated for the same subband in one frame before, and may be a recalculated normalization value of the previous or subsequent continuous subband in the same frame for which the recalculated normalization value has already been calculated.
Assuming that a recalculated normalization value that is newly calculated this time in consideration of the recalculated normalization value τX′= calculated last time is τXpost=, τXpost= is expressed as in the following equation. α1 and β1 are adjustment factors, and are decided as appropriate in accordance with desired performance and specifications. α1 and β1 are equal to, for example, 0.5.
In this way, because the recalculated normalization value is calculated in consideration of the recalculated normalization value τX′= calculated last time, values of the recalculated normalization value τX′= calculated last time and the recalculated normalization value calculated this time become close to each other, and the continuity between these values is improved. Accordingly, the musical noise and the like occurring when the input signal is, for example, a frequency-domain signal can be further reduced.
In a third embodiment, the decoding device 12 of the first embodiment or the modification thereof or the decoding device 22 of the second embodiment is replaced with a decoding device 32 (
The smoothing unit 326 receives, as its input, the value X̂(k) of the decoded signal obtained in Step D4 (
X̂POST(k) is expressed as in the following equation. α2 and β2 are adjustment factors, and are decided as appropriate in accordance with desired performance and specifications. α2 is equal to, for example, 0.85, and β2 is equal to, for example, 0.15. φ(•) expresses the positive/negative sign of •.
This can reduce the musical noise and the like caused by the discontinuity in the time axis direction of amplitude characteristics of X̂(k). In the case where a decoded signal in a time-domain is necessary, X̂POST(k) output from the smoothing unit 326 is input to the time-domain converter 125. The time-domain converter 125 transforms X̂POST(k) into the time-domain signal z(n) according to, for example, inverse MDCT, and outputs the resultant signal.
In a fourth embodiment, the decoding device 12, 22, or 32 of each embodiment described above or the modification thereof is replaced with a decoding device 42 (
In the present embodiment, the synthesizer 424 receives, as its inputs, τX−, b(m), Ê(k), and U, and performs processing illustrated in
The synthesizer 424 sets τ·C0 to k, sets 0 to m, and sets 0 to tmp, to thereby initialize these values k, m, and tmp (Step D311).
The synthesizer 424 compares k with (τ+1)·C0 (Step D312). If k is less than (τ+1)·C0, the synthesizer 424 determines whether or not the decoded value Ê(k) is 0 (Step D313). If the decoded value Ê(k) is 0, the synthesizer 424 increments k by 1 (Step D317), and goes to Step D312. If the decoded value Ê(k) is not 0, the synthesizer 424 calculates the power of the sample of the identification number k, and adds the calculated power to tmp (Step D314). That is, the synthesizer 424 sets a value obtained by adding the calculated power to a value of tmp, as a new value of tmp. For example, the synthesizer 424 calculates the power according to the following equation.
(C1·τ
Further, the synthesizer 424 increments m by 1 (Step D315), and calculates the following equation (Step D316).
{circumflex over (X)}(k)=SIGN(Ê(k))·(C1·τ
Note that SIGN(Ê(k)) is a function that is 1 when Ê(k) is positive and is −1 when Ê(k) is negative. After that, the synthesizer 424 increments m by 1 (Step D317), and goes to Step D312.
If it is determined in Step D312 that k is not less than (τ+1)·C0, the synthesizer 424 calculates a value of τX= defined by the following equation (Step D318).
Further, the synthesizer 424 sets τ·C0 to k, to thereby initialize a value of k (Step D321).
The synthesizer 424 compares k with (τ+1)·C0 (Step D322). If k is not less than (τ+1)·C0, the synthesizer 424 goes to Step D429 in
{circumflex over (X)}(k)=C3·τ
After that, the synthesizer 424 increments k by 1 (Step D328), and goes to Step D322.
On the other hand, if it is determined in Step D323 that the decoded value Ê(k) is not 0, the synthesizer 424 increments k by 1 (Step D328) without updating X̂(k), and goes to Step D322.
Note that the present invention is not limited to the above-mentioned embodiments. For example, C0, C1, C2, C3, α1, β1, α2, and β2 may be changed as appropriate in accordance with desired performance and specifications.
In addition, the input signal X(k) does not necessarily need to be a frequency-domain signal, and may be a given signal such as a time-domain signal. That is, the present invention can be applied to encoding and decoding of a given signal other than a frequency-domain signal.
In addition, a normalization value FGAIN for the input signal X(k) may be determined for each frame, and the quantization-candidate calculator 114 may use a value obtained by normalizing X(k) using the normalization value FGAIN in place of the value X(k) of each sample of the input signal, and may use a value obtained by normalizing τX− using the normalization value FGAIN in place of the quantized normalization value τX−, to thereby execute the processing of Step E3. For example, the processing of Step E3 may be executed in the state where X(k) is replaced with X(k)/FGAIN and where τX− is replaced with τX−/FGAIN. In addition, in this case, the normalization value calculator 112 does not exist, and the value obtained by normalizing X(k) using the normalization value FGAIN may be input to the normalization value quantizer 113, in place of the quantized normalization value τX−. In this case, the quantization-candidate calculator 114 may use a quantization value of the value obtained by normalizing X(k) using the normalization value FGAIN in place of the quantized normalization value τX−, to thereby perform the processing of Step E3. The normalization-value quantization index may correspond to the quantization value of the value obtained by normalizing X(k) using the normalization value FGAIN.
In addition, the various processes described above may be chronologically executed in the described order, and may be executed in parallel or individually as needed or in accordance with the processing capacity of an apparatus that executes the processes. Moreover, it goes without saying that the present invention can be changed as appropriate within the range not departing from the gist thereof.
[Hardware, Program, and Recording Medium]
The above-mentioned encoding device 11 and decoding devices 12, 22, and 33 each include, for example: a publicly known or dedicated computer formed of a central processing unit (CPU), a random-access memory (RAM), and the like; and a special program in which the above-mentioned processing contents are written. In this case, the special program is read by the CPU, and the CPU executes the special program to thereby implement each function. In addition, the special program may be formed of a single program sequence, and may implement a desired function by reading another program or a library.
Such a program can be recorded in a computer-readable recording medium. Examples of the computer-readable recording medium include a magnetic recording apparatus, an optical disk, a magneto-optical recording medium, and a semiconductor memory. The program is distributed by, for example, selling, assigning, leasing, and the like a portable recording medium such as a DVD or a CD-ROM that records therein the program. Alternatively, the program may be stored in a storage device of a server computer and be transferred for distribution from the server computer to another computer via a network.
A computer that executes such a program, for example, first temporarily stores the program recorded in the portable recording medium or the program transferred from the server computer, into its own storage device. Then, at the time of execution of processing, the computer reads the program stored in its own recording medium, and executes the processing according to the read program. Alternatively, according to another execution mode of the program, the computer may read the program directly from the portable recording medium to execute the processing according to the program. Still alternatively, each time the program is transferred from the server computer to the computer, the computer may sequentially execute the processing according to the received program.
In addition, at least part of the processing parts of the encoding device 11 and the decoding device 12, 22, or 33 may be formed of a special integrated circuit.
Number | Date | Country | Kind |
---|---|---|---|
2010-152950 | Jul 2010 | JP | national |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/JP2011/065273 | 7/4/2011 | WO | 00 | 12/26/2012 |