The present invention relates to an audio signal encoder, and more particularly to an audio signal encoder that includes: a mapping transform unit for subjecting input audio signals to mapping transform to generate frequency region signals that vary in response to frequency variation (also referred to as frequency domain signals, which are expressed as a function defined with respect to a frequency domain); a code amount designation unit that supplies, as a code amount, a coding bit rate set or designated by a user; and a frequency region signal compression encoder that, based on the code amount designated by the code amount designation unit, subjects frequency region signals to compression encoding processing to generate a bitstream.
One example of an audio signal encoder of the prior art is described in Digital Audio Compression Standard AC-3 issued by the Advanced Television System Committee (referred to hereinbelow as Reference 1).
Bandwidth-limiting filter 20 eliminates a frequency component that is not the object intended to encode from the input audio signals. Mapping transform unit 11 executes a mapping transform process on the input bandwidth-limited audio signals to generate frequency region signals. Code amount designation unit 12 transfers a coding bit rate that has been designated by the user to frequency region signal compression encoder 13. Based on the coding bit rate supplied by code amount designation unit 12, frequency region signal compression encoder 13 executes compression-coding processing on the frequency region signals to generate a bitstream.
In the above-described audio signal encoder of the prior art, the frequency components, which are included in the input audio signals but are not intended to encode, are removed through bandwidth-limiting filter processing in bandwidth-limiting filter 20. As an example, the use of a 3-Hz high-pass filter is recommended in the section on Input Filtering in Chapter 8.2. 1.3 of the above-described Reference 1.
However, this bandwidth-limiting filtering typically requires a large number of product-sum operations, and thus has the problem of entailing a large amount of operations.
The bandwidth-limited audio signals are subject to a mapping transform in mapping transform unit 11 and converted to frequency region signals. In Reference 1, a Modified Discrete Cosine Transform (MDCT) is used as the mapping transform to generate MDCT coefficients. The MDCT coefficients are frequency region signals that specify the behavior of the input audio signals through the use of frequency as a variable. The Modified Discrete Cosine Transform is widely used as a mapping transform means in audio encoding, and since the details regarding such aspects as calculation formulas of this means are widely known from documents such as Reference 1, explanation is here omitted. In Reference 1, a single Modified Discrete Cosine Transform normally generates 256 MDCT coefficients.
The MDCT coefficient represents spectrum intensity of an input audio signal with respect to frequency.
Code amount designation unit 12 supplies a coding bit rate that has been predetermined or that has been designated by a user to frequency region signal compression encoder 13.
Frequency region signal compression encoder 13 subjects the MDCT coefficients that have been generated by mapping transform unit 11 to information compression so as to meet the coding bit rate designated by code amount designation unit 12 and generates a bitstream. The information compression in this case includes entropy coding of quantized values, suppression of signal redundancy among a plurality of channels, and quantization based on auditory characteristics that are generally widely used in audio encoding. These techniques are generally widely known from documents such as Reference 1, and because these techniques have no relation to the novelty of the present invention, explanation regarding the details of these techniques is here omitted.
As previously described, the problem of the audio signal encoder of the above-described prior art is a large number of product-sum operations required for the filter processing of the bandwidth-limiting filter to result in a large amount of operations of the bandwidth-limiting filter.
It is an object of the present invention to eliminate the signals of a frequency zone which are not the object of coding by means of a small amount of operations and thereby improve the performance of an audio signal encoder, and further, to increase the speed of the encoding process, reduce power consumption, improve integration, and finally, simplify the circuits and the device construction.
To achieve the above-described object, the audio signal encoder of the present invention includes a bandwidth-limiting unit for executing bandwidth-limiting processing in accordance with attenuation characteristics that have been set corresponding to the code amount designated by said code amount designation unit. The bandwidth-limiting processing includes steps of allocating a part of the frequency zone covered by the frequency region signals to an attenuation frequency zone, and multiplying the values of frequency region signals in the attenuation frequency zone by attenuation coefficients each having a value less than 1 to attenuate the frequency region signals in the attenuation frequency zone; and supplying frequency region signals that have undergone the bandwidth-limiting processing to the frequency region signal compression encoder.
As one embodiment of the bandwidth-limiting unit, the bandwidth-limiting unit executes a bandwidth-limiting processing of: attenuating the frequency region signal in an attenuation frequency zone by multiplying the frequency region signal by an attenuation coefficient defined so as to vary or monotonouly decrease as the frequency varies from an attenuation start frequency to an attenuation end frequency; and making the value of the frequency region signal zero in a frequency zone beyond the attenuation end frequency. Here, the attenuation frequency zone is a frequency interval defined by the attenuation start frequency and the attenuation end frequency and is set based on the code amount designated by the code amount designation unit.
The relation between the attenuation start frequency and the attenuation end frequency can be variously set according to the object. When the bandwidth-limiting unit is intended to attenuate frequency region signals in a high-frequency zone, the attenuation end frequency is set equal to the attenuation start frequency, or the attenuation end frequency is set higher than the attenuation start frequency. Setting the attenuation end frequency equal to the attenuation start frequency enables a stepped attenuation of the frequency region signals in the zone of higher frequencies than the attenuation start frequency. Alternatively, setting the attenuation end frequency higher than the attenuation start frequency enables a gradual attenuation of the frequency region signals in the zone of higher frequencies than the attenuation start frequency.
When the bandwidth-limiting unit attenuates a frequency region signal of a low-frequency region, the attenuation end frequency is set equal to the attenuation start frequency, or the attenuation end frequency is set lower than the attenuation start frequency. In this case, setting the attenuation end frequency equal to the attenuation start frequency enables a stepped attenuation of the frequency region signals in the zone of lower frequencies than the attenuation start frequency. Alternatively, setting the attenuation end frequency lower than the attenuation start frequency enables gradual attenuation of the frequency region signals in a region of lower frequencies than the attenuation start frequency.
The attenuation coefficients can be set to have a attenuation characteristic represented as a linear function which decreases linearly as the frequency varies in the attenuation frequency zone from the attenuation start frequency to the attenuation end frequency with an initial value set to 1.
Alternatively, the attenuation coefficients can be set to have a attenuation characteristic represented as a trigonometric function which decreases trigonometrically as the frequency varies in the attenuation frequency zone from the attenuation start frequency to the attenuation end frequency with an initial value set to 1.
The attenuation frequency zone is a frequency interval defined by the attenuation start frequency and the attenuation end frequency; the frequency zone can be a frequency interval defined by frequency 0 and the inverse of the product of ½ and the sampling period of audio signals; and the attenuation coefficients are 1 in a range of the frequency zone other than the attenuation frequency zone.
The bandwidth-limiting unit attenuates frequency region signals by multiplying the frequency region signal by an attenuation coefficient determined for each frequency in advance in accordance with a coding bit rate designated by the code amount designation unit. Signals of a frequency zone that is not the object of encoding can thus be eliminated to enable a smaller amount of operation and thus realize higher-quality audio signal encoding.
We next refer to the accompanying figures to provide a detailed explanation of an embodiment of the present invention.
We first refer to
The audio signal encoder of the present embodiment includes mapping transform unit 11, bandwidth-limiting unit 10, code amount designation unit 12, and frequency region signal compression encoder 13.
Mapping transform unit 11 transforms input audio signals to frequency region signals. Bandwidth-limiting unit 10 attenuates a part of the frequency region signals. Frequency region signal compression encoder 13 compression-encodes the bandwidth-limited frequency region signals to generate a bitstream. Code amount designation unit 12 supplies a coding bit rate, which has been designated by a user, to both bandwidth-limiting unit 10 and frequency region signal compression encoder 13.
Explanation is next presented regarding the operation of the present embodiment.
Input audio signals are supplied to mapping transform unit 11. Mapping transform unit 11 effects a mapping transform on the input audio signals as in the prior art and generates frequency region signals. Explanation here involves a case in which a Modified Discrete Cosine Transform (MDCT) is employed as the mapping transform. In Reference 1, a single Modified Discrete Cosine Transform normally produces 256 MDCT coefficients. These MDCT coefficients express the spectrum intensity for each of the frequencies of the input audio signals. An arrangement of these MDCT coefficients in order starting from the lowest frequency can be expressed as:
MDCT (0), MDCT (1), . . . , MDCT (255) (1)
The detailed operation of mapping transform unit 11 is identical to that of the prior art, and since this operation has no relation to the characteristic part of the present invention, explanation of this operation is here omitted.
Code amount designation unit 12 supplies a coding bit rate designated by a user or a coding bit rate that has been determined in advance to bandwidth-limiting unit 10 and frequency region signal compression encoder 13. Except for the increase in the output destinations of the coding bit rate, the operation of code amount designation unit 12 is identical to that of the prior art.
Bandwidth-limiting unit 10, which is a characteristic part of the present invention, attenuates a number of MDCT coefficients of the received MDCT coefficients. The attenuation coefficients to be multiplied with the MDCT coefficients when attenuating are determined so as to provide the preset attenuation characteristic, based on the coding bit rate designated by code amount designation unit 12.
Explanation next regards the method of attenuating the high-frequency component.
According to Nyquist's sampling theorem, if the highest frequency included in a signal is fMAX, then the original waveform can be reproduced by sampling at time intervals of T≦1/(2fMAX). Accordingly, provided that the Nyquist's sampling theorem is properly applied and that the sampling frequency of the input audio signal is FS Hertz, it follows that this audio signal has frequency components up to (FS/2). When this input audio signal is subjected to a mapping transform to generate the above-described 256 MDCT coefficients, the Ath frequency fA is approximately:
fA=[(FS/2)÷256]×A(Hertz) (2)
Accordingly, the Ath MDCT coefficient MDCT(A) expresses the spectrum intensity for frequency fA. In this case, the high-frequency component of the frequency equal to or higher than fA Hertz can be eliminated by putting the values of the Ath MDCT coefficient and succeeding MDCT coefficients (having the integer numbers equal to and more than A as numbered in the increasing order of the frequency) at 0. In the present invention, the value of fA is referred to as the attenuation start frequency.
This attenuation start frequency is set so as to attenuate the frequency zone that has been determined in advance in accordance with a compression rate (coding bit rate) designated by the user. Generally, it is required to narrow a bandwidth when a compression rate is high because a high compression rate causes it difficult to code a wideband signal with high-quality. The unnecessary zone is therefore preferably attenuated.
Although the foregoing explanation describes a case in which a high-frequency zone is selected as the unnecessary zone, this is an embodiment in which the correspondence between the coding bit rate and the attenuation start frequency is preferably determined in advance such that the attenuation start frequency lowers with increase in the compression rate designated by code amount designation unit 12.
Explanation is next presented regarding the second working example of the present invention.
For frequency fF, for example, the attenuation coefficient can be used which is represented as a linear function of frequency as follows:
AT(F)=1−k[(fF−fA)/(fB−fA)] (3)
where fF stands for the Fth frequency that satisfies the expression F≧A. In expression (3), k is a proportionality constant and can be set arbitrarily.
As shown in
AT(F)=cos[{(fF−fA)/(fB−fA)}(π/2)] (4)
can be used. In addition, high-frequency components can be completely eliminated by making the Bth and succeeding MDCT coefficients zero.
Explanation is next presented regarding a fourth working example of the present invention. The present example is intended to eliminate low frequency components
Explanation is next presented regarding a fifth working example of the present invention.
Although this working example is a method of eliminating the low-frequency components, it offers a different approach from the fourth working example. While, in the fourth working example, the Cth and lower MDCT coefficients were made zero, in the fifth working example in contrast not only attenuation start frequency fC, expressive of the frequency of the Cth MDCT coefficient, but also attenuation end frequency fD that corresponds to the Dth MDCT coefficient is determined in accordance with the coding bit rate. In this case, the value of D is D<C, and consequently, fC>fD. Generally, it is preferred that the values of D and fD are zero and that the attenuation coefficient AT is set so that the MDCT coefficient gradually decreases starting from MDCT(C) to MDCT(D). In other words, MDCT(F) for F, where C≧F≧D, is multiplied by an attenuation coefficient AT(F) of a predetermined attenuation characteristic. The attenuation coefficient AT(F) can be stored in advance in bandwidth-limiting unit 10. The attenuation coefficient used can be represented as a linear function of frequency in the frequency range fC≧fF≧fD corresponding to C≧F≧D, as represented below:
AT(F)=k[(fF−fD)/(fC−fD)]
In the present working example, the attenuation coefficient expressed by a trigonometric function of a frequency variable, as described below, can be employed wherein the frequency variable is in the same frequency range fC≧fF≧fDas that of the fifth working example.
AT(F)=sin [{(fF−fD)/(fC−fD)}(π/2)] (5)
In addition, making the Dth and lower-numbered (numbered lower than D) MDCT coefficients zero enables complete elimination of low frequency components. In
In the figure, MDCT coefficients supplied from mapping transform unit 11 are faithfully provided as output by bandwidth-limiting unit 10 for the frequency zone higher than fC. In the frequency zone lower than attenuation start frequency fC, MDCT coefficients produced by multiplying the output of mapping transform unit 11 by the attenuation coefficients are provided as output by bandwidth-limiting unit 10. No output is provided from bandwidth-limiting unit 10 for MDCT coefficients in the frequency zone lower than attenuation end frequency fD .
Frequency region signal compression encoder 13 subjects the MDCT coefficients that have been generated by bandwidth-limiting unit 10 to information compression to satisfy the coding bit rate designated by code amount designation unit 12, thereby generating a bitstream. Here, information compression includes entropy encoding of quantized values, suppression of signal redundancy among a plurality of channels, and quantization based on auditory characteristics widely used in audio encoding. These techniques are identical to techniques of the prior art such as Reference 1, are generally widely known, and further, have no relation to the novelty of the present invention, and detailed explanation of these techniques is therefore here omitted.
As described in the foregoing explanation, the present invention allows the spectrum component in the unnecessary frequency zone of an input audio signal to attenuate by multiplying the spectrum component of the unnecessary frequency zone by an attenuation coefficient so as to limit the bandwidth of the audio signal, whereby the present invention has the following merits:
1) A bandwidth-limiting filter is not required as in the prior art, and product-sum operations are therefore not required. The amount of operations required for limiting bandwidth is therefore reduced.
2) The present invention therefore not only enables an acceleration of operations and a reduction of power consumption, but also contributes to a simplification of circuits and device construction, contributes to an improvement in characteristics and performance, and further, contributes to higher integration.
Number | Date | Country | Kind |
---|---|---|---|
2000-319699 | Oct 2000 | JP | national |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/JP01/08920 | 10/11/2001 | WO | 00 | 4/14/2003 |
Publishing Document | Publishing Date | Country | Kind |
---|---|---|---|
WO02/33831 | 4/25/2002 | WO | A |
Number | Name | Date | Kind |
---|---|---|---|
3341659 | Stern | Sep 1967 | A |
4588979 | Adams | May 1986 | A |
4969192 | Chen et al. | Nov 1990 | A |
Number | Date | Country |
---|---|---|
62-66358 | Apr 1987 | JP |
62-274809 | Nov 1987 | JP |
4-104617 | Apr 1992 | JP |
4-504192 | Jul 1992 | JP |
04-313964 | Nov 1992 | JP |
05-260100 | Oct 1993 | JP |
08-125543 | May 1996 | JP |
08-237130 | Sep 1996 | JP |
09-187005 | Jul 1997 | JP |
Number | Date | Country | |
---|---|---|---|
20040049378 A1 | Mar 2004 | US |