This application claims the benefit of Korean Patent Application No. 10-2005-0024567, filed on Mar. 24, 2005, in the Korean Intellectual Property Office, the disclosure of which incorporated herein in its entirety by reference.
1. Field of the Invention
The present invention relates to audio coding and decoding apparatuses and methods, and recording media storing the methods, and more particularly, to audio coding and decoding apparatuses and methods which support fine granularity scalability (FGS) using harmonic information of a high-band audio signal or wideband error audio signal when performing wideband audio coding and decoding, and a recording media storing the methods.
2. Description of Related Art
As the range of applications of audio communications and the transmission speed of networks have increased, the demand for high-quality audio communications has also increased. As such, while a conventional audio communication band is 0.3-3.4 kHz, a transmission of a wideband audio signal having a bandwidth of 0.3-7 kHz with high performance in a variety of aspects such as, for example, a natural property and clarity is needed.
In addition, a packet switching network via which data is transmitted in packet units may cause congestion of a channel and packet loss and audio degradation may occur. To solve this problem, a method of concealing a damaged packet has been used but this cannot be a fundamental solution.
Thus, a wideband audio coding and decoding method in which congestion of a channel is prevented by effectively compressing the wideband audio signal has been proposed.
Three examples of wideband audio coding and decoding methods include a first wideband audio coding and decoding method in which an audio signal having a bandwidth of 0.3-7 kHz is compressed at one time and restored, a second wideband audio coding and decoding method in which an audio signal having a bandwidth of 0.3-4 kHz and an audio signal having a bandwidth of 4-7 kHz are compressed hierarchically and restored, and a third wideband audio coding and decoding method in which an audio signal having a bandwidth of 0.3-3.4 kHz is compressed, restored and up-sampled to a wideband signal and a wideband error signal between an original wideband audio signal and the up-sampled wideband signal is obtained and compressed.
The second and third wideband audio coding and decoding methods use bandwidth scalability that enables optimum communication in a channel environment obtained by adjusting the amount of data of a layer to be transmitted according to the degree of congestion.
In the second and third wideband audio coding and decoding methods using the bandwidth scalability, a high-band audio signal having a frequency band of 4-7 kHz is coded using a modulated lapped transform (MLT). A high-band audio signal coding apparatus using a MLT is as shown in
Referring to
The 2D-DCT module 102 extracts a 2D-DCT coefficient from the magnitude of an inputted MLT coefficient and outputs the extracted 2D-DCT coefficient to a DCT coefficient quantizer 104. The DCT coefficient quantizer 104 arranges 2D-DCT vector coefficients in an ascending series statistically, quantizes the arranged vectors and then outputs codebook indices of the arranged vectors. The sign quantizer 103 quantizes a sign of a large MLT coefficient and outputs the quantized sign. The outputted codebook indices and the quantized sign are provided to a high-band audio decoding apparatus (not shown).
However, in high-band audio signal coding using the MLT, it is difficult to restore a high-quality audio signal when an audio signal is transmitted at a low bit rate.
In order to solve this problem, a high-band audio coding apparatus using a harmonic coder shown in
Referring to
An amplitude quantizer 202 quantizes the amplitude of the inputted high-band audio signal and outputs a high-band audio signal having the quantized amplitude. A phase quantizer 203 quantizes phase of the inputted high-band audio signal and outputs a high-band audio signal having the quantized phase. The quantized amplitude and the quantized phase are provided to a high-band audio decoding apparatus (not shown).
A high-quality signal can be reproduced at a low bit rate with low complexity through high-band audio signal coding using the harmonic coder shown in
In addition, when performing wideband error audio coding using the third method having the bandwidth scalability function, a wideband error audio signal having a bandwidth of 0.05-7 kHz is coded using a modified discrete cosine transform (MDCT). Awideband error audio signal coding apparatus using an MDCT shown in
Referring to
However, when an audio signal is transmitted at a low bit rate when using the wideband error audio signal coding method with the MDCT, it is difficult to restore a high-quality audio signal.
An aspect of the present invention provides audio coding and decoding apparatuses and methods which support fine granularity scalability (FGS) using harmonic information of a high-band audio signal or wideband error audio signal during wideband audio coding and decoding, and recording mediums storing the methods.
An aspect of the present invention also provides audio coding and decoding apparatuses and methods in which a high-band audio signal or wideband error audio signal is coded and decoded in harmonic units during wideband audio coding and decoding and which supports sufficient scalability for an audio signal, and recording mediums storing the methods.
According to an aspect of the present invention, there is provided an audio coding method including: detecting harmonics of a high-band audio signal or wideband error audio signal of an inputted audio signal; determining an order of the detected harmonics; and coding the harmonics based on the determined order of the harmonics.
According to another aspect of the present invention, there is provided an audio coding apparatus including: a harmonic detecting unit detecting harmonics of a high-band audio signal or wideband error audio signal of an inputted audio signal; a harmonic order determining unit determining an order of the detected harmonics; and a harmonic coding unit decoding the harmonics based on the determined order of the harmonics.
According to another aspect of the present invention, there is provide an audio decoding method including: decoding a received bitstream corresponding to a coded high-band audio signal or wideband error audio signal for each layer; and outputting the decoded result for each layer as a high-band audio signal or wideband error audio signal restored in each layer.
According to another aspect of the present invention, there is provided an audio decoding apparatus including: a bit unpacking unit, which if a bitstream corresponding to a coded high-band audio signal or wideband error audio signal is received, unpacks and outputs the received bitstream; and a harmonic decoding unit which decodes the bitstream outputted in each layer from the bit packing unit in layer units.
According to another aspect of the present invention, there is provided a recording medium on which a program for performing an audio coding method is recorded, the audio coding method including: detecting harmonics of a high-band audio signal or wideband error audio signal of an inputted audio signal; determining an order of the detected harmonics; and coding the harmonics based on the determined order of the harmonics.
According to another aspect of the present invention, there is provided a recording medium on which a program for performing an audio decoding method is recorded, the audio decoding method including: decoding a received bitstream corresponding to a coded high-band audio signal or wideband error audio signal for each layer; and outputting the decoded result for each layer as a high-band audio signal or wideband error audio signal restored of each layer.
Additional and/or other aspects and advantages of the present invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention
The above and/or other aspects and advantages of the present invention will become apparent and more readily appreciated from the following detailed description, taken in conjunction with the accompanying drawings of which:
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to the like elements throughout. The embodiments are described below in order to explain the present invention by referring to the figures.
The audio coding apparatus 400 includes a band divider 401, the high-band or wideband error audio coding unit 402, and a low-band audio coding unit 403.
If an audio signal is inputted to the audio coding apparatus 400, the band divider 401 divides the inputted audio signal into a low-band audio signal and a high-band audio signal and outputs the low-band and high-band audio signals or divides the inputted audio signal into a wideband error audio signal obtained by subtracting a signal obtained by decoding a low-band audio signal outputted from the low-band audio coding unit 403, from the inputted audio signal and the low-band audio signal, and outputs the low-band and the wideband error audio signal.
The high-band or wideband error audio coding unit 402 codes a high-band audio signal or wideband error audio signal so as to support fine granularity scalability (FGS) using harmonic information of the high-band audio signal or wideband error audio signal outputted from the band divider 401.
The harmonic detector 501 detects harmonics of the inputted high-band audio signal or wideband error audio signal. That is, the harmonic detector 501 detects all of the harmonics of the inputted high-band or wideband error audio signal using matching pursuit (MP) or fast Fourier transform (FFT). In this case, the number of detectable harmonics may be set in consideration of a transmission rate of a codec, sound quality, complexity, etc. For example, in the case of a high-band audio signal, the number of detectable harmonics can be set to 60, and in the case of a wideband error audio signal, the number of detectable harmonics can be set to 120, and the number of detectable harmonics can be variably set according to a sampling method of an inputted signal.
In a harmonic-detecting method using FFT, an inputted high-band audio signal or wideband error audio signal is FFTed and then, a peak corresponding to each harmonic is searched for, and the magnitude and phase of each harmonic are detected. In a harmonic-detecting method using MP, harmonics of an inputted high-band audio signal or wideband error audio signal are analyzed using a pitch lag (or a pitch delay) obtained from the high-band audio signal or wideband error audio signal. That is, a fundamental frequency ω0 is searched for using the pitch lag and harmonic parameters are searched for using a sine dictionary. The harmonic parameters include an amplitude A and a phase φ.
The amplitude A and phase φ of the sine dictionary are searched for using a matching pursuit (MP) algorithm in which an audio signal s(n) is used as a target signal. An audio signal SH(n) indicated by the sine dictionary can be defined using Equation 1.
where Ak is the amplitude of a k-th sine wave, ωk is an angle frequency of the k-th sine wave, φk is the phase of the k-th sine wave, wham(n) is a hamming window, and K is the number of sine dictionaries.
If all of the detectable harmonics are detected in frame units, the harmonic detector 501 can restrict the number of detected harmonics using a smoothing method by which weak harmonics, that is, detected harmonics having values less than or equal to a predetermined value, are removed. In the smoothing method, harmonics are removed if the ratio of magnitudes of adjacent harmonics is smaller than or equal to a predetermined value. The predetermined value is set according to a transmission rate of a codec and sound quality, etc. The ratio is obtained by setting a harmonic having a larger value of the two harmonics to a denominator and a harmonic having a smaller value of the two harmonics to a numerator.
The harmonic detector 501 obtains information required for noise filling. The information required for noise filling includes a root mean square (RMS) of magnitudes of harmonics detected in a frame where harmonics detection is performed and tilt information of a spectrum. The tilt information is gradient information as indicated in
The harmonic order determining unit 502 determines the ordering of harmonics detected by the harmonic detector 501. To this end, the harmonic order determining unit 502 uses perceptual weighting for the detected harmonics. That is, the harmonic order determining unit 502 detects the magnitude, the phase, and band information for each harmonic. The harmonic order determining unit 502 normalizes the detected magnitude, phase, and band information.
The magnitudes of harmonics are normalized based on the largest amplitude. The bands of harmonics are normalized by setting the lowest band to 1 and the highest band to 0 in an inputted audio signal and interpolating the other bands within the numerical range. The phases of the harmonics are normalized in the range from −π to π by setting an absolute value to π. In other words, −π or π is 1 and the other values are interpolated between 0 and 1.
The harmonic order determining unit 502 obtains an ordering criterion C by multiplying a normalized amplitude M, a normalized phase P, and normalized band information B by predetermined weighting values Wm, Wp, and Wb, respectively, as shown in Equation 2
C=MWm+PWp+BWb (2)
The weighting values Wm, Wp, and Wb can be obtained using
Wm>2*b>4p (3)
The harmonic order determining unit 502 determines an order for the harmonics detected in each frame based on the obtained ordering criterion C of each harmonic. That is, the order of the detected harmonics can be determined as shown in
The harmonic coding unit 503 codes the magnitudes and phases of the harmonics sequentially from the harmonics having the highest priorities based on the order determined by the harmonic order determining unit 502. In this case, the harmonic coding unit 503 also codes information required for noise filling.
The bit packing unit 504 bit-packs the result of coding obtained by the harmonic coding unit 503 and generates and outputs a bitstream having a data structure shown in
Returning to
The channel 410 transmits the bit-packed and coded bitstream outputted from the high-band audio signal or wideband error audio coding unit 402 and the low-band audio coding unit 403 to the audio decoding apparatus 420.
The audio decoding apparatus 420 receives a bitstream packet of the coded high-band or wideband error audio signal transmitted from the channel 410 and a bitstream packet of the coded low-band audio signal, respectively, and generates a restored audio signal.
To this end, the audio decoding apparatus 420 includes the high-band or wideband error audio decoding unit 421, a low-band audio decoding unit 422, and a band combining unit 423.
The high-band or wideband error audio decoding unit 421 unpacks a received bitstream packet corresponding to the coded high-band audio signal or wideband error audio signal and generates an audio signal restored in layer units and outputs the generated audio signal.
The bit unpacking unit 810 unpacks a received bitstream including a core layer composed of other data field and an enhancement layer, as shown in
The harmonic decoding unit 820 includes a core layer decoder 821 and first through n-th layer decoders 822_1 to 822_n and decodes each layer of the bitstream. That is, the core layer decoder 821 decodes the other data field of the bitstream, the first layer decoder 822_1 decodes a data field Data 0, and the n-th layer decoder 822_n decodes a data field Data N-1.
However, whether or not each of the decoders 821 and 822_1 through 822_n included in the harmonic decoding unit 820 performs decoding can be determined according to operating conditions of the audio decoding apparatus 420, a user's choice or the environment of the channel 410. If harmonic information defined in the data field Data 0 in the enhancement layer of a frame is received, an audio signal of the frame can be restored using information required for noise filling defined in the core layer.
In other words, when the number of harmonics of the corresponding frame is small, the harmonic decoding unit 820 performs noise filling. Whether or not the harmonic decoding unit 820 will perform noise filling is determined using a threshold value. The used threshold value may be set based on the ratio of the sum of magnitudes of all of the decoded harmonics to the total RMS. When the ratio is smaller than or equal to the threshold value, the harmonic decoding unit 820 performs the noise filling. In the noise filling, the restored harmonics are obtained and magnitude information about the entire band is obtained using the transmitted RMS and gradient. Next, the noise filling is performed in such a way that random noise is generated for undecoded portions and filled in the undecoded portions. In this case, magnitude information corresponding to the band is the amplitude of random noise to be generated.
Returning to
The low-band audio decoding unit 422 decodes a received bitstream corresponding to the coded low-band audio signal and outputs the restored low-band audio signal. The restored low-band audio signal is transmitted to the band combining unit 423.
The band combining unit 423 combines the audio signal outputted from the high-band or wideband error audio signal decoding unit 421 and restored in each layer with the restored low-band audio signal outputted from the low-band audio decoding unit 422 and outputs the restored audio signal.
First, in operation 901, if the inputted audio signal is divided into a high-band audio signal or wideband error audio signal and a low-band audio signal using the band divider 401 shown in
In operation 902, the magnitude, phase, and band information of each of the detected harmonics are obtained and normalized. In operation 903, an ordering criterion C of each harmonic is obtained using weighting values, the normalized magnitude, the normalized phase, and the normalized band information corresponding to the magnitude, phase, and band information of each of the detected harmonics.
In operation 904, the order of the harmonics detected in each frame IS determined based on the ordering criterion C. In operation 905, harmonic coding is performed based on the determined order of the harmonics. The harmonic coding is performed on the harmonics sequentially in order of ordering criterion.
In operation 906, information required for noise filling is decoded.
In operation 907, bit packing is performed on the high-band audio signal or wideband error audio signal using the harmonic coding result and the coded information for noise filling, and a bitstream shown in
In operation 908, the generated bitstream is transmitted to the channel 410 as a bitstream of the coded high-band audio signal or wideband error audio signal.
A bitstream corresponding to a coded high-band audio signal or wideband error audio signal is received in operation 1001, and the received bitstream is unpacked and divided according to layers and harmonics in operation 1002. In operation 1003, the bitstream divided according to layers and harmonics is decoded as described above with reference to
The methods according to the above-described embodiments of the present invention can also be embodied as computer readable code on a computer readable recording medium. The computer readable recording medium is any data storage device that can store data which can be thereafter read by a computer system. Examples of the computer readable recording medium include read-only memory (ROM), random-access memory (RAM), CD-ROMs, magnetic tapes, floppy disks, optical data storage devices, and carrier waves (such as data transmission through the Internet). The computer readable recording medium can also be distributed over network coupled computer systems so that the computer readable code is stored and executed in a distributed fashion.
According to the above-described embodiments of the present invention, fine granularity scalability is supported using harmonic information of a high-band audio signal or wideband error audio signal such that scalability of the audio signal is maximized, decoding is performed in harmonic units and very fine granularity scalability is supported.
In addition, a low-band audio signal is maintained and harmonic information regarding the high-band audio signal or wideband error audio signal is used such that the quality of a basic audio signal is maintained.
Since an audio signal can be restored through noise filling even in harmonics of the high-band or wideband error audio signal having very small amplitudes, the quality of the audio signal can be improved.
Although a few embodiments of the present invention have been shown and described, the present invention is not limited to the described embodiments. Instead, it would be appreciated by those skilled in the art that changes may be made to these embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the claims and their equivalents.
Number | Date | Country | Kind |
---|---|---|---|
10-2005-0024567 | Mar 2005 | KR | national |