The present invention relates to wideband signal encoding and decoding systems; and, more particularly, to a highband encoding apparatus and method for encoding highband signal (speech or audio) by using the encoding informaion of lowband encoder in a wideband signal encoding system which uses the conventional narrowband encoder as a core encoder and a highband decoding apparatus and method corresponding thereto.
Generally, wideband signal (speech or audio) encoding methods are largely categorized into three types. One is a wideband encoding method for encoding wideband signals ranging from 50 to 7,000 Hz at a time. Second is a band-splitting encoding method which encodes the lowband and highband signal with independent methods after dividing wideband signals into lowband signals ranging from 50 to 4,000 Hz and highband signals ranging from 4,000 to 7,000 Hz. Third algorithm is a step-based encoding method. In this method, first, lowpass filtered and down-sampled input signal is encoded by narrowband encoder and then the difference between wideband input signal and up-sampled lowband signal is encoded. Since the difference between the wideband input signal and the up-sampled lowband signal is mostly concentrated on the highband region, the encoding of highband signal is significant in quality improvement.
The band-splitting or the step-based wideband signal encoding system usually utilizes a standardized narrowband encoder for lowband signal encoding and utilizes a noise modulation and a frequency domain encoding technique for highband signal encoding. Herein, the bandwidth of narrowband (the telephone band) is between 0 and 4 kHz and the typical narrowband encoder are ITU-T, G.723.1, G.729.1, EVRC and the like. Thus, the band-splitting or step-based wideband signal encoding system is compatible with a narrowband encoder, which is applied to conventional communication systems.
Meanwhile, the noise modulation technique used for the encoding of highband signal in the conventional wideband signal encoding system performs modeling the highband signal by modulating random noise signals based on the energy distribution of highband signal. The noise modulation technique is very low-complexity method but just conveys the feeling of wideband signal. Also, it is not appropriate for the encoding of various types of signal.
In the frequency domain encoding technique, the input signal is transformed by using a transform algorithm such as Discrete Fourier Transform (DFT) and Discrete Cosine Transform (DCT), and the frequency coefficients are quantized and transmitted. In the frequency domain encoding technique, the waveform of a input signal is directly encoded. Thus the frequency domain encoding technique is appropriate for encoding various input signals. However, the frequency domain encoding technique has pre-echo problem because the onset pulses is frequently occur in the highband. If the onset segment is encoded in frequency domain, the quantization noises are spreaded over the entire frequency band. In other words, the quantization error occurring in a pitch pulse segment or an onset pulse segment due to a limited transmission rate causes a pre-echo synthesized signal.
It is, therefore, an object of the present invention to provide a highband encoding apparatus and method that can reduce a pre-echo phenomenon by using Temporal Noise Shaping (TNS) technique and encoding information of lowband signal in a wideband encoding system.
It is another object of the present invention to provide a highband decoding apparatus and method for decoding highband signals which are encoded by using the highband encoding apparatus and method in a wideband decoding system.
The objects and other advantages can be understood with reference to the following description and become apparent by preferred embodiments of the present invention. Also, it is obvious that the objects and advantages can be embodied by the means as claimed and combinations thereof.
In accordance with an aspect of the present invention, there is provided a highband encoding apparatus for encoding a highband signal based on lowband encoding information in a wideband encoding system, including: a domain converter for converting a domain of an input highband signal into a frequency domain; a linear prediction order determiner for determining a linear prediction order based on the lowband encoding information; a linear prediction analyzer for analyzing a highband signal whose domain is converted into the frequency domain based on the determined linear prediction order to thereby generate a linear prediction coefficient; a linear prediction coefficient quantizer for quantizing the linear prediction coefficient based on the lowband encoding information; and a residual signal quantizer for obtaining a residual signal by dequantizing the quantized linear prediction coefficient and quantizing the residual signal.
In accordance with another aspect of the present invention, there is provided a highband decoding apparatus for decoding a highband signal based on lowband encoding information in a wideband decoding system, including: a residual signal decoder for decoding a residual signal from a received bit stream; a linear prediction order determiner for determining a linear prediction order based on the lowband encoding information; a linear prediction coefficient dequantizer for dequantizing a linear prediction coefficient from the received linear prediction coefficient information by using the determined linear prediction order and the lowband encoding information; a linear prediction synthesizer for performing linear prediction synthesis on the decoded residual signal by using the dequantized linear prediction coefficient; and a domain converter for converting a highband signal performed linear prediction synthesis into a highband signal of a time domain.
In accordance with another aspect of the present invention, there is provided a highband encoding method for encoding a highband signal based on lowband encoding information in a wideband encoding system, including the steps of: a) converting a domain of an input highband signal into a frequency domain; b) determining a linear prediction order based on the lowband encoding information; c) analyzing the highband signal whose domain is converted into the frequency domain based on the determined linear prediction order and generating a highband linear prediction coefficient; d) quantizing the linear prediction coefficient based on the lowband encoding information; and e) obtaining a residual signal by dequantizing the quantized linear prediction coefficient, and quantizing the obtained residual signal.
In accordance with another aspect of the present invention, there is provided a highband decoding method for decoding a highband signal based on lowband encoding information in a wideband decoding system, including the steps of: a) decoding a residual signal from a received bit stream; b) determining a linear prediction order based on the lowband encoding information; c) dequantizing a linear prediction coefficient from the received linear prediction coefficient information based on the determined linear prediction order and the lowband encoding information; d) performing linear prediction synthesis on the decoded residual signal based on the dequantized linear prediction coefficient; and e) converting a highband signal performed linear prediction synthesis into a highband signal of a time domain.
The above and other objects and features of the present invention will become apparent from the following description of the preferred embodiments given in conjunction with the accompanying drawings, in which:
Other objects and aspects of the invention will become apparent from the following description of the embodiments with reference to the accompanying drawings, which is set forth hereinafter. When it is determined that further description on a prior art related to the technology of the present may blur the points of the present invention, the description will not be provided. Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings.
In the Temporal Noise Shaping (TNS) technology, the LPC residual coefficients of frequency domain are quantized and transmitted. The input signal are transformed int to frequency domain based on Discrete Fourier Transform (DFT) or Discrete Cosine Transform (DCT) and linear prediction coefficients are calculated on the transformed signals. The present invention provides a solution on the determination of an optimal order and quantization method of linear prediction coefficient (LPC).
When the exact frequency band split is impossible, lowband encoding information can be used for highband linear prediction analysis because highband signals include some part of lowband signals and the energy distribution of a highband signal on a time axis is similar to the energy distribution of a lowband signal.
For example, it is possible to use the pitch information of lowband encoder to determine a linear prediction order and to use a lowband synthesised signal to quantize a linear prediction coefficient in the encoding of highband signal.
As shown in
Meanwhile, a band-splitting wideband decoding system 120 decodes the received encoded parameters of lowband and highband, and interpolates the decoded lowband and highband signals by using interpolators 123 and 124 in two-folds. The interpolated lowband and highband signals pass through a band-pass filter of lowband 125 and a band-pass filter of highband 126, respectively, and they are synthesized to wideband signal.
The encoding apparatus and method of the present invention can be applied to a highband encoder 116 of the wideband encoding system 110, whereas the decoding apparatus and method of the present invention can be applied to a highband decoder 122 of the wideband encoding system 120. However, it is obvious to those skilled in the art that the scope of the present invention is not limited to it.
The frequency domain converter 201 transform the time domain of a highband signal into a frequency domain. In the present embodiment, the highband signal is converted into a frequency domain through Modified Discrete Cosine Transform (MDCT) and generates an MDCT coefficient through the frequency domain transform.
The linear prediction order determiner 202 determines a linear prediction order based on lowband encoding information such as pitch. The linear prediction order (p) can be expressed as Equation 1.
where NW denotes a frame length of a wideband encoding system; T denotes a pitch value obtained in lowband encoding system; and
denotes the number of pitch pulses per one frame. Since a quadratic linear prediction order is needed to express one pitch pulse, the linear prediction order (p) is expressed as the Equation 1.
The linear prediction analyzer 203 calculates a linear prediction coefficient by analyzing frequency domain highband signal based on the linear prediction order determined in the linear prediction order determiner 202. In short, an auto-correlation coefficient of the frequency domain highband signal is obtained and a linear prediction coefficient is obtained based on Levison Durbin algorithm.
The linear prediction coefficient quantizer 204 quantizes the linear prediction coefficient obtained in the linear prediction analyzer 203 based on lowband encoding information, i.e., the synthesized output signal of lowband encoder.
Meanwhile, the residual signal quantizer 205 dequantizes the linear prediction coefficient quantized in the linear prediction coefficient quantizer 204, and obtains a residual signal by performing linear prediction analysis filtering. The residual signal is called a linear prediction residual MDCT coefficient. The residual signal quantizer 205 quantizes the residual signal. In short, it divides the band of the residual MDCT coefficients into a several bands and quantizes the energy of each band and a coefficient of normalized residual MDCT coefficients. Herein, when the energy of each band is quantized, fixed codebook gain information of a lowband encoder can be used. In other words, quantization efficiency can be increased by quantizing the difference between the energy of each band and the fixed codebook gain of the lowband encoder, instead of quantizing energy information of each band.
The first LSP converting unit 301 converts a highband linear prediction coefficient generated in the linear prediction analyzer 203 of
The synthesized output signal of lowband encoder is transformed into frequency domain coefficients in frequency domain converting unit 302. For example, it converts the time domain synthesized signal of lowband encoder into a frequency domain through the MDCT.
The auto-correlation coefficient of the output signal of lowband encoder is transformed into the frequency domain, and then, the linear prediction coefficient is calculated based on the Levison Durbin algorithm in the linear prediction analyzing unit 303. The second line spectrum pair (LSP) converting unit 304 converts the linear prediction coefficient of the lowband synthesized signal into a line spectrum pair. The difference between a highband LSP obtained in the first LSP converting unit 301 and a lowband LSP obtained in the second LSP converting unit 304 are vector quantized in the vector quantizing unit 305.
The residual signal decoder 401 makes a residual signal based on the coefficient of a normalized residual signal and the energy of each frequency band transmitted from the highband encoding apparatus. Herein, when the energy of each frequency band is not quantized and transmitted but a difference between the energy of each frequency band and a fixed codebook gain of the lowband encoding system is quantized and transmitted, the energy of each frequency band is made by quantizing the difference and adding the fixed codebook gain to the dequantized value.
The linear prediction order determiner 402 determines the linear prediction order by using lowband encoding information, which is pitch information, just as the encoding process. The linear prediction coefficient dequantizer 403 dequantizes linear prediction coefficient information transmitted from the highband encoding apparatus based on the determined linear prediction order and the lowband encoding information, which is a lowband synthesized signal, and thereby decodes the linear prediction coefficient.
The linear prediction synthesizer 404 performs linear prediction synthesis on the decoded residual signal based on the dequantized linear prediction coefficient. That is, it generates an MDCT coefficient by performing linear prediction synthesis filtering the decoded residual signal. The frequency domain deconverter 405 converts the linear prediction-synthesized signal into a highband signal of a time domain. That is, it outputs a highband signal of the time domain by performing inverse MDCT (IMDCT).
The vector dequantizing unit 501 makes a line spectrum pair by performing vector dequantization on the linear prediction coefficient information transmitted from the highband encoding apparatus. In short, it generates the difference between a highband LSP and a lowband LSP coefficients. The frequency domain converting unit 502 converts time domain synthesized output of lowband decoder into a frequency domain coefficients.
The linear prediction coefficients of the synthesized output signal of lowband encoder are calculated in the linear prediction analyzing unit 503. The LPC coefficients are calculated in frequency domain.
The LSP converting unit 504 converts the linear prediction coefficient of the output signal of lowband into a line spectrum pair. The output LSP of the vector dequantizing unit 501 and the LSP of transformed lowband synthesized signal of LSP converting unit 504 are added and converted into a linear prediction coefficient in the LPC converting unit 505. In short, a linear prediction coefficient of a highband signal is generated.
At step S602, a linear prediction order is determined based on lowband encoding information, e.g., pitch information of a lowband signal. Subsequently, at step S603, a linear prediction coefficient is obtained by analyzing the highband signal whose domain is converted into the frequency domain based on the determined linear prediction order. To put it in detail, after an auto-correlation coefficient of the highband signal whose domain is converted into the frequency domain is obtained, a linear prediction coefficient is calculated based on Levison Durbin algorithm.
At step S604, the linear prediction coefficient is quantized by using the lowband encoding information, e.g., lowband synthesis signal. At step S605, the quantized linear prediction coefficient is dequantized, and a residual signal is obtained by performing linear prediction analytic filtering based on the dequantized linear prediction coefficient. In short, the band of a residual signal is divided into several bands, and the energy of each band and the coefficient of a normalized residual signal are quantized. Herein, when the energy of each band is quantized, fixed codebook gain information of the lowband encoder can be utilized. The quantization efficiency can be increased by quantizing the difference between the energy of each band and the fixed codebook gain of the lowband encoder, instead of quantizing the energy information of each band.
Meanwhile, at step S702, the domain of a lowband synthesized signal is converted into a frequency domain. At step S703, linear prediction analysis is carried out on a lowband synthesis signal whose domain is converted into the frequency domain. To be specific, an auto-correlation coefficient of the lowband synthesized signal whose domain is converted into the frequency domain is obtained and then a linear prediction coefficient is calculated based on Levison Durbin algorithm. At step S704, the linear prediction coefficient is converted into a line spectrum pair.
At step S705, the difference between a line spectrum pair of a highband signal generated at the step S701 and a line spectrum pair of a lowband synthesized signal generated at the step S704 is calculated. At step S706, the difference is vector-quantized.
At step S802, a linear prediction coefficient is determined based on lowband encoding information, e.g., pitch information, just as in the encoding process. At step S803, the linear prediction coefficient quantized and transmitted from the highband encoding apparatus is dequantized based on the lowband encoding information, e.g., a lowband synthesized signal.
At step S804, linear prediction synthesis is carried out on the residual signal decoded at the step S801 by using the dequantized linear prediction coefficient. That is, linear prediction synthesis filtering is performed on the decoded residual signal. At step S805, the linear prediction synthesized signal is converted into a highband signal of a time domain.
At step S902, the domain of a lowband synthesized signal is converted into a frequency domain. At step S903, an auto-correlation coefficient of the lowband synthesized signal whose domain is converted into the frequency domain is obtained, and a linear prediction coefficient is calculated based on Levison Durbin algorithm.
Subsequently, at step S904, a linear prediction coefficient of the lowband synthesized signal is converted into a line spectrum pair. At step S905, the line spectrum pair restored at the step S901 is summated with a line spectrum pair obtained at the step S904, and the summated line spectrum pair is converted into a linear prediction coefficient.
The present invention described above has an effect that it can remove a pre-echo by calculating an optimal linear prediction order for Temporal Noise Shaping (TNS) based on lowband encoding information and applying the optimal linear prediction order to highband encoding. In other words, the removal of the pre-echo effectively removes noise generated not only in a shift section but also in a voiced sound to thereby produce high-quality signal. Also, the present invention has an effect that it can quantize the linear prediction coefficient used for highband encoding in a low transmission rate based on the lowband encoding information.
The present application contains subject matter related to Korean patent application No. 2004-0103158, filed in the Korean Intellectual Property Office on Dec. 8, 2004, the entire contents of which is incorporated herein by reference.
While the present invention has been described with respect to certain preferred embodiments, it will be apparent to those skilled in the art that various changes and modifications may be made without departing from the scope of the invention as defined in the following claims.
Number | Date | Country | Kind |
---|---|---|---|
10-2004-0103158 | Dec 2004 | KR | national |