The present invention relates to a multi-codebook fixed bitrate CELP signal block encoding/decoding method and apparatus and a multi-codebook structure.
CELP speech coders typically use codebooks to store excitation vectors that are intended to excite synthesis filters to produce a synthetic speech signal. For high bit rates these codebooks contain a large variety of excitation vectors to cope with a large spectrum of sound types. However, at low bit rates, for example around 4–7 kbits/s, the number of bits available for the codebook index is limited, which means that the number of vectors to choose from must be reduced. Therefore low bit rate coders will have a codebook structure that is compromise between accuracy and richness. Such coders will give fair speech quality for some types of sound and barely acceptable quality for other types of sound.
In order to solve this problem with low bitrate coders a number of multi-mode solutions have been presented [1–5].
References [1–2] describe variable bitrate coding methods that use dynamic bit allocation; where the type of sound to be encoded controls the number of bits that are used for encoding.
References [3–4] describe constant bitrate coding methods that use several equal size codebooks that are optimized for different sound types. The sound type to be encoded controls which codebook is used.
These prior art coding methods all have the drawback that mode information has to be transferred from encoder to decoder in order for the decoder to use the correct decoding mode. Such mode information, however, requires extra bandwidth.
Reference [5] describes a constant bitrate multi-mode coding method that also uses equal size codebooks. In this case an already determined adaptive codebook gain of the previous subframe is used to switch from one coding mode to another coding mode. Since this parameter is transferred from encoder to decoder anyway, no extra mode information is required. This method, however, is sensitive to bit errors in the gain factor caused by the transfer channel.
An object of the present invention is an encoding/decoding scheme in which coding is improved without the need for explicitly transmitting coding mode information from encoder to decoder.
This object is solved in accordance with the enclosed claims.
Briefly, the present invention achieves the above object by using several different equal size codebooks. Each codebook is weak for some signals, but the other codebooks do not share this weakness for those signals. By deterministically (without regard to signal type) switching between these codebooks from speech block to speech block, the coding quality is improved. There is no need to transfer information on which codebook was selected for a particular speech block, since both encoder and decoder use the same deterministic switching algorithm.
The invention, together with further objects and advantages thereof, may best be understood by making reference to the following description taken together with the accompanying drawings, in which:
In the following description and in the claims the expression “encoder/decoder” is intended to mean either an encoder or a decoder, since the invention is equally applicable to both cases.
The basic principles of the present invention will now be described with reference to
A way of viewing a codebook is to consider it as a multi-dimensional (typically 40-dimensional) “needle cushion”, in which the “needles” represent code vectors. In this model an untrained stochastic codebook would be represented by a “hyper-spherical” needle cushion, in which the code vectors are evenly distributed in every “direction” (the codebook is “white”). The training process mentioned above redistributes these vectors in such a way that certain “directions” are more densely populated than other “directions”. The least densely populated “directions” correspond to the weak points of the codebook. Each codebook is trained differently in a way that ensures that the codebooks do not have common weak points.
Often a stochastic codebook is approximated by an algebraic codebook, see [6]. Such a codebook may, for example, contain code vectors having a length of 40 samples. However, only very few sample positions actually have values that differ from zero. Furthermore, in many such algebraic codebooks the only allowed values (different from zero) are +1 or −1.
When one of these codebooks is searched, 1 pulse is positioned in one of the allowed positions of track 0, and 1 pulse is positioned in one of the allowed positions of track 1 of a track pair. This pulse combination is used as a potential code vector group. The group includes 4 possible code vectors, namely 1 vector having 2 positive pules, 1 vector having 2 negative pulses and 2 vectors having 1 positive and 1 negative pulse. By shifting pulse positions within each of the 2 tracks in the track pair it is possible to form other such code vector groups. The same principles apply to track pair 1. By testing each possible combination the best code vector is selected. This code vector is defined by its corresponding track pair, 2 pulse positions in the tracks of this pair, and the pulse signs. This requires 1 bit to specify track pair, 2·3=6 bits to specify pulse positions (there are 8 positions in a track, which requires 3 bits) in the tracks of this pair, and 2 bits to specify the sign of each pulse. Thus, a total of 9 bits defines a code vector.
Returning to
The codebooks 10A–D all have the same bitrate, their weakest performance points are not shared. By deterministically switching between the codebooks from signal block to signal block, the deficiencies of each codebook will be compensated over time. It has been found that the average perceived sound quality of the encoded and thereafter decoded audio signals actually increases in spite of the fact that signal type is disregarded in the switching algorithm. This may be explained by noting that the resulting distortion from one single codebook is not repeated in every subframe or block. Instead the varying distortions will be smoothed out. Thus, the distortion from this low bitrate (multi) codebook is perceived less annoying, since it is not continuously repeated.
One embodiment of the selection algorithm is to sequentially and cyclically select each codebook 10A–D. The encoder and decoder are automatically in sync if the number of codebooks corresponds to the number of subframes in a frame and a codebook counter in encoder and decoder is reset every frame. Otherwise synchronization may be achieved by resetting a modulo n counter, where n is the number of codebooks, in both encoder and decoder at call-setup and handover.
Another selection algorithm is to use a pseudo-random sequence to select codebooks from the set. In this case the seed of the algorithm that generates the pseudo-random sequence is known to both encoder and decoder. Synchronization between encoder and decoder may, for example, be achieved by a pseudo random sequence that is based on transmitted and received frame parameters that are determined and analyzed prior to the codebook search.
As in
Due to the fact that the parameters that are used for set selection will be transferred from encoder to decoder anyway, no bandwidth is lost for transferring set selection information. Preferably only channel protected parameters are used for set detection. Furthermore, an especially preferred embodiment of the encoder/decoder of
Since the set selection precedes the codebook selection, the embodiment of
Typically the functionality of set and codebook selectors 22, 28 is implemented by one or several micro processors or micro/signal processor combinations.
It will be understood by those skilled in the art that various modifications and changes may be made to the present invention without departure from the scope thereof, which is defined by the appended claims.
| Number | Date | Country | Kind |
|---|---|---|---|
| 9803164 | Sep 1998 | SE | national |
| Number | Name | Date | Kind |
|---|---|---|---|
| 4932061 | Kroon et al. | Jun 1990 | A |
| 5371853 | Kao et al. | Dec 1994 | A |
| 5617145 | Huang et al. | Apr 1997 | A |
| 5754976 | Adoul et al. | May 1998 | A |
| 5778335 | Ubale et al. | Jul 1998 | A |
| 5991717 | Minde et al. | Nov 1999 | A |
| 6055496 | Heidari et al. | Apr 2000 | A |
| 6122608 | McCree | Sep 2000 | A |
| Number | Date | Country |
|---|---|---|
| 0770985 | May 1997 | EP |
| 05-265496 | Oct 1993 | JP |
| WO 9516260 | Jun 1995 | WO |