1. Technical Field
The present invention relates to a high-speed search method for an LSP (Local Spectrum Pair) using SVQ (Split Vector Quantization) and a fixed codebook of the G.729 speech encoder, and more particularly to a high-speed search method which may decrease overall computational complexity without sacrificing spectral distortion performance by reducing a size of the codebook using an order character of LSP parameters in searching a codebook having high computational complexity during quantizing a split vector of LSP parameters of a speech encoder, used to compress voice signals in a low speed, and a high-speed search method which may dramatically reduce computational complexity without loss of tone quality by detecting and searching tracks on the basis of a magnitude order of a correlation signal (d′(n)), obtained by an impulse response and a target signal in the process of searching the fixed codebook of the G.729 speech encoder.
2. Description of the Prior Art
Generally, for the speech encoding in a less than 16 kbps transmission rate, the speech is not directly transmitted but parameters representing the speech are sampled and quantized to reduce magnitude of the data, in a circumstance that the bandwidth is limited.
For high-quality encoding, the low transmission speech encoder quantizes LPC coefficients, in which an optimal LPC coefficient is obtained by dividing the input speech signal in a frame unit to minimize predictive error energy in each frame.
LPC filter is commonly a 10th ALL-POLE filter.
In the above conventional method, more bits should be assigned to quantize the 10 LPC coefficients. However, when directly quantizing the LPC coefficients, there are problems that characters of the filters are very sensitive to the quantization error and that stability of the LPC filter is not assured after quantizing the coefficients.
Therefore, the present invention is designed to overcome the problems of the prior art. An object of the present invention is to provide a high speed search method for a speech encoder having decreased overall computational complexity, and in which spectral distortion performance is not sacrificed.
These and other features, aspects, and advantages of the present invention will become better understood with regard to the following description, appended claims, and accompanying drawings, in which like components are referred to by like reference numerals. In the drawings:
a and 4b show a start point and an end point of a code vector group satisfying the order character, in which
Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings.
Quantizing overall vectors at one time is substantially impossible because a size of the vector table becomes too big and too much time is taken for search. To solve this problem, the present invention employs SVQ (Split Vector Quantization) to divide overall vectors into several sub-vectors and then quantize the sub-vectors independently. A predictive SVQ, which is a method adding a prediction unit to the SVQ, uses correlation between frames of the LSP (Linear Spectrum Pair) parameters for more efficient quantization. That is, the predictive SVQ does not quantize the LSP of a current frame directly, but predicts the LSP of the current frame on the basis of an LSP of the previous frame and then quantizes a prediction error. The LSP has a close relation with a frequency character of the speech signal, making time prediction possible with great gains.
When quantizing the LSP parameters with such VQ, most of quantizers have a large LSP codebook. And, in order to reduce computational complexity in searching an optimal code vector in the codebook, the quantizer decreases a range of codes to be searched by using an order of the LSP parameters. That is, the quantizer arranges the code vectors in the codebook for a target vector in a descending order according to element values in a specific position in a sub-vector. Then, the optimal code vector, which minimizes distortion in the arranged codebook, has nearly identical value with that of the target vector, which implies that such value has an order character. Under such presumption, the present invention compares an element value of a specific position arranged in a descending order with element values of other adjacent positions, and then calculates distortion with high computational complexity for the code vectors, which satisfies the order character, and cancels the calculation process for other code vectors.
Such method may reduce a great amount of computational complexity, overall.
0<p1<p2<. . . <pp<π [Equation 1]
In the Equation 2, the error criterion El,m is represented as a formula of p and p}, in which pm is a target vector to search the mth codebook, and p}l,m is corresponding to a lth code vector in a codebook for mth sub-vector. Here, an optimal code vector for each sub-vector is selected to minimize the next error criterion El,m and then transmitted through a finally selected codebook index (l)
In the Equation 2, the LSP code vector (p}) is divided into M number of sub-vectors, each of which consists Of Lm number of code vectors. Codebook magnitudes (L0, L1, . . . , LM−1) of M number may be assigned to a specific sub-vector to improve tone quality. Wm is a weighting matrix for the mth sub-vector and obtained by a non-quantized LSP vector (p).
In order to employ a high-speed search method in the present invention, conversion of the conventional codebook is needed. This is a process of replacing the conventional codebook with a new codebook having L reference rows, as illustrated in
where l,n in the subscript of pl,n are indices that represent the lth index of the nth reference row. i.e.. the letters “l” and “n.”
As seen in the Equation 3, the element value of the n−1th row in the target vector should be less than the element value of the nth row in the codebook, while the element value of the n+1th row should be bigger than the element value of the nth row in the codebook.
Presuming that the reference row of each codebook, which is optimized to each codebook, is N0, N1, . . . , Nm and the 10th LSP vector is a target vector, a search range of the codebook is determined by comparing the element value of the reference row in the codebook to be searched using the following Equations 4 and 5 with element values of rows before and after the reference row in the target vector and then excluding code vectors, which are not satisfying the order character, from the searching process.
In this specification, comparing an element value of Nth row of a code vector with an element value of a N−1th row of a target vector as shown in the Equation 4 to determining whether they satisfy the order character is called as a forward comparison, comparing the element value of the Nth row of the code vector with an element value of a N+1th row of the target vector as shown in the Equation 5 to determining whether they satisfy the order character is called as a backward comparison.
Hereinafter, preferred embodiments of the present invention are explained with reference to the accompanying drawings.
The process of obtaining a substantial start point and an end point of a code vector group, satisfying the order character for the given target vector, is shown in
a and 4b are flowcharts for illustrating the process of obtaining a substantial start point and an end point of a code vector group, which satisfies the order character, for the forward and backward comparison, respectively. The search range of the codebook can be calculated with the start point and the end point, obtained by such flowcharts.
As shown in
As shown in
If the start point and the end point are calculated, an optimally quantized vector may be selected by obtaining a distortion only for the vectors within the range between the start point and the end point.
An efficient search method of the fixed codebook is very important for high quality speech encoding in a low-transmission speech encoder. In the G.729 speech encoder, the fixed codebook is searched for each sub-frame, and 17-bit logarithmic codebook is used for the fixed codebook and an index of the searched codebook is transmitted. A Vector in each fixed codebook has 4 pulses. As shown in Table 1, each pulse has size of +1 or −1 in a designated position and is represented by the Formula 6.
in which c(n) is a fixed codebook vector, δ(n) is a unit pulse and mi is a position of the ith pulse.
An object signal x′(n) for search in the fixed codebook is obtained by eliminating a portion contributed by an adaptable codebook in an object signal x(n) used in a pitch search and may be represented like the following Formula 7.
in which gp is a gain of the adaptable codebook, and y(n) is a vector of the adaptable codebook.
Assuming that a codebook vector of an index (k) is Ck, an optimal code vector is selected as a codebook vector, which maximizes the following Formula 8.
in which d is a correlation vector between the object signal x′(n) and an impulse response h(n) of a composite filter, and Φ is a correlation matrix with h(n). That is, d and Φ are represented with the following Formulas 9 and 10.
The codebook search is comprised of 4 loops, each of which determines a new pulse. The matrix Ck that is squared in the numerator of Formula 8 is given by C in the following Formula 11, and the denominator in the Formula 8 is given as the following Formula 12 (in which φ(mi,mj) corresponds to Φ(i,j) of equation 10).
in which mi is a position of ith pulse, and si is its sign
In order to reduce the computational complexity in the codebook search, the following process is employed. A first, d(n) is decomposed into and absolute value d′(n)=| d(n) | and its sign. At this time, the sign value is previously determined for the available 40 pulse position in Table 1. And, the matrix Φ is modified into φ′(i,j)=sign[s(i)] sign[s(j)] φ(i,j), φ′(i,j)=0.5φ(i,j) in order to include the previously obtained sign value. Therefore, the Formula 11 may be represented as:
C=d′(m0)+d′(m1)+d′(m2)+d′(m3)
and the Formula 12 may be represented as:
In order to search all available pulse positions, 213 (=8,192) compositions should be searched. However, in order to reduce computational complexity, a threshold value (Cth) is determined as a candidate for searching 16 available pulses in a final track (t3) and then a part of candidates having low possibility are excluded on the basis of experimental data among all of 29 (=512) compositions to search pulses in the track (t3) only for the candidates which are over the threshold value.
At this time, the threshold value (Cth) is determined with a function of a maximum correlation value and an average correlation value of the prior three tracks (t0, t1, t2). The maximum correlation value of the tracks (t0, t1, t2) can be expressed as the following Formula 13.
Cmax=max[d′(t0)]+max[d′(t1)]+max[d′(t2)] [Equation 13]
in which max[d′(ti)] is a maximum value of d′(n) in the three tracks (t0, t1, t2). And, the average correlation value based on the tracks (t0, t1, t2) is as follows.
Here, the threshold value is given as the following Formula 15.
Cth=Cav+(Cmax−Cav)αt [Equation 15]
The threshold value is determined before searching the fixed codebook. And, candidates only over the threshold value are subject to search of the final track (t3). Here, the value of αt is used to control the number of candidates to search the final track (t3), in which the number of all candidates (N=512) becomes average N=60, and only 5% are over N=90. In addition, the track (t3) is limited to N1=105, and the number of the maximum candidates is limited to 180−N1. At this time, among 8,192 compositions, 90×16=1440 number of searches are accomplished.
When searching the fixed codebook in the above process, most of the computations are required in searching a position index of the optimal pulse in a loop of each track. Therefore, the high-speed search method of the present invention arranges values of each d′(n) in the tracks (t0, t1, t2) and then searches a position index which has the biggest d′(n) value among the three loops. Tables 2 and 3 show examples of the high-speed search method, including a search for specific sub-frames, which follow the below methods.
At first, the position indexes of the tracks (t0, t1, t2) are arranged in a descending order according to the d′(n) value. Then, the position index that has the biggest probability to be an optimal pulse position. as shown in
Then, because the threshold value in the Formula 15 is composed of only the d′(n) values. i.e.. the correlation vectors between the object signals and impulse response of the composite signals for each of the tracks (t0, t1, t2). as described above, and arranged with the d′(n) values in a descending order, after calculating each d′(n) value of the tracks (t0, t1, t2) and then determining whether the sum of the d′(n) values is over the predetermined threshold value, the search process is executed if the sum is over the threshold value by the codebook search is finished if the sum is not over the threshold value.
As described above, the candidate values over the threshold may be searched in a high-speed by sequentially arranging the fixed codebook according to the d′(n) values and calculating the correlation value Ck on the basis of the arranged codebook.
As shown in the Table 3, the tracks (t0, t1, t2) are searched in an order dependent on a size of d′(n). However, all of 8 position values of each track are not searched, but some position values limited depending on probability are searched. For an example based on Table 4, only 4 position values are searched in the track (t0), only 5 position values are searched in the track (t1) and only 6 position values are searched in the track (t2), while the searching process for other position values having low probability is excluded, so reducing computational complex without loss of the tune quality.
Interactions between the steps are described below with reference the Tables 1, 2, 3 and 4.
The step of determining the correlation values for each pulse position index in the tracks (t0, t1, t2) T100 determines the correlation values for each pulse position index in each track. That is, if the correlation value is d′(n), the step T100 determines sized of d′(0), d′(5), d′(10), . . . , d′(35) for the track 0 (t0), sizes of d′(1), d′(6), d′(11), . . . , l d′(36) for the track 1 (t1), and sizes of d′(2), d′(7), d′(12) . . . , d′(37) for the track 2 (t2).
Table 2 is a chart showing the correlation values for each pulse position index of the tracks (t0, t1, t2) in a specific sub-frame.
The step of arranging the pulse position indexes of the tracks (t0, t1, t2) according to the correlation value of each track T110 involves comparing sizes of correlation values of each pulse position index for each track and then arranging them in a descending order.
In other words, the step T110 compares the correlation value magnitudes obtained for all pulse position indexes of the track 0 (t0) and then arranges the correlation values in a descending order. The step T110 executes an arrangement for the tracks 1 and 2 in a descending order by using the same approach.
Table 3 is a chart showing the process of arranging the pulse position indexes in a descending order according to the correlation value magnitudes of each of the tracks (t0, t1, t2) in a specific sub-frame.
Referring to Tables 2 and 3, Table 2 assumes that the correlation value is given for each pulse position index and Table 3 shows pulse positions (or position indexes) arranged in a descending order on the basis of the correlation value.
Therefore, the pulse position indexes are newly arranged in the tracks (t0, t1, t2), in which the pulse position indexes are arranged as 5, 25 . . . , 30 in the track 0, as 6, 1 . . . , 31 in the track 1, and as 32, 37, . . . , 27 in the track 2.
The step T120 calculates a sum of the correlation values for each pulse position index of the tracks (t0, t1, t2).
Referring to Table 3, the step T120 obtains a sum of the correlation values for each pulse position index |d(5|+|d(6)|+|d(32)|, for each pulse position index composition (5, 6, 32) of the tracks (t0, t1, t2).
In addition, the step of checking whether the calculated sum is over the threshold value T130 performs comparison between the calculation sum of the pulse position index composition and the threshold value previously determined before the fixed codebook search.
The step T140 searches an optimal pulse position in the track 3 for the pulse position index composition if the calculated sum is over the threshold value.
As an example, if the sum of the correlation vector sizes for the pulse position index composition (5, 6, 32) is bigger than the threshold value in Table 3, the search candidates for searching an optimal pulse position in the tracks 0, 1 and 2 become (5, 6, 32, 3), (5, 6, 32, 8), . . . , (5, 6, 32, 39). They are compositions adding each pulse position index of the track 3 shown in
The step of checking whether search for all pulse position index compositions of the tracks (t0, t1, t2) is completed after searching the track 3 (t3) T150 is to check whether the track 3 is searched for all candidates in the case that the calculated sum is over the threshold value.
The step of increasing the pulse position indexes of the tracks (t0, t1, t2) if the search for all pulse position index compositions of the tracks (t0, t1, t2) is not completed T160 is increasing the pulse position index to obtain the next pulse position index composition for the tracks 0, 1 and 2 in the case that the calculated sum is over the threshold value.
As an example, if the current search candidate is (5, 6, 32) for the tracks 0, 1 and 2, the next search candidate adding the pulse position index may be (5, 6, 37).
If the pulse position index is added one more time, the next search candidate may be (5, 6, 12).
If the calculated sum is equal to or less than the threshold value, the search for the track 3 is not performed but the fixed codebook search for the corresponding sub-frames is finished T170.
Therefore, if there is a candidate not over the threshold value when determining candidates for searching the track 3, other candidates are also not over the threshold value, so stopping the search for the fixed codebook to reduce unnecessary computational complex.
As explained above, Table 4 is a chart showing statistical probabilities that each pulse position for the tracks 0, 1 and 2 is selected as an optimal pulse position for the tracks 0, 1 and 2 is selected as an optimal pulse position. As shown in the table, probability values that each pulse position for the tracks 0, 1 and 2 is selected as an optimal pulse position are arranged sequentially. Their arrangement is identical to that which is arranged in a descending order based on the size of the correlation value for each pulse position index.
This will be well understood with reference to the Formula 8, in which the numerator has more attribution than the denominator because the numerator of the Formula 8 based on the d′(n) value is in a square type.
Therefore, the pulse position, which maximizes the correlation value (Ck), is very probable to be the optimal pulse position, while the pulse position having the biggest correlation vector size is most probable to be the optimal pulse position.
According to such method, only limited pulse position values are searched according to the probability, or the size of the correlation value, not searching all of 8 pulse positions of the tracks 0, 1 and 2.
As an example, in Table 4, only 4 pulse positions are searched in the track (t0), only 5 pulse positions are searched in the track (t1) and only 6 pulse positions are searched in the track (t2), while the searching process for other pulse positions having low probability is excluded, so reducing computational complex without loss of the tune quality.
In other word, by using the method of the present invention, better performance is expected in an aspect of the computational complex in the fixed codebook search than the prior art, with same tune quality.
Furthermore, the high-speed fixed codebook search method of the present invention may be applied to the search process for various types of fixed codebook having a logarithmic structure.
The present invention gives effects of reducing computational complexity required to search the codebook without signal distortion in quantizing the LSP parameters of the speech encoder using SVQ manner, and reducing computational complexity without loss of tone quality in G.729 fixed codebook search by performing candidate selection and search on the basis for the correlation value size of the pulse position index.
Number | Date | Country | Kind |
---|---|---|---|
2000-1756 | Jan 2000 | KR | national |
2000-9519 | Feb 2000 | KR | national |
2000-18838 | Apr 2000 | KR | national |
Number | Name | Date | Kind |
---|---|---|---|
4907276 | Aldersberg | Mar 1990 | A |
5061924 | Mailhot | Oct 1991 | A |
5194864 | Nakano | Mar 1993 | A |
5481739 | Staats | Jan 1996 | A |
5748839 | Serizawa | May 1998 | A |
6246979 | Carl | Jun 2001 | B1 |
6622120 | Yoon et al. | Sep 2003 | B1 |
6836225 | Lee et al. | Dec 2004 | B2 |
Number | Date | Country |
---|---|---|
0 505 654 | Sep 1992 | EP |
Number | Date | Country | |
---|---|---|---|
20010010038 A1 | Jul 2001 | US |