Speech encoding method and apparatus using tree-structure delta code book

Information

  • Patent Grant
  • 5864650
  • Patent Number
    5,864,650
  • Date Filed
    Thursday, December 12, 1996
    27 years ago
  • Date Issued
    Tuesday, January 26, 1999
    25 years ago
Abstract
A larger number, L', of delta vectors .DELTA..sub.i (i=0, 1, 2, . . . , L'-1) than the required number L are each multiplied by a matrix of a linear predictive synthesis filter (3), their power (A.DELTA..sub.i).sup.T (A.DELTA..sub.i) is evaluated (42), and the delta vectors are reordered in decreasing order of power (43); then, L delta vectors are selected in decreasing order of power, the largest power first, to construct a tree-structure data code book (41), using which A-b-S vector quantization is performed (48). This provides increased freedom for the space formed by the delta vectors and improves quantization characteristics. Further, variable rate encoding is achieved by taking advantage of the structure of the tree-structure data code book.
Description

TECHNICAL FIELD
The present invention relates to a speech encoding method and apparatus for compressing speech signal information, and more particularly to a speech encoding method and apparatus based on Analysis-by-Synthesis (A-b-S) vector quantization for encoding speech at transfer rates of 4 to 16 kbps.
BACKGROUND ART
In recent years, a speech encoder based on A-b-S vector quantization, such as a code-excited linear prediction (CELP) encoder, has been drawing attention in the fields of LAN systems, digital mobile radio systems, etc., as a promising speech encoder capable of compressing speech signal information without degrading its quality. In such a vector quantization speech encoder (hereinafter simply called the encoder), predictive weighting is applied to each code vector in a code book to reproduce a signal, and an error power between the reproduced signal and the input speech signal is evaluated to determine a number (index) for a code vector with the smallest error prior to transmission to the receiving end.
The encoder based on such an A-b-S vector quantization system performs linear predictive filtering on each of the speech source signal vectors according to about 1,000 patterns stored in the code book, and searches the about 1,000 patterns for the one pattern that minimizes the error between a reproduced signal and the input speech signal to be encoded.
Since the encoder is required to ensure the instantaneousness of voice communication, the above search process must be performed in real time. This means that the search process must be performed repeatedly at very short time intervals, for example, at 5 ms intervals, for the duration of voice communication.
However, as will be described in detail, the search process involves complex mathematical operations, such as filtering and correlation calculations, and the amount of calculation required for these mathematical operations will be enormous, for example, in the order of hundreds of megaoperations per second (Mops). To handle such operations, a number of chips will be required even if the fastest digital signal processors (DSPs) currently available are used. In portable telephone applications, for example, this will present a problem as it will make it difficult to reduce the equipment size and power consumption.
To overcome the above problem, the present applicant proposed, in Japanese Patent Application No. 3-127669 (Japanese Patent Unexamined Publication No. 4-352200), a speech encoding system using a tree-structure code book wherein instead of storing code vectors themselves as in previous systems, a code book, in which delta vectors representing differences between signal vectors are stored, is used, and these delta vectors are sequentially added and subtracted to generate code vectors according to a tree structure.
According to this system, the memory capacity required to store the code book can be reduced drastically; furthermore, since the filtering and correlation calculations, which were previously performed on each code vector, are performed on the delta vectors and the results are sequentially added and subtracted, a drastic reduction in the amount of calculation can be achieved.
In this system, however, the code vectors are generated as a linear combination of a small number of delta vectors that serve as fundamental vectors; therefore, the generated code vectors do not have components other than the delta vector components. More specifically, in a space where the vectors to be encoded are distributed (usually, 40- to 64-dimensional space), the code vectors can only be mapped in a subspace having a dimension corresponding at most to the number of delta vectors (usually, 8 to 10).
Accordingly, the tree-structure delta code book has had the problem that the quantization characteristic degrades as compared with the conventional code book free from structural constraints even if the fundamental vectors (delta vectors) are well designed on the basis of the statistic distribution of the speech signal to be encoded.
Noting that when the linear predictive filtering operation is performed on each code vector to evaluate the distance, amplification is not achieved uniformly for all vector components but is achieved with a certain bias, and that the contribution each delta vector makes to code vectors in the tree-structure delta code book can be changed by changing the order of the delta vectors, the present applicant proposed, in Japanese Patent Application No. 3-515016, a method of improving the characteristic by using a tree-structure code book wherein each time the coefficient of the linear predictive filter is determined, a filtering operation is performed on each delta vector and the resulting power (the length of the vector) is compared, as a result of which the delta vectors are reordered in order of decreasing power.
However, with this method also, code vectors are generated from a limited number of delta vectors, as with the previous method, so that there is a limit to improving the characteristic. A further improvement in the characteristic is therefore demanded.
Another challenge for the speech encoder based on A-b-S vector quantization is to realize variable bit rate encoding. Variable bit rate encoding is an encoding scheme capable of varying the bit rate such that the encoding bit rate is adaptively varied according to situations such as the remaining capacity of the transmission path, significance of the speech source, etc., to achieve a greater encoding efficiency as a whole.
If the vector quantization system is to be applied to variable bit rate voice encoding, it is necessary to prepare code books each containing patterns corresponding to each transmission rate, and perform encoding by switching the code book according to the desired transmission rate.
In the case of conventional code books each constructed from a simple arrangement of code vectors, N.times.M words of memory corresponding to the product of the vector dimension (N) and the number of patterns (M) would be necessary to store each code book. Since the number of patterns M is proportional to the n-th power of 2 where n is the bit length of an index of the code vector, the problem is that an enormous amount of memory will be required in order to increase the variable range of the transmission rate or to control the transmission rate in smaller increments.
Also, in variable bit rate transmission, there are cases in which the rate of the transmission signals has to be reduced according to a request from the transmission network side even after encoding. In such cases, the decoder has to reproduce the speech signal from bit-dropped information, i.e. information with some bits dropped from the encoded information generated by the encoder.
For scalar quantization, which is inferior in efficiency to vector quantization, various techniques have so far been devised to cope with bit drop situations, for example, by performing control so that bits are dropped from the LSB side in increasing order of significance, or by constructing a high bit rate quantizer in such a manner as to contain the quantization levels of a low bit rate quantizer (embedded encoding).
However, in the case of the vector quantization system that uses conventional code books constructed from a simple arrangement of code vectors, since no structuring schemes are employed in the construction of the code books, there are no differences in significance among index bits for a code vector (whether the dropped bit is the LSB or MSB, the result will be the same in that an entirely different vector is called), and the same techniques as employed for scalar quantization cannot be used. The resulting problem is that a bit drop situation will cause a significant degradation in sound quality.
DISCLOSURE OF THE INVENTION
Accordingly, it is a first object of the invention to provide a speech encoding method and apparatus that use a tree-structure data code book achieving a further improvement on the above-described system.
It is another object of the invention to provide a speech encoding method and apparatus employing vector quantization which do not require an enormous amount of memory for the code book and are capable of coping with bit drop situations.
According to the present invention, there is provided a speech encoding method by which an input speech signal vector is encoded using an index assigned to a code vector that, among premapped code vectors, is closest in distance to the input speech signal vector, comprising the steps of:
a) storing a plurality of differential code vectors:
b) multiplying each of the differential code vectors by a matrix of a linear predictive synthesis filter;
c) evaluating the power amplification ratio of each differential code vector multiplied by the matrix;
d) reordering the differential code vectors, each multiplied by the matrix, in decreasing order of the evaluated power amplification ratio;
e) selecting from among the reordered vectors a prescribed number of vectors in decreasing order of the evaluated power amplification ratio, the largest ratio first;
f) evaluating the distance between the input speech signal vector and each of linear-predictive- synthesis-filtered code vectors formed by sequentially adding and subtracting the selected vectors through a tree structure; and
g) determining the code vector for which the evaluated distance is the smallest.
According to the present invention, there is also provided a speech encoding apparatus by which an input speech signal vector is encoded using an index assigned to a code vector that, among premapped code vectors, is closest in distance to the input speech signal vector, comprising:
means for storing a plurality of differential code vectors:
means for multiplying each of the differential code vectors by a matrix of a linear predictive synthesis filter;
means for evaluating the power amplification ratio of each differential code vector multiplied by the matrix;
means for reordering the differential code vectors, each multiplied by the matrix, in decreasing order of the evaluated power amplification ratio;
means for selecting from among the reordered vectors a prescribed number of vectors in decreasing order of the evaluated power amplification ratio, the largest ratio first;
means for evaluating the distance between the input speech signal vector and each of linear-predictive- synthesis-filtered code vectors formed by sequentially adding and subtracting the selected vectors through a tree structure; and
means for determining the code vector for which the evaluated distance is the smallest.
According to the present invention, there is also provided a variable-length speech encoding method by which an input speech signal vector is variable-length encoded using a variable-length code assigned to a code vector that, among premapped code vectors, is closest in distance to the input speech signal vector, comprising the steps of:
a) storing a plurality of differential code vectors:
b) evaluating the distance between the input speech signal vector and each of code vectors formed by sequentially performing additions and subtractions, working from the root of a tree structure, on the number of differential code vectors corresponding to a desired code length;
c) determining a code vector for which the evaluated distance is the smallest; and
d) determining a code, of the desired code length, to be assigned to the thus determined code vector.
According to the present invention, there is also provided a variable-length speech encoding apparatus by which an input speech signal vector is variable-length encoded using a variable-length code assigned to a code vector that, among premapped code vectors, is closest in distance to the input speech signal vector, comprising:
means for storing a plurality of differential code vectors:
means for evaluating the distance between the input speech signal vector and each of code vectors formed by sequentially performing additions and subtractions, working from the root of a tree structure, on the number of differential code vectors corresponding to a desired code length;
means for determining a code vector for which the evaluated distance is the smallest; and
means for determining a code, of the desired code length, to be assigned to the thus determined code vector.





BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is a block diagram illustrating the concept of a speech sound generating system;
FIG. 2 is a block diagram illustrating the principle of a typical CELP speech encoding system;
FIG. 3 is a block diagram showing the configuration of a stochastic code book search process in A-b-S vector quantization according to the prior art;
FIG. 4 is a block diagram illustrating a model implementing an algorithm for the stochastic code book search process;
FIG. 5 is a block diagram for explaining a principle of the delta code book;
FIGS. 6A and 6B are diagrams for explaining a method of adaptation of a tree-structure code book;
FIGS. 7A, 7B, and 7C are diagrams for explaining the principles of the present invention;
FIG. 8 is a block diagram of a speech encoding apparatus according to the present invention; and
FIGS. 9A and 9B are diagrams for explaining a variable rate encoding method according to the present invention.





BEST MODE FOR CARRYING OUT THE INVENTION
There are two types of speech sound, voiced and unvoiced sounds. Voiced sounds are generated by a pulse sound source caused by vocal chord vibration. The characteristic of the vocal tract, such as the throat and mouth, of each individual speaker is appended to the pulse sounds to thereby form speech sounds. Unvoiced sounds are generated without vibrating the vocal chords, the sound source being a Gaussian noise train which is forced through the vocal tract to thereby form speech sounds. Therefore, the speech sound generating mechanism can be modelled by using, as shown in FIG. 1, a pulse sound generator PSG that generates voiced sounds, a noise sound generator NSG that generates unvoiced sounds, and a linear predictive coding filter LPCF that appends the vocal tract characteristic to signals output from the respective generators. Human voice has pitch periodicity which corresponds to the period of the pulse train output from the pulse sound generator and which varies depending on each individual speaker and the way he or she speaks.
From the above, it can be shown that if the period of the pulse sound generator and the noise train of the noise generator that correspond to input speech sound can be determined, the input speech sound can be encoded by using the pulse period and code data (index) by which the noise train of the noise generator is identified.
Here, as shown in FIG. 2, vectors P obtained by delaying a past value (bP+gC) by different numbers of samples are stored in an adaptive code book 11, and a vector bP, obtained by multiplying each vector P from the adaptive code book 11 by a gain b, is input to a linear predictive filter 12 for filtering; then, the result of the filtering, bAP, is subtracted from the input speech signal X, and the resulting error signal is fed to an error power evaluator 13 which then selects from the adaptive code book 11 a vector P that minimizes the error power and thereby determines the period.
After that, or concurrently with the above operation, each code vector C from a stochastic code book 1, in which a plurality of noise trains (each represented by an N-dimensional vector) are prestored, is multiplied by a gain g, and the result is input to a linear predictive filter 3 for processing; then, a code vector that minimizes the error between the reconstructed signal vector gAC output from the linear predictive synthesis filter 3 and the input signal vector X (an N-dimensional vector) is determined by an error power evaluator 5. In this manner, the speech sound can be encoded by using the period and the data (index) that specifies the code vector. The above description given with reference to FIG. 2 has specifically dealt with an example in which the vectors AC and AP are orthogonal to each other; in other cases than the illustrated example, a code vector is determined which minimizes the error relative to a vector X--bAP representing the difference between the input signal vector X and the vector bAP.
FIG. 3 shows the configuration of a speech transmission (encoding) system that uses A-b-S vector quantization. The configuration shown corresponds to the lower half of FIG. 2. More specifically, 1 is a stochastic code book that stores N-dimensional code vectors C up to size M, 2 is an amplifier of gain g, 3 is a linear predictive filter that has a coefficient determined by a linear predictive analysis based on the input signal X and that performs linear predictive filtering on the output of the amplifier 2, 4 is an error generator that outputs an error in the reproduced signal vector output from the linear predictive filter 3 relative to the input signal vector, and 5 is an error power evaluator that evaluates the error and obtains a code vector that minimizes the error.
In this A-b-S quantization, unlike conventional vector quantization, each code vector (C) from the stochastic code book 1 is first multiplied by the optimum gain (g), and then filtered through the linear predictive filter 3, and the resulting reproduced signal vector (gAC) is fed into the error generator 4 which generates an error signal (E) representing the error relative to the input signal vector (X); then, using the power of the error signal as an evaluation function (a distance measure), the error power evaluator 5 searches the stochastic code book 1 for a code vector that minimizes the error power. Using the code (index) that specifies the thus obtained code vector, the input signal is encoded for transmission.
The error power at this time is given by
.vertline.E.vertline..sup.2 =.vertline.X-gAC.vertline..sup.2 (1)
The optimum code vector and gain g are so determined as to minimize the error power shown by Equation (1). Since the power varies with the sound level of the voice, the power of the reproduced signal is matched to the power of the input signal by optimizing the gain g. The optimum gain can be obtained by partially differentiating Equation (1) with respect to g.
d.vertline.E.vertline..sup.2 /dg=0
g is given by
g=(X.sup.T AC)/((AC).sup.T (AC)) (2)
Substituting g into Equation (1)
.vertline.E.vertline..sup.2 =.vertline..sup.- X.vertline..sup.2 -(X.sup.T AC).sup.2 /((AC).sup.T (AC)) (3)
When the cross-correlation between the input signal X and the output AC of the linear predictive filter 3 is denoted by R.sub.XC, and the autocorrelation of the output AC of the linear predictive filter 3 is denoted by R.sub.CC, then the cross-correlation and autocorrelation are respectively expressed as
R.sub.XC =X.sup.T AC (4)
R.sub.CC =(AC).sup.T (AC) (5)
Since the code vector C that minimizes the error power given by Equation (3) maximizes the second term on the right-hand side of Equation (3), the code vector C can be expressed as
C=argmax (R.sub.XC.sup.2 /R.sub.CC) (6)
Using the cross-correlation and autocorrelation that satisfy Equation (6), the optimum gain, from Equation (2), is given by
g=R.sub.XC /R.sub.CC (7)
FIG. 4 is a block diagram illustrating a model implementing an algorithm for searching the stochastic code book for a code vector that minimizes the error power from the above equations, and encoding the input signal on the basis of the obtained code vector. The model shown comprises a calculator 6 for calculating the cross-correlation R.sub.XC (=X.sup.T AC), a calculator 7 for calculating the square of the cross-correlation R.sub.XC, a calculator 8 for calculating the autocorrelation R.sub.CC of AC, a calculator 9 for calculating R.sub.XC.sup.2 /R.sub.CC, and an error power evaluator 5 for determining the code vector that maximizes R.sub.XC.sup.2 /R.sub.CC, or in other words, minimizes the error power, and outputting a code that specifies the code vector. The configuration is functionally equivalent to that shown in FIG. 3.
The above-described conventional code book search algorithm performs three basic functions, (1) the filtering of the code vector C, (2) the calculation of the cross-correlation R.sub.XC, and (3) the calculation of the autocorrelation R.sub.CC. When the order of the LPC filter 3 is denoted by Np, and the order of vector quantization (code vector) by N, the calculation amounts required in (1), (2), and (3) for each code vector are Np.multidot.N, N, and N, respectively. Therefore, the calculation amount required for the code book search for one code vector is (Np+2).multidot.N.
A commonly used stochastic code book 1 has a dimension of about 40 and a size of about 1024 (N=40, M=1024), and the order of analysis of the LPC filter 3 is usually about 10. Therefore, the number of addition and multiplication operations required for one code book search amounts to
(10+2).multidot.40.multidot.1024=480.times.10.sup.3
If such a code book search is to be performed for every subframe (5 msec) of speech encoding, it will require a processing capacity as large as 96 megaoperations per second (Mops); to realize realtime processing, it will require a number of chips even if the fastest digital signal processors (with maximum allowable computational capacity of 20 to 40 Mops) currently available are used.
Furthermore, for storing and retaining such a stochastic code book 1 as a table, a memory capacity of N.multidot.M (=40.multidot.1024=40K words) will be required.
In particular, in the field of car telephones and portable telephones where the speech encoder based on A-b-S vector quantization has potential use, smaller equipment size and lower power consumption are essential conditions, and the enormous amount of calculation and large memory capacity requirements described above present a serious problem in implementing the speech encoder.
In view of the above situation, the present applicant proposed, in Japanese Patent Application No. 3-127669 (Japanese Patent Unexamined Publication No. 4-352200), the use of a tree-structure delta code book, as shown in FIG. 5, in place of the conventional stochastic code book, to realize a speech encoding method capable of reducing the amount of calculation required for stochastic code book searching and also the memory capacity required for storing the stochastic code book.
Referring to FIG. 5, an initial vector C.sub.0 (=.DELTA..sub.0), representing one reference noise train, and delta vectors .DELTA..sub.1 to .DELTA..sub.L-1 (L=10), representing (L-1) kinds (levels) of delta noise trains, are prestored in a delta code book 10, and the respective delta vectors .DELTA..sub.1 to .DELTA..sub.L-1 are added to and subtracted from the initial vector C.sub.0 at each level through a tree structure, thereby forming code vectors (codewords) C.sub.0 to C.sub.1022 capable of representing (2.sup.10 -1) kinds of noise trains in the tree structure. Or, a -C.sub.0 vector (or a zero vector) is added to these vectors to form code vectors (code words) C.sub.0 to C.sub.1023 representing 2.sup.10 noise trains.
In this manner, from the initial vector .DELTA..sub.0 and the (L-1) kinds of delta vectors, .DELTA..sub.1 to .DELTA..sub.L-1 (L=10), stored in the delta code book 10, 2.sup.L -1 (=2.sup.10 -1=M-1) kinds of code vectors or 2.sup.L (=2.sup.10 =M) kinds of code vectors can be sequentially generated, and the memory capacity of the delta code book 10 can be reduced to L-N (=10.multidot.N), thus achieving a drastic reduction compared with the memory capacity M.multidot.N (=1024.multidot.N) required for the conventional noise code book.
Using the tree-structure delta code book 10 of such configuration, the cross-correlations R.sub.XC.sup.(j) and autocorrelations R.sub.CC.sup.(j) for code vectors C.sub.j (j=0 to 1022 or 1023) can be expressed by the following recurrence relations. That is, when each vector is expressed as
C.sub.2k+1 =C.sub.k +.DELTA..sub.i i=1, 2, . . . L-1 (8)
or
C.sub.2k+2 =C.sub.k -.DELTA..sub.i 2.sup.i-1 -1.ltoreq.k<2.sup.i -1 (9)
then
R.sub.XC (.sup.2k+1) =R.sub.XC.sup.(k) +X.sup.T (A.DELTA..sub.i) (10)
or
R.sub.XC (.sup.2k+2) =R.sub.XC.sup.(k) +X.sup.T (A.DELTA..sub.i) (11)
and
R.sub.CC (.sup.2k+1) =R.sub.CC.sup.(k) +(A.DELTA..sub.i).sup.T (A.DELTA..sub.i)+2(A.DELTA..sub.i).sup.T (AC.sub.k) (12)
or
R.sub.CC (.sup.2k+2) =R.sub.CC.sup.(k) +(A.DELTA..sub.i).sup.T (A.DELTA..sub.i)-2(A.DELTA..sub.i).sup.T (AC.sub.k) (13)
Thus, for the cross-correlation R.sub.XC, when the cross-correlation X.sup.T(A.DELTA..sub.i) is calculated for each delta vector .DELTA..sub.i (i=0 to L-1; .DELTA..sub.0 =C.sub.0), the cross-correlations R.sub.XC.sup.(j) for all code vectors C.sub.j are instantaneously calculated by sequentially adding or subtracting X.sup.T (A.DELTA..sub.i) in accordance with the recurrence relation (10) or (11), i.e. through the tree structure shown in FIG. 5. In the case of the conventional code book, a number of addition and multiplication operations amounting to
M.multidot.N (=1024.multidot.N)
was required to calculate the cross-correlations for code vectors for all noise trains. By contrast, in the case of the tree-structure code book, the cross-correlation R.sub.XC.sup.(j) is not calculated directly from each code vector C.sub.j (j=0, 1, . . . 2.sup.L -1), but calculated by first calculating the cross-correlation relative to each delta vector .DELTA..sub.j (j=0, 1, . . . L-1) and then adding or subtracting the results sequentially. Therefore, the number of addition and multiplication operations can be reduced to
L.multidot.N (=10.multidot.N)
thus achieving a drastic reduction in the number of operations.
For the orthogonal term (A.DELTA..sub.i).sup.T (AC.sub.k) in the third term of Equation (12), (13), when C.sub.k is expressed as
C.sub.k =.DELTA..sub.0 .+-..DELTA..sub.1 .+-..DELTA..sub.2 . . . .+-..DELTA..sub.i-1
then
(A.DELTA..sub.i).sup.T (AC.sub.k)=(A.DELTA..sub.i).sup.T (A.DELTA..sub.0).+-.(A.DELTA..sub.i).sup.T (A.DELTA..sub.i).+-. . . . (A.DELTA..sub.i).sup.T (A.DELTA..sub.i-1) (14)
Therefor, by calculating the cross-correlations, (A.DELTA..sub.i).sup.T (A.DELTA..sub.0,1,2, . . . ,i.sub.-1), between .DELTA..sub.i and .DELTA..sub.0, .DELTA..sub.1 . . . A.sub.i-1, and sequentially adding or subtracting the results in accordance with the tree structure of FIG. 5, the third term is calculated. Further, by calculating the autocorrelation, (A.DELTA..sub.i).sup.T (A.DELTA..sub.i), of each delta vector .DELTA..sub.i in the second term, and sequentially adding or subtracting the results in accordance with Equation (12) or (13), i.e., through the tree structure of FIG. 5, the autocorrelations R.sub.CC.sup.(j) of all code vectors C.sub.j are instantaneously calculated.
In the case of the conventional code book, the number of addition and multiplication operations amounting to
M.multidot.N (=1024.multidot.N)
was required to calculate the autocorrelations. By contrast, in the case of the tree-structure code book, the autocorrelation R.sub.CC.sup.(j) is not calculated directly from each code vector C.sub.j (j=0, 1, . . . 2.sup.L -1), but calculated from the autocorrelation of each delta vector .DELTA..sub.j (j=0, 1, . . . L-1) and cross-correlations in all possible combinations of different delta vectors. Therefore, the number of addition and multiplication operations can be reduced to
L(L+1).multidot.N/2 (=55.multidot.N)
thus achieving a drastic reduction in the number of operations.
However, since codewords (code vectors) in such a tree-structure delta code book are all formed as a linear combination of delta vectors, the code vectors do not have components other than delta vector components. More specifically, in a space where the vectors to be encoded are distributed (usually, 40- to 64-dimensional space), the code vectors can only be mapped in a subspace having a dimension corresponding at most to the number of delta vectors (usually, 8 to 10).
Accordingly, the tree-structure delta code book has had the problem that the quantization characteristic degrades as compared with the conventional code book free from structural constraints even if the fundamental vectors (delta vectors) are well designed on the basis of the statistic distribution of the speech signal to be encoded.
On the other hand, as previously described, the CELP speech encoder, for which the present invention is intended, performs vector quantization which, unlike conventional vector quantization, involves determining the optimum vector by evaluating distance in a signal vector space containing code vectors processed through a linear predictive filter having a filter transfer function Az.
Therefore, as shown in FIGS. 6A and 6B, a residual signal space (the sphere shown in FIG. 6A for L=3) is converted by the linear predictive filter into a reproduced signal space; in general, at this time the directional components of the axes are not uniformly amplified, but are amplified with a certain distortion, as shown in FIG. 6B.
That is, the characteristic (A) of the linear predictive filter exhibits a different amplitude amplification characteristic for each delta vector which is a component element of the code book, and consequently, the resulting vectors are not distributed uniformly throughout the space.
Furthermore, in the tree-structure delta code book shown in FIG. 5, the contribution of each delta vector to code vectors varies depending on the position of the delta vector in the delta code book 10. For example, the delta vector .DELTA..sub.1 at the second position contributes to all the code vectors at the second and lower levels, and likewise, the delta vector .DELTA..sub.2 at the third position contributes to all the code vectors at the third and lower levels, whereas the delta vector .DELTA..sub.9 contributes only to the code vectors at the 10th level. This means that the contribution of each delta vector to the code vectors can be changed by changing the order of the delta vectors.
Noting the above facts, the present applicant has shown, in Japanese Patent Application No. 3-515016, that the characteristic can be improved as compared with the conventional tree-structure code book having a biased distribution, when encoding is performed using a code book constructed in the following manner: each delta vector .DELTA..sub.i is processed with the filter characteristic (A), the (amplification ratio of the) power, .vertline.A.DELTA..sub.i .vertline..sup.2 =(A.DELTA..sub.i).sup.T (A.DELTA..sub.i), is calculated for the resulting vector A.DELTA..sub.i (the power of A.DELTA..sub.i is equal to the amplification ratio if the delta vector is normalized), and the delta vectors are reordered in order of decreasing power by comparing the calculated results with each other.
However, in this case also, the number of delta vectors is equal to the number actually used, and encoding is performed using the delta vectors reordered among them. This therefore places a constraint on the freedom of the code book.
For example, to simplify the discussion, consider the case of L=2, that is, a tree-structure delta code book wherein code vectors C.sub.0, C.sub.1 (=.DELTA..sub.0 +.DELTA..sub.1), and C.sub.2 (=.DELTA..sub.0 -.DELTA..sub.1) are generated from the vector C.sub.0 (=.DELTA..sub.0) and delta vector .DELTA..sub.1. If the vectors used as .DELTA..sub.0 and .DELTA..sub.1 are limited to unit vectors e.sub.x an e.sub.y, as shown in FIG. 7A, the code vectors generated are confined to the x-y plane indicated by oblique hatching even if the order is changed. On the other hand, when two vectors are selected from among three linearly independent unit vectors, e.sub.x, e.sub.y, and e.sub.z, and used as .DELTA..sub.0 and .DELTA..sub.1, greater freedom is allowed for the selection of a subspace, as shown in FIGS. 7A to 7C.
Improvement of the Tree-Structure Delta Code Book
The present invention aims at a further improvement of the delta code book, which is achieved as follows. L' delta vector candidates (L'>L), larger in number than L delta vectors (L vectors=initial vector+(L-1) delta vectors) actually used for the construction of the code book, are provided, and these candidates are reordered by performing the same operation as described above, from which candidates the desired number of delta vectors (L delta vectors) are selected in order of decreasing amplification ratio to construct the code book. The code book thus constructed provides greater freedom and contributes to improving the quantization characteristic.
The above description has dealt with the encoder, but in the matching decoder also, the same delta vector candidates as in the encoding side are provided and the same control is performed in the decoder so that a code book of the same contents as in the encoder is constructed, thereby maintaining the matching with the encoder.
FIG. 8 is a block diagram showing one embodiment of a speech encoding method according to the present invention based on the above concept. In this embodiment, the delta vector code book 10 is constructed to store and hold an initial vector C.sub.0 (=.DELTA..sub.0) representing one reference noise train and delta vectors .DELTA..sub.1 -.DELTA..sub.L'-1 representing (L'-1) N-dimensional delta noise trains larger in number than the actually used (L-1). The initial vector C.sub.0 and the delta vectors .DELTA..sub.1 -.DELTA..sub.L'-1 are each defined in N dimensions. That is, the initial vectors and the delta vectors are N-dimensional vectors formed by encoding the noise amplitudes of N samples generated in time series.
Also, in this embodiment, the linear predictive filter 3 is constructed from an IIR filter of order Np. An N.times.N rectangular matrix A, generated from the impulse response of this filter, is multiplied by each delta vector .DELTA..sub.i to perform filtering A on the delta vector .DELTA..sub.i, and the resulting vector A.DELTA..sub.i is output. The Np coefficients of the IIR filter vary in accordance with the input speech signal, and are determined by a known method. More specifically, since there exists a correlation between adjacent samples of the input speech signal, a correlation coefficient between samples is obtained, from which a partial autocorrelation coefficient, known as PARCOR coefficient, is obtained; then, from this PARCOR coefficient, an alpha coefficient of the IIR filter is determined, and using the impulse response train of the filter, an N.times.N rectangular matrix A is formed to perform filtering on each vector .DELTA..sub.i.
The L' vectors A.DELTA..sub.i (i=0, 1, . . . , L'-1) thus filtered are stored in a memory 40, and the power, .vertline.A.DELTA..sub.i .vertline..sup.2 =(A.DELTA..sub.i).sup.T (A.DELTA..sub.i), is evaluated in a power evaluator 42. Since each delta vector is normalized (.vertline..DELTA..sub.i .vertline..sup.2 =(.DELTA..sub.i).sup.T (.DELTA..sub.i)=1), the degree of amplification through the filtering A is directly evaluated by just evaluating the power. Next, based on the evaluation results supplied from the power evaluator 42, the vectors are reordered in a sorting section 43 in order of decreasing power. In the example of FIG. 6B, the vectors are reordered as follows.
.DELTA..sub.0 =e.sub.z, .DELTA..sub.1 =e.sub.x, .DELTA..sub.2 =e.sub.y
The thus reordered vectors A.DELTA..sub.i (i=0, 1, . . . , L'-1) total L' in number, but the subsequent encoding process is performed using the actually used L vectors A.DELTA..sub.i (i=0, 1, . . . , L-1).
Therefore, L vectors are selected in order of decreasing amplification ratio and stored in a selection memory 41. In the above example, .DELTA..sub.0 =e.sub.z and .DELTA..sub.1 =e.sub.x are selected from among the above delta vectors. Then, using the tree-structure delta code book constructed from these selected vectors, the encoding process is performed in exactly the same manner as previously described for the conventional tree-structure delta code book.
Details of the Encoding Process
The following describes in detail an encoder 48 that determines the index of the code vector C that is closest in distance to the input signal vector X from the input signal vector X and the tree-structure code book consisting of the vectors, A.DELTA..sub.0, A.DELTA..sub.1, A.DELTA..sub.2, . . . , A.DELTA..sub.L-1, stored in the selection memory 41.
The encoder 48 comprises: a calculator 50 for calculating the cross-correlation, X.sup.T (A.DELTA..sub.i), between the input signal vector X and each delta vector .DELTA..sub.i ; a calculator 52 for calculating the autocorrelation, (A.DELTA..sub.i).sup.T (A.DELTA..sub.i), of each delta vector .DELTA..sub.i ; a calculator 54 for calculating the cross-correlation, (A.DELTA..sub.i).sup.T (A.DELTA..sub.0, 1, 2, . . . , i-1), between each delta vector; a calculator 55 for calculating the orthogonal term (A.DELTA..sub.i).sup.T (AC.sub.k) from the output of the calculator 54; a calculator 56 for accumulating the cross-correlation of each delta vector from the calculator 50 and calculating the cross-correlation R.sub.XC between the input signal vector X and each code vector C; a calculator 58 for accumulating the autocorrelation, (A.DELTA..sub.i).sup.T (A.DELTA..sub.i), of each delta vector .DELTA..sub.i fed from the calculator 52 and each orthogonal term (A.DELTA..sub.i).sup.T (AC.sub.k) fed from the calculator 55, and calculating the autocorrelation of each code vector C; a calculator 60 for calculating R.sub.CX.sup.2 /R.sub.CC ; a smallest-error noise train determining device 62; and a speech encoder 64.
First, parameter i indicating the tree-structure level under calculation is set to 0. In this state, the calculators 50 and 52 calculate X.sup.T (A.DELTA..sub.0) and (A.DELTA..sub.0).sup.T (A.DELTA..sub.0), respectively, which are output. The calculators 54 and 55 output 0. X.sup.T (A.DELTA..sub.0) and (A.DELTA..sub.0).sup.T (A.DELTA..sub.0) output from the calculators 50 and 52, respectively, are stored in the calculators 56 and 58 as the cross-correlation R.sub.XC.sup.(0) and autocorrelation R.sub.CC.sup.(0), respectively, which are output. From the R.sub.XC.sup.(0) and R.sub.CC.sup.(0), the calculator 60 calculates the value of F(X, C)=R.sub.XC.sup.2 /R.sub.CC which is output.
The smallest-error noise train determining device 62 compares the thus calculated F(X, C) with the maximum value Fmax (initial value 0) of previous F(X, C); if F(X, C)>Fmax, Fmax is updated by taking F(X, C) as Fmax, and at the same time, the previous code is updated by a code that specifies the noise train (code vector) providing the Fmax.
Next, the parameter i is updated from 0 to 1. In this state, the calculators 50 and 52 calculate X.sup.T (A.DELTA..sub.1) and (A.DELTA..sub.1).sup.T (A.DELTA..sub.1), respectively, which are output. The calculator 54 calculates (A.DELTA..sub.1).sup.T (A.DELTA..sub.0), which is output. The calculator 55 outputs the input value as the orthogonal term (A.DELTA..sub.1).sup.T (AC.sub.0). From the stored R.sub.XC.sup.(0) and the value of X.sup.T (A.DELTA..sub.1) output from the calculator 50, the calculator 56 calculates the values of the cross-correlations R.sub.XC.sup.(1) and R.sub.XC.sup.(2) at the second level in accordance with Equation (10) or (11); the calculated values are output and stored. From the stored R.sub.CC.sup.(0) and the values of (A.DELTA..sub.1).sup.T (A.DELTA..sub.1) and (A.DELTA..sub.1).sup.T (AC.sub.0) respectively output from the calculators 52 and 55, the calculator 58 calculates the values of the autocorrelations R.sub.CC.sup.(1) and R.sub.CC.sup.(2) at the second level in accordance with Equation (12) or (13); the values are output and stored. The operation of the calculator 60 and smallest-error noise train determining device 62 is the same as when i=0.
Next, the parameter i is updated from 1 to 2. In this state, the calculators 50 and 52 calculate X.sup.T (A.DELTA..sub.2) and (A.DELTA..sub.2).sup.T (A.DELTA..sub.2), respectively, which are output. The calculator 54 calculates the cross-correlations, (A.DELTA..sub.2).sup.T (A.DELTA..sub.1) and (A.DELTA..sub.2).sup.T (A.DELTA..sub.0), of .DELTA..sub.2 relative to .DELTA..sub.1 and .DELTA..sub.0, respectively. From these values, the calculator 55 calculates the orthogonal term (A.DELTA..sub.2).sup.T (AC.sub.1) in accordance with Equation (14), and outputs the result. From the stored R.sub.XC.sup.(1) and R.sub.XC.sup.(2) and the value of X.sup.T (A.DELTA..sub.2) fed from the calculator 50, the calculator 56 calculates the values of the cross-correlations R.sub.XC.sup.(3-6) at the third level in accordance with Equations (10) or (11); the calculated values are output and stored. From the stored R.sub.CC.sup.(1) and R.sub.CC.sup.(2) and the values of (A.DELTA..sub.2).sup.T (A.DELTA..sub.2) and (A.DELTA..sub.2).sup.T (AC.sub.1) respectively output from the calculators 52 and 55, the calculator 58 calculates the values of the autocorrelations R.sub.C.sup.(3-6) at the third level in accordance with Equation (12) or (13); the calculated values are output and stored. The operation of the calculator 60 and smallest-error noise train determining device 62 is the same as when i=0 or 1.
The above process is repeated until the processing for i=L-1 is completed, upon which the speech encoder 64 outputs the latest code stored in the smallest-error noise train determining device 62 as the index of the code vector that is closest in distance to the input signal vector X.
When calculating (A.DELTA..sub.i).sup.T (A.DELTA..sub.i) in the calculator 52, the calculation result from the power evaluator 42 can be used directly.
Variable Rate Encoding
Using the previously described tree-structure delta code book or the tree-structure delta code book improved by the present invention, variable rate encoding can be realized that does not require as much memory as is required for the conventional code book and is capable of coping with bit drop situations.
That is, a tree-structure delta code book, having the structure shown in FIG. 9A consisting of .DELTA..sub.0, .DELTA..sub.1, .DELTA..sub.2, . . . , is stored. If, of these vectors, encoding is performed using only the vector .DELTA..sub.0 at the first level so that two code vectors
C.sub.* =0 (Zero vector)
C.sub.0 =.DELTA..sub.0
are generated, as shown in FIG. 9B, then one-bit encoding is accomplished with one-bit information indicating whether to select or not select C.sub.0 as the index data.
If encoding is performed using the vectors .DELTA..sub.0 and .DELTA..sub.1 down to the second level so that four code vectors
C.sub.* =0
C.sub.0 =.DELTA..sub.0
C.sub.1 =.DELTA..sub.0 +.DELTA..sub.1
C.sub.2 =.DELTA..sub.0 -.DELTA..sub.1
are generated, then two-bit encoding is accomplished with two-bit information, one bit indicating whether C.sub.0 is selected as the index data and the other specifying .DELTA.C.sub.1 or -.DELTA.C.sub.1.
Likewise, using vectors .DELTA..sub.0, .DELTA..sub.1, . . . , .DELTA..sub.i down to the ith level, i-bit encoding can be accomplished. Accordingly, by using one tree-structure delta code book containing .DELTA..sub.0, .DELTA..sub.1, . . . , .DELTA..sub.L-1, the bit length of the generated index data can be varied as desired within the range of 1 to L.
If variable bit rate encoding with 1 to L bits is to be realized using the conventional code book, the number of words in the required memory will be
N.times.(2.sup.0 +2.sup.1 + . . . +2.sup.L)=N.times.(2.sup.L+1 -1)
where N is the vector dimension. By contrast, if the tree-structure delta code book of FIG. 9A is used as shown in FIG. 9B, the number of words in the required memory will be
N.times.L
Either the previously described tree-structure delta code book wherein the vectors are not reordered, the tree-structure delta code book wherein the delta vectors are reordered according to the amplification ratio by A, or the tree-structure delta code book wherein L data vectors are selected for use from among L' delta vectors, may be used to realize the tree-structure delta code book described above.
Variable bit rate control can be easily accomplished by stopping the processing in the encoder 48 at the desired level corresponding to the desired bit length. For example, for four-bit encoding, the encoder 48 should be controlled to perform the above-described processing for i=0, 1, 2, and 3.
Embedded Encoding
Embedded encoding is an encoding scheme capable of reproducing voice at the decoder even if part of bits are dropped along the transmission channel. In variable rate encoding using the above tree-structure delta code book, this can be accomplished by constructing the encoding system so that if any bit is dropped, the affected code vector can be reproduced as the code vector of its parent or ancestor in the tree structure. For example, in a four-bit encoding system �C.sub.0, C.sub.1, . . . , C.sub.14 !, if one bit is dropped, C.sub.13 and C.sub.14 are reproduced as C.sub.6 in a three-bit code and C.sub.12 and C.sub.11 as C.sub.5 in a three-bit code. In this manner, speech sound can be reproduced without significant degradation in sound quality since code vectors having a parent-child relationship have relatively close values.
Tables 1 to 4 show an example of such an encoding scheme.
TABLE 1______________________________________transmitted bits: 1 bitcode vector transmitted code______________________________________C.sub.* 0C.sub.0 1______________________________________
TABLE 2______________________________________transmitted bits: 2 bitcode vector transmitted code______________________________________C.sub.* 00C.sub.0 01C.sub.1 11C.sub.2 10______________________________________
TABLE 3______________________________________transmitted bits: 3 bitcode vector transmitted code______________________________________C.sub.* 000C.sub.0 001C.sub.1 011C.sub.2 010C.sub.3 111C.sub.4 110C.sub.5 101C.sub.6 100______________________________________
TABLE 4______________________________________transmitted bits: 4 bitcode vector transmitted code______________________________________C.sub.* 0000C.sub.0 0001C.sub.1 0011C.sub.2 0010C.sub.3 0111C.sub.4 0110C.sub.5 0101C.sub.6 0100C.sub.7 1111C.sub.8 1110C.sub.9 1101.sub. C.sub.10 1100.sub. C.sub.11 1011.sub. C.sub.12 1010.sub. C.sub.13 1001.sub. C.sub.14 1000______________________________________
In the case of 4 bits, for example, the above encoding scheme is set as follows.
C.sub.11 =.DELTA..sub.0 -.DELTA..sub.1 +.DELTA..sub.2 +.DELTA..sub.3 has four delta vector elements whose signs are (+, -, +, +) in decreasing order of significance, and is therefore expressed as "11011".
C.sub.2 =.DELTA..sub.0 -.DELTA..sub.1 has only two delta vector elements whose signs are (+, -) in this order. The code in this case is assumed equivalent to (0, 0, +, -) and expressed as "0010".
Table 5 shows how the thus encoded information is reproduced when a one-bit drop has occurred, reducing 4 bits to 3 bits.
TABLE 5______________________________________ transmission channelencode (4 bits) (bit drop) decode (3 bits)______________________________________C.sub.* 0000 0000 .fwdarw. 000 000 C.sub.*C.sub.0 0001 0001 .fwdarw. 000 000 C.sub.*C.sub.1 0011 0011 .fwdarw. 001 001 C.sub.0C.sub.2 0010 0010 .fwdarw. 001 001 C.sub.0C.sub.3 0111 0111 .fwdarw. 011 011 C.sub.1C.sub.4 0110 0110 .fwdarw. 011 011 C.sub.1C.sub.5 0101 0101 .fwdarw. 010 010 C.sub.2C.sub.6 0100 0100 .fwdarw. 010 010 C.sub.2C.sub.7 1111 1111 .fwdarw. 111 111 C.sub.3C.sub.8 1110 1110 .fwdarw. 111 111 C.sub.3C.sub.9 1101 1101 .fwdarw. 110 110 C.sub.4.sub. C.sub.10 1100 1100 .fwdarw. 110 110 C.sub.4.sub. C.sub.11 1011 1011 .fwdarw. 101 101 C.sub.5.sub. C.sub.12 1010 1010 .fwdarw. 101 101 C.sub.5.sub. C.sub.13 1001 1001 .fwdarw. 100 100 C.sub.6.sub. C.sub.14 1000 1000 .fwdarw. 100 100 C.sub.6______________________________________
As can be seen from Table 5 in conjunction with FIG. 9A, when a one-bit drop occurs, the affected code is reproduced as the vector one level upward.
When two bits are dropped, the code is reconstructed as shown in Table 6.
TABLE 6______________________________________ transmission channelencode (4 bits) (bit drop) decode (2 bits)______________________________________C.sub.* 0000 0000 .fwdarw. 00 00 C.sub.*C.sub.0 0001 0001 .fwdarw. 00 00 C.sub.*C.sub.1 0011 0011 .fwdarw. 00 00 C.sub.*C.sub.2 0010 0010 .fwdarw. 00 00 C.sub.*C.sub.3 0111 0111 .fwdarw. 01 01 C.sub.0C.sub.4 0110 0110 .fwdarw. 01 01 C.sub.0C.sub.5 0101 0101 .fwdarw. 01 01 C.sub.0C.sub.6 0100 0100 .fwdarw. 01 01 C.sub.0C.sub.7 1111 1111 .fwdarw. 11 11 C.sub.1C.sub.8 1110 1110 .fwdarw. 11 11 C.sub.1C.sub.9 1101 1101 .fwdarw. 11 11 C.sub.1.sub. C.sub.10 1100 1100 .fwdarw. 11 11 C.sub.1.sub. C.sub.11 1011 1011 .fwdarw. 10 10 C.sub.2.sub. C.sub.12 1010 1010 .fwdarw. 10 10 C.sub.2.sub. C.sub.13 1001 1001 .fwdarw. 10 10 C.sub.2.sub. C.sub.14 1000 1000 .fwdarw. 10 10 C.sub.2______________________________________
In this case, the affected code is reproduced as the vector of its ancestor two levels upward.
Tables 7 to 10 show another example of the embedded encoding scheme of the present invention.
TABLE 7______________________________________transmitted bits: 1 bitcode vector transmitted code______________________________________C.sub.* 0C.sub.0 1______________________________________
TABLE 8______________________________________transmitted bits: 2 bitcode vector transmitted code______________________________________C.sub.* 00C.sub.0 01C.sub.1 10C.sub.2 11______________________________________
TABLE 9______________________________________transmitted bits: 3 bitcode vector transmitted code______________________________________C.sub.* 000C.sub.0 001C.sub.1 010C.sub.2 011C.sub.3 100C.sub.4 101C.sub.5 110C.sub.6 111______________________________________
TABLE 10______________________________________transmitted bits: 4 bitcode vector transmitted code______________________________________C.sub.* 0000C.sub.0 0001C.sub.1 0010C.sub.2 0011C.sub.3 0100C.sub.4 0101C.sub.5 0110C.sub.6 0111C.sub.7 1000C.sub.8 1001C.sub.9 1010.sub. C.sub.10 1011.sub. C.sub.11 1100.sub. C.sub.12 1101.sub. C.sub.13 1110.sub. C.sub.14 1111______________________________________
In this encoding scheme also, when one bit is dropped, the parent vector of the affected vector is substituted, and when two bits are dropped, the ancestor vector two levels upward is substituted.
Claims
  • 1. A speech encoding method by which an input speech signal vector is encoded using an index assigned to a code vector that, among predetermined code vectors, is closest in distance to said input speech signal vector, comprising the steps of:
  • a) storing a plurality of differential code vectors having a tree structure;
  • b) multiplying each of said differential code vectors by a matrix of a linear predictive filter;
  • c) evaluating a power amplification ratio of each differential code vector multiplied by said matrix;
  • d) reordering the differential code vectors, each multiplied by said matrix, in decreasing order of said evaluated power amplification ratio;
  • e) selecting from among said reordered vectors a prescribed number of vectors in decreasing order of said evaluated power amplification ratio, the largest ratio first the number of the selected vectors being smaller than a number of the reordered vectors;
  • f) evaluating the distance between said input speech signal vector and each of linear-predictive-filtered code vectors that are to be formed by sequentially adding and subtracting said selected vectors through the tree structure; and
  • g) determining the code vector for which said evaluated distance is the smallest.
  • 2. A method according to claim 1, wherein each of said differential code vectors is normalized.
  • 3. A method according to claim 1, wherein
  • said step f) includes: calculating a cross-correlation R.sub.XC between said input speech signal vector and each of said linear-predictive- filtered code vectors by calculating the cross-correlation between said input speech signal vector and each of said selected vectors and by sequentially performing additions and subtractions through the tree structure; calculating an autocorrelation R.sub.CC of each of said linear-predictive- filtered code vectors by calculating the autocorrelation of each of said selected vectors and the cross-correlation of every possible combination of different vectors and by sequentially performing additions and subtractions through the tree structure; and calculating the quotient of a square of the cross-correlation R.sub.XC by the autocorrelation R.sub.CC, R.sub.XC.sup.2 /R.sub.CC, for each of said code vectors, and
  • said step g) includes determining the code vector that maximizes the value of R.sub.XC.sup.2 /R.sub.CC, as the code vector that is closest in distance to said input speech signal vector.
  • 4. A speech encoding apparatus by which an input speech signal vector is encoded using an index assigned to a code vector that, among predetermined code vectors, is closest in distance to said input speech signal vector, comprising:
  • means for storing a plurality of differential code vectors having a tree structure;
  • means for multiplying each of said differential code vectors by a matrix of a linear predictive filter;
  • means for evaluating a power amplification ratio of each differential code vector multiplied by said matrix;
  • means for reordering the differential code vectors, each multiplied by said matrix, in decreasing order of said evaluated power amplification ratio;
  • means for selecting from among said reordered vectors a prescribed number of vectors in decreasing order of said evaluated power amplification ratio, the largest ratio first, the number of the selected vectors being smaller than a number of the reordered vectors;
  • means for evaluating the distance between said input speech signal vector and each of linear-predictive- filtered code vectors that are to be formed by sequentially adding and subtracting said selected vectors through the tree structure; and
  • means for determining the code vector for which said evaluated distance is the smallest.
  • 5. An apparatus according to claim 4, wherein each of said differential code vectors is normalized.
  • 6. An apparatus according to claim 4, wherein
  • said distance evaluation means includes: means for calculating a cross-correlation R.sub.XC between said input speech signal vector and each of said linear-predictive- filtered code vectors by calculating the cross-correlation between said input speech signal vector and each of said selected vectors and by sequentially performing additions and subtractions through the tree structure; means for calculating an autocorrelation R.sub.CC of each of said linear-predictive- filtered code vectors by calculating the autocorrelation of each of said selected vectors and the cross-correlation of every possible combination of different vectors and by sequentially performing additions and subtractions through the tree structure; and means for calculating the quotient of a square of the cross-correlation R.sub.XC by the autocorrelation R.sub.CC, R.sub.XC.sup.2 /R.sub.CC, for each of said code vectors, and
  • said code vector determining means includes means for determining the code vector that maximizes the value of R.sub.XC.sup.2 /R.sub.CC, as the code vector that is closest in distance to said input speech signal vector.
  • 7. A variable-length speech encoding method by which an input speech signal vector is variable-length encoded using a variable-length code assigned to a code vector that, among predetermined code vectors, is closest in distance to said input speech signal vector, comprising the steps of:
  • a) storing a plurality of differential code vectors having a tree structure;
  • b) evaluating a distance between said input speech signal vector and each of code vectors that are to be formed by sequentially performing additions and subtractions with regard to differential code vectors the number of which corresponds to a variable code length, working from a root of the tree structure;
  • c) determining a code vector for which said evaluated distance is the smallest; and
  • d) determining a code, of the variable code length, to be assigned to said determined code vector.
  • 8. A method according to claim 7, further comprising the step of multiplying each of said differential code vectors by a matrix in a linear predictive filter, wherein in said step b) the distance is evaluated between said input speech signal vector and each of linear-predictive- filtered code vectors that are to be formed by sequentially adding and subtracting the differential code vectors, each multiplied by said matrix, through the tree structure.
  • 9. A method according to claim 8, wherein
  • said step b) includes: calculating a cross-correlation R.sub.XC between said input speech signal vector and each of said linear-predictive- filtered code vectors by calculating the cross-correlation between said input speech signal vector and each of said differential code vectors multiplied by said matrix and by sequentially performing additions and subtractions through the tree structure; calculating an autocorrelation R.sub.CC of each of said linear-predictive- filtered code vectors by calculating the autocorrelation of each of said differential code vectors multiplied by said matrix and the cross-correlation of every possible combination of different vectors and by sequentially performing additions and subtractions through the tree structure; and calculating the quotient of a square of the cross-correlation R.sub.XC by the autocorrelation R.sub.CC, R.sub.XC.sup.2 /R.sub.CC, for each of said code vectors, and
  • said step c) includes determining the code vector that maximizes the value of R.sub.XC.sup.2 /R.sub.CC, as the code vector that is closest in distance to said input speech signal vector.
  • 10. A method according to claim 9, further comprising the steps of:
  • evaluating a power amplification ratio of each differential code vector multiplied by said matrix; and
  • reordering the differential code vectors, each multiplied by said matrix, in decreasing order of said evaluated power amplification ratio;
  • wherein in said step b) the additions and subtractions are performed in the thus reordered sequence through the tree structure.
  • 11. A method according to claim 10, further comprising the step of selecting from among said reordered vectors a prescribed number of vectors in decreasing order of said evaluated power amplification ratio, the largest ratio first, wherein in said step b) the additions and subtractions are performed on said selected vectors through the tree structure.
  • 12. A method according to claim 7, wherein a code is assigned to said code vector in such a manner as to be associated with a code vector corresponding to the parent thereof in the tree structure when one bit is dropped from any of said code vectors.
  • 13. A variable-length speech encoding apparatus by which an input speech signal vector is variable-length encoded using a variable-length code assigned to a code vector that, among predetermined code vectors, is closest in distance to said input speech signal vector, comprising:
  • means for storing a plurality of differential code vectors having a tree structure;
  • means for evaluating a distance between said input speech signal vector and each of the code vectors that are to be formed by sequentially performing additions and subtractions with regard to differential code vectors the number of which corresponds to a variable code length, working from a root of the tree structure;
  • means for determining a code vector for which said evaluated distance is the smallest; and
  • means for determining a code, of the variable code length, to be assigned to said determined code vector.
  • 14. An apparatus according to claim 13, further comprising means for multiplying each of said differential code vectors by a matrix in a linear predictive filter, wherein said distance evaluating means evaluates the distance between said input speech signal vector and each of linear-predictive- filtered code vectors that are to be formed by sequentially adding and subtracting the differential code vectors, each multiplied by said matrix, through the tree structure.
  • 15. An apparatus according to claim 14, wherein
  • said distance evaluating means includes: means for calculating a cross-correlation R.sub.XC between said input speech signal vector and each of said linear-predictive- filtered code vectors by calculating the cross-correlation between said input speech signal vector and each of said differential code vectors multiplied by said matrix and by sequentially performing additions and subtractions through the tree structure; means for calculating an autocorrelation R.sub.CC of each of said linear-predictive- filtered code vectors by calculating the autocorrelation of each of said differential code vectors multiplied by said matrix and the cross-correlation of every possible combination of different vectors and by sequentially performing additions and subtractions through the tree structure; and means for calculating the quotient of a square of the cross-correlation R.sub.XC by the autocorrelation R.sub.CC, R.sub.XC.sup.2 /R.sub.CC, for each of said code vectors, and
  • said code vector determining means includes means for determining the code vector that maximizes the value of R.sub.XC.sup.2 /R.sub.CC, as the code vector that is closest in distance to said input speech signal vector.
  • 16. An apparatus according to claim 15, further comprising:
  • means for evaluating a power amplification ratio of each differential code vector multiplied by said matrix; and
  • means for reordering the differential code vectors, each multiplied by said matrix, in decreasing order of said evaluated power amplification ratio;
  • wherein said distance evaluating means performs the additions and subtractions in the thus reordered sequence through the tree structure.
  • 17. An apparatus according to claim 15, further comprising means for selecting from among said reordered vectors a prescribed number of vectors in decreasing order of said evaluated power amplification ratio, the largest ratio first, wherein said distance evaluating means performs the additions and subtractions on said selected vectors through the tree structure.
  • 18. An apparatus according to claim 13, wherein a code is assigned to said code vector in such a manner as to be associated with a code vector corresponding to a parent thereof in the tree structure when one bit is dropped from any of said code vectors.
Priority Claims (1)
Number Date Country Kind
4-246491 Sep 1992 JPX
Parent Case Info

This application is a continuation of application Ser. No. 08/244,068, filed as PCT/JP93/01323, Sep. 16, 1993 published as WO94/07239, Mar. 31, 1994 now abandoned.

US Referenced Citations (2)
Number Name Date Kind
5323486 Taniguchi et al. Jun 1994
5359696 Gerson et al. Oct 1994
Foreign Referenced Citations (10)
Number Date Country
59-012499 Jan 1984 JPX
61-184928 Aug 1986 JPX
2-055400 Feb 1990 JPX
4-039679 Jun 1992 JPX
4-352200 Dec 1992 JPX
4-344699 Dec 1992 JPX
5-088698 Apr 1993 JPX
5-158500 Jun 1993 JPX
5-210399 Aug 1993 JPX
5-232996 Sep 1993 JPX
Continuations (1)
Number Date Country
Parent 244068 May 1994