Claims
- 1. A method for robust pattern recognition, comprising the steps of:
(a) generating N sets of feature vectors x1, x2, . . . xN from a set of observation vectors which are indicative of a pattern which it is desired to recognize, at least one of said sets of feature vectors being different than at least one other of said sets of feature vectors and being preselected for purposes of containing at least some complimentary information with regard to said at least one other of said sets of feature vectors; and (b) combining said N sets of feature vectors in a manner to obtain an optimized set of feature vectors which best represents said pattern, said combining being performed in accordance with the equation: p(x1, x2, . . . xN|sj)=f—n{K+[w1p(x1|sj)q+w2p(x2|sj)q+. . . +wNp(xN|sj)q]1/q}where: f—n is one of an exponential function exp( ) and a logarithmic function log( ), sj is a label for a class j, N is greater than or equal to 2, p(x1, x2, . . . xN|sj) is conditional probability of feature vectors x1, x2, . . . xN given that they are generated by said class j, K is a normalization constant, w1, w2, . . . wN are weights assigned to x1, x2, . . . xN respectively according to confidence levels therein; and q is a real number corresponding to a desired combination function.
- 2. The method of claim 1, wherein f—n is said logarithmic function.
- 3. The method of claim 2, wherein K is approximately 100.
- 4. The method of claim, 1, wherein f—n is said exponential function.
- 5. The method of claim 4, wherein step (b) further comprises:
ranking all classes for each of said feature vectors; and generating a merged rank list by picking that class from among each of said feature vectors which yields a highest rank; whereby it is possible to discriminate among correct and incorrect ones of said classes.
- 6. The method of claim 4, further comprising the additional step of, prior to step (b), evaluating a given one of said feature vectors, xk, via information theory-based techniques to determine whether said given one of said feature vectors contains valuable information and should be combined.
- 7. The method of claim 6, wherein said evaluating step comprises the sub-steps of:
computing mutual information, I, in accordance with: I(xm,xk)=ΣΣ((p(xm, xk))/(p(xm)p(xk))) where: the first summation is for xmεXM and the second summation is for xkεXK, k≠m, xm is another given one of said N sets of feature vectors, and XM and XK are a set of all similar feature vectors computed on all training data; and selecting said given feature vector, xk, when I is less than a preselected threshold; whereby complimentary feature vectors are combined for maximum benefit.
- 8. The method of claim 4, wherein:
said observation vectors are frames of speech; said feature vectors are acoustic feature vectors; and said pattern is a time waveform corresponding to speech, such that x1, x2, . . . xN can be represented as {right arrow over (x)}(t), where t is time.
- 9. The method of claim 8, wherein said acoustic feature vectors include at least two of mel cepstra, LDA, centroids, perceptive linear predictions (PLP), LPC cepstra, multiple spectral bands, linear transformations, nonlinear transformations, maximum likelihood linear transformations (MLLT), principal component analysis (PCA), and vocal tract length normalized features (VTL).
- 10. The method of claim 4, wherein N=2.
- 11. The method of claim 4, wherein N>2.
- 12. The method of claim 4, wherein the sum of all of said weights w1, w2, . . . wN is substantially equal to 1.
- 13. The method of claim 4, wherein all of said weights w1, w2, . . . wN are substantially equal.
- 14. The method of claim 4, wherein at least some of said weights w1, w2, . . . wN are not equal.
- 15. The method of claim 4, wherein step (b) comprises the sub-step of employing different weights for different classes.
- 16. The method of claim 4, wherein step (b) includes the sub-step of assigning to at least one of said weights a value of substantially zero so as to shut off a given feature space due to unreliability of said given feature space under predetermined conditions.
- 17. The method of claim 4, wherein step (b) further comprises the sub-step of arbitrarily choosing K to facilitate mathematical operations.
- 18. The method of claim 4, wherein K is assigned a value of zero and f—n is said exponential function.
- 19. The method of claim 4, wherein q is substantially equal to 1.
- 20. The method of claim 4, wherein q approaches infinity, such that:
- 21. An apparatus for robust pattern recognition, said apparatus comprising:
(a) a feature vector generator which generates N sets of feature vectors x1, x2, . . . xN from a set of observation vectors which are indicative of a pattern which it is desired to recognize, at least one of said sets of feature vectors being different than at least one other of said sets of feature vectors and being preselected for purposes of containing at least some complimentary information with regard to said at least one other of said sets of feature vectors; and (b) a feature vector combiner which combines said N sets of feature vectors in a manner to obtain an optimized set of feature vectors which best represents said pattern, said combining being performed in accordance with the equation: p(x1, x2, . . . xN|sj)=f—n{K+[w1p(x1|sj)q+w2p(x2|sj)q+. . . +wNp(xN|sj)q]1/q}where: fn is one of an exponential function exp( ) and a logarithmic function log( ), sj is a label for a class j, N is greater than or equal to 2, p(x1, x2, . . . xN|sj) is conditional probability of feature vectors x1, x2, . . . xN given that they are generated by said class j, K is a normalization constant, w1, w2, . . . wN are weights assigned to x1, x2, . . . xN respectively according to confidence levels therein; and q is a real number corresponding to a desired combination function.
- 22. The apparatus of claim 21, wherein f—n is said logarithmic function.
- 23. The apparatus of claim 22, wherein K is approximately 100.
- 24. The apparatus of claim 21, wherein f—n is said exponential function.
- 25. The apparatus of claim 21, wherein said feature vector combiner further comprises:
a class ranker which ranks all classes for each of said feature vectors; and a merged rank list generator which generates a merged rank list by picking that class from among each of said feature vectors which yields a highest rank; whereby it is possible to discriminate among correct and incorrect ones of said classes.
- 26. The apparatus of claim 21, further comprising a feature vector evaluator which evaluates a given one of said feature vectors via information theory-based techniques to determine whether said given one of said feature vectors contains valuable information and should be combined.
- 27. The apparatus of claim 21, wherein said feature vector evaluator in turn comprises:
a mutual information computation module which computes mutual information, I, in accordance with: I(xm,xk)=ΣΣ((p(xm, xk))/(p(xm)p(xk))) where: the first summation is for xmεXM and the second summation is for xkεXK, k≠m, xm is another given one of said N sets of feature vectors, and XM and XK are a set of all similar feature vectors computed on all training data; and a feature vector set selector which selects said given feature vector, xk, when I is less than a preselected threshold; whereby complimentary feature vectors are combined for maximum benefit.
- 28. The apparatus of claim 21, wherein:
said observation vectors are frames of speech; said feature vectors are acoustic feature vectors; and said pattern is a time waveform corresponding to speech, such that x1, x2, . . . xN can be represented as {right arrow over (x)}(t), where t is time.
- 29. The apparatus of claim 28, wherein said acoustic feature vectors include at least two of mel cepstra, LDA, centroids, perceptive linear predictions (PLP), LPC cepstra, multiple spectral bands, linear transformations, nonlinear transformations, maximum likelihood linear transformations (MLLT), principal component analysis (PCA), and vocal tract length normalized features (VTL).
- 30. The apparatus of claim 21, wherein N=2.
- 31. The apparatus of claim 21, wherein N>2.
- 32. The apparatus of claim 21, wherein the sum of all of said weights w1, w2, . . . wN is substantially equal to 1.
- 33. The apparatus of claim 21, wherein all of said weights w1, w2, . . . wN are substantially equal.
- 34. The apparatus of claim 21, wherein at least some of said weights w1, w2, . . . wN are not equal.
- 35. The apparatus of claim 21, wherein said feature vector combiner is configured to employ different weights for different classes.
- 36. The apparatus of claim 21, wherein at least one of said weights is substantially zero so as to shut off a given feature space due to unreliability of said given feature space under predetermined conditions.
- 37. The apparatus of claim 21, wherein K is arbitrarily chosen to facilitate mathematical operations.
- 38. The apparatus of claim 21, wherein K is assigned a value of zero and f—n is said exponential function.
- 39. The apparatus of claim 21, wherein q is substantially equal to 1.
- 40. The apparatus of claim 21, wherein q approaches infinity, such that:
- 41. A program storage device readable by machine, tangibly embodying a program of instructions executable by the machine to perform method steps for robust pattern recognition, said method steps comprising:
(a) generating N sets of feature vectors x1, x2, . . . xN from a set of observation vectors which are indicative of a pattern which it is desired to recognize, at least one of said sets of feature vectors being different than at least one other of said sets of feature vectors and being preselected for purposes of containing at least some complimentary information with regard to said at least one other of said sets of feature vectors; and (b) combining said N sets of feature vectors in a manner to obtain an optimized set of feature vectors which best represents said pattern, said combining being performed in accordance with the equation: p(x1, x2, . . . xN|sj)=f—n{K+[w1p(x1|sj)q+w2p(x2|sj)q+. . . +wNp(xN|sj)q]1/q}where: fn is one of an exponential function exp( ) and a logarithmic function log( ), sj is a label for a class j, N is greater than or equal to 2, p(x1, x2, . . . xN|sj) is conditional probability of feature vectors x1, x2, . . . xN given that they are generated by said class j, K is a normalization constant, w1, w2, . . . wN are weights assigned to x1, x2, . . . xN respectively according to confidence levels therein; and q is a real number corresponding to a desired combination function.
- 42. A method for robust pattern recognition, comprising the steps of:
(a) generating N sets of feature vectors x1, x2, . . . xN from a set of observation vectors which are indicative of a pattern which it is desired to recognize, at least one of said sets of feature vectors being different than at least one other of said sets of feature vectors and being preselected for purposes of containing at least some complimentary information with regard to said at least one other of said sets of feature vectors; and (b) combining said N sets of feature vectors in a manner to obtain an optimized set of feature vectors which best represents said pattern, said combining being performed via one of:
a weighted likelihood combination scheme wherein a set of weights are assigned to corresponding likelihoods from each of said N sets of feature vectors; and a rank-based state-selection scheme wherein that one of said N sets of feature vectors for which a corresponding one of said likelihoods has a highest rank is selected.
- 43. An apparatus for robust pattern recognition, said apparatus comprising:
(a) a feature vector generator which generates N sets of feature vectors x1, x2, . . . xN from a set of observation vectors which are indicative of a pattern which it is desired to recognize, at least one of said sets of feature vectors being different than at least one other of said sets of feature vectors and being preselected for purposes of containing at least some complimentary information with regard to said at least one other of said sets of feature vectors; and (b) a feature vector combiner which combines said N sets of feature vectors in a manner to obtain an optimized set of feature vectors which best represents said pattern, said combining being performed via one of:
a weighted likelihood combination scheme wherein a set of weights are assigned to corresponding likelihoods from each of said N sets of feature vectors; and a rank-based state-selection scheme wherein that one of said N sets of feature vectors for which a corresponding one of said likelihoods has a highest rank is selected.
- 44. A program storage device readable by machine, tangibly embodying a program of instructions executable by the machine to perform method steps for robust pattern recognition, said method steps comprising:
(a) generating N sets of feature vectors x1, x2, . . . xN from a set of observation vectors which are indicative of a pattern which it is desired to recognize, at least one of said sets of feature vectors being different than at least one other of said sets of feature vectors and being preselected for purposes of containing at least some complimentary information with regard to said at least one other of said sets of feature vectors; and (b) combining said N sets of feature vectors in a manner to obtain an optimized set of feature vectors which best represents said pattern, said combining being performed via one of:
a weighted likelihood combination scheme wherein a set of weights are assigned to corresponding likelihoods from each of said N sets of feature vectors; and a rank-based state-selection scheme wherein that one of said N sets of feature vectors for which a corresponding one of said likelihoods has a highest rank is selected.
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application claims the benefit of U.S. Provisional Application Serial No. 60/238,841 filed Oct. 6, 2000.
Provisional Applications (1)
|
Number |
Date |
Country |
|
60238841 |
Oct 2000 |
US |