Claims
- 1. A method comprising the steps of:
creating an evaluation model from at least one evaluation phone; creating a synthesizer model from at least one synthesizer phone; and determining a matrix from the evaluation and synthesizer models.
- 2. The method of claim 1:wherein the at least one evaluation phone comprises a first plurality of evaluation phones, the at least one synthesizer phone comprises a first plurality of synthesizer phones; and wherein the method further comprises the steps of: creating a new matrix by subtracting the matrix from an identity matrix; creating an intermediate matrix comprising the new matrix and a second identity matrix; determining a first set of specific elements of the intermediate matrix; and determining acoustic confusability from one of the specific elements.
- 3. The method of claim 2, further comprising the steps of:
creating a second evaluation model comprising the first plurality of evaluation phones and additional evaluation phones; creating a second matrix from the second evaluation model and the synthesizer model; creating a second new matrix by subtracting the second matrix from a third identity matrix; creating a second intermediate matrix comprising the second new matrix and a fourth identity matrix; determining a second set of specific elements of the intermediate matrix, the specific elements corresponding to a column of the second intermediate matrix, wherein the second set of specific elements comprise the first set of specific elements and a new set of specific elements; and determining a second acoustic confusability by using previously performed calculations of the first set of elements and by calculating the new set of specific elements.
- 4. The method of claim 1, wherein the evaluation model comprises a hidden Markov model of the at least one evaluation phone and wherein the synthesizer model comprises a hidden Markov model of the at least one synthesizer phone.
- 5. The method of claim 4, wherein at least one of the hidden Markov models comprises a plurality of states and a plurality of transitions between states, wherein at least one of the transitions is a transition from one of the states to itself, wherein at least one of the transitions is a transition from one of the states to another of the states, wherein each transition has a transition probability associated with it, and wherein each state has a probability density associated with it.
- 6. The method of claim 5, wherein the plurality of states comprises a starting state, an ending state and an intermediate state, wherein the plurality of transitions comprise:
a transition from the starting state to itself; a transition from the starting state to the intermediate state; a transition from the intermediate state to itself; a transition from the intermediate state to the ending state; and a transition from the ending state to itself.
- 7. The method of claim 1, further comprising the steps of:
creating a new matrix by subtracting the matrix from an identity matrix; determining an inverse of the new matrix by the following steps:
creating an intermediate matrix comprising the new matrix and a second identity matrix; determining a specific entry of the second identity matrix that corresponds to acoustic confusability; determining a specific column or row in which the specific entry resides; and performing column or row manipulations to create a third identity matrix in the new matrix while calculating only entries of the specific column or row in the second identity matrix; and selecting the specific entry as the acoustic confusability.
- 8. The method of claim 1, further comprising the steps of:
creating a new matrix by subtracting the matrix from an identity matrix. determining an inverse of the new matrix; and determining acoustic confusability by using the inverse of the new matrix.
- 9. The method of claim 8, wherein the step of determining acoustic confusability by using the inverse of the new matrix comprises the step of selecting one element of the inverse of the new matrix as the acoustic confusability.
- 10. The method of claim 5, wherein the step of determining a matrix from the evaluation and synthesizer models comprises the steps of:
determining a plurality of product machine states; and determining a plurality of product machine transitions between the product machine states.
- 11. The method of claim 10, wherein:
each of the product machine states corresponds to one of the states of the evaluation model and one of the states of the synthesizer model; each of the product machine transitions connects one of the product machine states to the same or another product machine state; and a product machine transition exists when one or both of the following are true: a transition connects one evaluation model state with the same or another evaluation model state and a transition connects one synthesizer model state with the same or another synthesizer model state.
- 12. The method of claim 10, wherein the step of determining a matrix from the evaluation and synthesizer models further comprises the steps of:
determining a product machine transition probability for each of the plurality of product machine transitions; and determining a synthetic likelihood for each of the product machine states.
- 13. The method of claim 10, wherein the matrix comprises a plurality of elements and wherein each element of the matrix corresponds to a potential transition between two of the product machine states.
- 14. The method of claim 13, wherein the step of determining a matrix from the evaluation and synthesizer models fuirther comprises the steps of:
selecting an element of the matrix; assigning a probability to the element if a product machine transition exists between two product machine states corresponding to a potential transition that corresponds to the element, else assigning a zero to the element; and continuing the steps of selecting and assigning until each element of the matrix has been assigned.
- 15. A method comprising the steps of:
a) creating an evaluation model from a plurality of evaluation phones, each of the phones corresponding to a first word; b) creating a synthesizer model from a plurality of synthesizer phones, each of the phones corresponding to a second word; c) creating a product machine from the evaluation model and synthesizer model, the product machine comprising a plurality of transitions and a plurality of states; d) determining a matrix from the product machine; and e) determining acoustic confusability of the first word and the second word by using the matrix.
- 16. The method of claim 15, wherein each of the evaluation and synthesizer models comprises a hidden Markov model.
- 17. The method of claim 16, further comprising the step of determining synthetic likelihoods for each of the plurality of product machine states.
- 18. The method of claim 17, wherein each synthetic likelihood is a measure of the acoustic confusability of two specific observation densities associated with the hidden Markov models of the evaluation and synthesizer models.
- 19. The method of claim 17, wherein the synthetic likelihoods are compressed by normalization.
- 20. The method of claim 17, wherein the synthetic likelihoods are compressed by ranking.
- 21. The method of claim 17, wherein all synthetic likelihoods are determined through a method selected from the group consisting essentially of a cross-entropy measure, a dominance measure, a decoder measure, and an empirical measure.
- 22. The method of claim 15, further comprising the steps of:
f) performing steps (a) through (e) for a plurality of word pairs, each word pair comprising evaluation and synthesizer models, thereby determining a plurality of acoustic conflisabilities; and g) determining acoustic perplexity by using the plurality of acoustic confusabilities.
- 23. The method of claim 15, further comprising the steps of:
f) performing steps (a) through (e) for a plurality of word pairs, each word pair comprising evaluation and synthesizer models, thereby determining a plurality of acoustic confusabilities; and g) determining synthetic acoustic word error rate by using the plurality of acoustic confusabilities.
- 24. A method comprising the steps of:
a) determining acoustic confusability for each of a plurality of word pairs; and b) determining a metric by using the acoustic confusabilities.
- 25. The method of claim 24, wherein step (b) further comprises the step of determining an acoustic perplexity by using the confusabilities.
- 26. The method of claim 25, further comprising the steps of:
c) performing steps (a) and (b) to determine an acoustic perplexity of a base bigram language model; d) performing steps (a) and (b) to determine an acoustic perplexity of an augmented language model; and e) determining gain comprising a logarithm of a fraction determined by dividing the acoustic perplexity of the augmented language model by the acoustic perplexity of the base bigram language model.
- 27. The method of claim 25, further comprising the step of:
c) minimizing acoustic perplexity during training of a language model.
- 28. The method of claim 27, wherein step (c) further comprises the step of maximizing a negative logarithm of the acoustic perplexity.
- 29. The method of claim 24, wherein step (b) further comprises the step of determining a Synthetic Acoustic Word Error Rate (SAWER) by using the confusabilities.
- 30. The method of claim 29, further comprising the steps of:
c) performing steps (a) and (b) to determine a SAWER of a base bigram language model; d) performing steps (a) and (b) to determine a SAWER of an augmented language model; and e) determining an improvement comprising a difference between the SAWER of the augmented language model and the SAWER of the base bigram language model.
- 31. The method of claim 29, further comprising the step of:
c) minimizing the SAWER during training of a language model.
- 32. The method of claim 31, wherein step (c) further comprises the step of maximizing one minus the SAWER.
- 33. The method of claim 29, further comprising the steps of:
c) performing steps (a) and (b) to determine a SAWER for a vocabulary; d) augmenting the vocabulary with at least one additional word; e) performing steps (a) and (b) to determine a SAWER for the augmented vocabulary; and f) determining an improvement comprising a difference between the SAWER for the vocabulary and the SAWER for the augmented vocabulary.
- 34. The method of claim 33, further comprising the steps of:
g) performing steps (d) through (f) for a plurality of additional words; h) determining a particular word of the additional words that has the best improvement; and i) adding the particular word to the vocabulary.
- 35. The method of claim 24, wherein each of the words of the word pairs is represented by a hidden Markov model, and wherein step (a) further comprises the steps of:
creating a product machine for each of the plurality of word pairs, wherein each word each product machine comprising a plurality of states and a plurality of transitions determined by the hidden Markov models of a corresponding word pair; and for each product machine, determining synthetic likelihoods for each of the plurality of product machine states.
- 36. The method of claim 35, wherein each synthetic likelihood is a measure of the acoustic confusability of two specific observation densities associated with the hidden Markov models of the corresponding word pair.
- 37. The method of claim 35, wherein the synthetic likelihoods are compressed by normalization.
- 38. The method of claim 35, wherein the synthetic likelihoods are compressed by ranking.
- 39. The method of claim 35, wherein all synthetic likelihoods are determined through a method selected from the group consisting essentially of a cross-entropy measure, a dominance measure, a decoder measure, and an empirical measure.
- 40. The method of claim 35:
wherein step (a) further comprises the step of, for each acoustic confusability:
determining a matrix from a corresponding product machine; and determining an inverse of a second matrix created by subtracting the matrix from an identity matrix; and wherein each hidden Markov model comprises a plurality of phones; wherein a larger word and a smaller word have an identical sequence of phones; wherein the larger of the two words comprises an additional set of phones; and wherein a set of calculations perfonned when determining the inverse of the matrix for the smaller word is cached and used again when determining the inverse of the matrix for the larger word.
- 41. The method of claim 24, wherein step (a) further comprises the steps of, for each of the word pairs:
determining an edit distance between each word of the word pair; and determining acoustic confusability from the edit distance.
- 42. The method of claim 41, wherein the edit distance is determined by determining a number of operations and a type of each operation to change one word of the word pair into the other word of the word pair.
- 43. The method of claim 42, wherein the operations are selected from the group consisting essentially of deletions, substitutions and additions of phones.
- 44. The method of claim 42, further comprising the step of weighting each operation.
- 45. The method of claim 42, further comprising the step of assigning a cost to each operation.
- 46. A method for determining acoustic confusability of a word pair, the method comprising the steps of:
determining an edit distance between each word of the word pair; and determining acoustic confusability from the edit distance.
- 47. The method of claim 46, wherein the edit distance is determined by determining a number of operations and a type of each operation to change one word of the word pair into the other word of the word pair.
- 48. The method of claim 47, wherein the operations are selected from the group consisting essentially of deletions, substitutions and additions of phones.
- 49. The method of claim 47, further comprising the step of weighting each operation.
- 50. The method of claim 47, further comprising the step of assigning a cost to each operation.
- 51. A system comprising:
a memory that stores computer-readable code; and a processor operatively coupled to said memory, said processor configured to implement said computer-readable code, said computer-readable code configured to: creating an evaluation model from at least one evaluation phone; creating a synthesizer model from at least one synthesizer phone; and determining a matrix from the evaluation and synthesizer models.
- 52. A system comprising:
a memory that stores computer-readable code; and a processor operatively coupled to said memory, said processor configured to implement said computer-readable code, said computer-readable code configured to: a) determine acoustic confusability for each of a plurality of word pairs; and b) determine a metric by using the acoustic confusabilities.
- 53. The system of claim 52, wherein the computer-readable code is fuirther configured, when performing step (b), to determine an acoustic perplexity by using the confusabilities.
- 54. The system of claim 52, wherein the computer-readable code is further configured, when performing step (b), to determine a Synthetic Acoustic Word Error Rate (SAWER) by using the confusabilities.
- 55. A system for determining acoustic confusability of a word pair, the system comprising:
a memory that stores computer-readable code; and a processor operatively coupled to said memory, said processor configured to implement said computer-readable code, said computer-readable code configured to: determine an edit distance between each word of the word pair; and determine acoustic confusability from the edit distance.
- 56. An article of manufacture comprising:
a computer-readable medium having computer-readable code means embodied thereon, the computer-readable program code means comprising: a step to creating an evaluation model from at least one evaluation phone; a step to creating a synthesizer model from at least one synthesizer phone; and a step to determining a matrix from the evaluation and synthesizer models.
- 57. An article of manufacture comprising:
a computer-readable medium having computer-readable code means embodied thereon, the computer-readable program code means comprising: a) a step to determine acoustic confusability for each of a plurality of word pairs; and b) a step to determine a metric by using the acoustic confasabilities.
- 58. The article of manufacture of claim 57, wherein the computer-readable program code means further comprises, when performing step (b), a step to determine an acoustic perplexity by using the confusabilities.
- 59. The article of manufacture of claim 57, wherein the computer-readable program code means further comprises, when performing step (b), a step to determine a Synthetic Acoustic Word Error Rate (SAWER) by using the confusabilities.
- 60. An article of manufacture for determining acoustic confasability of a word pair, the article of manufacture comprising:
a computer-readable medium having computer-readable code means embodied thereon, the computer-readable program code means comprising: determine an edit distance between each word of the word pair; and determine acoustic confusability from the edit distance
CROSS REFERENCE TO RELATED APPLICATIONS
[0001] This application claims the benefit of United States Provisional Application No. 60/199,062, filed Apr. 20, 2000.
Provisional Applications (1)
|
Number |
Date |
Country |
|
60199062 |
Apr 2000 |
US |