Claims
- 1. A speech recognition system for identifying words from a series of feature vectors representing speech, the system comprising:
a segment model capable of providing a trajectory expression for each of a set of segment states; and a decoder capable of generating a path score that is indicative of the probability that a sequence of words is represented by the series of feature vectors, the path score being based on a feature probability that is determined in part based on differences between a sequence of feature vectors and a segment state's trajectory expression.
- 2. The speech recognition system of claim 1 wherein the segment model comprises a set of probabilistic parameters and wherein the feature probability represents the probability of the sequence of feature vectors given the probabilistic parameters.
- 3. The speech recognition system of claim 2 wherein the probabilistic parameters comprise a trajectory parameter matrix and a covariance matrix.
- 4. The speech recognition system of claim 2 further comprising a trainer for training the probabilistic parameters for each segment state.
- 5. The speech recognition system of claim 4 wherein the trainer adaptively trains the probabilistic parameters based on a probability that is determined in part by taking the difference between a sequence of training feature vectors and the trajectory expression provided by the segment model.
- 6. The speech recognition system of claim 5 wherein the probabilistic parameters comprise a trajectory parameter matrix that is adaptively trained according to:
- 7. The speech recognition system of claim 6 wherein γm|ki is calculated based in part on a feature probability that provides the likelihood of the current sequence of training feature vectors given a segment model of a previous iteration, the feature probability calculated as:
- 8. A method of speech recognition comprising:
accessing a segment model's description of a curve for a segment of speech; determining differences between the curve and input feature vectors associated with the segment of speech; using the differences to determine a segment probability that describes the likelihood of the input feature vectors given the segment model; and identifying a most likely sequence of hypothesized words based in part on the segment probability.
- 9. The method of claim 8 further comprising training the segment model through an iterative process that trains a present iteration's segment model in part by determining differences between training feature vectors and a curve described by a previous iteration's segment model.
- 10. The method of claim 9 wherein determining differences between the training feature vectors and a curve described by a previous iteration's segment model comprises multiplying a parameter matrix of a previous iteration by a generation matrix to produce a product and subtracting the product from a matrix containing the training feature vectors.
- 11. The method of claim 10 wherein training the present iteration's segment model further comprises determining the present iteration's parameter matrix through the calculation of:
- 12. The method of claim 11 wherein γm|ki is calculated based in part on a feature probability that provides the likelihood of the current sequence of training feature vectors given a parameter matrix of a previous iteration and a covariance matrix of a previous iteration, the feature probability calculated as:
- 13. A method of training a speech recognition system using training feature vectors generated from a training speech signal, the method comprising:
segmenting the training feature vectors into segments aligned with units of training text; determining differences between training feature vectors of a segment and a curve defined by a segment model associated with the segment's unit of text; and using the differences to determine a revised segment model for the unit of text.
- 14. A computer-readable medium having computer-executable components for performing steps comprising:
evaluating trajectory expressions at selected frames of a speech signal, the trajectory expressions representing a segment model for a speech recognition system; determining differences between the evaluated trajectory expressions and feature vectors generated from a speech signal; using the differences to determine a segment probability that describes the likelihood of the feature vectors given the segment model; and identifying the likelihood of a sequence of words being present in the speech signal based in part on the segment probability.
REFERENCE TO RELATED APPLICATIONS
[0001] This application is a Divisional of U.S. Patent Application 09/559,509, filed on Apr. 27, 2000 and entitled SPEECH RECOGNITION METHOD AND APPARATUS UTLIZING SEGMENT MODELS.
Divisions (1)
|
Number |
Date |
Country |
Parent |
09559509 |
Apr 2000 |
US |
Child |
10866934 |
Jun 2004 |
US |