Claims
- 1. A computer system comprising:
(a) a database of speech models; (b) a speech recognition (SR) engine adapted to compare user utterances to the database of speech models to recognize the user utterances; (c) an adaptation module adapted to modify the database of speech models based on a set of user utterances corresponding to a set of known inputs; (d) a pronunciation evaluation module adapted to characterize user utterances relative to corresponding speech models in the database; and (e) a sequence generator adapted to generate the set of known inputs used by the adaptation module to modify the database of speech models, wherein the sequence generator automatically selects at least a subset of the known inputs based on the characterization of previous user utterances by the pronunciation evaluation module.
- 2. The invention of claim 1, wherein the speech models are phoneme templates in a parametric domain.
- 3. The invention of claim 1, wherein, using the database of speech models, the SR engine generates and compares parametric representations of the set of known inputs to parametric representations of the user utterances to generate segmentation results for use by the adaptation module and the pronunciation evaluation module.
- 4. The invention of claim 1, further comprising a score management module adapted to collect results from the pronunciation evaluation module and identify one or more problem phonemes, wherein the sequence generator selects additional known inputs for the set of known inputs based on the one or more problem phonemes.
- 5. The invention of claim 4, wherein the score management module thresholds phoneme pronunciation scores from the pronunciation evaluation module to identify the one or more problem phonemes.
- 6. The invention of claim 1, wherein the generation of known inputs for adaptation of speech models in the database automatically terminates when the system determines that all of the speech models are sufficiently adapted.
- 7. The invention of claim 1, wherein:
the speech models are phoneme templates in a parametric domain; using the database of speech models, the SR engine generates and compares parametric representations of the set of known inputs to parametric representations of the user utterances to generate segmentation results for use by the adaptation module and the pronunciation evaluation module; further comprising a score management module adapted to collect results from the pronunciation evaluation module and identify one or more problem phonemes, wherein:
the sequence generator selects additional known inputs for the set of known inputs based on the one or more problem phonemes; and the score management module thresholds phoneme pronunciation scores from the pronunciation evaluation module to identify the one or more problem phonemes; and the generation of known inputs for adaptation of speech models in the database automatically terminates when the system determines that all of the speech models are sufficiently adapted.
- 8. A computer-based method for training a computer application having a speech recognition (SR) engine adapted to compare user utterances to a database of speech models to recognize the user utterances, the method comprising:
generating a set of known inputs; modifying the database of speech models based on a set of user utterances corresponding to the set of known inputs; and characterizing user utterances relative to corresponding speech models in the database, wherein at least a subset of the known inputs are automatically selected based on the characterization of previous user utterances.
- 9. The invention of claim 8, wherein the speech models are phoneme templates in a parametric domain.
- 10. The invention of claim 8, wherein, using the database of speech models, the SR engine generates and compares parametric representations of the set of known inputs to parametric representations of the user utterances to generate segmentation results for use in modifying the database and characterizing the user utterances.
- 11. The invention of claim 8, further comprising collecting results from the pronunciation evaluation module and identifying one or more problem phonemes, wherein additional known inputs are selected for the set of known inputs based on the one or more problem phonemes.
- 12. The invention of claim 11, wherein phoneme pronunciation scores are thresholded to identify the one or more problem phonemes.
- 13. The invention of claim 8, wherein the generation of known inputs for adaptation of speech models in the database automatically terminates when it is determined that all of the speech models are sufficiently adapted.
- 14. The invention of claim 8, wherein:
the speech models are phoneme templates in a parametric domain; using the database of speech models, the SR engine generates and compares parametric representations of the set of known inputs to parametric representations of the user utterances to generate segmentation results for use in modifying the database and characterizing the user utterances; further comprising collecting results from the pronunciation evaluation module and identifying one or more problem phonemes, wherein:
additional known inputs are selected for the set of known inputs based on the one or more problem phonemes; and phoneme pronunciation scores are thresholded to identify the one or more problem phonemes; and the generation of known inputs for adaptation of speech models in the database automatically terminates when it is determined that all of the speech models are sufficiently adapted.
- 15. A machine-readable medium, having encoded thereon program code, wherein, when the program code is executed by a machine, the machine implements a method for training a computer application having a speech recognition (SR) engine adapted to compare user utterances to a database of speech models to recognize the user utterances, the method comprising:
generating a set of known inputs; modifying the database of speech models based on a set of user utterances corresponding to the set of known inputs; and evaluating the user utterances, wherein at least a subset of the known inputs are automatically selected based on the evaluation of previous user utterances.
- 16. The invention of claim 15, wherein, using the database of speech models, the SR engine generates and compares parametric representations of the set of known inputs to parametric representations of the user utterances to generate segmentation results for use in modifying the database and characterizing the user utterances.
- 17. The invention of claim 15, further comprising collecting results from the pronunciation evaluation module and identifying one or more problem phonemes, wherein additional known inputs are selected for the set of known inputs based on the one or more problem phonemes.
- 18. The invention of claim 17, wherein phoneme pronunciation scores are thresholded to identify the one or more problem phonemes.
- 19. The invention of claim 15, wherein the generation of known inputs for adaptation of speech models in the database automatically terminates when it is determined that all of the speech models are sufficiently adapted.
- 20. The invention of claim 15, wherein:
the speech models are phoneme templates in a parametric domain; using the database of speech models, the SR engine generates and compares parametric representations of the set of known inputs to parametric representations of the user utterances to generate segmentation results for use in modifying the database and characterizing the user utterances; further comprising collecting results from the pronunciation evaluation module and identifying one or more problem phonemes, wherein:
additional known inputs are selected for the set of known inputs based on the one or more problem phonemes; and phoneme pronunciation scores are thresholded to identify the one or more problem phonemes; and the generation of known inputs for adaptation of speech models in the database automatically terminates when it is determined that all of the speech models are sufficiently adapted.
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] The subject matter of this application is related to U.S. patent application Ser. No. 10/188,539 filed Jul. 3, 2002 as attorney docket no. Gupta 8-1-4 (referred to herein as “the Gupta 8-1-4 application”), the teachings of which are incorporated herein by reference.