Claims
- 1. A speech recognition system which separately outputs text and commands comprising:
- an isolated word speech recognizer;
- accessible by said isolated word speech recognizer, a first vocabulary of respective text word models, said isolated word speech recognizer operating to compare speech input with at least a selected portion of said first vocabulary and to provide a plurality of scores indicating the degree of match of said speech input with an identified one or more of said models;
- a continuous speech recognizer;
- accessible by said continuous speech recognizer, a second vocabulary of respective command word models, said continuous speech recognizer operating to compare speech input to said second vocabulary of command word models and to provide a score indicating the degree of match of said speech input with at least one identified sequence of the respective models;
- an arbitration algorithm for selecting from among the models identified by said isolated word speech recognizer and the sequence of models identified by said continuous speech recognizer and for outputting corresponding text if a score from said isolated word recognizer is selected and outputting a respective command if a score from said continuous speech recognizer is selected.
- 2. A recognition system as set forth in claim 1 further comprising means for applying a relative scaling to said recognizer scores by a factor empirically trained to minimize incursions by each of said vocabularies on correct results from the other vocabulary.
- 3. A recognition system as set forth in claim 2 wherein said scaling factor is determined by applying, with a given factor,
- A. isolated word test data to both recognizers and counting the intrusions by CSR models on correct ISR translations, and
- B. continuous word test data to both recognizers and counting intrusions by ISR models on correct CSR translations,
- and then adjusting said factor to minimize intrusions.
- 4. A recognition system as set forth in claim 1 further comprising means for testing a sequence of models scored by said continuous speech recognizer to determine if the sequence parses into an executable command.
- 5. A recognition system as set forth in said claim 1 wherein said text word vocabulary includes in excess of 5000 models and said command word vocabulary includes fewer than 2000 models.
- 6. A recognition system as set forth in claim 1 further comprising means for normalizing the score provided by said continuous speech recognizer on the basis of the length of the speech input.
- 7. A speech recognition system which separately outputs text and commands comprising:
- an isolated word speech recognizer;
- accessible by said isolated word speech recognizer, a first vocabulary of respective text word models numbering in excess of 5000, said isolated word speech recognizer operating to compare speech input with at least a selected portion of said first vocabulary and to provide a plurality of scores indicating the degree of match of said speech input with identified ones of said models;
- a continuous speech recognizer;
- accessible by said continuous speech recognizer, a second vocabulary of respective command word models numbering less than 2000, said continuous speech recognizer operating to compare speech input to said second vocabulary of models and to provide a score indicating the degree of match of said speech input with an identified sequence of the respective models;
- means for normalizing the score provided by said continuous speech recognizer on the basis of the length of the speech input;
- an arbitration algorithm for selecting from among the models identified by said isolated word speech recognizer and the sequence of models identified by said continuous speech recognizer and for outputting corresponding text if a score from said isolated word recognizer is selected and outputting a respective command if a score from said continuous speech recognizer is selected.
- 8. A recognition system as set forth in claim 7 further comprising means for applying a relative scaling to said recognizer scores by a factor empirically trained to minimize incursions by each of said vocabularies on correct results from the other vocabulary.
- 9. A speech recognition system which separately outputs text and commands comprising:
- an isolated word speech recognizer;
- accessible by said isolated word speech recognizer, a vocabulary of respective text word models numbering in excess of 5000, said isolated word speech recognizer operating to compare speech input with at least a selected portion of said first vocabulary and to provide a plurality of scores indicating the degree of match of said speech input with identified ones of said models;
- a continuous speech recognizer;
- accessible by said continuous speech recognizer, a second vocabulary of respective command word models numbering less than 2000, said continuous speech recognizer operating to compare speech input to said second vocabulary of models and to provide a score indicating the degree of match of said speech input with an identified sequence of the respective models;
- means for normalizing the score provided by said continuous speech recognizer on the basis of the length of the speech input;
- means for applying a relative scaling to said recognizer scores by a factor empirically trained to minimize incursions by each of said vocabularies on correct results from the other vocabulary; and
- an arbitration algorithm for selecting from among the models identified by said isolated word speech recognizer and the sequence of models identified by said continuous speech recognizer and for outputting corresponding text if a score from said isolated word recognizer is selected and outputting a respective command if a score from said continuous speech recognizer is selected.
- 10. A recognition system as set forth in claim 9 wherein said scaling factor is determined by applying, with a given factor,
- A. isolated word test data to both recognizers and counting the intrusions by CSR models on correct ISR translations, and
- B. continuous word test data to both recognizers and counting intrusions by ISR models on correct CSR translations,
- and then adjusting said factor to minimize intrusions.
CROSS-REFERENCE TO RELATED APPLICATION
This application is a continuation-in-part of application Ser. No. 08/496,979 filed Jun. 30, 1995 and entitled Speech Recognition System Using Arbitration Between Continuous Speech And Isolated Word Modules, now U.S. Pat. No. 5,677,991 issued on Oct. 14, 1997.
US Referenced Citations (7)
Non-Patent Literature Citations (1)
Entry |
Jeffrey C. Scott, "The voices of automation", Computer Shopper, vol. 16, No. 9, p. 550(6), from Computer Select, Aug. 1997. |
Continuation in Parts (1)
|
Number |
Date |
Country |
Parent |
496979 |
Jun 1995 |
|