Claims
- 1. A speech recognition interface for a speech recognition engine, the interface comprising:
a compiler that produces a binary grammar from a markup language grammar written in a markup language; a grammar engine that provides the binary grammar to the speech recognition engine.
- 2. The speech recognition interface of claim 1 wherein the markup language grammar is written in an extensible markup language.
- 3. The speech recognition interface of claim 1 wherein the markup language grammar represents a context-free grammar.
- 4. The speech recognition interface of claim 1 wherein the markup language grammar comprises a switch grammar tag that indicates to the speech recognition engine to switch to a different grammar during the recognition of at least one word.
- 5. The speech recognition interface of claim 4 wherein grammar switch tag is a dictation tag that indicates to the speech recognition engine to switch to a dictation grammar.
- 6. The speech recognition interface of claim 5 wherein the dictation tag indicates to the speech recognition engine to switch to the dictation grammar during the recognition of more than one word.
- 7. The speech recognition interface of claim 4 wherein the switch grammar tag is a text buffer tag that indicates to the speech recognition engine to switch to a grammar stored in a text buffer.
- 8. The speech recognition interface of claim 7 wherein the text buffer comprises a sequence of words and wherein the speech recognition engine identifies a sub-sequence of the words in the sequence of words from an input speech signal.
- 9. The speech recognition interface of claim 1 wherein the markup language grammar comprises a rule tag that delimits a grammar structure that may be referenced by a name attribute of the rule tag.
- 10. The speech recognition interface of claim 9 wherein the rule tag further comprises an interpreter attribute that indicates that code is to be executed when the grammar structure delimited by the rule tag is recognized by the speech recognition engine.
- 11. The speech recognition interface of claim 10 wherein the markup language grammar further comprises a resource tag indicating at least one resource to be provided to the code associated with the interpreter attribute.
- 12. The speech recognition interface of claim 9 wherein the markup language grammar further comprises a script tag that delimits script code to be interpreted when the grammar structure delimited by a rule tag is recognized by the speech recognition engine.
- 13. A computer-readable medium having computer-interpretable instructions comprising:
an application providing a speech interface that expects to receive speech from the user as possible input; and a speech grammar associated with the application and defining valid word patterns for the user's speech, the speech grammar written in a markup language.
- 14. The computer-readable medium of claim 13 wherein the speech grammar comprises grammar tags representing the outermost tags of the grammar.
- 15. The computer-readable medium of claim 13 wherein the speech grammar comprises rule tags that delimit a valid grammar structure for the grammar and that comprise a name attribute which is set equal to the name by which the grammar structure can be referenced.
- 16. The computer-readable medium of claim 15 wherein the speech grammar further comprises rule reference tags that provide a reference to one grammar structure from within a second grammar structure.
- 17. The computer-readable medium of claim 15 wherein the rule tags further comprise an interpreter attribute that indicates whether code associated with the rule tags should be invoked when the grammar structure delimited by the rule tags is recognized as from a speech signal.
- 18. The computer-readable medium of claim 17 wherein the speech grammar further comprises resource tags that delimit the identity of a resource that is to be provided to the code associated with a rule tag.
- 19. The computer-readable medium of claim 17 wherein the code associated with rule tags receives values of semantic properties that have been set because the grammar structure delimited by the rule tags has been recognized from the speech signal.
- 20. The computer-readable medium of claim 15 wherein the speech grammar further comprises script tags delimiting script code that is to be interpreted when the grammar structure delimited by a pair of rule tags is recognized from a speech signal.
- 21. The computer-readable medium of claim 15 wherein the rule tags comprise a semantic property name attribute and a semantic property value attribute such that the semantic property represented by the semantic property name is set equal to the semantic property value when the grammar structure delimited by the rule tags is recognized from a speech signal.
- 22. The computer-readable medium of claim 13 wherein the speech grammar further comprises grammar switch tags that indicate that different grammar should be used during a part of speech recognition.
- 23. The computer-readable medium of claim 22 wherein the grammar switch tags comprise dictation tags that indicate that a dictation grammar should be used for the recognition of at least one word in the grammar structure.
- 24. The computer-readable medium of claim 22 wherein the grammar switch tags comprise text buffer tags that indicate that sub-sequences of words in a sequence of words should be used as a grammar for the recognition of at least one word in the grammar structure.
- 25. The computer-readable medium of claim 13 wherein the speech grammar further comprises phrase tags that delimit at least one word in a grammar structure.
- 26. The computer-readable medium of claim 25 wherein the phrase tags comprise a semantic property name attribute and a semantic property value attribute such that the semantic property represented by the semantic property name is set equal to the semantic property value when the at least one word delimited by the phrase tags is recognized from a speech signal.
- 27. The computer-readable medium of claim 13 wherein the speech grammar further comprises list tags that delimit a list of alternative grammar structures.
- 28. The computer-readable medium of claim 27 wherein the list tags comprise a semantic property name attribute and a semantic property value attribute such that the semantic property represented by the semantic property name is set equal to the semantic property value when at least one of the alternative grammar structures delimited by the list tags is recognized from a speech signal.
- 29. The computer-readable medium of claim 13 wherein the grammar comprises optional tags that delimit a grammar structure that can be but does not have to be recognized from a speech signal in order for a grammar structure that contains the optional to be recognized from the speech signal.
- 30. A method of defining a grammar for speech recognition, the method comprising:
delimiting a grammar structure in rule tags that conform to a markup language; delimiting all of the rule tags for the grammar in grammar tags that conform to a markup language.
- 31. The method of claim 30 wherein delimiting a grammar structure in rule tags comprises setting a name attribute of the rule tags so that the grammar structure can be referred to by the name of the rule tags.
- 32. The method of claim 30 wherein delimiting the grammar structure in rule tags comprises setting a value for an interpreter attribute to indicate that code is to be invoked when the grammar structure delimited by the rule tags is recognized from a speech signal.
- 33. The method of claim 32 wherein delimiting the grammar structure in rule tags further comprises delimiting a resource identifier within resource tags within the rule tags to identify a resource to be provided to the code associated with the interpreter attribute.
- 34. The method of claim 30 wherein delimiting the grammar structure in rule tags comprises delimiting script code within script tags between the rule tags, the script code to be interpreted when the grammar structure delimited by the rule tags is recognized from a speech signal.
- 35. The method of claim 30 wherein delimiting the grammar structure with rule tags comprises setting a semantic property identifier attribute of the rule tag such that the semantic property identified by the semantic property identifier attribute is set equal to a value when the grammar structure delimited by the rule tags is recognized from a speech signal.
- 36. The method of claim 30 wherein delimiting the grammar structure with rule tags comprises delimiting at least one word of the grammar structure in phrase tags.
- 37. The method of claim 36 wherein delimiting at least one word of the grammar structure in phrase tags comprises setting a semantic property identifier attribute and a semantic property value attribute of the phrase tag such that the semantic property identified by the semantic property identifier attribute is set equal to semantic property value when the at least one word delimited by the phrase tags is recognized from a speech signal.
- 38. The method of claim 30 wherein delimiting the grammar structure with rule tags comprises delimiting a list of alternative grammar sub-structures with list tags.
- 39. The method of claim 38 wherein delimiting a list of alternative grammar sub-structures with list tags comprises setting a semantic property identifier attribute of the list tag such that the semantic property identified by the semantic property identifier attribute is set equal to a value when at least one of the grammar sub-structures in the list of alternative grammar sub-structures is recognized from a speech signal.
- 40. The method of claim 30 wherein delimiting the grammar structure with rule tags comprises delimiting an optional grammar sub-structure as optional such that the grammar structure delimited by the rule tags can be recognized from a speech signal regardless of whether the optional grammar sub-structure is recognized from the speech signal.
- 41. The method of claim 30 wherein delimiting the grammar structure with rule tags comprises including a grammar switch tag in the grammar structure to indicate that a different grammar should be used to recognize at least one word from a speech signal.
- 42. The method of claim 41 wherein including a grammar switch tag comprises including a dictation tag to indicate that a dictation grammar should be used to recognize at least one word from the speech signal.
- 43. The method of claim 41 wherein including a grammar switch tag comprises including a text buffer tag to indicate that sub-sequences of words from a sequence of words should be used to recognize at least one word from the speech signal.
REFERENCE TO RELATED APPLICATION
[0001] This application claims priority from a U.S. Provisional Application having serial No. 60/219,861, filed on Jul. 20, 2000 and entitled “MICROSOFT SPEECH SDK (SAPI 5.0)”.
Provisional Applications (1)
|
Number |
Date |
Country |
|
60219861 |
Jul 2000 |
US |