Claims
- 1. A method for generating voice output for an application, comprising:
receiving a symbolic representation of data to be outputted from the application, wherein the symbolic representation is locale-independent; obtaining a locale attribute that identifies a version of a language that is spoken in a locale; expanding the symbolic representation of the data into a fully articulated locale-specific textual representation of the data; and associating the textual representation of the data with one or more audio files containing locale-specific voice output corresponding to the textual representation.
- 2. The method of claim 1, wherein the method further comprises outputting the audio files to a user.
- 3. The method of claim 2, wherein outputting the audio files to the user involves:
sending references to the audio files from an application server to a voice gateway; and allowing the voice gateway to output the audio files to the user.
- 4. The method of claim 2, further comprising:
receiving a voice input from the user; and interpreting the voice input using a locale-specific grammar.
- 5. The method of claim 1, wherein the locale attribute is encoded in an application markup language.
- 6. The method of claim 1, wherein the locale attribute is encoded in a Voice extensible Markup Language (VoiceXML) document that contains:
a locale-independent representation of how voice output is to be presented to a user; and a locale-independent representation of how a voice input is to be received from the user.
- 7. The method of claim 1, wherein obtaining the locale attribute involves receiving the locale attribute as an application parameter.
- 8. The method of claim 1, wherein obtaining the locale attribute involves receiving the locale attribute as an application parameter associated with a particular user.
- 9. The method of claim 1, wherein the locale attribute includes:
a language code that identifies the language; and a region code that identifies a geographic region in which a locale-specific version of the language is spoken.
- 10. The method of claim 6, wherein the method further comprises translating a Multi-channel extensible Markup Language (MXML) document into the VoiceXML document;
wherein the MXML document can also be translated into other markup languages, such as HyperText Markup Language (HTML).
- 11. The method of claim 1, wherein associating the textual representation of the data with the audio files involves matching the largest possible substrings of the textual representation with corresponding audio files from a library.
- 12. A computer-readable storage medium storing instructions that when executed by a computer cause the computer to perform a method for generating voice output for an application, the method comprising:
receiving a symbolic representation of data to be outputted from the application, wherein the symbolic representation is locale-independent; obtaining a locale attribute that identifies a version of a language that is spoken in the locale; expanding the symbolic representation of the data into a fully articulated locale-specific textual representation of the data; and associating the textual representation of the data with one or more audio files containing locale-specific voice output corresponding to the textual representation.
- 13. The computer-readable storage medium of claim 12, wherein the method further comprises outputting the audio files to a user.
- 14. The computer-readable storage medium of claim 13, wherein outputting the audio files to the user involves:
sending references to the audio files from an application server to a voice gateway; and allowing the voice gateway to output the audio files to the user.
- 15. The computer-readable storage medium of claim 12, wherein the method further comprises:
receiving a voice input from the user; and interpreting the voice input using a locale-specific grammar.
- 16. The computer-readable storage medium of claim 12, wherein the locale attribute is encoded in an application markup language.
- 17. The computer-readable storage medium of claim 121, wherein the locale attribute is encoded in a Voice eXtensible Markup Language (VoiceXML) document that contains:
a locale-independent representation of how voice output is to be presented to a user; and a locale-independent representation of how a voice input is to be received from the user.
- 18. The computer-readable storage medium of claim 12, wherein obtaining the locale attribute involves receiving the locale attribute as an application parameter.
- 19. The computer-readable storage medium of claim 12, wherein obtaining the locale attribute involves receiving the locale attribute as an application parameter associated with a particular user.
- 20. The computer-readable storage medium of claim 12, wherein the locale attribute includes:
a language code that identifies the language; and a region code that identifies a geographic region in which a locale-specific version of the language is spoken.
- 21. The computer-readable storage medium of claim 17, wherein the method further comprises translating a Multi-channel eXtensible Markup Language (MXML) document into the VoiceXML document;
wherein the MXML document can also be translated into other markup languages, such as HyperText Markup Language (HTML).
- 22. The computer-readable storage medium of claim 12, wherein associating the textual representation of the data with the audio files involves matching the largest possible substrings of the textual representation with corresponding audio files from a library.
- 23. An apparatus that generates voice output for an application, comprising:
a receiving mechanism configured to receive a symbolic representation of data to be outputted from the application, wherein the symbolic representation is locale-independent; wherein the receiving mechanism is configured to obtain a locale attribute that identifies a version of a language that is spoken in the locale; an expansion mechanism configured to expand the symbolic representation of the data into a fully articulated locale-specific textual representation of the data; and an association mechanism configured to associate the textual representation of the data with one or more audio files containing locale-specific voice output corresponding to the textual representation.
- 24. The apparatus of claim 23, wherein the apparatus further comprises an output mechanism configured to output the audio files to a user.
- 25. The apparatus of claim 24,
wherein the expansion mechanism and the association mechanism reside within an application server; and wherein the output mechanism resides within a voice gateway, which is configured to receive references to the audio filed from the application server, and to output the audio files to the user.
- 26. The apparatus of claim 23, further comprising a voice input mechanism configured to:
receive a voice input from the user; and to interpret the voice input using a locale-specific grammar.
- 27. The apparatus of claim 23, wherein the locale attribute is encoded in an application markup language.
- 28. The apparatus of claim 23, wherein the locale attribute is encoded in a Voice eXtensible Markup Language (VoiceXML) document that contains:
a locale-independent representation of how voice output is to be presented to a user; and a locale-independent representation of how a voice input is to be received from the user.
- 29. The apparatus of claim 23, wherein the receiving mechanism is configured to obtain the locale attribute as an application parameter.
- 30. The apparatus of claim 23, wherein the receiving mechanism is configured to obtain the locale attribute as an application parameter associated with a particular user.
- 31. The apparatus of claim 23, wherein the locale attribute includes:
a language code that identifies the language; and a region code that identifies a geographic region in which a locale-specific version of the language is spoken.
- 32. The apparatus of claim 28, wherein the apparatus further comprises a translation mechanism configured to translate a Multi-channel eXtensible Markup Language (MXML) document into the VoiceXML document;
wherein the MXML document can also be translated into other markup languages, such as HyperText Markup Language (HTML).
- 33. The apparatus of claim 23, wherein the association mechanism is configured to match the largest possible substrings of the textual representation with corresponding audio files from a library.
- 34. A means for generating voice output for an application, comprising:
a receiving means for receiving a symbolic representation of data to be outputted from the application, wherein the symbolic representation is locale-independent; wherein the receiving means is configured to obtain a locale attribute that identifies a version of a language that is spoken in the locale; an expansion means for expanding the symbolic representation of the data into a fully articulated locale-specific textual representation of the data; and an association means for associating the textual representation of the data with one or more audio files containing locale-specific voice output corresponding to the textual representation.
RELATED APPLICATION
[0001] This application hereby claims priority under 35 U.S.C. §119 to U.S. Provisional Patent Application No. 60/440,309, filed on 14 Jan. 2003, entitled “Concatenated Speech Server,” by inventor Christopher Rusnak (Attorney Docket No. OR03-01301PSP), and to U.S. Provisional Patent Application No. 60/446,145, filed on 10 Feb. 2003, entitled “Concatenated Speech Server,” by inventor Christopher Rusnak (Attorney Docket No. OR03-01301PSP2). This application additionally claims priority under 35 U.S.C. §119 to U.S. Provisional Patent Application No. 60/449,078, filed on 21 Feb. 2003, entitled “Globalization of Voice Applications,” by inventors Ashish Vora, Kara L. Sprague and Christopher Rusnak (Attorney Docket No. OR03-03501PRO).
[0002] This application is additionally related to a non-provisional patent application entitled, “Structured Datatype Expansion Framework,” by inventors Ashish Vora, Kara L. Sprague and Christopher Rusnak filed on the same day as the instant application (Attorney Docket No. OR03-03501).
[0003] This application is also related to a non-provisional patent application entitled, “Method and Apparatus for Using Locale-Specific Grammars for Speech Recognition,” by inventor Ashish Vora filed on the same day as the instant application (Attorney Docket No. OR03-03601).
Provisional Applications (3)
|
Number |
Date |
Country |
|
60440309 |
Jan 2003 |
US |
|
60446145 |
Feb 2003 |
US |
|
60449078 |
Feb 2003 |
US |