Claims
- 1. A system, comprising an application and audio processing engines, wherein the exchange of audio between the audio processing engines is decoupled from control and application level exchanges.
- 2. The system of claim 1, wherein the application generates control messages that configure and control the audio processing engines in a manner that renders the exchange of control messages independent of the application model and location of the engines.
- 3. The system of claim 1, wherein the audio processing engines comprise web services.
- 4. The system of claim 3, wherein the web services are described and accessed using WSDL (Web Services Description Language), or an extension thereof.
- 5. The system of claim 1, further comprising a task manager that is used to abstract from the application, the discovery of the audio processing engines and remote control of the engines.
- 6. The system of claim 1, further comprising a load manager, responsive to control messages from the application, for selecting an audio processing engine and associating the engine to an audio I/O (input/output port) on one of a session, call and utterance basis.
- 7. The system of claim 1, wherein the application comprises a legacy IVR (interactive voice response) application.
- 8. The system of claim 1, wherein the control and application level exchanges are implemented using SOAP (simple object access protocol) or an extension thereof.
- 9. The system of claim 1, wherein the audio processing engines are dynamically associated with the application on one of a persistent, call, session, and utterance basis.
- 10. The system of claim 1, wherein WSFL (web services flow language) or an extension thereof is used to dynamically configure the processing flow of the system.
- 11. A distributed speech processing system, comprising:
a conversational application; an audio I/O processing service, which is programmable by control messages generated by the conversational application to provide audio I/O services for the conversational application; and a speech engine service, which is programmable by control messages generated by the conversational application to provide speech processing services for the conversational application.
- 12. The system of claim 11, wherein the audio I/O processing service and speech engine service comprise Web services.
- 13. The system of claim 11, wherein the control messages are encoded using XML (eXtensible Markup Language) and wherein the control messages are exchanged using SOAP (Simple Object Access Protocol).
- 14. The system of claim 11, wherein each service comprises interfaces that are described using WSDL (Web Services Description Language).
- 15. The system of claim 14, wherein WSFL (web services flow language) or an extension thereof is used to dynamically configure the processing flow of the system.
- 16. The system of claim 11, wherein the speech engine service provides one of automatic speech processing (ASR) services, text-to-speech (TTS) synthesis services, natural language understanding (NLU) services, and a combination thereof.
- 17. The system of claim 11, wherein the audio I/O processing service provides speech encoding/decoding services, audio recording services, audio playback services, and a combination thereof.
- 18. The system of claim 11, further comprising a load manager that dynamically allocates and assigns the services for the conversational application, based on control messages generated by the conversational application.
- 19. The system of claim 11, wherein the services are programmed to negotiate uplink and downlink audio codecs for generating RTP-based audio streams.
- 20. The system of claim 11, wherein the speech engine services are dynamically allocated to the conversational application on one of a call, session, utterance and persistent basis.
- 21. The system of claim 11, wherein the services are discoverable using UDDI (Universal Description, Discovery and Integration) or an extension thereof.
- 22. The system of claim 11, wherein services provided by the speech engine service and audio I/O processing service are defined as a collection of ports.
- 23. The system of claim 22, wherein types of ports comprise audio in, audio out, control in, and control out.
- 24. The system of claim 11, further comprising a task manager that is used to abstract from the conversational application, the discovery of the services and remote control of the services.
- 25. The system of claim 11, wherein the audio I/O service comprises a gateway that connects audio streams from a network to the speech processing services.
- 26. The system of claim 25, wherein the network comprises a PSTN (public switched telephone network)
- 27. The system of claim 25, wherein the network comprises a VoIP (voice over IP) network.
- 28. The system of claim 25, wherein the network comprises a wireless network.
- 29. A speech processing web service, comprising:
a listener for receiving and parsing control messages that are used for programming the speech processing web service, wherein the control message are encoded using XML (extensible Markup Language) and exchanged using SOAP (Simple Object Access Protocol); a business interface layer for exposing speech processing services offered by the web service, wherein the services are described and accessed using WSDL (web services description language); and a business logic layer for providing speech processing services, the speech processing services comprising one of automatic speech recognition, speech synthesis, natural language understanding, acoustic feature extraction, audio encoding/decoding, audio recording, audio playback, and any combination thereof.
- 30. The speech processing web service of claim 29, wherein a service of the speech processing web service is dynamically allocated and assigned to a conversational application and programmed by the conversational application.
- 31. The speech processing web service of claim 29, wherein the web service is advertised via UDDI.
- 32. A method for dynamically allocating speech services in a distributed speech processing system, comprising the steps of:
receiving an incoming call by an application; generating a control message for requesting a speech processing service to service the incoming call; dynamically allocating a speech processing service to the application; generating a control message for dynamically programming the allocated speech service; and processing the incoming call using the programmed speech service.
CROSS REFERENCE TO RELATED APPLICATION
[0001] This application claims priority to U.S. Provisional Application Serial No. 60/300,755, filed on Jun. 25, 2001, which is incorporated herein by reference.
Provisional Applications (1)
|
Number |
Date |
Country |
|
60300755 |
Jun 2001 |
US |