The present invention relates to an interaction server for adapting modalities of a dialog between one or several clients and one or several server applications as well as a method and a computer software product for adapting modalities of such dialog.
The invention is based on a priority application, EP 03292201.5, which is hereby incorporated by reference.
In recent years, computers have been provided with a plurality of different types of input devices, such as a keyboard, a mouse, a touch panel, an image scanner, a video camera, a pen and a microphone to enable various information items to be input in various forms. Also a plurality of different types of output devices, such as different forms of display units and a loudspeaker have been provided for outputting various information items in a variety of forms, such as different graphical forms or spoken language. Further, communication terminals provided with different types of input and output devices are enabling the input and the output of information items in various forms. For example, JP10107877 A describes a multi-modal telephone set which uses both a visual display and a synthesized voice to communicate with the user.
Further, application programming languages with alternative information types and the variant records like alt texts in HTML/http are available for performing access to arbitrary type of information.
It is the object of the present invention to improve the interaction between a user and a server application executed by a network server.
The object of the present invention is achieved by an interaction server for adapting modalities of a dialog between one or several clients and one or several server applications, the interaction server comprising a modality handler receiving a set of dialog modality capability data and/or dialog modality requirement data assigned to an intended dialog between a particular client and a particular server application, and the modality handler mediating between the dialog modality capabilities and/or dialog modality requirements of the client and the server application and selecting the dialog modalities of the dialog based on said mediation. The object of the present invention is further achieved by a method for adapting modalities of a dialog between a client and a server, the method comprising the steps of providing a set of dialog modality capabilities or dialog modality requirements to an interaction server, mediating between the dialog modality capabilities and/or dialog modality requirements of the client and the server, and selecting the dialog modalities of said dialog based on said mediation. The object of the present invention is further achieved by a computer software product providing, when executed by a computer, the steps of receiving a set of dialog modality capability data or dialog modality requirement data of the client and/or the server, mediating between the dialog modality capabilities and/or dialog modality requirements of the client and the server and selecting the dialog modalities of said dialog based on said mediation. The present invention implements the basic idea to manage modalities with a proxy separately, i.e. not at the client (e.g. terminal) and not at the server or server application.
Several advantages are achieved by the present invention:
A universal modality interface is provided between clients and server applications. Both, the client and the server have not to take care to meet the modality capabilities and requirements of the other party. Client and server implementation is simplified. Network services are no longer restricted to one specific type of terminal, but a plurality of different kinds of network services may be accessed by a plurality of different types of terminals via the interaction server. A heterogeneous client environment is hidden from the network. Consequently, the efficiency of service creation and service provisioning is increased and the number of available network services is drastically increased.
Further advantages are achieved by the embodiments indicated by the dependent claims.
A modality mediating interaction server is located between the clients and the servers. A server could offer several modalities to the interaction server and a client could subscribe several modalities. The interaction server performs a matching and/or arrange a modality agreement, a modality adaptation, or a modality conversion. Further, transfer protocols could be adapted leading to a more efficient communication.
According to a preferred embodiment of the invention, the modality handler considers mediation modality capability data describing terminal and service representation capabilities and modality requirement data describing service and/or user presentation requirements to perform the mediation between the client and the several application.
Modality capabilities are, for example, available input and output devices of a terminal, the type of classification of these input/output devices, e.g. the size and the color capabilities of a liquid crystal display, and the terminal hardware/software support of these input/output devices.
If all these aforementioned data are considered by the modality handler during the mediation step, the modality handler is in a position to select the dialog modalities of a particular dialog in an optimized way.
Further advantages are achieved if the modality handler considers in addition terminal and/or user preferences. For example, the user could personally select modalities he intends to use for interacting with a network service. Also, a server application providing a network service may set modality preferences indicating the modalities which are best to provide the service. Further, environmental data, for example the actual location of the client, may be taken into consideration for the selection of the dialog modalities. These additional features help the modality handler to adopt dialog modalities to current user needs.
To support the above described process, the service application provides information like the identification of the service provided by the service application, its service class, its service type and its service preferences to the modality handler. The other way around, the modality handler receives information like client type, client identifier, user, user type, location of the client, user preferences, client preferences and/or client environmental data for performing the mediation step.
Further advantages are achieved if the modality handler accesses a terminal, user and/or service profile for mediating between the dialog modality capabilities and/or dialog modality requirements of the client and the server application. This helps to minimize the data flow over the network and supports a proper mediation between the different interests.
According to a further preferred embodiment of the invention, the modality handler provides, in addition to the above described modality selection function, modality adaptation and/or modality conversion functions. For example, the modality handler creates a dialog modality specification of said dialog specifying the selected dialog modalities, which is used by a dialog engine for performing the adaptation and conversion functions.
An efficient and proper adaptation is achieved by a dialog engine accessing a multi-modal service script provided by the server application, the created dialog specification and a dialog building data base.
The efficiency of the modality conversion and/or adaptation functions of the server is increased by providing a multi-modal backend within the interaction server executing a dialog script created by the dialog engine. The multi-modal backend may comprise a set of browser applications selected case by case to communicate with the client. Further, the multi-modal backend may comprise a set of media-processing units, for example speech recognition, text to speech or handwriting recognition, for communicating with the client. This specific architecture increases the efficiency of the data processing and makes it possible to support a big number of different types of clients in an efficient way.
Further advantages are achieved by the provisioning of a protocol conversion function within the interaction server converting between the protocols used for the communication between the interaction server and the servers executing the server applications and the protocols used for the communication between a specific client and the interaction server.
These as well as other features and advantages of the invention may be better appreciated by reading the following detailed description of preferred exemplary embodiments taken in conjunction with accompanying drawings of which:
The client 41 is a voice phone, for example an ISDN-phone or PSTN-phone (ISDN=Integrated Service Digital Network; PSDN=Public Switched Telecommunication Network). As indicated in
The client 42 is a data enabled phone, for example a GSM cell phone with GPRS capability (GSM=Global System for Mobile Communication; GPRS=General Packet Radio Service).
The client 43 is a portable computer connected with the communication network 1. The client 43 communicates, for example, via a TCP/IP protocol with the interaction server 2 wherein the TCP/IP communication bases, for example, on a wireless LAN, DSL or internet protocol (TCP=Transaction Capability Protocol; IP=Internet Protocol; DSL=Digital Subscriber Line).
The client 44 is a smart phone, for example an UMTS phone with multi-modal inputting and outputting capabilities (UMTS=Universal Mobile Telecommunications).
Each of the different clients 41 to 44 uses different communication protocols to communicate with the interaction server 2. Further, each of the different clients 41 to 44 provides a different set of modalities for the interaction with the respective user.
A modality describes the way how information is presented from the client to the user or from the user to the client. For example, a information may be submitted as voice message, written information on a screen, by an icon or a graphic displayed on a screen, by pressing a specific key of a key-pad, by entering a handwritten command, by a pen, by a mouse-pad, by a voice command, by a typed command word or by touching an icon on a touch pad.
Depending on the kind of client, a specific set of the modalities is supported by the respective client. For example, the smart phone 44 might use voice only, graphical only or operate in a multi-modal way, for example via HTML plus or flash with speech recognition, text to speech and handwriting recognition located in the interaction server 2 or in the client 44.
The interaction server 2 handles voice, graphic and multi-modal interaction for all the different clients 41 to 44. On the other side, the transaction server 2 interacts, for example via a multi-modal markup language, e.g. via HTML plus, SALT, or X+V, or via Flash, with the servers 31 to 33. Each of the servers 31 to 33 executes at least one server application which can be contacted by one of the clients 41 to 44 for performing a respective network service. When contacted by one of the clients 41 to 44, the server application interacts with the respective clients 41 to 44 via the interaction server 2 which handles the different modalities of the clients 41 to 44 for the server application, ensures to have the same dialog structure and gives independence of IP and TDM word (TDM=Time Division Multiplexing).
The interaction server 2 provides a common interface for service applications, for example IM, E-mail, Presence. Further, it performs network translator and call control functions, for server applications, for example for voice, video, gaming and conferencing.
The interaction server 2 controls the dialog between the various clients 41 to 43 and the various server applications. A set of dialog modality capabilities or dialog modality requirements of clients and/or server applications are provided to the interaction server. Based on this set of information, the interaction server mediates between the dialog modality capabilities and/or dialog modality requirements of a client and the respectively assigned server, and selecting the dialog modalities for this specific dialog between one specific client and one specific server application based on said mediation.
The interaction server 2 consists of a hardware platform, a software platform basing on the hardware platform and several application programs executed by the system platform formed by the software and hardware platform. These application programs or a selected part of these application programs constitute a computer software product providing the function of a modality handler as described in the following, when executed on the system platform. Further, such computer software product is constituted by a storage medium storing these application programs or said selected part of application programs.
From functional point of view, the interaction server 2 comprises a modality handler 5 and a profile data base 57. The modality handler 5 contains a dialog controller 51 and a dialog engine 52. During the establishment of a session with a server application 91, the client 44 submits several information 71 to the dialog controller 51, which are used by the dialog controller 51 to determine the dialog modality capabilities and/or dialog modality requirements of the client 44. Further, the server application 91 submits information 72 to the dialog controller 51, which enables the dialog controller 51 to determine dialog modality capabilities and/or dialog modality requirements of the selected server application 91.
For example, the information 71 contains an identifier of the client 44, an identifier of the user currently associated with the client 44, the client type, the user type, client class or user class assigned to client 44, user preferences, client preferences and/or client environmental data like location of the client, local temperature, lightning conditions and so on.
The information 72 contains, for example, a server application identifier, a service type, a service class associated with the server application, and/or service preferences.
Further, it is also possible that the information 71 and 72 does already contain dialog modality capability data and/or dialogue modality requirement data assigned by the client 44 and the server application 91 to the intended dialog, respectively. But, these data or at least a part of these data are preferably determined by the dialog controller 51 itself, for example, by means of accessing the profile data base 57 or another data source depending on the submitted information 71 and 72.
The profile data base 57 contains terminal, user, and/or services profiles of different clients, different users and different server applications. For example, the profile data base 57 contains a terminal profile for each different type or each different class of clients, for example one terminal profile for each of the different clients 41 to 44. Each of these terminal profiles comprises a set of specific data describing the terminal presentation capabilities and terminal preferences of the respective type or class of terminals. For example, it specifies the respective input and output devices of these clients and the respective associated hardware and software support as well as details of these input and output devices influencing the presentation or the inputting of data (for example size, solution and color capabilities of a screen). Further, the profile data base 57 contains, for example, a set of user preferences which might be personalized by each user by online access to the interaction server 2. Further, the profile data base 57 contains modality requirement data and modality capability data describing the service presentation/interaction capabilities and requirements of different server applications.
As already indicated above, all these modality capability and modality requirement data may be provided by the profile data base 57, the client and/or the server application linked in a dialog, and/or by a further data base accessed by the dialog control 51.
The dialog control 51 considers the available information, i.e. the determined dialog modality capability and/or dialog modality requirement data of the server application 91 and the client 44, as well as the client, user and/or server application preferences and obtained environmental data, to mediate between the client 44 and the server application 91 and come to a selection of dialog modalities for this dialog that meets the interests of the client 44 and the requirements and limitations set by the server application 91.
In the following, the selected dialog modalities form the frame work of the further interaction between the client 44 and the server application 91 as well as the frame work for the interaction between the user and the service provided by help of the server application 91 and the client 44.
It is possible that the selected dialog modalities are submitted to the client 44 and/or the server application 91 to establish the further interaction based on these constraints. But, these data are preferably used by the modality handler 5 to adapt the modalities between the client application 44 and the server application 91. Preferably, the modality handler 5 comprises a dialog engine 52 performing a modality and communication conversion or adapting function between the client 44 and the server application 91. The dialog engine 52 receives information 74 from the dialog controller 91 containing a dialog modality specification describing the selected dialog modalities for the intended dialog. The dialog engine comprises dialog conversion and adaptation means 56 and communication protocol conversion and adaptation means 53 to 55 performing these functionalities.
The dialogue manager 5 provides dialog management including terminal adaptation and user profile handling. It accesses a multi-modal services script provided by the server application 91. For example, this multi-modal service script is encoded in the XML-language (XML=Extended Mark-up Language). The dialog building data base 7 comprises libraries providing functional sets of different service interactions as well as design or restriction parameters. The dialogue engine of the modality handler 5 accesses the multi-modal services script of the server application 91, the dialog specification prepared by the dialog controller and the dialog building data base to a dialog specification executed by the multi-modal backend 6. For example, the modality handler submits a dialog specification encoded in a multimodal markup language, e.g. HTML plus, SALT or X+V (HTML=Hyper Text Mark-up Language) to the multi-modal backend. In contrast to the multi-modal service script provided by the server application 91 to the modality handler 5, the dialog specification provided from the modality handler to the multi-modal backend 6 does already respect the selected dialog modalities. The multi-modal backend comprises a terminal 61 and several multi-media processing units 62 supporting the work of the terminal 61. For example, the media processing units provide text to speech translation, handwriting recognition and speech recognition functionalities.
In an analog way as described above it is also possible to provide a specific mono-mode service script to the modality handler 5 which provides in the following by help of the dialog building data base 7 an dialog specification respecting the selected dialog modalities.
The client 47 is a voice only terminal. The interaction between this voice only terminal and the server application 91 is supported by the modality handler 5 and the voice browser 56 controlled by the modality handler.
The client 48 is a multi-modal terminal. The interaction between this terminal and the server application 91 is supported by the modality handler, the voice browser 65 and the multi-modal browser 66. For voice use, the modality handler controls the voice browser 65 which forms the server side backend. For multi-modal use, the interaction is performed through the multi-modal browser 66. For HTML-use, the modality handler directly interacts with the client 48.
The client 49 is a computer communicating with the interaction server 2 via HTML. In case of an interaction between the client 49 and the server application 91, the modality handler 5 directly interacts with the client 49.
In addition to the above demonstrated possibilities of client-server-interaction, it is further possible that the modality handler controls a direct interaction between the server application 91 and one of the browser applications 65 and 66 as well as one of the clients 48 and 49, if the server application supports the specific modality and protocol. If, for example, the server application 91 supports Voice XML, a direct interaction between the server application 91 and the browser application 95 is possible. Further, a direct interaction between the server application 91 and the HTML terminal provided by the client 49 is possible, if the server application 91 supports HTML.
Number | Date | Country | Kind |
---|---|---|---|
03292201.5 | Sep 2003 | EP | regional |