Computing device user interfaces implemented as conversation flows are becoming increasingly common. Such user interfaces may be configured to accept a single user utterance as an input, or to conduct a multi-step conversational dialog flow in which the computer and a user exchange multiple queries and responses.
This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter. Furthermore, the claimed subject matter is not limited to implementations that solve any or all disadvantages noted in any part of this disclosure.
Examples are disclosed that relate to a development environment for designing conversational user interfaces. One example provides a computing system comprising a logic subsystem and a data-holding subsystem. The data-holding subsystem comprises instructions executable by the logic subsystem to receive input defining a machine conversation dialog flow, display in an editing user interface a first representation of the machine conversation dialog flow in the form of a symbolic representation, receive input requesting display of a second representation of the machine conversation dialog flow, and in response to the request display in the editing user interface the machine conversation dialog flow in the character-based representation. The instructions are further executable to, based upon the machine conversation dialog flow, update a machine conversation schema template to form an updated machine conversation schema, and form an agent definition file based upon the updated machine conversation schema for use in executing the machine conversation dialog flow.
Another example provides a computing system comprising a logic subsystem and a data-holding subsystem comprising computer-readable instructions executable by the logic system to operate an agent development environment. The tool is configured to receive input defining a machine conversation dialog flow, display an editable representation of the machine conversation dialog flow in an editing field of a user interface of the agent development environment, receive an input selecting a state within the machine conversation dialog flow displayed in the editing field, and in response, display in a preview field of the user interface a preview of the state within the machine conversation dialog flow as the state would be presented during runtime. The tool is further configured to, based upon the input defining the machine conversation dialog flow, update a machine conversation schema template to form an updated machine conversation schema, and form an agent definition file based upon the updated machine conversation schema for use in executing the machine conversation dialog flow.
Another example provides a computing system comprising a logic subsystem and a data-holding subsystem comprising computer-readable instructions executable by the logic system. The instructions are executable to receive input defining a machine conversation dialog flow, based upon the input, modify a machine conversation schema template to form an agent definition, and receive an input granting of access to the agent definition, the input defining user accounts with which to grant access to the agent definition. The instructions are further executable to grant access to the machine conversation dialog flow to the user accounts defined, receive feedback regarding testing usage of the agent definition by the user accounts, receive an input requesting display of a representation of the feedback regarding testing usage of the agent definition from the user accounts, and in response, display a representation of the machine conversation dialog flow and the representation of the feedback at one or more locations within the representation of the machine conversation dialog flow.
As mentioned above, a computing device user interface may be implemented in the form of a conversational dialog flow conducted via speech or text. Conversation-based user interfaces may be used to perform a wide variety of tasks. For example, a digital personal assistant, which may take the form of a software module on a mobile phone or desktop computer, may utilize conversation-based user inputs to book a reservation at a restaurant, order food, set a calendar reminder, and/or order movie tickets with the proper conversation design. Likewise, bots may be used as conversational interfaces for carrying out many types of transactions.
A conversation-based user interface may be implemented as a conversational agent definition, referred to herein as an agent definition, that defines the conversation flow between a user and a computing device. As different inputs and outputs may be provided for different types of computing devices and in different computing contexts, it may be difficult to efficiently adapt a conversation (e.g. for making a restaurant reservation) authored for one computing device context, such as mobile, to a different context, such as desktop, holographic, small screen, or audio-only. As such, a developer may have to develop a different agent for each desired context in which the developer wishes to use a particular conversation flow.
Accordingly, examples are disclosed herein that relate to an agent development environment configured to facilitate the development of conversational user interface flows and the adaption of an authored conversational user interface flow across a variety of computing contexts. As one example, a developer creating a website for a delicatessen may author a machine conversation dialog flow for a website application that comprises states specifying, for example, the types of breads, meats, cheeses and spreads a customer may choose to build a sandwich. The agent development environment may then facilitate the reuse of that conversation across multiple platforms, for example, as a bot conversation on the website of the delicatessen, an SMS-based ordering platform in which users could use text-messaging to respond to text prompts in order to build a sandwich, and/or as an audio-only conversational interface in which a visitor may speak with a bot equipped with language understanding and language generating modules (e.g. for use in an automobile context). The tool further permits the developer to preview the user interfaces for each state in the conversation flow under development in each of a variety of device contexts, and also preview a runtime simulation of the conversation flow, without having to manually adapt the machine conversation for each desired context or code control logic for executing the runtime simulation.
Further, the disclosed examples also provide for the presentation of a conversation flow under development in different views, such that a user may choose to view a conversation flow under development in a symbolic view (e.g. as a flow diagram), in a character-based view (e.g. as a script that represents the conversation flow as text-based code), and/or in various other views. The disclosed examples further allow a conversation flow under development to be tested by a defined group of other users for the purpose of gathering testing usage feedback, and to present feedback from such testing usage feedback as markup to a displayed view of the conversation flow under development, illustrating various statistical information gathered from the testing process at each conversation state and/or transition.
When authoring a machine conversation dialog flow, a developer may specify various information for the flow, such as information regarding a domain, one or more intents associated with the domain, one or more slots for a domain-intent pair, one or more states for an intent, transitions between states, and response templates for the flow. A domain is a category which encompasses a group of functions for a module or tool, an intent is at least one action used to perform at least one function of a category of functions for an identified domain, and a slot is a specific value or set of values used for completing a specific action for a given domain-intent pair. For example, an “alarm time” slot may be specified for a “set an alarm” intent in the “alarm” domain.
After the dialog flow for performing the desired agent functionalities has been structured by a developer, the agent development environment updates a schema template using the information provided to the agent development environment by the user. The schema formed by updating the schema template, in combination with code to implement business logic of the conversation flow and potentially other control logic (e.g. transitional control) of the conversation flow, forms the agent definition. The schema template may comprise a document (e.g. implemented as XML or another type of computer-readable document) with code segments defining the states of a machine conversation dialog flow. As such, the term “agent” may represent any suitable data/command structure which may be used to implement, via a runtime environment, a conversation flow associated with a device functionality. The code that implements business logic may be entered by a developer, for example, when building the agent definition. Code for controlling the conversation flow likewise may be entered by the developer, or may be provided by a runtime environment in which the agent is executed.
The agent definition may be configured for a specific operating and/or device context, (e.g. a bot, personal assistant, or other computing device interface), or may be configured to execute in multiple different contexts, e.g. by including in the schema input and output options for each of the multiple different contexts at relevant states in the conversation flow.
In
The agent development environment 102 may comprise suitable logic, circuitry, interfaces, and/or code, and may be operable to provide functionalities associated with agent definitions (including generating and editing such definitions), as explained herein. The agent development environment 102 may comprise an agent generator 128, U/I design block 130, a schema template block 132, response/flow design block 134, language generation engine 136, a localization engine 138, and a feedback and in-line analytics module 140. The agent development environment 102 may include a visual editing tool, as described in more detail below, and/or any other suitable editing tools. As another example, a development environment may have a combination of different documents and views coming together to capture an agent definition. As a more detailed example, a conversation flow may be captured in a first document, and the responses captured in a second document. Such a development environment may help streamline the authoring experience by bringing these separate documents together.
The schema template block 132 may be operable to provide a schema template, such as the template shown in
The U/I design 130 may comprise suitable logic, circuitry, interfaces, and/or code (e.g. retrieved from U/I Database 144), and may be operable to generate and provide to the agent generator 128 one or more user interfaces for use with the agent definition 142.
The response/flow design module 134 may comprise suitable logic, circuitry, interfaces, and/or code to provide one or more response strings for use by the agent generator 128. For example, response strings (and presentation modes for the response strings) may be selected from responses database 148. The language generation engine 136 may be used to generate one or more human-readable responses, which may be used in connection with a given domain-intent-slot configuration (e.g., based on inputs 150, 152 and 154). The response/flow design module 134 may also provide the agent generator 128 with flow design in connection with a machine conversation dialog flow.
In an example, for an agent definition 142 generated by the agent generator 128, the selection of response strings and/or presentation mode for such responses may be based upon the digital context chosen by the developer for the agent definition as well as any number of factors, such as a user's distance from a device, the user's posture, noise level, current user activity, or social environment around user. As described below in more detail with regard to
The agent generator 128 may receive input from a programming specification 156. For example, the programming specification 156 may specify a domain, one or more intents, and one or more slots, via inputs 150, 152 and 154 respectively. The agent generator 128 may also acquire the schema template 132 and generate an updated schema 104 based on, for example, user input received via the U/I design module 130. Response/flow input from the response/flow design module 134, as well as localization input from the localization engine 138, may be used by the agent generator 128 to further update the schema template 132 and generate the updated schema 104. An additional programming code segment 106 may also be generated (e.g. based upon user input of code) to implement and manage performing of one or more requested functions by a digital personal assistant, bot, and/or computing device (e.g. to implement business logic). The updated schema 104 and the programming code segment 106 may be combined to generate the agent definition 142. The agent definition 142 may then be output to a display 126 and/or stored in storage 158 in some operating contexts.
Runtime environment 118 comprises suitable logic, circuitry, interfaces, and/or code to execute a machine conversation dialog flow defined by an agent definition. The runtime environment 118 may be implemented as a portable library configured to interpret and execute state transitions of a machine conversation flow defined by the agent definition. By implementing the conversation runtime, bot-specific execution code does not have to be rewritten by a developer each time a different bot is created. This may simplify the development of agent definitions, and thus allow conversational interfaces to be more efficiently developed. Further, language understanding and language generation can be shared across agents to allow assets to be reused across a larger developer ecosystem. Runtime simulation 119 may be utilized by agent development environment 102 to provide runtime simulations for a machine conversation dialog flow under development.
The feedback system 168 is configured to gather and present testing usage data from other users 110, 112 to test a machine conversation dialog flow under development. Testing usage metrics are gathered from users 110, 112 via telemetry and stored in a feedback database 108 after passing through a privacy filter 114 to remove personal information regarding the users. Example feedback is described in more detail below with regard to
The user interface 200 further comprises a taskbar 203 that illustrates event triggers and user triggers that a developer may create and/or modify. For example, a developer may create a user trigger to define the user voice commands configured to initiate the machine conversation in runtime. In some examples, a set of slots (e.g. date, location, time) may capture entities from voice commands that may be stored as parameters to define the user trigger. Alternatively or additionally, a developer may create an event trigger to automatically initiate a task. For example, a developer may select an upcoming event as a trigger to initiate the machine conversation in runtime.
Taskbar 203 also illustrates a hierarchical organization of the dialog flow under development. As illustrated, a machine conversation dialog flow may be organized into one or more dialog flow references, wherein each dialog flow reference may contain its own conversation sub-flow. Taskbar 203 may allow for efficient re-use of dialog flows within flow references in a larger dialog flow. For example, if a dialog flow that pertains to making a dinner reservation comprises “setTime” and “setDate” dialog flow references, the “setTime” and “setDate” dialog flow references may be re-used if the developer decides to create a plurality of states in which the conversation asks a user to input dates and times. As part of the dinner reservation example, the conversation flow may request user input of multiple dates and times for dinner reservations, in order of preference, should the first-preferred date and time not be available.
Next referring to
Continuing with
Continuing with
Next regarding
Referring next to
Further, a developer may select one or more digital contexts for the dialog flow, and separately define responses for each context. For example, as shown in steps 705 and 710 of
Referring to
Once an agent definition has been completed, the agent definition may be installed and run on a developer's local machine for testing by including appropriate code behind in the definition for interaction with the desired operating environment. Further, the agent development environment 102 is configured to allow the developer to share an agent definition with other defined user accounts, and provide feedback and analytic functions 140 to allow the developer to view feedback from testing usage by other users.
First,
The Error List window 1050 of
As illustrated by example in
As illustrated by example in
As mentioned above with regard to
In some examples, to predict whether expected user utterances will be identified and understood, a developer may train and test a language understanding model. For example, in
The agent development tool further is configured to allow a language understanding model to be assigned to a dialog flow.
As mentioned above, a preview of the selected state may be displayed within the machine conversation dialog flow as adapted to a type of device selected in the user interface.
In some examples, the methods and processes described herein may be tied to a computing system of one or more computing devices. In particular, such methods and processes may be implemented as a computer-application program or service, an application-programming interface (API), a library, and/or other computer-program product.
Computing system 1800 includes a logic subsystem 1802 and a storage machine 1804. Computing system 1800 may optionally include a display subsystem 1806, input subsystem 1808, communication subsystem 1810, and/or other components not shown in
Logic subsystem 1802 includes one or more physical devices configured to execute instructions. For example, the logic subsystem may be configured to execute instructions that are part of one or more applications, services, programs, routines, libraries, objects, components, data structures, or other logical constructs. Such instructions may be implemented to perform a task, implement a data type, transform the state of one or more components, achieve a technical effect, or otherwise arrive at a desired result.
The logic subsystem may include one or more processors configured to execute software instructions. Additionally or alternatively, the logic subsystem may include one or more hardware or firmware logic machines configured to execute hardware or firmware instructions. Processors of the logic subsystem may be single-core or multi-core, and the instructions executed thereon may be configured for sequential, parallel, and/or distributed processing. Individual components of the logic subsystem optionally may be distributed among two or more separate devices, which may be remotely located and/or configured for coordinated processing. Aspects of the logic subsystem may be virtualized and executed by remotely accessible, networked computing devices configured in a cloud-computing configuration.
Storage subsystem 1804 includes one or more physical devices configured to hold instructions executable by the logic subsystem to implement the methods and processes described herein. When such methods and processes are implemented, the state of storage subsystem 1804 may be transformed—e.g., to hold different data.
Storage subsystem 1804 may include removable and/or built-in devices. Storage subsystem 1804 may include optical memory (e.g., CD, DVD, HD-DVD, Blu-Ray Disc, etc.), semiconductor memory (e.g., RAM, EPROM, EEPROM, etc.), and/or magnetic memory (e.g., hard-disk drive, floppy-disk drive, tape drive, MRAM, etc.), among others. Storage subsystem 1804 may include volatile, nonvolatile, dynamic, static, read/write, read-only, random-access, sequential-access, location-addressable, file-addressable, and/or content-addressable devices.
It will be appreciated that storage subsystem 1804 includes one or more physical devices. However, aspects of the instructions described herein alternatively may be propagated by a communication medium (e.g., an electromagnetic signal, an optical signal, etc.) that is not held by a physical device for a finite duration.
Aspects of logic subsystem 1802 and storage subsystem 1804 may be integrated together into one or more hardware-logic components. Such hardware-logic components may include field-programmable gate arrays (FPGAs), program- and application-specific integrated circuits (PASIC/ASICs), program- and application-specific standard products (PSSP/ASSPs), system-on-a-chip (SOC), and complex programmable logic devices (CPLDs), for example.
The terms “module,” “program,” and “engine” may be used to describe an aspect of computing system 1800 implemented to perform a particular function. In some cases, a module, program, or engine may be instantiated via logic subsystem 1802 executing instructions held by storage subsystem 1804. It will be understood that different modules, programs, and/or engines may be instantiated from the same application, service, code block, object, library, routine, API, function, etc. Likewise, the same module, program, and/or engine may be instantiated by different applications, services, code blocks, objects, routines, APIs, functions, etc. The terms “module,” “program,” and “engine” may encompass individual or groups of executable files, data files, libraries, drivers, scripts, database records, etc.
It will be appreciated that a “service”, as used herein, is an application program executable across multiple user sessions. A service may be available to one or more system components, programs, and/or other services. In some implementations, a service may run on one or more server-computing devices.
When included, display subsystem 1806 may be used to present a visual representation of data held by storage subsystem 1804. This visual representation may take the form of a graphical user interface (GUI). As the herein described methods and processes change the data held by the storage machine, and thus transform the state of the storage machine, the state of display subsystem 1806 may likewise be transformed to visually represent changes in the underlying data. Display subsystem 1806 may include one or more display devices utilizing virtually any type of technology. Such display devices may be combined with logic subsystem 1802 and/or storage machine 1804 in a shared enclosure, or such display devices may be peripheral display devices.
When included, input subsystem 1808 may comprise or interface with one or more user-input devices such as a keyboard, mouse, touch screen, or game controller. In some embodiments, the input subsystem may comprise or interface with selected natural user input (NUI) componentry. Such componentry may be integrated or peripheral, and the transduction and/or processing of input actions may be handled on- or off-board. Example NUI componentry may include a microphone for speech and/or voice recognition; an infrared, color, stereoscopic, and/or depth camera for machine vision and/or gesture recognition; a head tracker, eye tracker, accelerometer, and/or gyroscope for motion detection and/or intent recognition; as well as electric-field sensing componentry for assessing brain activity.
When included, communication subsystem 1810 may be configured to communicatively couple computing system 1800 with one or more other computing devices. Communication subsystem 1810 may include wired and/or wireless communication devices compatible with one or more different communication protocols. As non-limiting examples, the communication subsystem may be configured for communication via a wireless telephone network, or a wired or wireless local- or wide-area network. In some embodiments, the communication subsystem may allow computing system 1800 to send and/or receive messages to and/or from other devices via a network such as the Internet.
Another example provides a computing system comprising a logic subsystem and a data-holding subsystem comprising computer-readable instructions executable by the logic system to receive input defining a machine conversation dialog flow, display in an editing user interface a first representation of the machine conversation dialog flow in the form of a symbolic representation, receive input requesting display of a second representation of the machine conversation dialog flow, in response to the request, display in the editing user interface the machine conversation dialog flow in a character-based representation, and based upon the machine conversation dialog flow, update a machine conversation schema template to form an updated machine conversation schema, and form an agent definition file based upon the updated machine conversation schema for use in executing the machine conversation dialog flow. The instructions may be additionally or alternatively executable to display in the editing user interface the symbolic representation as a flow diagram comprising states of the machine conversation in symbol form, and the character-based representation as a script view comprising states of the machine conversation in character form. The instructions may be additionally or alternatively executable to selectively display the machine conversation in the editing user interface via one or more other views than the flow diagram view and the script view. The instructions may be additionally or alternatively executable to receive, via the editing user interface input, user inputs additional states in the machine conversation dialog flow via the symbolic representation and also via the character-based representation. The instructions may be additionally or alternatively executable to receive an input selecting a selected flow diagram symbol in the symbolic representation of the machine conversation dialog flow, and in response display in the editing user interface editable code that is executable at a state in the machine conversation dialog flow represented by the selected flow diagram symbol. The instructions may be additionally or alternatively executable to receive an input selecting a selected state in a currently displayed representation of the machine conversation dialog flow, and to display a preview representing an appearance of a runtime user interface at the selected state in a preview field of the editing user interface. The instructions may be additionally or alternatively executable to receive a user input comprising one or more of a speech input and a text input to further define a selected state within the machine conversation dialog flow. The instructions may be additionally or alternatively executable to receive user inputs of a plurality of different types of triggers configured to initiate the machine conversation in runtime, the plurality of different types of triggers comprising a user input-based trigger type and an event-based trigger type. The instructions may be additionally or alternatively executable to display a user interface configured to permit adaptation of the machine conversation dialog flow to a plurality of different device contexts, and to receive inputs of runtime user interface presentation settings for each of the different types of device contexts.
Another example provides a computing system comprising a logic subsystem and a data-holding subsystem comprising computer-readable instructions executable by the logic system to operate an agent development environment configured to receive input defining a machine conversation dialog flow, display an editable representation of the machine conversation dialog flow in an editing field of a user interface of the agent development environment, receive an input selecting a state within the machine conversation dialog flow displayed in the editing field, in response, display in a preview field of the user interface a preview of the state within the machine conversation dialog flow as the state would be presented during runtime, and based upon the input defining the machine conversation dialog flow, update a machine conversation schema template to form an updated machine conversation schema, and form an agent definition file based upon the updated machine conversation schema for use in executing the machine conversation dialog flow. The instructions may be additionally or alternatively executable to display the preview of the selected state within the machine conversation dialog flow as adapted to a type of device selected in the user interface. The instructions may be additionally or alternatively executable to display in the user interface a list of selectable hardware specifications for the type of device, receive input of a selected hardware specification, and display the preview of the selected state as the selected state would be presented in on a computing device having the selected hardware specification. The preview may additionally or alternatively comprise visual elements obtained for the preview from outside of an operating environment of the agent development environment.
Another example provides a computing system, comprising a logic subsystem and a data-holding subsystem comprising computer-readable instructions executable by the logic system to receive input defining a machine conversation dialog flow, based upon the input, modify a machine conversation schema template to form an agent definition receive an input requesting sharing of the agent definition, the input defining user accounts with which to grant access to the agent definition, grant access to the machine conversation dialog flow to the user accounts defined, receive feedback regarding testing usage of the agent definition from the user accounts, receive an input requesting display of a representation of the feedback regarding testing usage of the agent definition from the user accounts, and in response, display a representation of the machine conversation dialog flow and the representation of the feedback at one or more locations within the representation of the machine conversation dialog flow. The instructions may be additionally or alternatively executable to display the feedback by displaying a representation of a number of times a pathway between states was followed. The instructions may be additionally or alternatively executable to display the feedback by displaying a representation of a number of dropouts of user accounts at each of a plurality of states of the machine conversation dialog flow. The instructions may be additionally or alternatively executable to display the feedback by displaying a representation of a number of successful language understandings at a location of the machine conversation dialog flow, wherein each successful language understandings represents an instance of the machine conversation dialog flow recognizing a user input, and displaying a representation of a number of unsuccessful language understanding at the location of the machine conversation dialog flow, wherein each unsuccessful language understanding represents an instance of the machine conversation dialog flow not recognizing the user input. The instructions may be additionally or alternatively executable to receive an input via the editing user interface selecting a selected state in the machine conversation dialog flow, and in response provide an output of a user input received during testing usage that was not understood at the selected state. The instructions may be additionally or alternatively executable to modify a language understanding model such that an unsuccessful user query, input, or utterance is understood. Receiving feedback regarding testing usage of the agent definition from the user accounts may be additionally or alternatively comprise using telemetry to collect information on how the conversation flow is used.
It will be understood that the configurations and/or approaches described herein are exemplary in nature, and that these specific embodiments or examples are not to be considered in a limiting sense, because numerous variations are possible. The specific routines or methods described herein may represent one or more of any number of processing strategies. As such, various acts illustrated and/or described may be performed in the sequence illustrated and/or described, in other sequences, in parallel, or omitted. Likewise, the order of the above-described processes may be changed.
The subject matter of the present disclosure includes all novel and non-obvious combinations and sub-combinations of the various processes, systems and configurations, and other features, functions, acts, and/or properties disclosed herein, as well as any and all equivalents thereof.
This application is a continuation of U.S. patent application Ser. No. 15/636,503, filed Jun. 28, 2017, which claims priority to U.S. Provisional Patent Application No. 62/418,068, filed Nov. 4, 2016, the entirety of each of which is hereby incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
62418068 | Nov 2016 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 15636503 | Jun 2017 | US |
Child | 16949430 | US |