A digital assistant (DA) is a software agent that performs tasks or services for a user interacting with a device. Typically, the tasks or services are based on user input and the user's location. For example, the user may query the digital assistant for a list of restaurants near the user's current location. The digital assistant can then access information from online resources to provide a list of restaurants in close proximity to the user. The user may then review the list, select one of the restaurants, or initiate another query.
However, the digital assistants today are personalized only with respect to the current state of the user, ie; it leverages user's current location, current time of day, calendar information etc, but the digital assistant does not have digital memory to remember past actions like user's queries, the results provided, or the online resources accessed in response to a particular query. In particular, the digital assistant does not remember how the user interacted in regard to the digital assistant's response to a query. What is needed is a way to structure and store information so that it may be accessed at a later time without having to subsequently reinitiate the same query and without requiring the user to repeat past interactions or remember the steps previously taken regarding the same or similar query results.
This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description section. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended as an aid in determining the scope of the claimed subject matter.
According to one aspect disclosed herein, a method is presented for personalizing a user's digital assistant. The method disclosed herein includes accessing the digital assistant via a device, receiving a first input from the user via the digital assistant and performing a task via the digital assistant in response to the first input. The method also includes defining and storing a first session linking the first input from the user with the completed task performed via the digital assistant and generating a knowledge base of information associated with a plurality of sessions. The method may also include retrieving information about the first session from the knowledge base in response to subsequently receiving a second input from the user.
According to another aspect disclosed herein, a system is presented for enhancing a digital assistant. The system disclosed herein includes at least one processor and an operating environment executing using the at least one processor to perform actions including receiving a first input from the user via a digital assistant, performing a task via the digital assistant in response to the user input, defining and storing a session linking the first input from the user with the completed task performed via the digital assistant, and generating a knowledge base of information based on a plurality of sessions. The system may also perform actions including retrieving information about the session from the knowledge base in response to subsequently receiving a second input from the user referencing information associated with the session of the first input and generating a recommendation to the user based on information in the knowledge base.
According to yet another aspect disclosed herein, a computer-readable storage medium including instructions for enhancing a digital assistant on a user's device is disclosed. The instructions executed by a processor include accessing the digital assistant via a device, receiving a first query as input from the user via the digital assistant and, in response to the user's first query, generating search results via the digital assistant accessing an application on the device or accessing an online resource. The instructions also include defining and storing a session linking the first query from the user with the search results generated via the digital assistant, generating a knowledge base of information based on a plurality of sessions, and retrieving information about the session from the knowledge base in response to subsequently receiving a second query as input from the user, wherein the second query references the first query.
Examples are implemented as a computer process, a computing system, or as a computer program product for one or more computers. According to an aspect, the computer program product is a server of a computer system having a computer program comprising instructions for executing a computer process.
The details of one or more aspects are set forth in the accompanying drawings and description below. Other features and advantages will be apparent from a reading of the following detailed description and a review of the associated drawings. It is to be understood that the following detailed description is explanatory only and is not restrictive of the claims.
The accompanying drawings, which are incorporated in and constitute a part of this disclosure, illustrate various aspects. In the drawings:
The following detailed description refers to the accompanying drawings. Wherever possible, the same reference numbers are used in the drawings and the following description refers to the same or similar elements. While examples may be described, modifications, adaptations, and other implementations are possible. For example, substitutions, additions, or modifications may be made to the elements illustrated in the drawings, and the methods described herein may be modified by substituting, reordering, or adding stages to the disclosed methods. Accordingly, the following detailed description is not limiting, but instead, the proper scope is defined by the appended claims. Examples may take the form of a hardware implementation, or an entirely software implementation, or an implementation combining software and hardware aspects. The following detailed description is, therefore, not to be taken in a limiting sense.
Aspects of the present invention are described below with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
System 100 may also have a database 104 for storing a variety of information. The network 108 facilitates communication between devices, such as computing device 102, database 104 and server 106. The network 108 may include the Internet and/or any other type of local or wide area networks. Communication between devices allows for the exchange of data and files such as answers to indirect questions, information associated with digital profiles, and other information.
The digital assistant 112 of
The digital assistant 212 of
While
The digital assistant 112, 212 may be accessed by a user to receive input such as a query. For example, the user queries the digital assistant 112, 212 for a list of restaurants that are nearby. The digital assistant may then access an application on the device or resources external to the device to perform the task of generating results such as a list of restaurants nearby. In one or more embodiments, the digital assistant accesses an online source in order to perform a task. For example, the user digital assistant can access a website of one or more vendors over the Internet.
After the digital assistant generates the search results, the user can then select one of the restaurants to call and place an order or drive to the location of one of the restaurants. However, after a period of time such as a few days, weeks or longer, the user may not remember the name of the restaurant or its location which requires the user to initiate the same query again and also the digital assistant to again provide the same or similar results.
Thus, the digital assistant cannot recall not just the list of results that were provided in response to the user's previous queries, the digital assistant also cannot recall which search result the user previously selected. Moreover, although the devices 102 or the server 206 may include a digital profile 110, 210, these profiles 110, 210 traditionally do not include information associated with past queries or past tasks performed by the digital assistants 112, 212. Moreover, the profiles 110, 210 do not include information about the user's interaction with the device, the digital assistant or information provided as a result of accessing applications of the device and external resources.
In one or more embodiments, a knowledge base could be generated that includes information about the user's past inputs to the digital assistant, the tasks the digital assistant then performs, and the user's interactions. For example, a particular query from the user could be linked to a particular task performed by the digital assistant. Also, a user input and a corresponding task completed or performed by the digital assistant could together define a session that is stored in a knowledge base that is accessible to the digital assistant. The knowledge base could store any number of sessions in order to allow the digital assistant to retrieve information about any particular session. For example, a (second) input from a user received by the digital assistant could reference or refer to a previous (first) input. In particular, sometime after a first query and a task performed by the digital assistant of a first session, the digital assistant receives a second query referencing information about the first session such as information related to the first query or the corresponding completed task. The digital assistant can then retrieve from the knowledge base the information about the first session such as the first query, search results provided in response to the first query and/or the user's interactions with the digital assistant that occurred in response to receiving the particular search results to the first query.
In one or more embodiments, each session may be assigned an identifier (ID) which may be utilized by the digital assistant to retrieve information associated with a particular session. In such case, defining a first session includes assigning an ID to the first session and then retrieving information about the first session from the knowledge base includes utilizing the ID of the first information to retrieve the information associated with the first session. In one or more embodiment, the ID may be part of an RTF structure or other structure having an ID.
Referring again to the runtime flow, the candidate generator of block 320 makes a call to the knowledge base 340 which returns to the candidate generator 320 the candidate results. The candidate results are the past conversations of one or more sessions that could be used in the current session based on the current user query which are sent to the machine learning model 350. The machine learning model 350 is trained with the training data 352 via the learning algorithm 354 as understood by those skilled in the art of machine learning models. The machine learning model 350 ranks the results 356 and returns the top result. Thus, the best response retrieved from the knowledge base 340 is provided in response to the user's current query. In one or more embodiments, the model 350 is accessed and trained to identify from the knowledge base 340 the most likely task to perform in response to the user's input.
The machine learning models of
Processing of information regarding a user by machine learning model 350 includes the machine learning model 350 processing responses to indirect questions to determine what information may be inferred from the response. For example, a machine learning module 350 may have access to information that associates responses to indirect questions. The machine learning model 350 may use statistical modeling to make a prediction based on a user's response to an indirect query.
In addition to determining from the knowledge base 350 which task to perform in response to a user's input/query, inferences may also be determined from the knowledge based 340 without user input/query. The inferences are linked to information such as personal preferences, likes or dislikes, typical user interactions, action taken in the past, within the knowledge base 340.
A software application such as a bot may run automated tasks or scripts which can retrieve information. In at least one embodiment, an alter ego bot of a user can perform an action on the user's behalf without requiring an actual conversation. In other words, a bot could be the user's digital assistant. The bot can also utilize information of the sessions from the knowledge base and the shadow profile in order to perform a task on the user's behalf. For example, when the digit assistant is placing an order in response to a user input, a query may be received from a live person or from a vendor's bot. The query from the bot may be something like “Do you want any toppings on that pizza?” Then in such case, the response to the vendor's bot derived from the knowledge base or the shadow profile could be, for example, “No, I never add toppings to the pizza.” Also, for example, the vendor's bot could ask “Where do you want it delivered?” Then the response derived from either the knowledge base or the shadow profile could be the user's physical address. The bot also could add to the information of the knowledge base or the shadow profile. Thus, the bot or digital assistant captures, retrieves and reasons over previous actions and information.
The shadow profile 362 may be the same as or part of the digital profiles 110, 210 of
The knowledge base 310 may be modeled as a set of assertions. Each assertion a is a triple of the form {esbjid, p, eobjid} where p denotes a predicate, esbjid and eobjid denote the subject and the object entities of a, with unique IDs (id). The task completion platform (TCP) runtime task frame is used for instantiating the assertion triples. The task frame parameters P represent the predicates, with each task frame instance as the subject entity, esbjid and resolved parameter values correspond to the object entity eobjid. Aspects of this were a focus of U.S. patent application Ser. No. 14/704,564, filed May 5, 2015, entitled “Building Multimodal Collaborative Dialogs with Task Frames,” and U.S. patent application Ser. No. 14/797,444, filed Jul. 13, 2015, entitled “Task State Tracking in Systems and Services” which are incorporated herein by reference in their entireties.
As an example consider a user task to “remind me about buying milk at 10 am on Friday” with a simplified task frame as:
This results in knowledge base assertions as:
a1: {e1, Reminder, e}
a2: {e, Reminder.ReminderText, e}
a3: {e, Reminder.TimeTrigger, e}
a4: {e, Reminder.FinalAction, e}
a5: {e, Entity.Value, “buymilk”}
a6: {e, Entity.Type, “string”}
a7: {e, Entity.State, “Resolved”}
a8: {e, Entity.Value, “20160101T-530”}
. . .
. . .
Search queries which are not handled by the TCP platform are transformed to assertions with predicate as “Query.Search” and object as the user search query.
a11: {e,Query.Search, e}
a12: {e,Entity.Value, “mexican restaurants near me”}
a13: {e, Domain, “restaurant”}
a14: {e, Query.Search.ClickedResult, e}
a15: {e, Local.Cuisine, “mexican”}
. . .
. . .
This approach builds a true semantic graph over user tasks and search queries connecting the user's past conversational actions with the semantic web. The graph formalism allows the platform to embed meta information like query time and location seamlessly.
The knowledge base 340 is sampled to build the training set of data 352 and the machine learning model 350 retrieves the relevant information/response. There are tools targeting question answering problems available in the literature varying from Information Retrieval based approaches in recent past to use of layered end-to-end trainable artificial neural networks more recently.
The process 400 may also include the process block 460 for retrieving information about the first session from the knowledge base in response to subsequently receiving a second input from the user. For example, in response to a second query received from the user during a second session, the digital assistant may retrieve or recall information from the knowledge base about a prior first session linking a first query with the digital assistants corresponding completed task. The process 400 may also include the process block 470 for generating a recommendation via the digital assistant to the user based on information in the knowledge base. Generating of recommendations to the user based on information in the knowledge base could occur without prompting by the user in response to the digital assistant learning information from the knowledge base or the shadow profile. Generating of recommendations to the user based on information in the knowledge base or the shadow profile could also occur in response to the digital assistant determining a current status of the user such as the user's current location and/or a particular time. For example, the digital assistant could learn from the knowledge base or the shadow profile that the user previously ordered or dined at a particular location in close proximity to the user's current location. It is to be understood that additional operations may be performed between the process steps described here or in addition to those steps.
Embodiments, for example, are described above with reference to block diagrams and/or operational illustrations of methods, systems, and computer program products according to embodiments. The functions/acts noted in the blocks may occur out of the order as shown in any flowchart or described herein with reference to the Figures. For example, two steps or processes shown or described in succession may in fact be executed substantially concurrently or the blocks may sometimes be executed in the reverse order, depending upon the functionality/acts involved.
As stated above, a number of program modules and data files may be stored in the system memory 504. While executing on the processing unit 502, the program modules 508 may perform processes including, but not limited to, one or more of the stages of the methods and processes illustrated in the figures. Other program modules that may be used in accordance with embodiments of the present invention may include electronic mail and contacts applications, word processing applications, spreadsheet applications, database applications, slide presentation applications, drawing or computer-aided application programs, etc.
Furthermore, embodiments of the invention may be practiced in an electrical circuit comprising discrete electronic elements, packaged or integrated electronic chips containing logic gates, a circuit utilizing a microprocessor, or on a single chip containing electronic elements or microprocessors. For example, embodiments of the invention may be practiced via a system-on-a-chip (SOC) where each or many of the components illustrated in
The computing device 500 may also have one or more input device(s) 530 such as a keyboard, a mouse, a pen, a sound input device, a touch input device, etc. The output device(s) 532 such as a display, speakers, a printer, etc. may also be included. The aforementioned devices are examples and others may be used. The computing device 500 may include one or more communication connections 534 allowing communications with other computing devices 540. Examples of suitable communication connections 534 include, but are not limited to, RF transmitter, receiver, and/or transceiver circuitry; universal serial bus (USB), parallel, and/or serial ports.
The term computer readable media as used herein may include computer storage media. Computer storage media may include volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information, such as computer readable instructions, data structures, or program modules. The system memory 504, the removable storage device 524, and the non-removable storage device 526 are all computer storage media examples (i.e., memory storage.) Computer storage media may include RAM, ROM, electrically erasable read-only memory (EEPROM), flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other article of manufacture which can be used to store information and which can be accessed by the computing device 500. Any such computer storage media may be part of the computing device 500. Computer storage media does not include a carrier wave or other propagated or modulated data signal.
Communication media may be embodied by computer readable instructions, data structures, program modules, or other data in a modulated data signal, such as a carrier wave or other transport mechanism, and includes any information delivery media. The term “modulated data signal” may describe a signal that has one or more characteristics set or changed in such a manner as to encode information in the signal. By way of example, and not limitation, communication media may include wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, radio frequency (RF), infrared, and other wireless media.
One or more application programs 656 may be loaded into the memory 658 and run on or in association with the operating system 660. Examples of the application programs include digital assistants, phone dialer programs, e-mail programs, personal information management (PIM) programs, word processing programs, spreadsheet programs, Internet browser programs, messaging programs, and so forth. The system 650 also includes a non-volatile storage area 662 within the memory 658. The non-volatile storage area 662 may be used to store persistent information that should not be lost if the system 650 is powered down. The application programs 656 may use and store information in the non-volatile storage area 662, such as e-mail or other messages used by an e-mail application, and the like. A synchronization application (not shown) also resides on the system 650 and is programmed to interact with a corresponding synchronization application resident on a host computer to keep the information stored in the non-volatile storage area 662 synchronized with corresponding information stored at the host computer. As should be appreciated, other applications may be loaded into the memory 658 and run on the mobile computing device 600.
The system 650 has a power supply 670, which may be implemented as one or more batteries. The power supply 670 might further include an external power source, such as an AC adapter or a powered docking cradle that supplements or recharges the batteries. The system 650 may also include a radio 672 that performs the function of transmitting and receiving radio frequency communications. The radio 672 facilitates wireless connectivity between the system 650 and the “outside world,” via a communications carrier or service provider. Transmissions to and from the radio 672 are conducted under control of the operating system 660. In other words, communications received by the radio 672 may be disseminated to the application programs 656 via the operating system 660, and vice versa.
The visual indicator 632 may be used to provide visual notifications, and/or an audio interface 674 may be used for producing audible notifications via the audio transducer 636. In the illustrated embodiment, the visual indicator 632 is a light emitting diode (LED) and the audio transducer 636 is a speaker. These devices may be directly coupled to the power supply 670 so that when activated, they remain on for a duration dictated by the notification mechanism even though the processor 680 and other components might shut down for conserving battery power. The LED may be programmed to remain on indefinitely until the user takes action to indicate the powered-on status of the device. The audio interface 1274 is used to provide audible signals to and receive audible signals from the user. For example, in addition to being coupled to the audio transducer 636, the audio interface 674 may also be coupled to a microphone to receive audible input, such as to facilitate a telephone conversation. In accordance with embodiments of the present invention, the microphone may also serve as an audio sensor to facilitate control of notifications, as will be described below. The system 650 may further include a video interface 682 that enables an operation of an on-board camera to record still images, video stream, and the like.
A mobile computing device 600 implementing the system 650 may have additional features or functionality. For example, the mobile computing device 600 may also include additional data storage devices (removable and/or non-removable) such as, magnetic disks, optical disks, or tape. Such additional storage is illustrated in
Data/information generated or captured by the mobile computing device 600 and stored via the system 650 may be stored locally on the mobile computing device 600, as described above, or the data may be stored on any number of storage media that may be accessed by the device via the radio 672 or via a wired connection between the mobile computing device 600 and a separate computing device associated with the mobile computing device 600, for example, a server computer in a distributed computing network, such as the Internet. As should be appreciated such data/information may be accessed via the mobile computing device 600 via the radio 672 or via a distributed computing network. Similarly, such data/information may be readily transferred between computing devices for storage and use according to well-known data/information transfer and storage means, including electronic mail and collaborative data/information sharing systems.
Embodiments of the present invention, for example, are described above with reference to block diagrams and/or operational illustrations of methods, systems, and computer program products according to embodiments of the invention. The functions/acts noted in the blocks may occur out of the order as shown in any flowchart. For example, two blocks shown in succession may in fact be executed substantially concurrently or the blocks may sometimes be executed in the reverse order, depending upon the functionality/acts involved.
The description and illustration of one or more embodiments provided in this application are not intended to limit or restrict the scope of the invention as claimed in any way. The embodiments, examples, and details provided in this application are considered sufficient to convey possession and enable others to make and use the best mode of claimed invention. The claimed invention should not be construed as being limited to any embodiment, example, or detail provided in this application. Regardless of whether shown and described in combination or separately, the various features (both structural and methodological) are intended to be selectively included or omitted to produce an embodiment with a particular set of features. Having been provided with the description and illustration of the present application, one skilled in the art may envision variations, modifications, and alternate embodiments falling within the spirit of the broader aspects of the general inventive concept embodied in this application that do not depart from the broader scope of the claimed invention.