The present disclosure relates generally to the fields of natural language understanding (NLU) and artificial intelligence (AI), and more specifically, to an artifact pinning subsystem for NLU.
This section is intended to introduce the reader to various aspects of art that may be related to various aspects of the present disclosure, which are described and/or claimed below. This discussion is believed to be helpful in providing the reader with background information to facilitate a better understanding of the various aspects of the present disclosure. Accordingly, it should be understood that these statements are to be read in this light, and not as admissions of prior art.
Cloud computing relates to the sharing of computing resources that are generally accessed via the Internet. In particular, a cloud computing infrastructure allows users, such as individuals and/or enterprises, to access a shared pool of computing resources, such as servers, storage devices, networks, applications, and/or other computing based services. By doing so, users are able to access computing resources on demand that are located at remote locations and these resources may be used to perform a variety computing functions (e.g., storing and/or processing large quantities of computing data). For enterprise and other organization users, cloud computing provides flexibility in accessing cloud computing resources without accruing large up-front costs, such as purchasing expensive network equipment or investing large amounts of time in establishing a private network infrastructure. Instead, by utilizing cloud computing resources, users are able redirect their resources to focus on their enterprise's core functions.
Such a cloud computing service may host a virtual agent, such as a chat agent, that is designed to automatically respond to issues with the client instance based on natural language requests from a user of the client instance. For example, a user may provide a request to a virtual agent for assistance with a password issue, wherein the virtual agent is part of a Natural Language Processing (NLP) or Natural Language Understanding (NLU) system. NLP is a general area of computer science and AI that involves some form of processing of natural language input. Examples of areas addressed by NLP include language translation, speech generation, parse tree extraction, part-of-speech identification, and others. NLU is a sub-area of NLP that specifically focuses on understanding user utterances. Examples of areas addressed by NLU include question-answering (e.g., reading comprehension questions), article summarization, and others. For example, an NLU may use algorithms to reduce human language (e.g., spoken or written) into a set of known symbols for consumption by a downstream virtual agent. NLP is generally used to interpret free text for further analysis. Current approaches to NLP are typically based on deep learning, which is a type of AI that examines and uses patterns in data to improve the understanding of a program.
Certain existing virtual agents implementing NLU techniques attempt to derive meaning from a received user utterance by comparing features of the user utterance to a stored collection of sample utterances. Based on any matches therebetween, the virtual agents may understand a request of a received user utterance and perform suitable actions or provide suitable replies in response to the request. In such search-based implementations of NLU, it may be important to compare multiple interpretations of the user utterance to a sizeable quantity of sample utterances to provide a widely-scoped meaning search. However, undirected expansion of user utterances and/or sample utterances to achieve this wide search scope may introduce challenges with respect to processing and memory resources, inference latency, precision, and consistency during meaning derivation.
A summary of certain embodiments disclosed herein is set forth below. It should be understood that these aspects are presented merely to provide the reader with a brief summary of these certain embodiments and that these aspects are not intended to limit the scope of this disclosure. Indeed, this disclosure may encompass a variety of aspects that may not be set forth below.
Present embodiments are directed to an agent automation framework that is designed to extract meaning from user utterances, such as requests received by a virtual agent, and suitably respond to these user utterances. To perform these tasks, the agent automation framework includes an NLU framework and an intent-entity model having defined intents and entities (e.g., artifacts) that are associated with sample utterances. The NLU framework includes a meaning extraction subsystem that is designed to generate meaning representations for the sample utterances of the intent-entity model to construct an understanding model, as well as generate meaning representations for a received user utterance to construct an utterance meaning model. As noted herein, each meaning representation embodies a different understanding or interpretation of an underlying utterance, whether the utterance is a sample utterance or a received user utterance. Additionally, the disclosed NLU framework includes a meaning search subsystem that is designed to search the meaning representations of the understanding model (which defines the meaning search space) to locate matches for meaning representations of the utterance meaning model (which defines the meaning search keys). As discussed herein, the meaning search subsystem performs expansion of both the search space and the search keys, as well as targeted pinning of the search space, to provide structure-specific search participants for improved meaning derivation. As such, present embodiments generally improve search-based NLU by providing focused expansion of the search keys and the search space against which the search keys are compared.
More specifically, present embodiments are directed to an artifact pinning subsystem that includes the meaning extraction subsystem and the meaning search subsystem mentioned above. By coordinating operation of the meaning extraction subsystem and the meaning search subsystem, the artifact pinning subsystem leverages relational cues provided during user interaction with the virtual agent (e.g., a behavior engine) and/or during compilation of the understanding models for improved meaning searching. For example, the artifact pinning subsystem may receive structural information from embedded relationships within the intent-entity model, contextual information from a behavior engine (BE) guiding end-user interaction, or both, to generate a tailored search space against which the NLU framework compares user-utterance-based search keys to more adeptly interact with and satisfy requests of the users interfacing with the NLU framework. To generate a particular search space, the artifact pinning subsystem disclosed herein may first generate multiple meaning representations as utterance tree structures, which provide potential representations of the various understandings derivable from each sample utterance of the intent-entity model. In some cases, the meaning representations are generated via vocabulary cleansing, vocabulary injection, and/or various part-of-speech assignments that are applied to respective tokens associated with nodes of the utterance trees. However, any suitable techniques that generate alternative meaning representations corresponding to various understandings of the sample utterances, including alternative parse structure discovery, vocabulary substitution, re-expressions, and so forth, may be implemented within the NLU framework. A significant number of potential candidates for the search space are thus generated by construing each of the sample utterances to have multiple different understandings, each corresponding to a respective meaning representation.
Notably, to guide pruning of the potential candidates, the artifact pinning subsystem identifies that each sample utterance of the intent-entity model was annotated with artifact labels that define relationships between the intents and the one or multiple entities of each sample utterance, within the structure defined by the particular intent-entity model. For example, an author of the intent-entity model may identify that a particular sample utterance relates to a particular intent, then label or annotate any suitable entities within the respective sample utterance that correspond to (e.g., belong to, are related to) the particular intent. The author may similarly label additional entities corresponding to additional intents of each given sample utterance. As recognized herein, the artifact pinning subsystem leverages the artifact labels of the sample utterances to prune any meaning representations that are not valid representations of a particular sample utterance, thereby improving a quality of a subsequent intent match by excluding non-relevant meaning representations for the intent match.
In particular, with respect to each intent of the intent-entity model, the artifact pinning subsystem may generate a set of meaning representations from a respective sample utterance that include the respective intent, as well as one or multiple respective entities that correspond to the labeled entity of a corresponding sample utterance. As discussed in more detail below, verifying that the entities of the generated meaning representations align with the artifact labels annotated in a respective intent-entity model enables the artifact pinning subsystem to efficiently prune invalid or non-relevant meaning representations from consideration for improved meaning search quality. The artifact pinning subsystem may then re-express the set of meaning representations by altering the arrangement or included number of nodes of utterance trees associated with the set, removing any duplicate candidates, and finally, generating the search space based on the remaining meaning representations of the set that have the labeled entity in a proper entity format. Similarly, the artifact pinning subsystem may form the search keys by generating multiple potential meaning representations of a user utterance, then, during a meaning search or inference, compare the search keys to the search space that includes the meaning representations that survive the above-mentioned model-based entity pinning. The meaning search may also be performed with respect to an inferenced, contextual intent of conversation between the user and a behavior engine, such that meaning representation matches are identified based on their correspondence to the contextual intent (e.g., an intent previously inferenced by the NLU during a dialog with a user) and thereby guide targeted pruning or refinement of the search space. Further, during the meaning search, the artifact pinning subsystem may increase the contribution (e.g., increase the respective similarity score) of meaning representations within the search space that match the contextual intent to improve similarity scoring processes based on a particular conversational situation or identified topic of conversation. In other embodiments, the artifact pinning subsystem may remove meaning representations that are not associated with the contextual intent, expediting the similarity scoring processes via implementation of a narrower embodiment of the search space.
Various aspects of this disclosure may be better understood upon reading the following detailed description and upon reference to the drawings in which:
One or more specific embodiments will be described below. In an effort to provide a concise description of these embodiments, not all features of an actual implementation are described in the specification. It should be appreciated that in the development of any such actual implementation, as in any engineering or design project, numerous implementation-specific decisions must be made to achieve the developers' specific goals, such as compliance with system-related and business-related constraints, which may vary from one implementation to another. Moreover, it should be appreciated that such a development effort might be complex and time consuming, but would nevertheless be a routine undertaking of design, fabrication, and manufacture for those of ordinary skill having the benefit of this disclosure.
As used herein, the term “computing system” or “computing device” refers to an electronic computing device such as, but not limited to, a single computer, virtual machine, virtual container, host, server, laptop, and/or mobile device, or to a plurality of electronic computing devices working together to perform the function described as being performed on or by the computing system. As used herein, the term “machine-readable medium” may include a single medium or multiple media (e.g., a centralized or distributed database, and/or associated caches and servers) that store one or more instructions or data structures. The term “non-transitory machine-readable medium” shall also be taken to include any tangible medium that is capable of storing, encoding, or carrying instructions for execution by the computing system and that cause the computing system to perform any one or more of the methodologies of the present subject matter, or that is capable of storing, encoding, or carrying data structures utilized by or associated with such instructions. The term “non-transitory machine-readable medium” shall accordingly be taken to include, but not be limited to, solid-state memories, and optical and magnetic media. Specific examples of non-transitory machine-readable media include, but are not limited to, non-volatile memory, including by way of example, semiconductor memory devices (e.g., Erasable Programmable Read-Only Memory (EPROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), and flash memory devices), magnetic disks such as internal hard disks and removable disks, magneto-optical disks, and CD-ROM and DVD-ROM disks.
As used herein, the terms “application,” “engine,” and “plug-in” refer to one or more sets of computer software instructions (e.g., computer programs and/or scripts) executable by one or more processors of a computing system to provide particular functionality. Computer software instructions can be written in any suitable programming languages, such as C, C++, C #, Pascal, Fortran, Perl, MATLAB, SAS, SPSS, JavaScript, AJAX, and JAVA. Such computer software instructions can comprise an independent application with data input and data display modules. Alternatively, the disclosed computer software instructions can be classes that are instantiated as distributed objects. The disclosed computer software instructions can also be component software, for example JAVABEANS or ENTERPRISE JAVABEANS. Additionally, the disclosed applications or engines can be implemented in computer software, computer hardware, or a combination thereof.
As used herein, the term “framework” refers to a system of applications and/or engines, as well as any other supporting data structures, libraries, modules, and any other supporting functionality, that cooperate to perform one or more overall functions. In particular, a “natural language understanding framework” or “NLU framework” comprises a collection of computer programs designed to process and derive meaning (e.g., intents, entities, artifacts) from natural language utterances based on an understanding model. As used herein, a “behavior engine” or “BE,” also known as a reasoning agent or RA/BE, refers to a rule-based agent, such as a virtual agent, designed to interact with users based on a conversation model. For example, a “virtual agent” may refer to a particular example of a BE that is designed to interact with users via natural language requests in a particular conversational or communication channel. With this in mind, the terms “virtual agent” and “BE” are used interchangeably herein. By way of specific example, a virtual agent may be or include a chat agent that interacts with users via natural language requests and responses in a chat room environment. Other examples of virtual agents may include an email agent, a forum agent, a ticketing agent, a telephone call agent, and so forth, which interact with users in the format of email, forum posts, autoreplies to service tickets, phone calls, and so forth.
As used herein, an “intent” refers to a desire or goal of a user which may relate to an underlying purpose of a communication, such as an utterance. As used herein, an “entity” refers to an object, subject, or some other parameterization of an intent. It is noted that, for present embodiments, certain entities are treated as parameters of a corresponding intent. More specifically, certain entities (e.g., time and location) may be globally recognized and extracted for all intents, while other entities are intent-specific (e.g., merchandise entities associated with purchase intents) and are generally extracted only when found within the intents that define them. As used herein, “artifact” collectively refers to both intents and entities of an utterance. As used herein, an “understanding model” is a collection of models used by the NLU framework to infer meaning of natural language utterances. An understanding model may include a vocabulary model that associates certain tokens (e.g., words or phrases) with particular word vectors, an intent-entity model, an entity model, or a combination thereof. As used herein an “intent-entity model” refers to a model that associates particular intents with particular sample utterances, wherein entities associated with the intent may be encoded as a parameter of the intent within the sample utterances of the model. As used herein, the term “agents” may refer to computer-generated personas (e.g., chat agents or other virtual agents) that interact with users within a conversational channel. As used herein, a “corpus” refers to a captured body of source data that includes interactions between various users and virtual agents, wherein the interactions include communications or conversations within one or more suitable types of media (e.g., a help line, a chat room or message string, an email string). As used herein, an “utterance tree” refers to a data structure that stores a meaning representation of an utterance. As discussed, an utterance tree has a tree structure (e.g., a dependency parse tree structure) that represents the syntactic and grammatical structure of the utterance (e.g., relationships between words, part-of-speech (POS) taggings), wherein nodes of the tree structure store vectors (e.g., word vectors, subtree vectors) that encode the semantic meaning of the utterance. As used herein, a “quality” of an inference or meaning search refers to a quantitative measure based on one or multiple of an accuracy, a precision, and/or any suitable F-score of the inference, as would be understood by one of ordinary skill in the art of NLU.
As used herein, “source data” or “conversation logs” may include any suitable captured interactions between various agents and users, including but not limited to, chat logs, email strings, documents, help documentation, frequently asked questions (FAQs), forum entries, items in support ticketing, recordings of help line calls, and so forth. As used herein, an “utterance” refers to a single natural language statement made by a user or agent that may include one or more intents. As such, an utterance may be part of a previously captured corpus of source data, and an utterance may also be a new statement received from a user as part of an interaction with a virtual agent. As used herein, “machine learning” or “ML” may be used to refer to any suitable statistical form of artificial intelligence capable of being trained using machine learning techniques, including supervised, unsupervised, and semi-supervised learning techniques. For example, in certain embodiments, ML techniques may be implemented using a neural network (NN) (e.g., a deep neural network (DNN), a recurrent neural network (RNN), a recursive neural network). As used herein, a “vector” (e.g., a word vector, an intent vector, a subject vector, a subtree vector) refers to a linear algebra vector that is an ordered n-dimensional list (e.g., a 300 dimensional list) of floating point values (e.g., a 1×N or an N×1 matrix) that provides a mathematical representation of the semantic meaning of a portion (e.g., a word or phrase, an intent, an entity, a token) of an utterance.
As used herein, the terms “dialog” and “conversation” refer to an exchange of utterances between a user and a virtual agent over a period of time (e.g., a day, a week, a month, a year, etc.). As used herein, an “episode” refers to distinct portions of dialog that may be delineated from one another based on a change in topic, a substantial delay between communications, or other factors. As used herein, “context” refers to information associated with an episode of a conversation that can be used by the BE to determine suitable actions in response to extracted intents and/or entities of a user utterance. As used herein, a “contextual intent” refers to an intent that was previously identified or processed by the NLU framework during a flow or conversation between the BE and the user. As used herein, “domain specificity” refers to how attuned a system is to correctly extracting intents and entities expressed in actual conversations in a given domain and/or conversational channel. As used herein, an “understanding” of an utterance refers to an interpretation or a construction of the utterance by the NLU framework. As such, it may be appreciated that different understandings of an utterance are generally associated with different meaning representations having different structures (e.g., different nodes, different relationships between nodes), different POS taggings, and so forth.
As mentioned, a computing platform may include a chat agent, or another similar virtual agent, that is designed to automatically respond to user requests to perform functions or address issues on the platform via NLU techniques. The disclosed NLU framework is based on principles of cognitive construction grammar (CCG), in which an aspect of the meaning of a natural language utterance can be determined based on the form (e.g., syntactic structure, shape) and semantic meaning of the utterance. The disclosed NLU framework is capable of generating multiple meaning representations that form one or more search keys for an utterance. Additionally, the disclosed NLU framework is capable of generating an understanding model having multiple meaning representations for certain sample utterances, which expands the search space for meaning search, thereby improving operation of the NLU framework. However, when attempting to derive user intent from natural language utterances by comparing the search keys to the search space, it is presently recognized that certain NLU frameworks may perform inefficient searches when considering search keys and search spaces that include a large number of meaning representations, potentially returning irrelevant intent matches and/or incurring undesirable inference latency that reduces user satisfaction with the certain NLU frameworks.
Accordingly, present embodiments are generally directed toward an agent automation framework capable of leveraging CCG techniques to generate multiple meaning representations for utterances, including sample utterances in the intent-entity model and utterances received from a user. In particular, an artifact pinning subsystem of the agent automation system directs a meaning extraction subsystem and a meaning search subsystem of the agent automation subsystem during both compilation of a search space, as well as during inference of a received user utterance that is transformed into one or more search keys and compared to the search space. During generation of the search space, the artifact pinning subsystem may determine multiple different understandings of sample utterances within one or multiple intent-entity models by performing vocabulary adjustment, varied part-of-speech assignment, and/or any other suitable processes that generate multiple meaning representations corresponding to various understandings of each sample utterance. As such, the artifact pinning subsystem thus generates a potentially-sizeable quantity of candidates for inclusion within the search space.
To selectively prune the candidates, the artifact pinning subsystem disclosed herein leverages artifact correlations of the sample utterances to prune any meaning representations that are not valid representations of a particular sample utterance. That is, the sample utterances generally each belong to an identified intent that may have been labeled (e.g., by an author or ML-based annotation subsystem) with any suitable number of corresponding entities, within a structure or set of relationships defined by the intent-entity model. To validate the relevance of each candidate meaning representation for an identified intent, the artifact pinning subsystem may identify and pin a set of meaning representations that include the particular intent and include one or multiple respective entities corresponding to one or multiple labeled entities of a corresponding sample utterance. That is, the meaning representations of the set are desirably retained (e.g., pinned) as valid formulations of an associated sample utterance, within the structure defined by artifact labels of a respective intent-entity model. The set of meaning representations may then be re-expressed or expanded via any suitable processes, such as altering the arrangement or included number of nodes of meaning representations (e.g., utterance trees) associated with the set. The artifact pinning subsystem of certain embodiments may therefore remove any duplicate candidates and generate (e.g., compile) the search space based on the remaining meaning representations with the appropriate pinned entity.
During a meaning search performed to derive meaning from an on-going conversation between a user and a behavior engine, the artifact pinning subsystem may form the one or more search keys to compare against the search space by generating multiple potential meaning representations of a user utterance. Notably, the artifact pinning subsystem may pin the search space with respect to an inferenced, contextual intent of the on-going conversation, guiding further targeted pruning of the search space that may otherwise decrease a quality or increase a resource demand of a meaning search. For example, the artifact pinning subsystem may identify relevant meaning representations within the search space or the underlying understanding model to provide a similarity scoring bonus to the relevant meaning representations or to prune other, non-relevant meaning representations from the search space, thereby providing more direct search paths for the meaning searches. As will be understood, the herein-disclosed pinning of the multiple various candidates of the search space enables the agent automation system to target particularly-relevant candidates for improved meaning search quality. Pruning the search space to these candidates may also limit computing resource usage and improve the efficiency of the NLU framework.
With the preceding in mind, the following figures relate to various types of generalized system architectures or configurations that may be employed to provide services to an organization in a multi-instance framework and on which the present approaches may be employed. Correspondingly, these system and platform examples may also relate to systems and platforms on which the techniques discussed herein may be implemented or otherwise utilized. Turning now to
For the illustrated embodiment,
In
To utilize computing resources within the platform 20, network operators may choose to configure the data centers 22 using a variety of computing infrastructures. In one embodiment, one or more of the data centers 22 are configured using a multi-tenant cloud architecture, such that one of the server instances 24 handles requests from and serves multiple customers. Data centers 22 with multi-tenant cloud architecture commingle and store data from multiple customers, where multiple customer instances are assigned to one of the virtual servers 24. In a multi-tenant cloud architecture, the particular virtual server 24 distinguishes between and segregates data and other information of the various customers. For example, a multi-tenant cloud architecture could assign a particular identifier for each customer in order to identify and segregate the data from each customer. Generally, implementing a multi-tenant cloud architecture may suffer from various drawbacks, such as a failure of a particular one of the server instances 24 causing outages for all customers allocated to the particular server instance.
In another embodiment, one or more of the data centers 22 are configured using a multi-instance cloud architecture to provide every customer its own unique customer instance or instances. For example, a multi-instance cloud architecture could provide each customer instance with its own dedicated application server and dedicated database server. In other examples, the multi-instance cloud architecture could deploy a single physical or virtual server 24 and/or other combinations of physical and/or virtual servers 24, such as one or more dedicated web servers, one or more dedicated application servers, and one or more database servers, for each customer instance. In a multi-instance cloud architecture, multiple customer instances could be installed on one or more respective hardware servers, where each customer instance is allocated certain portions of the physical server resources, such as computing memory, storage, and processing power. By doing so, each customer instance has its own unique software stack that provides the benefit of data isolation, relatively less downtime for customers to access the platform 20, and customer-driven upgrade schedules. An example of implementing a customer instance within a multi-instance cloud architecture will be discussed in more detail below with reference to
Although
As may be appreciated, the respective architectures and frameworks discussed with respect to
By way of background, it may be appreciated that the present approach may be implemented using one or more processor-based systems such as shown in
With this in mind, an example computer system may include some or all of the computer components depicted in
The one or more processors 82 may include one or more microprocessors capable of performing instructions stored in the memory 86. Additionally or alternatively, the one or more processors 82 may include application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), and/or other devices designed to perform some or all of the functions discussed herein without calling instructions from the memory 86.
With respect to other components, the one or more busses 84 include suitable electrical channels to provide data and/or power between the various components of the computing system 80. The memory 86 may include any tangible, non-transitory, and computer-readable storage media. Although shown as a single block in
It should be appreciated that the cloud-based platform 20 discussed above provides an example of an architecture that may utilize NLU technologies. In particular, the cloud-based platform 20 may include or store a large corpus of source data that can be mined, to facilitate the generation of a number of outputs, including an intent-entity model. For example, the cloud-based platform 20 may include ticketing source data having requests for changes or repairs to particular systems, dialog between the requester and a service technician or an administrator attempting to address an issue, a description of how the ticket was eventually resolved, and so forth. Then, the generated intent-entity model can serve as a basis for classifying intents in future requests, and can be used to generate and improve a conversational model to support a virtual agent that can automatically address future issues within the cloud-based platform 20 based on natural language requests from users. As such, in certain embodiments described herein, the disclosed agent automation framework is incorporated into the cloud-based platform 20, while in other embodiments, the agent automation framework may be hosted and executed (separately from the cloud-based platform 20) by a suitable system that is communicatively coupled to the cloud-based platform 20 to process utterances, as discussed below.
With the foregoing in mind,
The embodiment of the agent automation framework 100 illustrated in
For the embodiment illustrated in
For the embodiment illustrated in
For the illustrated embodiment, the NLU framework 104 includes an NLU engine 116 and a vocabulary manager 118. It may be appreciated that the NLU framework 104 may include any suitable number of other components. In certain embodiments, the NLU engine 116 is designed to perform a number of functions of the NLU framework 104, including generating word vectors (e.g., intent vectors, subject or entity vectors, subtree vectors) from word or phrases of utterances, as well as determining distances (e.g., Euclidean distances) between these vectors. For example, the NLU engine 116 is generally capable of producing a respective intent vector for each intent of an analyzed utterance. As such, a similarity measure or distance between two different utterances can be calculated using the respective intent vectors produced by the NLU engine 116 for the two intents, wherein the similarity measure provides an indication of similarity in meaning between the two intents.
The vocabulary manager 118 addresses out-of-vocabulary words and symbols that were not encountered by the NLU framework 104 during vocabulary training. For example, in certain embodiments, the vocabulary manager 118 can identify and replace synonyms and domain-specific meanings of words and acronyms within utterances analyzed by the agent automation framework 100 (e.g., based on the collection of rules 114), which can improve the performance of the NLU framework 104 to properly identify intents and entities within context-specific utterances. Additionally, to accommodate the tendency of natural language to adopt new usages for pre-existing words, in certain embodiments, the vocabulary manager 118 handles repurposing of words previously associated with other intents or entities based on a change in context. For example, the vocabulary manager 118 could handle a situation in which, in the context of utterances from a particular client instance and/or conversation channel, the word “bike” actually refers to a motorcycle rather than a bicycle.
Once the intent-entity model 108 and the conversation model 110 have been created, the agent automation framework 100 is designed to receive a user utterance 122 (in the form of a natural language request) and to appropriately take action to address the request. For example, for the embodiment illustrated in
It may be appreciated that, in other embodiments, one or more components of the agent automation framework 100 and/or the NLU framework 104 may be otherwise arranged, situated, or hosted for improved performance. For example, in certain embodiments, one or more portions of the NLU framework 104 may be hosted by an instance (e.g., a shared instance, an enterprise instance) that is separate from, and communicatively coupled to, the client instance 42. It is presently recognized that such embodiments can advantageously reduce the size of the client instance 42, improving the efficiency of the cloud-based platform 20. In particular, in certain embodiments, one or more components of the artifact pinning subsystem discussed below may be hosted by a separate instance (e.g., an enterprise instance) that is communicatively coupled to the client instance 42, as well as other client instances, to enable improved meaning searching for suitable matching meaning representations within the search space to enable identification of artifact matches for the utterance 122.
With the foregoing in mind,
In particular, the NLU framework 104 illustrated in
For the embodiment of the agent automation framework 100 illustrated in
As illustrated in
As mentioned, the NLU framework 104 includes two primary subsystems that cooperate to convert the hard problem of NLU into a manageable search problem—namely: a meaning extraction subsystem and a meaning search subsystem. For example,
For the embodiment illustrated in
As an example of one of the meaning representations 158, 162 disclosed herein,
The form or shape of the utterance tree 166 illustrated in
Moreover, in other embodiments, each of the nodes 202 may be annotated by the structure subsystem with additional information about the word or phrase represented by the node to form an annotated embodiment of the utterance tree 166. For example, each of the nodes 202 may include a respective tag, identifier, shading, or cross-hatching that is indicative of a class annotation of the respective node. In particular, for the example utterance tree 166 illustrated in
As such, it may be appreciated that the utterance tree 166, from which the meaning representations are generated, serves as a basis (e.g., an initial basis) for artifact extraction. Further to this effect, the nodes 202 of certain embodiments of the utterance tree 166, such as those in which the utterance is a sample utterance 155 of the intent-entity model 108, may also be annotated or tagged with respective artifact labels (e.g., intent labels and/or entity labels) that identify a particular one of the nodes as a particular entity that is defined within a particular intent. For example, during construction of the intent-entity model 108, an author of the intent-entity model 108 may identify the sample utterance 155 as belonging to a particular intent. Within the intent-entity model 108, the author may then identify (e.g., highlight, annotate, label) certain tokens within the sample utterance 155, such as particular entities, as entities that belong to or are associated with the intent. For example, a “purchase product” intent may have a number of labeled entities within an intent-entity model 108 that belong to the intent, such as a “brand” entity, a “model” entity, a “color” entity, a “size” entity, a “shipping address” entity, and so forth, depending on the nature of the product. In some embodiments, at least a portion of the sample utterances 155 are associated with artifact labels by machine learning features of the NLU framework, either in addition to or in alternative to the manually-specified annotations of an author. In either case, the artifact labels for the sample utterances 155 may be specific to the particular relationships defined by the associated understanding model, as well as the underlying intent-entity model, providing a structure that enables improved inference within the scope of the particular intent-entity model 108.
Based on cognitive construction grammar techniques, the structure subsystem may consider the artifact labeling of the sample utterances 155 to guide determination of the shape of the utterance tree 166. It should be understood that the artifact labels applied to the tokens of the sample utterances 155 may be propagated to the meaning representations 158, and in particular, are included as parameters associated with the respective nodes of the meaning representations, where each node represents a token of the utterance. For example, in the present embodiment, the entirety of the sample utterance 155 of the utterance tree 166 may be identified as belonging to one (or multiple of) a desire intent, a travel intent, or a purchase intent, based on the presence of the “want” of node 202A, the “to go” of node 202B, and the “to buy” of node 202C, and “to return” of node 202D. These intents may therefore be leveraged within the structure of a particular intent-entity model 108 to identify meaning matches to meaning representations within the intent categories.
Further, certain nodes of the utterance tree 166 may be labeled as an entity that is associated with a particular intent of the sample utterance 155. For example, the tokens “store” of node 202F and “mall” of node 202G may include artifact labels to indicate that the tokens are labeled entities that correspond to the purchase intent. The nodes 202F, 202G may both be annotated as entities that represent locations within the travel intent. That is, each of these labeled entities further define their associated intent to enable the NLU framework 104 to derive meaning and perspective from user utterances 122 related to the associated intent. The artifact labels discussed herein are further leveraged by an artifact pinning subsystem, as discussed below, to verify whether certain cleansed forms and/or alternative forms of meaning representations generated for various sample utterances 155 have valid form interpretations that uphold the embedded artifact relationship information within a particular intent-entity model 108. By excluding interpretations that construe entities in manners unsuited to (e.g., different from, not in accordance with) the artifact labels of the sample utterances 155, the ambiguity of complex user or sample utterances, such as polysemic utterances and/or utterances having multi-word entities that are interpretable in multiple ways, can be reduced or eliminated. It should be understood that embodiments of the utterance tree 166 that are generated from a received user utterance 122 provided to the BE 102 may also include any other suitable tags that the BE 102 derives from the context of interaction with an end-user, such as tags indicating that a particular user utterance 122 was received during discussion of a particular intent, during a certain time of day, during occurrence of a particular news or weather event, and so forth.
Providing more detail herein with respect to generation of the search space 252, the artifact pinning subsystem 250 may aggregate the sample utterances 155 of a set 270 of intent-entity models, such as multiple intent-entity models 108 that are each suited for a particular purpose or domain. As discussed above, the sample utterances 155 are individually associated with artifact labels 272 that link model-specific relationships between various intents and various entities of the sample utterances 155. For example, each intent-entity model 108 of the set 270 may include sample utterances 155 that provide guidance for the NLU framework 104 to perform meaning searches with respect to any suitable natural language interaction with users, such as greeting users, managing meetings, managing a particular product of an enterprise, managing human resource actions, and/or concluding conversations with users, among many other suitable interactions. The sample utterances 155 are analyzed by the meaning extraction subsystem 150 of the artifact pinning subsystem 250 to generate a set 282 of meaning representations that assign possible forms to, as well as consider polysemic expression of, each respective sample utterance 155. For the set 282 of meaning representations, a respective understanding model of a set 284 of understanding models may therefore be generated, wherein each understanding model of the set 284 defines a respective model-specific search space 286.
As discussed below, the artifact pinning subsystem 250 desirably pins suitable meaning representation candidates of the multiple model-specific search spaces 286 to compile the search space 252 (e.g., compiled search space). In particular, for a given model-specific search space 286, the artifact pinning subsystem 250 identifies and pins suitable candidates from the set 282 of meaning representations that align with the artifact labels 272 of the particular one of the set 270 of intent-entity models from which the particular one of the set 284 of understanding models was derived. It should be understood that the artifact pinning subsystem 250 may compile the search space 252 after one or more conversations between a user and the BE 102, periodically, in response to receipt of new or updated sample utterances 155, and so forth.
Similarly, during search key generation and utilization, the artifact pinning subsystem 250 receives a user utterance 122 and derives a set 290 of meaning representations for the user utterance 122 that assigns potential entities to tokens within the user utterance 122. Notably, the artifact pinning subsystem 250 may not prune the set 290 of meaning representations because the search space 252 is already pruned to surviving meaning representations 158 that uphold the artifact relationships set forth by the artifact labels 272 of the set 270 of intent-entity models. Additionally, the artifact pinning subsystem 250 may further refine the set 282 of meaning representations of the set 270 of intent-entity models based on a context of a conversation between the user and the BE 102, such as a conversation to order a product or schedule a meeting. Thus, the artifact pinning subsystem 250 generates the tailored utterance meaning model 160 from the set 290 of meaning representations as the search keys 254 for comparison to the particularly context-aware search space 252. Indeed, as discussed in more detail below, the meaning search subsystem 152 may compare the meaning representations of the set 290 defining the search keys 254 to the meaning representations 158 of the search space 252 to identify any suitable, matching meaning representations 158, which enable the NLU framework 104 to identify the extracted artifacts 140 therefrom. The meaning search subsystem 152 may also score the matching meaning representations 158 and/or the artifacts therein with an accompanying confidence level to facilitate appropriate agent responses 124 and/or actions 142 to the most likely extracted artifacts 140 from the meaning representations 158. As will be understood, the disclosed embodiments of the model-based and context-specific expansion and pruning (e.g., pinning) of the meaning representations performed by the artifact pinning subsystem 250 provide enhancements to the search space 252 and/or the search keys 254 to facilitate efficient, relationship-aware identification of the extracted artifacts 140 for improved meaning search quality.
Generally, each meaning representation candidate of the set 310 may have a respective CCG form that is assigned based on the structure of the sample utterance 155, including the interdependencies of the intents and entities within each meaning representation candidate. The disclosed artifact pinning subsystem 250 performs entity pinning to enable the meaning search subsystem 152 to focus on meaning representation candidates of the set 310 that align with the aforementioned relational cues embedded within the associated intent-entity model 108 (e.g., defined herein as meaning representations having valid entity formulations), thereby leveraging these embedded relationships to verify the prescribed forms of the set 310 of meaning representation candidates. That is, assuming that each meaning representation candidate of the set 310 is related to a particular intent, the artifact pinning subsystem 250 pins one or multiple entities 312 (illustrated via filled circles in the present embodiment) therein that have a valid entity formulation and that correspond with the labeled or annotated entities within the set 270 of intent-entity models (defined by the artifact labels 272). Other meaning representation candidates of set 310 that do not contain a valid entity formulation (e.g., that do not correspond with the artifact labels 272) may therefore be disregarded as unviable candidates, while the entity-pinned meaning representation candidates of the set 310 are retained (e.g., pinned). The entity pinning faculties discussed herein may be performed with respect to the search space 252 to leverage the artifact-level relationships embedded within the set 270 of intent-entity models for structure-verification-based pruning. Indeed, with the search space 252 pruned and compiled to match the embedded relationships of the intent-entity model 108, the artifact pinning subsystem 250 enables the search keys 254 to be expanded, not pinned, and utilized as broader search components for improved inference quality. However, in other embodiments, the search keys 254 may be pruned in a manner similar to that of the search space 252, but with respect to potential, estimated artifact labels that are generated via ML techniques and contrasted to the artifact labels 272 of the sample utterances 155.
As a particular example, for a sample utterance 155 of “Order [three-way switch],” the set 310 of meaning representation candidates may include entity formulations and associated entity assignments (indicated as brackets) that express the sample utterance 155 as a first meaning representation for a first interpretation of the utterance as “I would like to order one three-way switch” and a second meaning representation for a second interpretation of the utterance as “I would like to order three and to switch my way.” Because the first interpretation includes the particular labeled entity [three way switch], the artifact pinning subsystem 250 may retain and pin the first meaning representation as a valid parse of the sample utterance 155 in light of the associated intent-entity model 108. In contrast, because the second meaning representation does not include the particular labeled entity of the sample utterance 155, the artifact pinning subsystem 250 may determine that the second meaning representation is not a valid interpretation of the sample utterance 155 and remove or prune the second meaning representation from the search space 252, thereby directing appropriate consumption of utterances for improved understanding accuracy.
Moreover, with respect to meaning searches performed by comparing the search keys 254 to the search space 252, the artifact pinning subsystem 250 may also leverage contextual information provided by the BE 102 to verify whether any of the meaning representation candidates of the set 310 generated for the search space 252 align with the context of conversation (e.g., a current or on-going conversation) between an end-user and the BE 102. For example, in response to receiving a user utterance 122 requesting to set up a meeting at a particular time, NLU framework 104 may perform a meaning search based on the user utterance 122 to determine that the utterance corresponds to a “meeting setup intent” defined in a particular intent-entity model 108. The NLU framework 104 provides this intent, along with any entities identified in the utterance, to the BE 102, as discussed above with respect to
As will be understood, these techniques thereby efficiently leverage implicit clues derivable from the current conversation and/or provided by an author of the intent-entity model 108 to pin candidate meaning representations as suitable search spaces 252 that provide an improved meaning search. It should be understood that although the artifact labels 272 of
The artifact pinning subsystem 250 performing the illustrated embodiment of the process 350 begins with a vocabulary refinement phase 352 that cleanses and validates (block 354) the sample utterances 155 stored within one intent-entity model of the set 270. For example, the artifact pinning subsystem 250 may utilize a vocabulary subsystem 356 of the NLU framework 104 (e.g., corresponding to the vocabulary manager 118 in certain embodiments) to access and apply the rules 114 stored in the database 106 that modifies certain tokens (e.g., words, phrases, punctuation, emojis) of the sample utterances 155. In some embodiments, the vocabulary subsystem 356 performs the vocabulary refinement phase 352 based on a vocabulary model 360 that is stored with the intent-entity model 108 within a particular understanding model of the set 282 or based on an aggregated vocabulary model derived from each respective vocabulary model of the set 282. By way of example, in certain embodiments, cleansing may involve applying a rule that removes non-textual elements (e.g., emoticons, emojis, punctuation) from the sample utterances 155. In certain embodiments, cleansing may involve correcting misspellings or typographical errors in the sample utterances 155. Additionally, in certain embodiments, cleansing may involve substituting certain tokens with other tokens. For example, the vocabulary subsystem 356 may apply a rule that that all entities with references to time or color with a generic or global entity, such as a global entity for a phone number, a time, a color, a meeting room, and so forth. In certain cases in which the intent-entity model 108 is a pre-built vocabulary model, the artifact pinning subsystem 250 may omit cleansing, while proceeding with validation to ensure the sample utterances 155 are valid sample utterances in view of validation rules stored within the database 106.
Continuing the vocabulary refinement phase 352 of the process 350, the artifact pinning subsystem 250 then performs (block 362) vocabulary injection on tokens of the sample utterances 155, thereby re-rendering the sample utterances 155 by adjusting phraseology and/or terminology of the sample utterances 155. Based on the vocabulary model 360 (or aggregate vocabulary model) stored within the understanding model 157, the artifact pinning subsystem 250 utilizes the vocabulary subsystem 356 to replace the content of certain tokens of the sample utterances 155 to become more discourse-appropriate phrases and/or terms. In certain embodiments, multiple phrases and/or terms may be replaced, and the various permutations of such replacements are used to generate a set 364 of utterances (e.g., vocabulary-adjusted sample utterances) for each sample utterance 155 of the particular intent-entity model 108. For example, in certain embodiments, the vocabulary subsystem 356 may access the vocabulary model 360 of the understanding model 157 to identify alternative vocabulary that can be used to generate re-expressions of the utterances having different tokens. By way of specific example, in an embodiment, the vocabulary subsystem 356 may determine that a synonym for “developer” is “employee,” and may generate a new utterance in which the term “developer” is substituted by the term “employee.”
For the embodiment illustrated in
The artifact pinning subsystem 250 may subsequently direct remaining candidates of the set 364 to the structure subsystem 370 of an alternative form expansion phase 372 of the process 350 for part-of-speech (POS) tagging and parsing. That is, in the illustrated embodiment, the artifact pinning subsystem 250 implements the structure subsystem 370 to generate (block 374) the set 282 of one or more meaning representations that are representative of the sample utterances 155 of the intent-entity model 108. For the embodiment illustrated in
The artifact pinning subsystem 250 performing the illustrated embodiment of
For example, given a sample utterance stating “Book meeting,” a first meaning representation of the set 282 of the meaning representations may indicate a parse and POS tagging in which “book” is interpreted as a verb-style intent (e.g., corresponding to a “schedule” intent) having “meeting” as an entity. As such, the first meaning representation of the set 282 may be interpreted as, “I want to schedule a meeting.” For a second meaning representation of the set 282, the parse and POS tagging may interpret “book” as a noun, such that the second meaning representation is interpreted as, “I want a book meeting.” Assuming that a “schedule” intent of the intent-entity model 108 includes at least one sample utterance 155 with author-labeled entities defined therein, the artifact pinning subsystem 250 analyzes the artifact labels 272 to determine whether “meeting” in the first meaning representation corresponds to at least one appropriate labeled entity of the “schedule” intent of the intent-entity model 108. In response to determining that “meeting” in the first meaning representation corresponds to a “meeting” entity of the “schedule” intent within the intent-entity model 108, the artifact pinning subsystem 250 may pin the first meaning representation as a valid formulation of the sample utterance 155. In contrast, upon analyzing a “desire” or “I want” intent of the intent-entity model 108 when considering the second meaning representation of the set 282, the artifact pinning subsystem 250 may determine that there are no sample utterances having a labeled “book meeting” entity defined for the “desire” intent within the intent-entity model 108. As such, the artifact pinning subsystem 250 may recognize that the second meaning representation is an invalid or irrelevant parse of the particular sample utterance 155, at least with respect to the particular domain in which the intent-entity model 108 is defined.
With the set 388 of meaning representations pinned as having suitable structure for each intent, the artifact pinning subsystem 250 may perform a re-expression phase 392 of the process 350 in which the artifact pinning subsystem 250 determines (block 394) suitable re-expressions of the tokens of the meaning representations of the set 388. For example, for the validated and number-reduced set 388 of meaning representations, the artifact pinning subsystem 250 may adjust the order of tokens, add tokens, remove tokens, transform tokens, generalize tokens, or otherwise manipulate the expression of each token of each meaning representation of the set 310. In certain embodiments, the re-expression phase 392 of the process 350 is performed by the NLU framework 104 described in U.S. patent application Ser. No. 16/239,218, entitled, “TEMPLATED RULE-BASED DATA AUGMENTATION FOR INTENT EXTRACTION,” filed Jan. 3, 2019, which is incorporated by reference herein in its entirety for all purposes.
The step of block 394 may generate multiple different meaning representations for each meaning representation of the set 388, again expanding the number of potential meaning representation candidates for inclusion in the search space 252. However, the artifact pinning subsystem 250 of certain embodiments may again leverage the artifact labels 272 to discard (block 396) certain meaning representations of the re-expressed variants of the set 388 that do not include the respective pinned entity therein. Then, the artifact pinning subsystem 250 removes (block 398) any artifact-level duplicates across the entire set 388 of meaning representations, as discussed above with respect to block 376. As such, the artifact pinning subsystem 250 may provide the suitably-distinct meaning representations remaining within the set 388 as meaningful candidates within a compiled understanding model 400, which corresponds to the compiled search space 252 introduced above. In some embodiments, the NLU framework 104 having the artifact pinning subsystem 250 may maintain the compiled search space 252 for a threshold amount of time, for use for a threshold number of meaning searches, and so forth.
In other embodiments, the refinement and expansion phase 452 may be modified to include any suitable processes that generate the set 454 of potential meaning representations having zero, one, or more suitably-distinct variations of the respective user utterance 122, in accordance with the present disclosure. For example, any one or multiple of vocabulary cleansing, vocabulary substitution, vocabulary injection, various part-of-speech assignments, alternative parse structure discovery, and re-expressions may be performed to generate the set 454 of potential meaning representations. As such, the artifact pinning subsystem 250 may aggregate the suitably-distinct meaning representations of each generated set 454 of potential meaning representations as meaningful candidates within the utterance meaning model 160, thereby forming the utterance meaning model 160 as the one or more search keys 254 suitable for efficient comparison to the search space 252.
Notably, it is presently recognized that the artifact pinning subsystem 250 may leverage contextual intent information related to a current conversation between the user and the BE 102 via a BE context intent pinning phase 470 that improves targeted refinement of the search space 252. In the illustrated embodiment, the artifact pinning subsystem 250 may identify (block 472) a contextual intent 474 from the context of the current episode between the user and the BE 102. As examples, the contextual intent 474 may be a purchase intent, a meeting setup intent, a travel intent, a desire intent, and so forth, based on the current operation of the BE 102 with respect to the dialog between the user and the BE 102. In some embodiments, the BE 102 may provide or make available to the NLU framework 104 the current contextual intent 474 of the BE 102, which may be utilized to tailor operation of the NLU framework 104 for subsequent intent inference operations. For example, in certain embodiments, the contextual intent 474 may be in the form of an indication of a flow (e.g., script, process) that is currently being executed by the BE 102 in response to a previously received user utterance 122, wherein the flow corresponds to a particular intent (e.g., a purchase intent, a schedule meeting intent) that was inferenced by the NLU framework 104 from the previously received user utterance 122. In other embodiments, the artifact pinning subsystem 250 may determine the contextual intent 474 from previous episodes, or from any suitable context information associated with an episode of a conversation that the BE 102 may use to determine suitable actions in response to extracted intents and/or entities of the user utterance 122. Moreover, in certain embodiments, the artifact pinning subsystem 250 and/or the BE 102 may identify that a slot-filling operation or conversation is to be performed to gather additional information or entities associated with the contextual intent 474.
Based on the contextual intent 474 of the BE 102, the artifact pinning subsystem 250 may aggregate (block 476) or identify relevant meaning representations 480 of the meaning representations 382 of the compiled understanding model 400 that are associated with (e.g., include) the contextual intent 474. It should be understood that the compiled understanding model 400 may have been previously generated via the model-based entity pinning phase 380 discussed above with respect to
With the relevant meaning representations 480 identified, the artifact pinning subsystem 250 may pin (block 482) or intent-pin the relevant meaning representations 480 within the search space 252. The artifact pinning subsystem 250 may therefore pin the meaning search to particular meaning representation candidates that relate to the contextual intent 474, therefore efficiently leveraging available information to improve how the user may interface with the BE 102. In certain embodiments, the pinning performed by the artifact pinning subsystem 250 at block 482 enables the agent automation system 100 to leverage the contextual intent 474 to soft-pin (e.g., identify and retain) particular subsets of the compiled understanding model 400 for particularly relevant meaning searches. The relevant meaning representations 480 identified within the compiled understanding model 400 may be provided a scoring bonus during such searching processes, thereby soft-pinning the generated search space 252 to provide more direct search paths for the meaning searches. The scoring bonus provided to the relevant meaning representations 480 (or scoring penalty provided to irrelevant meaning representations) may be any suitable increase above a threshold score. Thus, the artifact pinning subsystem 250 may increase the contribution of similarity scores determined for meaning representations within the search space 252 that match the contextual intent 474 (e.g., current topic of discussion) to improve similarity scoring processes, imparting a preference to the contextual intent 474 based on a particular conversational frame of reference. By influencing similarity scores (e.g., soft-pinning) instead of pruning the search space 252 based on the contextual intent 474, the artifact pinning subsystem 250 is designed to enable the agent automation system 100 to correctly interpret user utterances that change the topic of conversation, such as by enabling other intents to remain within meaning search results and potentially be identified as meaning search matches to the changed topic.
In other embodiments, the artifact pinning subsystem 250 executes the pinning of block 482 as hard-pinning, in which meaning representations 382 of the compiled understanding model 400 that are not associated with contextual intent 474 are removed from consideration (e.g., pruned). The artifact pinning subsystem 250 may therefore generate a restrictive or constrained embodiment of the search space 252 against which the search keys 254 are compared to identify the extracted artifacts 240, directing searching and similarity scoring resources to the remaining relevant meaning representations 480 of the complied understanding model 400. It should be understood that the artifact pinning subsystem 250 may be individually tailored to perform the BE context intent pinning phase 470 via any suitable combination of soft-pinning or hard-pinning. For example, the artifact pinning subsystem 250 may utilize soft-pinning in response to determining that the contextual intent 474 is relatively uncommon, is associated with an average episode duration that is below a threshold duration, or has any other suitable quality indicative of an increased likelihood of imminent topic change. In embodiments in which the contextual intent 474 is not associated with an increased likelihood of imminent topic change, the artifact pinning subsystem 250 may utilize hard-pinning. In any case, it should be understood that the process 450 may be repeated in response to subsequent user utterances 122 until the contextual intent 474 is satisfied (e.g., all sub-branches of a slot-filling operation are filled).
The BE context intent pinning phase 470 may be better understood with respect to an example embodiment of a particular slot-filling operation of BE 102 in which the user has already indicated a particular intent to purchase an item. Within the purchase intent or context identified as a particular contextual intent 474 (e.g., target intent, intent identified by BE 102 from previous user utterance 122), the BE 102 may respond by prompting the user for additional information related to the purchase intent, and the artifact pinning subsystem 250 may analyze subsequently-received user utterances by targeting (e.g., intent-pinning) the entities as likely being slot-filling responses associated with the contextual intent 474. In such embodiments, the BE 102 progresses through an episode of conversation that requests entities from the user that are missing and important for satisfying the contextual intent 474, without redundantly requesting information that the user has already provided.
As another specific example, in response to receiving a particular user utterance 122 of “I want to schedule a meeting at 10:00 am tomorrow,” the NLU framework 104 may identify that the contextual intent 474 is a “schedule meeting” intent that is associated with a “meeting start time” entity within the intent-entity model 108. As such, the BE 102 may provide an agent response to request that the user provide entities for meeting participants, a meeting subject, and/or a meeting end time. Because the NLU framework 104 is performing the BE context intent pinning phase 470, the search space 252 for meaning search may be desirably narrowed to or directed toward the meeting setup intent, thereby improving agent responses by discrediting or disregarding meaning representations 382 that are not associated with the meeting setup intent.
In other words, the artifact pinning subsystem 250 may recognize that a particular intent is current being discussed or was being previously discussed during the course of conversation based on input from the BE 102. As such, when the user has already requested scheduling of a meeting, the NLU framework 104 may receive a subsequent user utterance 122 indicating a meeting end time, which is identified as including a potential entity without any potential intent (e.g., a noun phrase). To better analyze the user utterance 122, the artifact pinning subsystem 250 desirably associates the contextual intent 474 from with the entity of the user utterance 122 and boosts the confidence level (e.g., above a threshold confidence level) to encourage or force intent-specific entity matching, while permitting potential topic changes. This association effectively operates as a contextual intent pinning to facilitate high-quality and context-relevant meaning searches.
Technical effects of the present disclosure include providing an agent automation framework that implements an artifact pinning subsystem that controls operation of the meaning extraction subsystem and the meaning search subsystem to expand and subsequently narrow a search space used during intent inference. During generation of the search space, the artifact pinning subsystem may determine multiple different understandings of sample utterances within one or multiple intent-entity models by performing vocabulary adjustment and varied part-of-speech assignment to each sample utterance, thus generating a potentially-sizeable quantity of candidates for inclusion within the search space. To prune the candidates, the disclosed artifact pinning subsystem leverages artifact correlations of the sample utterances to prune meaning representations that are not valid representations of a particular sample utterance. That is, the sample utterances generally each belong to an intent that may have been labeled (e.g., by an author or ML-based annotation subsystem) with a particular entity, within the structure defined by the related intent-entity model. The artifact pinning subsystem analyzing the various meaning representations for the identified intent may therefore identify a set of meaning representations that are associated with the particular intent and include a respective entity corresponding to the labeled entity of a corresponding sample utterance. The set of meaning representations may then be re-expressed by altering the arrangement or included number of nodes of meaning representations associated with the set, remove any duplicate candidates, and generate the search space based on the remaining meaning representations with the appropriate pinned entity. During a meaning search based on a received utterance from a user, the search keys may be formed by generating multiple potential meaning representations the received utterance from a user. The search space may also be generated or refined during the meaning search with respect to an inferenced, contextual intent or current topic of conversation between the user and a behavior engine, guiding further targeted pruning of the search space. As such, the artifact pinning subsystem improves a quality of meaning searches by removing or assigning similarity scores below a threshold to unviable and irrelevant meaning representations during search space compilation and inference processes.
The specific embodiments described above have been shown by way of example, and it should be understood that these embodiments may be susceptible to various modifications and alternative forms. It should be further understood that the claims are not intended to be limited to the particular forms disclosed, but rather to cover all modifications, equivalents, and alternatives falling within the spirit and scope of this disclosure.
The techniques presented and claimed herein are referenced and applied to material objects and concrete examples of a practical nature that demonstrably improve the present technical field and, as such, are not abstract, intangible or purely theoretical. Further, if any claims appended to the end of this specification contain one or more elements designated as “means for [perform]ing [a function] . . . ” or “step for [perform]ing [a function] . . . ”, it is intended that such elements are to be interpreted under 35 U.S.C. 112(f). However, for any claims containing elements designated in any other manner, it is intended that such elements are not to be interpreted under 35 U.S.C. 112(f).
This application claims priority from and the benefit of U.S. Provisional Application No. 62/869,811, entitled “PINNING ARTIFACTS FOR EXPANSION OF SEARCH KEYS AND SEARCH SPACES IN A NATURAL LANGUAGE UNDERSTANDING (NLU) FRAMEWORK,” filed Jul. 2, 2019, which is incorporated by reference herein in its entirety for all purposes. This application is also related to U.S. Provisional Application No. 62/869,864, entitled “SYSTEM AND METHOD FOR PERFORMING A MEANING SEARCH USING A NATURAL LANGUAGE UNDERSTANDING (NLU) FRAMEWORK”; U.S. Provisional Application No. 62/869,817, entitled “PREDICTIVE SIMILARITY SCORING SUBSYSTEM IN A NATURAL LANGUAGE UNDERSTANDING (NLU) FRAMEWORK”; and U.S. Provisional Application No. 62/869,826, entitled “DERIVING MULTIPLE MEANING REPRESENTATIONS FOR AN UTTERANCE IN A NATURAL LANGUAGE UNDERSTANDING (NLU) FRAMEWORK,” which were each filed Jul. 2, 2019 and are incorporated by reference herein in their entirety for all purposes.
Number | Date | Country | |
---|---|---|---|
62869811 | Jul 2019 | US | |
62869864 | Jul 2019 | US | |
62869817 | Jul 2019 | US | |
62869826 | Jul 2019 | US |