The present disclosure relates to computational methods and computer systems for determining an answer in response to a query.
Question-answer models can employ large question-answer datasets to enable a computer, when provided a question, to provide an answer. Due to a variety of linguistic contexts, these models lack a commonsense approach. Accordingly, they yield an undesirable percentage of inaccurate answers.
According to one embodiment, a method is disclosed that uses a dialogue computer. The dialogue computer may comprise: receiving a query from a user; providing the query to an input layer of a neural network; injecting one or more triples of a knowledge graph into a plurality of nodes of an output layer of the neural network; and determining an answer to the query based on the output layer.
According to another embodiment, a non-transitory computer-readable medium comprising computer-executable instructions and a memory for maintaining the computer-executable instructions is disclosed. The computer-executable instructions when executed by one or more processors of a computer may perform the following functions: receive a query from a user; provide the query to an input layer of a neural network; inject one or more triples of a knowledge graph into a plurality of nodes of an output layer of the neural network; and determine an answer to the query based on the output layer.
According to another embodiment, a dialogue computer is disclosed. The dialogue computer may comprise: one or more processors; and memory coupled to the one or more processors storing a plurality of instructions executable by the one or more processors, the plurality of instructions comprising to: receive a query from a user; provide the query to an input layer of a neural network; inject one or more triples of a knowledge graph into a plurality of nodes of an output layer of the neural network; and determine an answer to the query based on the output layer.
According to the at least one example set forth above, a computing device comprising at least one processor and memory is disclosed that is programmed to execute any combination of the examples of the method(s) set forth herein.
According to the at least one example, a computer program product is disclosed that includes a computer readable medium that stores instructions which are executable by a computer processor, wherein the instructions of the computer program product include any combination of the examples of the method(s) set forth herein and/or any combination of the instructions executable by the one or more processors, as set forth herein.
Embodiments of the present disclosure are described herein. It is to be understood, however, that the disclosed embodiments are merely examples and other embodiments can take various and alternative forms. The figures are not necessarily to scale; some features could be exaggerated or minimized to show details of particular components. Therefore, specific structural and functional details disclosed herein are not to be interpreted as limiting, but merely as a representative basis for teaching one skilled in the art to variously employ the embodiments. As those of ordinary skill in the art will understand, various features illustrated and described with reference to any one of the figures can be combined with features illustrated in one or more other figures to produce embodiments that are not explicitly illustrated or described. The combinations of features illustrated provide representative embodiments for typical applications. Various combinations and modifications of the features consistent with the teachings of this disclosure, however, could be desired for particular applications or implementations.
Turning now to the figures, wherein like reference numerals indicate like or similar features and/or functions, a dialogue computer 10 is shown for generating an answer to a query or question posed by a user (not shown). According to an example,
A user of the Q&A system 12 may be a human being which communicates a query (a.k.a., a question) with a desire to receive a corresponding response. According to one embodiment, the query may regard any suitable subject matter. In other embodiments, the query may pertain to a predefined category of information (e.g., customer technical support for a product or service). These are merely examples; other embodiments also exist and are contemplated herein. An example process of providing an answer to a user's query will be described following a description of illustrative elements of system 12.
Human-machine interface (HMI) 14 may comprise any suitable electronic input-output device which is capable of: receiving a query from a user, communicating with dialogue computer 10 in response to the query, receiving an answer from dialogue computer 10, and in response, providing the answer to the user. According to the illustrated example of
Input device 20 may comprise one or more electronic input components for receiving a query from the user. Non-limiting examples of input components include: a microphone, a keyboard, a camera or sensor, an electronic touch screen, switches, knobs, or other hand-operated controls, and the like. Thus, via the input device 20, HMI 14 may receive the query from user via any suitable communication format—e.g., in the form of typed text, uttered speech, user-selected symbols, image data (e.g., camera or video data), sign-language, a combination thereof, or the like. Further, the query may be received in any suitable language.
Controller 22 may be any electronic control circuit configured to interact with and/or control the input device 20, the output device 24, and/or the communication device 26. It may comprise a microprocessor, a field-programmable gate array (FPGA), or the like; however, in some examples only discrete circuit elements are used. According to an example, controller 22 may utilize any suitable software as well (e.g., non-limiting examples include: DialogFlow™, a Microsoft chatbot framework, and Cognigy™). While not shown here, in some implementations, the dialogue computer 10 may communicate directly with controller 22. Further, in at least one example, controller 22 may be programmed with software instructions that comprise—in response to receiving at least some image data—determining user gestures and reading the user's lips. The controller 22 may provide the query to the dialogue computer 10 via the communication device 26. In some instances, the controller 22 may extract portions of the query and provide these portions to the dialogue computer 10—e.g., controller 22 may extract a subject of the sentence, a predicate of the sentence, an action of the sentence, a direct object of the sentence, etc.
Output device 24 may comprise one or more electronic output components for presenting an answer to the user, wherein the answer corresponds with a query received via the input device 20. Non-limiting examples of output components include: a loudspeaker, an electronic display, or the like. In this manner, when the dialogue computer 10 provides an answer to the query, HMI 14 may use the output device 24 to present the answer to the user according to any suitable format. Non-limiting examples include presenting the user with the answer in the form of audible speech, displayed text, one or more symbol images, a sign language video clip, or a combination thereof.
Communication device 26 may comprise any electronic hardware necessary to facilitate communication between dialogue computer 10 and at least one of controller 22, input device 20, or output device 24. Non-limiting examples of communication device 26 include a router, a modem, a cellular chipset, a satellite chipset, a short-range wireless chipset (e.g., facilitating Wi-Fi, Bluetooth, or the like), or a combination thereof. In at least one example, the communication device 26 is optional. E.g., dialogue computer 10 could communicate directly with the controller 22, input device 20, and/or output device 24.
Storage media devices 16 may be any suitable writable and/or non-writable storage media communicatively coupled to the dialogue computer 10. While two are shown in
Structured data may be data that is labeled and/or organized by field within an electronic record or electronic file. Non-limiting examples of structured data include a knowledge graph (e.g., having a plurality of nodes (each node defining a different subject matter domain), wherein some of the nodes are interconnected by at least one relation), a data array (an array of elements in a specific order), metadata (e.g., having a resource name, a resource description, a unique identifier, an author, and the like), a linked list (a linear collection of nodes of any type, wherein the nodes have a value and also may point to another node in the list), a tuple (an aggregate data structure), and an object (a structure that has fields and methods which operate on the data within the fields).
According to at least some examples, the knowledge graph may comprise one or more knowledge types. Non-limiting examples include: a declarative commonsense knowledge type (scope comprising factual knowledge; e.g., “the sky is blue,” “Paris is in France,” etc.); a taxonomic knowledge type (scope comprising classification; e.g., football players are athletes,” “cats are mammals,” etc.); a relational knowledge type (e.g., scope comprising relationships; e.g., “the nose is part of the head,” “handwriting requires a hand and a writing instrument,” etc.); a procedural knowledge type (scope comprising prescriptive knowledge, a.k.a., order of operations; e.g., “one needs an oven before baking cakes,” “the electricity should be disconnected while the switch is being repaired,” etc.); a sentiment knowledge type (scope comprising human sentiments; e.g., “rushing to the hospital makes people worried,” “being on vacation makes people relaxed,” etc.); and a metaphorical knowledge type (scope comprising idiomatic structures; e.g., “time flies,” “it's raining cats and dogs,” etc.).
Unstructured data may be information that is not organized in a pre-defined manner (i.e., which is not structured data). Non-limiting examples of unstructured data include text data, electronic mail (e-mail) data, social media data, internet forum data, image data, mobile device data, communication data, and media data, just to name a few. Text data may comprise word processing files, spreadsheet files, presentation files, message field information of e-mail files, data logs, etc. Electronic mail (e-mail) data may comprise any unstructured data of e-mail (e.g., a body of an e-mail message). Social media data may comprise information from commercial websites such as Facebook™, Twitter™, LinkedIn™, etc. Internet forum data (e.g., also called message board data) may comprise online discussion information (of a website) wherein the website presents saved written communications of forum users (these written communications may be organized or curated by topic); in some examples, forum data may comprise a question and one or more public answers (e.g., question and answer (Q&A) data). Of course, Q&A data may form parts of other data types as well. Image data may comprise information from commercial websites such as YouTube′, Instagram™, other photo-sharing sites, and the like. Mobile device data may comprise Short Message System (SMS) or other short message data, mobile device location data, etc. Communication data may comprise chat data, instant message data, phone recording data, collaborative software data, etc. And media data may comprise Motion Pictures Expert Group (MPEG) Audio Layer Ills (MP3s), digital photos, audio files, video files (e.g., including video clips (e.g., a series of one or more frames of a video file)), etc.; and some media data may overlap with image data. These are merely examples of unstructured data; other examples also exist. Further, these and other suitable types of unstructured data may be received by the dialogue computer 10—receipt may occur concurrently or otherwise.
As shown in
Processor(s) 30 may be programmed to process and/or execute digital instructions to carry out at least some of the tasks described herein. Non-limiting examples of processor(s) 30 include one or more of a microprocessor, a microcontroller or controller, an application specific integrated circuit (ASIC), a field-programmable gate array (FPGA), one or more electrical circuits comprising discrete digital and/or analog electronic components arranged to perform predetermined tasks or instructions, etc.—just to name a few. In at least one example, processor(s) 30 read from memory 32 and/or non-volatile memory 34 and execute multiple sets of instructions which may be embodied as a computer program product stored on a non-transitory computer-readable storage medium (e.g., such as in non-volatile memory 34). Some non-limiting examples of instructions are described in the process(es) below and illustrated in the drawings. These and other instructions may be executed in any suitable sequence unless otherwise stated. The instructions and the example processes described below are merely embodiments and are not intended to be limiting.
Memory 32 may include any non-transitory computer usable or readable medium, which may include one or more storage devices or storage articles. Exemplary non-transitory computer usable storage devices include conventional hard disk, solid-state memory, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), as well as any other volatile or non-volatile media. Non-volatile media include, for example, optical or magnetic disks and other persistent memory, and volatile media, for example, also may include dynamic random-access memory (DRAM). These storage devices are non-limiting examples; e.g., other forms of computer-readable media exist and include magnetic media, compact disc ROM (CD-ROMs), digital video disc (DVDs), other optical media, any suitable memory chip or cartridge, or any other medium from which a computer can read. As discussed above, memory 32 may store one or more sets of instructions which may be embodied as software, firmware, or other suitable programming instructions executable by the processor(s) 30—including but not limited to the instruction examples set forth herein. In operation, processor(s) 30 may read data from and/or write data to memory 32.
Non-volatile memory 34 may comprise ROM, EPROM, EEPROM, CD-ROM, DVD, and other suitable non-volatile memory devices. Further, as memory 32 may comprise both volatile and non-volatile memory devices, in at least one example additional non-volatile memory 34 may be optional.
Communication network 18 facilitates electronic communication between dialogue computer 10, the storage media device(s) 16, and HMI 14. Communication network 18 may comprise a land network, a wireless network, or a combination thereof. For example, the land network may enable connectivity to public switched telephone network (PSTN) such as that used to provide hardwired telephony, packet-switched data communications, internet infrastructure, and the like. And for example, the wireless network may comprise cellular and/or satellite communication architecture covering potentially a wide geographic region. Thus, at least one example of a wireless communication network may comprise eNodeBs, serving gateways, base station transceivers, and the like.
According to at least one example, the structured data 42 comprises at least a question-and-answer (Q&A) pair feature set. The Q&A pair feature set may be provided to a language model 44 stored in non-volatile memory 34 and executed using processor(s) 30 of dialogue computer 10. The language model 44 may be a neural network configured to receive a query (e.g., a query in the form of an interrogative sentence) and provide an answer (e.g., in the form of a declarative sentence). In some examples, dialogue computer 10 further determines a metadata feature set (and/or other feature set) and also provides these set(s) to language model 44. Any combination of feature sets may be provided to the language model 44 so that the language model 44 may be built and trained thereby increasing the likelihood that the answer is accurate.
Returning to
The various knowledge types of the knowledge graph 46 may be comprised of triples which are interconnected to form data structure. According to an example, a triple may comprise a subject element, a relationship element, and an object element. According to an example, knowledge graph 46 may be configured to improve the answer determination of the language model 44—e.g., the triples may comprise a subject of a sentence (subject element), a predicate of a sentence (relationship element), and an object of the sentence (object element), wherein the object is part of the predicate of the sentence. Accordingly, potentially millions of subject elements could be linked to different object elements via respective relationship elements. Each of the subject, relationship, and object elements may be assigned a value; e.g., each triple may be represented as a mathematical vector. In this manner, the triple may be utilized by the language model 44.
As shown in
Thus, as shown in
According to an example (see
Once the answer is selected, the answer is provided to HMI 14a. As described above, via at least one output device 24, the user is presented with the answer (A). Thus, continuing with the example above, a user may approach HMI 14a (e.g., a kiosk), utter or type a query via the input device 20, the controller 22 may provide it to the communication device 26, the communication device 26 may transmit it to the dialogue computer 10, the dialogue computer 10 may execute the hybrid language model (as described above), upon determination of an answer to the query, the dialogue computer 10 may provide the answer to the communication device 26, the communication device 26 may provide the answer to the controller 22, and the controller 22 may provide the answer to the output device 24, wherein the output device 24 may provide the answer audibly or by otherwise presenting via display.
Turning now to
In block 610, computer 10 may determine whether it is receiving unstructured data—e.g., from storage media devices 16 or the like. If computer 10 is receiving unstructured data, the process may proceed to block 620; otherwise, it may loop back and repeat block 610. It should be appreciated that unstructured data could be received in other ways as well; e.g., initially, a quantity of unstructured data could be moved from a network drive to memory 32 (and/or memory 34).
In block 620, computer 10 may update a first feature set. According to an example, the first feature set could be a Q&A pair feature set, as described above. For example, dialogue computer 10 may parse the unstructured data received in block 610 and organize the data into structured data, such as the Q&A pair feature set. According to at least one example, the Q&A pair feature set is a data array; however, this is not required in all examples. Block 630 may follow.
As shown in
In block 630, computer 10 may update the language model 44 using the information of block 620. Updated structured data, once imported, may enable the language model 44 to be more refined and yield more accurate answers. Block 630 may or may not include addition of new neural network layers and/or neural network nodes in existing and/or new layers.
Blocks 610-640 may occur before a user approaches HMI 14a. In block 650, the computer 10 determines whether an ‘answer’ to a query has been determined using the knowledge graph 46 (e.g., abbreviated ‘KG’ in the figures). When no answer has been determined, process 600 may loop back to block 610 and continue to receive and process unstructured data to improve the language model 44, the knowledge graph 46, or both. (Alternatively, process 600 could end).
However, when an answer has been determined in block 650, process 600 may loop back and update the language model 44 in block 630. E.g., the nodes of the language model 44 may be tuned further using the answer derived using the injected triples of the language graph feature set. Block 650 may again follow, and then when no new answer is determined in block 650, process 600 may proceed to block 610, as previously described.
Other embodiments of process 600 also exist. For example, additional feature sets may be used to improve the language model. According to
According to another embodiment, process 600 further may comprise training the language model using supervised learning. E.g., a query may be presented to the neural network which also comprises predetermined answers to the query so that the neural network will determine the correct answer. This process may be repeated until the accuracy of the neural network exceeds a predetermined threshold.
Turning now to
In block 710, computer 10 may determine whether a user query has been received. For example, a default state may be that no user query has been received; however, computer 10 may receive an indication from controller 22 (via communication device 26) that a query has been received. According to one non-limiting example, the user query may be a RESTful application program interface (API) request (REpresentational State Transfer Application Protocol Request) via Hyper Text Transfer Protocol Secure (HTTPS); however, other examples also exist (including but not limited to a Simple Object Access Protocol (SOAP) request).
According to another embodiment of block 710, controller 22 determines whether a query has been received from a user (e.g., via HMI 14a). When no query is determined to be received, process 700 may loop back and repeat block 710 until a query is received. Otherwise, the process may proceed to block 720.
In block 720, computer 10 may determine one or more triples from knowledge graph 46 using the query. According to a non-limiting example, the computer 10 may evaluate the query sentence, and based on the evaluation, computer 10 may determine at least one associated knowledge type (e.g., one of a declarative commonsense knowledge type, a taxonomic knowledge type, a relational knowledge type, a procedural knowledge type, a sentiment knowledge type, a metaphorical knowledge type, or other suitable type) and provide one or more triples from the knowledge graph 46 based on one or more appropriate knowledge types.
Block 730 may follow block 720, or may occur at least partially concurrently, or may even be executed before block 720. In block 730, computer 10 may provide the query to the language model 44. In some examples, the entire interrogative sentence is provided; in other examples, computer 10 parses the query and provides to the input layer 60 of language model 44 portions of the interrogative sentence.
In block 740, computer 10 may process the query using the language model 44. According to an example, node(s) of the various layers 60, 64, 66 of a neural network are activated. The output layer 62 may be processed in the blocks that follow.
Once the one or more triples are determined (block 720) and once the neural network is executed except for the output layer 62 (block 740), block 750 may be executed by computer 10. As shown in
In block 760, computer 10—via language model 44—may calculate a probability value for at least some of the output nodes of the output layer 62. According to an example, this may be a matrix or vector calculation.
In block 770 which follows, computer 10 may determine which output node has the highest probability value (e.g., of a probability distribution). As discussed above, computer 10 may use output selection 48 to make this determination.
In block 780, computer 10 may determine an answer based on the output node having the highest probability value and provide this as an answer to the user query. According to one example, computer 10 may provide this to HMI 14a—via communication network 18 (e.g., as an asynchronous response via HTTPS). Communication device 26 may receive this answer and provide it to controller 22 which in turn may provide the answer to output device 24 so that the answer may be presented to the user. Thereafter, process 700 may end, or alternatively, process 700 may loop back to block 710 to repeat the process.
Other embodiments of process 700 also exist. For example, when the knowledge graph 46 is not stored at dialogue computer 10, computer 10 may transmit a request for the one or more triples, and the knowledge graph (e.g., being stored on a different computer or server) may transmit a response that comprises one or more triples.
Now turning to
Now turning to
Now turning to
Now turning to
Now turning to
Still other examples also exist. For example, knowledge graph 46 described above may be public or private. E.g., a private knowledge graph may comprise proprietary data known only to the entity which manufactures and sells the dialogue computer 10. In at least one example, the proprietary data may comprise personally identifiable information (PII); e.g., the dialogue computer 10 may be configured for a workplace environment.
Thus, there has been described a question and answer system that receives a query from a user, processes the query using a hybrid language model, and then provides an answer to the query. The hybrid language model may comprise aspects of a data-oriented language model and aspects of a knowledge-oriented language model. The hybrid language model may increase the accuracy of answers to user queries.
The processes, methods, or algorithms disclosed herein can be deliverable to/implemented by a processing device, controller, or computer, which can include any existing programmable electronic control unit or dedicated electronic control unit. Similarly, the processes, methods, or algorithms can be stored as data and instructions executable by a controller or computer in many forms including, but not limited to, information permanently stored on non-writable storage media such as ROM devices and information alterably stored on writeable storage media such as floppy disks, magnetic tapes, CDs, RAM devices, and other magnetic and optical media. The processes, methods, or algorithms can also be implemented in a software executable object. Alternatively, the processes, methods, or algorithms can be embodied in whole or in part using suitable hardware components, such as Application Specific Integrated Circuits (ASICs), Field-Programmable Gate Arrays (FPGAs), state machines, controllers or other hardware components or devices, or a combination of hardware, software and firmware components.
While exemplary embodiments are described above, it is not intended that these embodiments describe all possible forms encompassed by the claims. The words used in the specification are words of description rather than limitation, and it is understood that various changes can be made without departing from the spirit and scope of the disclosure. As previously described, the features of various embodiments can be combined to form further embodiments of the invention that may not be explicitly described or illustrated. While various embodiments could have been described as providing advantages or being preferred over other embodiments or prior art implementations with respect to one or more desired characteristics, those of ordinary skill in the art recognize that one or more features or characteristics can be compromised to achieve desired overall system attributes, which depend on the specific application and implementation. These attributes can include, but are not limited to cost, strength, durability, life cycle cost, marketability, appearance, packaging, size, serviceability, weight, manufacturability, ease of assembly, etc. As such, to the extent any embodiments are described as less desirable than other embodiments or prior art implementations with respect to one or more characteristics, these embodiments are not outside the scope of the disclosure and can be desirable for particular applications.