This patent application is based on and claims priority pursuant to 35 U.S.C. § 119(a) to Japanese Patent Application No. 2021-008362, filed on Jan. 22, 2021, in the Japan Patent Office, the entire disclosure of which is hereby incorporated by reference herein.
Embodiments of the present disclosure relate to an information processing apparatus, an information processing system, an information processing method, and a non-transitory computer-executable medium.
In recent years, deep learning has been used in various situations as one of machine learning techniques. In this deep learning, a framework is known in which a model is automatically built by learning rules of input/output relations of given existing data as a distributed network, and a prediction for new input data is output based on the model. In natural language processing as well, as represented by Bidirectional Encoder Representation from Transformers (BERT), deep learning-based models have appeared that obtain a distributed representation that take context into consideration. Such models have top-level performance in all tasks related to natural language processing. Further, application examples of such models include classification of message contents on a social networking service (SNS), Voice of Customer (VOC) analysis of electronic commerce (EC) sites, and document summarization and generation. Use of such models are growing.
Furthermore, a technique is known that estimates answer media (text, image, video, voice) to be used based on a question sentence in natural language and outputs an answer to the question sentence by the estimated answer medium.
An embodiment of the present disclosure includes an information processing apparatus including circuitry. The circuitry receives a question input and transmitted by an input apparatus. The circuitry obtains answer source information for creating an answer to the received question, the answer source information associating natural language information given in advance with non-language information by deep learning, the non-language information including configuration information. The circuitry transmits, to the input apparatus, answer content information to the question or additional content request information requesting an input of additional content to the question, the answer content information and the additional content request information being created based on the answer source information.
Another embodiment of the present disclosure includes an information processing method performed by an information processing apparatus. The information processing method includes receiving a question input and transmitted by an input apparatus. The information processing method includes obtaining answer source information for creating an answer to the received question, the answer source information associating natural language information given in advance with non-language information by deep learning, the non-language information including configuration information. The information processing method includes transmitting, to the input apparatus, answer content information to the question or additional content request information requesting an input of additional content to the question, the answer content information and the additional content request information being created based on the answer source information.
Another embodiment of the present disclosure includes a non-transitory computer-executable medium storing a program storing instructions which, when executed by a computer, causes the computer to perform an information processing method performed by an information processing apparatus. The information processing method includes receiving a question input and transmitted by an input apparatus. The information processing method includes obtaining answer source information for creating an answer to the received question, the answer source information associating natural language information given in advance with non-language information by deep learning, the non-language information including configuration information. The information processing method includes transmitting, to the input apparatus, answer content information to the question or additional content request information requesting an input of additional content to the question, the answer content information and the additional content request information being created based on the answer source information.
A more complete appreciation of the disclosure and many of the attendant advantages and features thereof can be readily obtained and understood from the following detailed description with reference to the accompanying drawings, wherein:
The accompanying drawings are intended to depict embodiments of the present invention and should not be interpreted to limit the scope thereof. The accompanying drawings are not to be considered as drawn to scale unless explicitly noted. Also, identical or similar reference numerals designate identical or similar components throughout the several views.
In describing embodiments illustrated in the drawings, specific terminology is employed for the sake of clarity. However, the disclosure of this specification is not intended to be limited to the specific terminology so selected and it is to be understood that each specific element includes all technical equivalents that have a similar function, operate in a similar manner, and achieve a similar result.
Referring now to the drawings, embodiments of the present disclosure are described below. As used herein, the singular forms “a,” “an,” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise.
Referring to the drawings, embodiments of the present disclosure are described. In the description of the drawings, the same elements are denoted by the same reference numerals, and redundant descriptions thereof are omitted.
A description is given below of the present embodiment, with reference to
Overview of Configuration of Question Answering System 1:
Input Apparatus:
The input apparatus 2 is implemented by an information processing apparatus (computer system) installed with a general-purpose operating system (OS). The input apparatus 2 receives an input of voice (sound) information of voice (natural language) that is spoken by a human and obtained through a microphone or sound generated by a machine, converts the voice (sound) information into text information, and transmits the text information to the information processing apparatus 3 through the communication network 100. Further, the input apparatus 2 receives text information transmitted by the information processing apparatus 3, converts the text information into voice (sound) information, and outputs sound or voice according to the voice (sound) information to the outside through a speaker. Furthermore, the input apparatus 2 converts the received text information into screen information to be displayed on display means, and causes the display means to display a screen based on the screen information. In one example, the input apparatus 2 is a communication terminal having communication capability such as a smartphone, a tablet terminal, a personal digital assistant (PDA), or a wearable personal computer (PC) of sunglasses type or wristwatch type, for example. In another example, the input apparatus 2 is a general-purpose PC. In other words, the input apparatus 2 to be used may be any terminal capable of executing software such as browser software.
Information Processing Apparatus:
The information processing apparatus 3 is implemented by an information processing apparatus (computer system) installed with a general-purpose OS and having a server function. The information processing apparatus 3 communicates with the input apparatus 2 through the communication network 100, and processes data related to question content transmitted by the input apparatus 2. Further, the information processing apparatus 3 generates answer (response) information related to the question content and information for receiving additional question content, and transmits the generated information to the input apparatus 2. In the following description, a term “deep learning”, which is as an example of machine learning, is used to refer to the machine learning. Furthermore, in the present embodiment, the information processing apparatus 3 generates a “knowledge source 300” representing answer source information for creating an answer to a particular question, by using natural language information and non-language information, such as an image or configuration information. The natural language information and the non-language information are associated with each other by deep learning. In the following description, the “knowledge source 300” representing answer source information for creating an answer to a particular question may be referred to as the knowledge source 300. The information processing apparatus 3 deductively generates an answer to a question content given by a user by using the generated knowledge source 300. In another example, the information processing apparatus 3 is a general-purpose PC, provided that the information processing apparatus has a configuration configured to generate the knowledge source 300.
Terms:
The term “question answering” used in the present embodiment refers to analyzing a given question using deep learning and providing, to a user (questioner) who has made the question, exact information that the user wants. On the other hand, the term “search” refers to performing a search with a keyword that one considers by oneself, and to retrieve desired information by analyzing the search result.
Hardware Configuration:
Hardware Configuration of Input Apparatus and Information Processing Apparatus:
A description is now given of a hardware configuration of each apparatus, according to an embodiment.
The CPU 201 controls overall operation of the input apparatus 2. The ROM 202 stores a program to boot the CPU 201. The RAM 203 is used as a work area for the CPU 201. The HD 204 stores various data such as a control program. The HDD controller 205 controls reading or writing of various data from or to the HD 204 under control of the CPU 201. The display 206 displays various information such as a cursor, menu, window, characters, virtual numeric keypad, execution key, or image. The display 206 is one example of a display device (display means). The external device connection I/F 208 is an interface for connecting the input apparatus 2 to various external devices. Examples of the external devices include, but are not limited, a universal serial bus (USB) memory and a USB device. Examples of the bus line 210 include, but are not limited to, an address bus and a data bus, which electrically connects the elements such as the CPU 201 with each other.
The network I/F 209 is an interface that enables the input apparatus 2 to perform data communication through the communication network 100. The keyboard 211 is an example of an input device (input means) provided with a plurality of keys that allows a user to input characters, numerals, or various instructions. The pointing device 212 is an example of an input device (input means) that allows a user to select or execute various instructions, select an object for processing, or move a cursor being displayed. In another example, the input device (input means) includes at least one of a touch panel or a voice input apparatus, in addition to or in alternative to the keyboard 211 and the pointing device 212. The DVD-RW drive 214 controls reading or writing (storing) various data from or to a DVD-RW 213, which is an example of a removable storage medium. In another example, the removable storage medium includes at least one of digital versatile disk-recordable (DVD-R) or a Blu-ray® disc, in addition to or in alternative to the DVD-RW. The medium I/F 216 controls reading or writing data from or to a storage medium 215 such as a flash memory. The microphone 218 is an example of sound collecting device (sound collecting means) that collects voice or ambient sound (audio signal). The speaker 219 is an example of a sound output device (sound output means) that outputs an output sound signal obtained by converting an input sound signal. The sound input/output I/F 217 is a circuit that processes an input or output of a sound signal between the microphone 218 and the speaker 219 under control of the CPU 201.
The information processing apparatus 3 is implemented by a general-purpose computer. As illustrated in
Of these hardware elements, the CPU 301 to the pointing device 312 has the same or substantially the same configuration as the hardware elements of the CPU 201 to the pointing device 212 of the input apparatus 2, and the redundant detailed descriptions thereof are omitted below. The medium I/F 316 controls reading or writing (storing) data from or to a storage medium 315 such as a flash memory. In one example, when the information processing apparatus 3 is a general-purpose PC, the information processing apparatus 3 includes a hardware resource corresponding to the DVD-RW drive 214 of the input apparatus 2.
The computer illustrated in
In one example, any one of the above-described control programs is recorded in a file in a format installable or executable on a computer-readable storage medium or is downloaded through a network for distribution. Examples of the storage medium include, but are not limited to, a compact disc recordable (CD-R), a DVD, a Blu-ray® disc, a secure digital (SD) card, and a USB memory. In another example, such storage medium is provided in domestic markets or foreign markets as program products. For example, the information processing apparatus 3 implements an information processing method according to the present disclosure by executing a program according to the present disclosure.
Functional Configuration of Question Answering System:
A description is now given of a functional configuration of each apparatus according to an embodiment, with reference to
Functional Configuration of Input Apparatus:
As illustrated in
Each Functional Unit of Input Apparatus:
A detailed description is now given of each functional unit of the input apparatus 2. The transmission/reception unit 21 of the input apparatus 2 illustrated in
The operation reception unit 22 is implemented mainly by the keyboard 211 and the pointing device 212 illustrated in
The sound input/output unit 23 is implemented mainly by the microphone 218 and the sound input/output I/F 217 illustrated in
The display control unit 24 is implemented mainly by processing of the CPU 201 to the display 206 illustrated in
The conversion and creation unit 27 is implemented mainly by processing of CPU 201 illustrated in
The storing and reading unit 29 is implemented mainly by processing of the CPU 201 illustrated in
Functional Configuration of Information Processing Apparatus:
As illustrated in
Object-to-be-Extracted Management Table:
The countermeasure is information obtained by extracting what kind of countermeasure has been taken in the managed apparatus to be maintained. For example, as the countermeasure. “replaced a unit”, “replaced a part”, and “repaired” are given and managed.
The location is information obtained by extracting at which (where) a malfunction has occurred in the managed apparatus to be maintained. For example, as the location, “Part A”, “Part B” and the like are given and managed. In another example, the above various data (the malfunction phenomenon, the countermeasure, and the location) managed in the object-to-be-extracted management table are automatically extracted using a named entity extraction method or predicate-argument structure analysis, for example. In still another example, the above various data are automatically extracted using a named entity extraction method or predicate-argument structure analysis, for example, in a process of creating the knowledge source 300 described below, instead of being managed as table data.
Image Information Management Table:
As described, above with reference to
The image indicates an image of a part, a unit, or the like of the managed apparatus. For example, as the image, “Image A” “Image B” and the like are given and managed. The image icon indicates an icon corresponding to the image. The image icon is given and managed in an image file format of each icon (e.g., .bmp, .jpg).
Product Configuration Management Table:
The first layer is at the highest level, when parts, units, and the like of the managed apparatus is represented by computer-aided design (CAD) data. For example, as the first layer, “Unit A” is given and managed. The second layer is at the next lower level to the first layer to which Unit A belongs. For example, as the second layer, a “rotation unit”, a “conveyance unit” and the like are given and managed. The third layer is at the next lower level to the second layer and includes parts of the second layer. For example, as the third layer, “Part A”, “Part B” and the like are given and managed.
In another example, the product configuration management table includes up to the second layer, instead of the third layer. In still another example, the product configuration management table further includes the fourth or higher layer.
In another example, the object-to-be-extracted management table (the object-to-be-extracted management DB 3001), the image information management table (image information management DB 3002), and the product configuration management table (product configuration management DB 3003) are data managed in respective certain areas of the storage unit 3000, instead of being managed as table data.
Each Functional Unit of Information Processing Apparatus:
A detailed description is now given of each functional unit of the information processing apparatus 3. The transmission/reception unit 31 of the information processing apparatus 3 illustrated in
The analysis unit 32 is implemented mainly by processing of CPU 301 illustrated in
The extraction and generation unit 33 is implemented mainly by processing of CPU 301 illustrated in
The determination unit 35 is implemented mainly by processing of CPU 301 illustrated in
The answer creation unit 36 is implemented mainly by processing of CPU 301 illustrated in
The storing and reading unit 39 is implemented mainly by processing of the CPU 301 illustrated in
Form of Knowledge Source:
A description is now given of the knowledge source 300, according to the present embodiment.
The knowledge source 300 is extracted and generated from a natural language data group and a structured/unstructured data group described below, which are given in advance. In the present embodiment, a natural language data group is treated as an example of natural language information, and the structured unstructured data group is treated as an example of non-language information other than the natural language information. The natural language data group is, for example, data related to a natural language held by a manufacturer as a provider of the managed apparatus to be maintained and managed by the question answering system. The natural language data group as an example of the natural language information includes a design specification, a manual, a maintenance report, and parts information. On the other hand, the structured/unstructured data group as an example of the non-language information is an example of a data group other than the natural language data group. Among the structured/unstructured data group, for example, CAD data, three-dimensional (3D) CAD data (mechanism and configuration information of a product), and a bill of material (BOM) are structured data, and a CAD image and an image in a document are unstructured data. As a form of the knowledge source 300, for example, a structured knowledge graph and an RDB are given.
Regarding the knowledge source 300, a description is given of a case as a specific example in which a user who uses the input apparatus 2 or a person in charge of maintenance of the managed apparatus searches for a countermeasure for a malfunction that has occurred in the managed apparatus using the input apparatus 2. In the following description, the user who uses the input apparatus 2 or the person in charge of maintenance of the managed apparatus is referred to as a “user or a maintenance person” for the sake of explanatory convenience. Specific processing includes: (1) information extraction from the natural language data group; (2) information extraction from the data group other than the natural language data group; and association of results obtained by (1) and (2).
(1) Extraction of Information from Natural Language Data Group
In an example in which the natural language data group is data held by the manufacturer that provides the managed apparatus, examples of the natural language data group include, but, are not limited to, a design specification, a manuals, a maintenance report, and parts information. From the above examples, location information and phenomenon information (e.g., a malfunction phenomenon and a countermeasure) are automatically extracted using at least one of the named entity extraction method and the predicate-argument structure analysis. The location information indicates where parts, units, or the like are arranged in a product. The phenomenon information indicates a phenomenon occurred in the parts, units, or the like. Although
As a specific example of the information extraction from the natural language data group, the object-to-be-extracted management table illustrated in
(2) Extraction of Information from Data Group Other than Natural Language
Layer structure information of a product is acquired from CAD data as the data group other than the natural language data group. In another example, a BOM is also used. Association of an image in a CAD and an image in a document as image data with the natural language data (information managed by “location” in
“Part A” and “Part C” managed as the information of “location” in the knowledge source 300 illustrated in
A description is now given of an operation and processes performed by the question answering system 1 according to the present embodiment, with reference to
Knowledge Source Generation Processing:
In one example, the extraction and generation unit 33 updates the knowledge source 300 according to an update frequency of the object-to-be-extracted management table (the object-to-be-extracted management DB 3001, see
Question Input Processing:
A description is now given of an operation of receiving an input of a question performed by the input apparatus 2 and an operation of responding to the question performed by the information processing apparatus 3. As illustrated in
Example Display Screen:
Question Response Processing:
Referring again to
Operation of Creating Answer and Additional Question:
In response to receiving the question request, the information processing apparatus 3 performs a process of creating an answer sentence or an additional question to the question indicated by the question content information (step S23). Specifically, the answer creation unit 36 refers to the knowledge source 300, to create an answer sentence corresponding to the content of the given question or an additional question indicated by additional content request information for requesting an input of an additional content for selecting the answer to the question.
Next, the answer creation unit 36 creates an answer candidate group, which is candidates for an answer to the question, by using the knowledge source 300 generated by the extraction and generation unit 33 (step S23-2). Specifically, based on a content obtained by the analysis, the answer creation unit 36 creates two information items as an answer source. Of these two information items, one information item is “location”: “Part B”, “countermeasure”: “replace a unit”, and the other information item is “location”: “Part D”, “countermeasure”: “repair”, both including “abnormal sound” in the malfunction phenomenon.
Next, the determination unit 35 determines whether an answer is uniquely determined from the created answer candidate group (step S23-3). When the determination unit 35 determines that an answer is uniquely determined from the created answer candidate group (step S23-3: YES), the answer creation unit 36 creates an answer sentence to the question based on the content of the created answer candidate group (step S23-4). Then, the operation of the flowchart ends.
By contrast, when the determination unit 35 determines that an answer is not uniquely determined from the created answer candidate group (step S23-3: NO), the answer creation unit 36 creates an additional question sentence for the question to obtain information for uniquely determining an answer to the question (step S23-5). Then, the operation of the flowchart ends. In the example of step S23-2, since the answer creation unit 36 creates the two information items as the answer source, an answer is not uniquely determined. Accordingly, the answer creation unit 36 performs the process of step S23-5 described above. Then, the operation of the flowchart ends. In this case, a content of the additional question, that is, the additional content for selecting the answer to the question is, for example, “Which of the “rotation unit” or the “conveyance unit” is the location where the abnormal sound is occurring?”.
Referring again to
In response to receiving the response to the question, the storing and reading unit 29 temporarily stores the received response to the question in a particular area of the storage unit 2000 (step S25).
Next, the display control unit 24 reads the response information indicating the temporarily stored answer or the information requesting an input of additional content from the storage unit 2000 in cooperation with the storing and reading unit 29, displays the read response information or information requesting an input of additional content on the display 206 (step S26). Note that, in response to receiving an additional question from the user or the managed apparatus after performing the process of step S26, the operation transitions to the process of step S21, and the input apparatus 2 repeats the subsequent processes.
Referring again to step S22, when new information (question) is given by the user or the maintenance person, the answer creation unit 36 performs the processes of step S23-1 to step S23-4 or step S23-5 again in accordance with the given information, and repeats the processes until an answer is uniquely determined.
Example Display Screen:
Further, the response notification screen 2102 includes a “confirm” button 2113 (hereinafter, referred to as a confirm button 2113). Thus, the user or the maintenance person confirms the content displayed on the response notification screen 2102, and then operates (e.g., presses or taps) the confirm button 2113. In response to the operation by the user or the maintenance person, the screen transitions to another screen.
The response notification screen 2102 illustrated in
Example Display Screen:
Further, the response notification screen 2103 includes a “confirm” button 2114 (hereinafter, referred to as a confirm button 2114). Thus, the user or the maintenance person confirms the content displayed on the response notification screen 2103, and then operates (e.g., presses or taps) the confirm button 2114. In response to the operation by the user or the maintenance person, the screen transitions to another screen.
The response notification screen 2103 illustrated in
In still another example, the display control unit 24 at least one of the inference process through which the answer to the particular question is obtained as illustrated in
Example Display Screen:
The description given above is of the examples of the screen display generated in response to the speech by the user or the maintenance person such as “Please tell me how to deal with the abnormal sound XXXYYY”, in order to search for how to deal with the malfunction of the managed apparatus. In another example, in response to a speech by the user or the maintenance person, for example, “The motor seems damaged, so please tell me how to deal with it” in order to search for how to deal with the malfunction of the managed apparatus, the answer creation unit 36 creates another additional question in step S23-5. For example, the answer creation unit 36 creates an additional question sentence including a content such as “The location where the motor is damaged can be either “Part A” or “Part C”. Do you know which one is damaged?” Further, in this case, “Part A” and “Part C” are managed as the same part in the knowledge source 300. Accordingly, in one example, the display control unit 24 of the input apparatus 2 controls the display 206 to display both the image icon of “Image A” and the image icon of “Image C” transmitted by the information processing apparatus 3 in the step S24, to allow the user or the maintenance person to select either one of the displayed image icons.
In one example, in the question answering system according to the present embodiment, for example, when the above-described processes of step S22 and step S24 are performed, another apparatus or the like resides between the input apparatus 2 and the information processing apparatus 3. In other words, in one example, information (data) to be exchanged between the input apparatus 2 and the information processing apparatus 3 is exchanged via another apparatus. The above-described configuration and processing may also be applied to other processing steps between the input apparatus 2 and the information processing apparatus 3.
In the present embodiment, the user and the maintenance person are collectively referred to as a “user”. Further, the user includes, in addition to the maintenance person, a service person who manages various services provided by the managed apparatus, and a repair person who repairs the managed apparatus.
As described above, according to the present embodiment, the information processing apparatus 3 refers to the knowledge source 300, to create an answer sentence or an additional question corresponding to a content of a given question (step S23). Further, the information processing apparatus 3 transmits, as a response to the question to the input apparatus 2, an answer sentence to the question or the additional content request information requesting an input of additional content for selecting an answer to the question (step S24). Thus, since the information processing apparatus 3 requests the input apparatus 2 for new information for selecting the answer to the question, the accuracy of an answer to a given particular question content is improved.
Further, to obtain desired information for a question content given by the user, the information processing apparatus 3 analyzes information across modalities and performs a search based on combination of images and language information, to generate an answer. Specifically, the information processing apparatus 3 presents, to the user, a process (inference process) of inferring an answer content with reference to the knowledge source 300 and a process (explanation process) of how an answer is obtained, with respect to a particular question content. This enables the information processin2 apparatus 3 to improve the reliability for the user and the work efficiency of the user.
The functions of one embodiment described above can be implemented by a computer executable program described in a legacy programming language such as an assembler, C, C++, C#, and Java®, or an object-oriented programming language. The program to implement the functions in each embodiment can be distributed via a telecommunication line.
Further, the program for executing the functions of one embodiment can be stored, for distribution, on a storage medium such as a ROM, an electrically erasable programmable read-only memory (EEPROM), an erasable programmable read-only memory (EPROM), a flash memory, a flexible disk (FD), a CD-ROM, a DVD-ROM, a DVD-RAM, a DVD-Rewritable (DVD-RW), a Blu-ray® disk, a secure digital (SD) card, or a magneto-optical disc (MO).
Furthermore, some or all of the functions of one embodiment may be mounted on a programmable device (PD) such as an application specific integrated circuit (ASIC), a digital signal processor (DSP), a field programmable gate array (FPGA), a system on a chip (SOC), or a graphics processing unit (GPU), and distributed by a storage medium as a circuit configuration data (bit stream data) downloaded to the PD in order to implement the functions of the embodiment on the PD, or as data described by Hardware Description Language (HDL), Very High Speed Integrated Circuits Hardware Description Language (VHDL), Verilog-HDL, etc., for generating circuit configuration data.
Each of the tables obtained by the above-described embodiment may be generated by learning effect of machine learning. In addition, in alternative to using the tables, the data of each related item may be classified by the machine learning. In the present disclosure, the machine learning is defined as a technology that makes a computer to acquire human-like learning ability. In addition, the machine learning refers to a technology in which a computer autonomously generates an algorithm required for determination such as data identification from learning data loaded in advance and applies the generated algorithm to new data to make a prediction. Any suitable learning method is applied for machine learning, for example, any one of supervised learning, unsupervised learning, semi-supervised learning, reinforcement learning, and deep learning, or a combination of two or more those learning.
In another example, the input apparatus 2 described in one embodiment as an example includes the functions and means of the information processing apparatus 3, whereby enabling the input apparatus to function as an input response apparatus. In this case, the input apparatus includes functional units including the knowledge source and the extraction and generation unit that extracts and generates the knowledge source.
In the technique according to the related art, there has been no idea of generating an answer to a given question content based on multiple information items including texts and images related to the question content, when responding the answer obtained by deep learning to the given question content. This may lead to low accuracy of the answer to the question content.
According to one or more embodiments of the present disclosure, the accuracy of an answer to a given question content is improved.
Although the information processing apparatus, the question answering system, the information processing method, and the program according to embodiments of the present disclosure have been described above, the above-described embodiments are illustrative and do not limit the present disclosure. Thus, numerous additional modifications and variations are possible in light of the above teachings. For example, elements and/or features of different illustrative embodiments may be combined with each other and/or substituted for each other within the scope of the present disclosure.
Any one of the above-described operations may be performed in various other ways, for example, in an order different from the one described above.
Each of the functions of the described embodiments may be implemented by one or more processing circuits or circuitry. Processing circuitry includes a programmed processor, as a processor includes circuitry. A processing circuit also includes devices such as an application specific integrated circuit (ASIC), a digital signal processor (DSP), a field programmable gate array (FPGA), and conventional circuit components arranged to perform the recited functions.
Number | Date | Country | Kind |
---|---|---|---|
2021-008362 | Jan 2021 | JP | national |