The present invention relates to an interactive device, a method of controlling an interactive device, and a control program. For example, the present invention relates to an interactive device that converses with a user by voice or text.
Interactive devices that converse with a user by voice or text have conventionally been developed. For example, Patent Literature 1 discloses an interactive device that converses with a user by voice. Some of the interactive devices are configured to: store user's speeches in a database; and use user's previous speeches stored in the database to generate a speech of the interactive device.
Japanese Patent Application Publication, Tokukai, No. 2015-87728 (Publication date: May 7, 2015)
However, a user sometimes omits some phrase in his/her speech. For example, in a case where the interactive device says “(Do you) like apples?”, the user may say “Sure” (the subject is omitted), “Yes” (answer is shortened), or the like, instead of saying “(I) like apples”. In such cases, the interactive device is sometimes unable to make effective use of the user's speech in generating a speech of the interactive device. One way to construct a more valuable database would be to complete the user's speech and store it in the database; however, if the interactive device completes the user's speech by adding a seemingly omitted phrase, the completed user's speech may be incorrect. That is, the completed user's speech may be different from the user's intended one. Such an incorrectly completed user's speech cannot be made effective use of in generation of a speech of the interactive device in some cases.
The present invention was made in view of the above issue, and an object thereof is, by storing a user's speech in a state with no omissions or incorrect parts, to make effective use of the stored user's previous speech in order to generate a speech of the interactive device.
In order to attain the above object, an interactive device in accordance with one aspect of the present invention is an interactive device configured to converse with a user by voice or text, including: a speech completion section configured to, if a speech of the user inputted to the interactive device lacks some phrase, complete the speech of the user on the basis of at least one of a previous speech of the interactive device and a previous speech of the user; a correct/incorrect determination section configured to determine whether the speech of the user completed by the speech completion section is correct or incorrect on the basis of a specified determination condition; a speech storing section configured to, if the correct/incorrect determination section determines that the speech of the user is correct, store information of the speech of the user in a speech database; and a speech generation section configured to generate a speech of the interactive device with use of the speech of the user that has been stored in the speech database by the speech storing section.
In order to attain the above object, a method of controlling an interactive device in accordance with one aspect of the present invention is a method of controlling an interactive device that is configured to converse with a user by voice or text, the method including: a speech completing step including, if a speech of the user inputted to the interactive device lacks some phrase, completing the speech of the user on the basis of at least one of a previous speech of the interactive device and a previous speech of the user; a correct/incorrect determining step including determining whether the speech of the user completed in the speech completing step is correct or incorrect on the basis of a specified determination condition; a speech storing step including, if it is determined in the correct/incorrect determining step that the speech of the user is correct, storing information of the speech of the user in a speech database that is for use in generation of a speech of the interactive device; and a speech generating step including generating the speech of the interactive device with use of the speech of the user that has been stored in the speech database in the speech storing step.
According to one aspect of the present invention, it is possible, by storing a user's speech in a state with no omissions or incorrect parts, to make effective use of the stored user's previous speech in order to generate a speech of an interactive device.
The following description will discuss embodiments of the present invention in detail.
(Configuration of Interactive Device 1)
The following description will discuss a configuration of the interactive device 1 in accordance with Embodiment 1, with reference to
As illustrated in
The speech input section 10 detects a user's speech, and generates speech data that corresponds to the user's speech. The speech input section 10 is, specifically, a microphone. The speech data detected by the speech input section 10 is transmitted to the control section 20.
The control section 20 generates a speech of the interactive device 1. The control section 20 carries out speech recognition of the user's speech detected by the speech input section 10 to thereby obtain information of the user's speech, and stores the obtained information in the speech database 50. As illustrated in
The speech output section 30 outputs the speech of the interactive device 1, which is generated by the control section 20, in the form of a sound. The speech output section 30 is, specifically, a speaker. In one variation, the interactive device 1 may output the speech of the interactive device 1 in text form.
The scenario database 40 stores therein scenarios for use in generation of a speech of the interactive device 1. The scenarios include question scenarios (see
(Flow of Speech Information Obtaining Process)
The following description will discuss a flow of a speech information obtaining process carried out by the control section 20, with reference to
As shown in
The speech recognition section 21 receives, from the speech input section 10, the speech data that corresponds to the user's speech (S2, speech obtaining step). The speech recognition section 21 carries out a speech recognition process with respect to the speech data received from the speech input section 10, and thereby converts the speech data that corresponds to the user's speech into text data (S3). The speech recognition section 21 may be configured such that, if the speech recognition fails, the speech recognition section 21 requests the user to speak again by use of a display notification, a sound notification, or the like notification or waits until the user speaks again. The speech recognition section 21 supplies, to the morphological analysis section 22, the result of the speech recognition (i.e., text data that corresponds to the user's speech). The speech recognition section 21 may be configured such that, even if the speech recognition fails, the speech recognition section 21 supplies the result of the speech recognition to the morphological analysis section 22. Note that, in a case where the interactive device 1 is a machine that converses with a user by text, the morphological analysis section 22 in step S2 receives text inputted by the user, and the foregoing step S3 is omitted. In the following descriptions, the text data obtained as a result of speech recognition or as a result of text input is referred to as user's speech data.
The morphological analysis section 22 carries out a morphological analysis of the user's speech data obtained from the speech recognition section 21 (S4). Specifically, the morphological analysis section 22 breaks the user's speech into morphemes (e.g., words), each of which is the smallest meaningful unit in the grammar of a language. The morphological analysis is an existing technique, and therefore descriptions therefor are omitted here.
Next, the morphological analysis section 22 evaluates the result of the morphological analysis (S5). Specifically, the morphological analysis section 22 determines whether the user's speech omits a phrase or not. Note here that a phrase is made up of one or more words.
If it is determined that the user's speech omits a phrase (Yes in S6), the completion processing section 23 completes the user's speech by adding a seemingly omitted phrase (e.g., subject, predicate, modifier) based on at least one of the immediately preceding speech of the interactive device 1 and a previous speech of a user (S7, speech completing step). A flow of the speech completion process (S7) carried out by the completion processing section 23 will be described later. On the other hand, if it is determined that the user's speech does not omit any phrase (No in S6), the completion processing section 23 does not carry out the speech completion process.
The speech storing section 25 obtains the user's speech data from the completion processing section 23. As described earlier, if it is determined that the user's speech omits some phrase, the completion processing section 23 completes the user's speech by adding a seemingly omitted phrase in step S7. Therefore, the user's speech that the speech storing section 25 obtains is a complete speech having no phrases omitted.
Next, the speech storing section 25 determines a topic category of each word contained in the user's speech, with reference to the category table 60 (see
In a case where the completion processing section completes the user's speech in step S7, the completed user's speech may be different from the user's intended one. For example, in a case where the user says “sweet”, the completion processing section 23 completes the user's speech by adding a subject that was seemingly omitted in the user's speech; however, the subject added by the completion processing section 23 may be different from the user's intended subject. To address this, the correct/incorrect determination section 26 determines whether the completed user's speech is correct or incorrect on the basis of a specified determination condition, and, only if it is determined that the completed user's speech is correct, the speech storing section 25 stores the information of the completed user's speech in the speech database 50. The determination of whether the completed user's speech is correct or incorrect, carried out by the correct/incorrect determination section 26, can be made on the basis of any determination condition. For example, the correct/incorrect determination section 26 may use information of the user's immediately preceding speech or information of the immediately preceding speech of the interactive device 1 to determine whether the completed user's speech is correct or incorrect. One example of the speech storing process (S8) carried out by the correct/incorrect determination section 26 will be described later. With this, the speech information obtaining process ends.
According to the above-described speech information obtaining process, it is possible to store a user's speech in complete state, that is, in a state having no phrases omitted, in the speech database 50. Information of user's previous speeches stored in the speech database 50 can be used to generate a speech of the interactive device 1. A method of generating a speech of the interactive device 1 with the use of the information of the user's previous speeches stored in the speech database 50 will be described later.
(S1: Flow of Speech Generation Process)
The following description will discuss a flow of step S1 of the foregoing speech information obtaining process (see
As shown in
Next, the speech generation section 24 searches the scenario database 40 (illustrated in
The speech generation section 24 generates a next speech of the interactive device 1 by replacing the topic category of the scenario selected in S205 with the topic category of the user's preceding speech or with the topic category of the preceding speech of the interactive device 1 (S206, speech generating step). Note that, if there are no scenarios containing the same topic category as the topic category associated with the user's immediately preceding speech in the scenario database 40 (No in S201), the interactive device 1 may respond to the user's speech by an action such as back-channel feedback, without outputting any speech. Alternatively, in a case where the topic category of the next speech of the interactive device 1 differs greatly from the topic category of the user's immediately preceding speech, the speech generation section 24 may generate a speech that informs the user of a topic change (e.g., “By the way”).
On the other hand, if there are scenarios containing the same topic category as the topic category associated with the user's immediately preceding speech in the scenario database 40 (Yes in S201), the speech generation section 24 extracts conditions and results associated with the scenarios (see
If there is no information of a user's preceding speech or a preceding speech of the interactive device 1 that satisfies one of the conditions and results corresponding to the scenarios extracted in S202 in the speech database 50 (No in S203), the speech generation section 24 selects, from the scenario database 40, a scenario that contains a different topic category from the topic category associated with the user's immediately preceding speech (S205). On the other hand, if there is information of a user's preceding speech or a preceding speech of the interactive device 1 that satisfies one of the conditions and results corresponding to the scenarios extracted in S202 in the speech database 50 (Yes in S203), the speech generation section 24 selects one of the extracted scenarios (S204). Then, the speech generation section 24 generates a next speech of the interactive device 1 by replacing the topic category of the scenario selected in S204 or S205 with the topic category of the user's preceding speech or the preceding speech of the interactive device 1 (S206, speech generating step). With this, the speech generation process ends.
(S7: Flow of Speech Completion Process)
The following description will discuss a flow of step S7 of the foregoing speech information obtaining process (see
As shown in
Specifically, the completion processing section 23 refers to the speech database 50 to obtain information of the immediately preceding speech of the interactive device 1 (that is, the completion processing section 23 obtains the most recently stored one of the information items of previous speeches of the interactive device 1 stored in the speech database 50). Then, the completion processing section 23 completes the user's speech by adding a subject to the user's speech, based on the subject of the immediately preceding speech of the interactive device 1. For example, in a case where the interactive device 1 says “Do you like grapes?” in accordance with “Scenario 2” in the scenario database 40 shown in
In a case where the subject is not omitted in the user's speech (NO in S301), the completion processing section 23 next determines whether or not the predicate was omitted in the user's speech (S303). If it is determined that the predicate was omitted in the user's speech (YES in S303), the completion processing section 23 completes the user's speech by adding a predicate to the user's speech on the basis of the immediately preceding speech of the interactive device 1 (S304). For example, in a case where the immediately preceding speech of the interactive device 1 is “Do you like grapes?” and the user said “I do”, the completion processing section 23 generates the completed user's speech “XX (registered name of the user) likes grapes”. The completion processing section 23 may further carry out a step of adding a modifier to the user's speech (this arrangement is not illustrated).
If it is determined that the predicate is not omitted in the user's speech (NO in S303), the completion processing section 23 next determines whether or not the answer was shortened in the user's speech (S305). That is, the completion processing section 23 determines whether the user's speech is “Yes” or the like affirmative response or “NO” or the like negative response. If it is determined that the answer is shortened in the user's speech (YES in S305), the completion processing section 23 refers to the speech database 50 (see
If it is determined that none of the phrases in the user's speech is omitted (NO in S305), the completion processing section 23 does not carry out the speech completion process with respect to the user's speech.
(S8: Flow of Speech Storing Process)
The following description will discuss a flow of step S8 of the foregoing speech information obtaining process, that is, a flow of the speech storing process, with reference to
As shown in
If the correct/incorrect determination section 26 fails to find information of a user's previous speech associated with the same topic category as a topic category of a word contained in the completed user's speech (NO in S402), the correct/incorrect determination section 26 determines that the completed user's speech is incorrect. In this case, the speech storing section 25 does not store the information of the completed user's speech in the speech database 50 (S403). Note however that, if the correct/incorrect determination section 26 determines that the completed user's speech is incorrect, the interactive device 1 may ask the user whether the completed user's speech is correct or incorrect. In this arrangement, if the user's answer is that the completed user's speech is appropriate, the speech storing section 25 also stores, in the speech database 50, the completed user's speech that has been determined to be incorrect by the correct/incorrect determination section 26. This arrangement will be described later in Embodiment 3.
On the other hand, if the correct/incorrect determination section 26 succeeds in finding information of a user's previous speech associated with the same topic category as a topic category of a word contained in the completed user's speech (YES in S402), the correct/incorrect determination section 26 determines that the completed user's speech is correct. In this case, the speech storing section 25 stores the information of the user's speech completed by the completion processing section 23 in the speech database (S404). Note that, in a case where the completion processing section 23 did not carry out the completing process with respect to the user's speech in step S7 of the speech information obtaining process, the correct/incorrect determination section 26 may not carry out the determination of whether or not the user's speech is correct or incorrect, and the speech storing section 25 may store the user's speech which has not been subjected to the completing process.
(Variation)
In one variation, the correct/incorrect determination section 26 may determine whether the completed user's speech is correct or incorrect on the basis of not only the condition in terms of topic category of the completed user's speech but also a condition in terms of who (which user) made the speech. According to the arrangement of this variation, whether the completed user's speech is correct or incorrect is determined based on an increased number of conditions, and therefore it is possible to more accurately determine whether the completed user's speech is correct or incorrect.
In this variation, if the correct/incorrect determination section 26 succeeds in finding, from the speech database 50, information of a user's previous speech associated with the same topic category as a topic category of the completed user's speech (YES in S402 of
(Example of Speech Database 50)
The following arrangement, which is not illustrated, may be employed: in the speech database 50, an information item of a user's previous speech is provided with (i) an accompanying information item that is indicative of a means (voice input, text input) via which the speech was inputted into the interactive device 1 or (ii) an accompanying information item that is indicative of the state (having been subjected to the completing process or not) of the speech when the speech was stored in the speech database 50.
(Example of Category Table 60)
Some of the topic categories may be in inclusion relation with each other. Specifically, a word associated with a certain topic category may be one of the words that are associated with another topic category (superordinate category). For example, the topic categories “SWEETNESS”, “SOURNESS”, and “UMAMI” in
In the speech storing process S8 of Embodiment 1, the correct/incorrect determination section 26 determines that the completed user's speech is correct if a topic category of a word contained in the completed user's speech is the same as a topic category of a user's previous speech (see
(S8: Flow of Speech Storing Process)
The following description will discuss a flow of a speech storing process S8 in accordance with Embodiment 2 with reference to
As shown in
If a combination of topic categories of a plurality of words contained in the completed user's speech is not the same as the combination of topic categories associated with the immediately preceding speech of the interactive device 1 (NO in S502), the speech storing section 25 does not store the information of the completed user's speech in the speech database 50 (S503). Note that the following arrangement, like that described later in Embodiment 3, may be employed: if the correct/incorrect determination section 26 determines that the completed user's speech is incorrect, the interactive device 1 asks the user whether the completed user's speech is correct or incorrect. In this arrangement, if the user answers that the completed user's speech is appropriate, the speech storing section 25 also stores, in the speech database 50, the completed user's speech that has been determined to be incorrect by the correct/incorrect determination section 26.
On the other hand, if a combination of topic categories of a plurality of words contained in the completed user's speech is the same as the combination of topic categories associated with the immediately preceding speech of the interactive device 1 (YES in S502), the speech storing section 25 stores the information of the completed user's speech in the speech database 50 (S504). Note that, if the completion processing section 23 does not carry out the completing process with respect to the user's speech in step S7 of the speech information obtaining process, the correct/incorrect determination section 26 may or may not carry out the determination of whether the user's speech is correct or incorrect. In a case where the correct/incorrect determination section 26 does not carry out the determination of whether the user's speech is correct or incorrect, the speech storing section 25 may store the user's speech that has not been subjected to the completing process.
In a case where the interactive device 1 and a user are having a conversation about a certain topic, the user's speech is closely related to the immediately preceding speech of the interactive device 1. On the other hand, if the user has changed topics, the user's speech is less related to the immediately preceding speech of the interactive device 1. As described earlier, the completion processing section 23 completes the user's speech on the basis of the immediately preceding speech of the interactive device 1, and therefore the completion processing section 23 is highly likely to be able to correctly complete the user's speech in the former case; however, in the latter case, the completion processing section 23 is less likely to be able to correctly complete the user's speech. According to the arrangement of Embodiment 2, the speech storing section 25 stores the completed user's speech in the speech database 50 only if the topic categories of the words contained in the completed user's speech are the same as the topic categories of the words contained in the immediately preceding speech of the interactive device 1, that is, only in the former case. As such, the speech storing section 25 is capable of storing, in the speech database 50, only information of a user's speech that is highly likely to have been completed correctly.
Note that the speech storing process discussed in Embodiment 2 and the speech storing process discussed in Embodiment 1 may be employed in combination. For example, the following arrangement may be employed. As described earlier in Embodiment 1, the correct/incorrect determination section 26 first determines whether or not a topic category of a word contained in the completed user's speech is the same as a topic category of a user's previous speech. If it is determined that the topic category of a word contained in the completed user's speech is the same as a topic category of a user's pervious speech, the correct/incorrect determination section 26 determines that the completed user's speech is correct. On the other hand, if it is determined that the topic category of a word contained in the completed user's speech is not the same as a topic category of a user's pervious speech, the correct/incorrect determination section 26 further carries out a determination of whether the completed user's speech is correct or incorrect in the manner described in Embodiment 2. According to this arrangement, the correct/incorrect determination section 26 is capable of more accurately determining whether the completed user's speech is correct or incorrect.
Embodiment 3 deals with an arrangement in which, if the speech storing section 25 determines not to store the completed user's speech in the speech storing process S8 of the speech information obtaining process (see
(Speech Confirmation Process)
The following description will discuss a flow of a speech confirmation process in accordance with Embodiment 3, with reference to
As shown in
If the speech generation section 24 fails to find a scenario that contains the same topic category as that of a word contained in the completed user's speech from the scenario database 40 (NO in S602), the speech generation section 24 generates a speech of the interactive device 1 on the basis of the topic category of the user's speech (S603). For example, in a case where the completed user's speech is “Lemons are sweet”, the speech generation section 24 generates a speech of the interactive device 1 on the basis of a topic category (e.g., fruit) associated with “lemons” and a topic category (e.g., sweetness) associated with “sweet”. For example, the speech generation section 24 may generate the speech “Are lemons sweet?” as a speech of the interactive device 1. In a case where the user's speech which has not been subjected to the completing process is “sweet”, the morphological analysis section 22 carries out a morphological analysis of the user's speech to thereby determine that the subject ([What]) was omitted in the user's speech. Then, the speech generation section 24 may generate the speech “What tastes sweet?” as a speech of the interactive device 1, on the basis of the result of the morphological analysis by the morphological analysis section 22 and the topic category “sweet” which is associated with the user's speech.
On the other hand, if the speech generation section 24 succeeds in finding a question scenario that contains the same topic category as that of the completed user's speech from the scenario database 40 (YES in S602), the speech generation section 24 generates a speech of the interactive device 1 in accordance with the found question scenario (S604). For example, if the completed user's speech is “Lemons are sweet”, the speech generation section 24 obtains, from the scenario database 40, a question scenario that contains topic categories corresponding to “lemon” and “sweet” (such topic categories are, for example, fruit, sweetness, sourness, umami, and the like). Then, the speech generation section 24 may generate a speech of the interactive device 1 in accordance with the obtained question scenario. For example, in a case where the question scenario obtained by the speech generation section 24 is “Is(Are) [A] [B]?”, the speech generation section 24 may replace [A] with “lemons” and replace [B] with “sweet”, and thereby generate the speech “Are lemons sweet?” as a speech of the interactive device 1.
The speech generation section 24 causes the speech output section 30 to output the thus-generated speech (question) of the interactive device 1 (S605). Then, the control section 20 of the interactive device 1 waits for a certain period of time to receive a user's response to the speech of the interactive device 1.
If the user does not respond within the certain period of time after the speech of the interactive device 1 (No in S606), the speech storing process ends. On the other hand, if the user responds within the certain period of time (Yes in S606), the correct/incorrect determination section 26 determines whether the user's response is affirmative (such as “Yes” or “Yep”) or negative (such as “No” or “Nope”) (S607). If the user's response is affirmative (YES in S607), the speech storing section 25 stores the completed user's speech in the speech database 50 (S608). On the other hand, if the user's response is negative (NO in S607), the speech storing section 25 does not store the completed user's speech in the speech database 50.
According to the arrangement of Embodiment 3, if the correct/incorrect determination section 26 determines that the completed user's speech is incorrect, the speech generation section 24 asks the user whether the completed user's speech is correct or incorrect. If the user answers that the completed user's speech is correct, the speech storing section 25 stores the user's speech in the speech database 50. Thus, it is possible to more accurately determine whether the completed user's speech is correct or incorrect. In addition, it is possible to reduce the likelihood that information of a user's speech that is not incorrect (that is, correct) will not be stored in the speech database 50.
[Software Implementation Example]
The control section 20 of the interactive device 1 can be realized by a logic circuit (hardware) provided in an integrated circuit (IC chip) or the like or can be alternatively realized by software as executed by a central processing unit (CPU).
In the latter case, the interactive device 1 includes a CPU that executes instructions of a program that is software realizing the foregoing functions; a read only memory (ROM) or a storage device (each referred to as “storage medium”) in which the program and various kinds of data are stored so as to be readable by a computer (or a CPU); and a random access memory (RAM) in which the program is loaded. An object of the present invention can be achieved by a computer (or a CPU) reading and executing the program stored in the storage medium. Examples of the storage medium encompass “a non-transitory tangible medium” such as a tape, a disk, a card, a semiconductor memory, and a programmable logic circuit. The program can be supplied to the computer via any transmission medium (such as a communication network or a broadcast wave) which allows the program to be transmitted. Note that the present invention can also be achieved in the form of a computer data signal in which the program is embodied via electronic transmission and which is embedded in a carrier wave.
[Recap]
An interactive device (1) in accordance with Aspect 1 of the present invention is an interactive device configured to converse with a user by voice or text, comprising: a speech completion section (completion processing section 23) configured to, if a speech of the user inputted to the interactive device lacks some phrase, complete the speech of the user on the basis of at least one of a previous speech of the interactive device and a previous speech of the user; a correct/incorrect determination section (26) configured to determine whether the speech of the user completed by the speech completion section is correct or incorrect on the basis of a specified determination condition; a speech storing section (25) configured to, if the correct/incorrect determination section determines that the speech of the user is correct, store information of the speech of the user in a speech database (50); and a speech generation section (24) configured to generate a speech of the interactive device with use of the speech of the user that has been stored in the speech database by the speech storing section.
According to the above arrangement, it is possible to generate a speech of the interactive device with the use of information of a user's speech inputted to the interactive device. Furthermore, if the user's speech lacks some phrase, the user's speech is completed. It follows that information of a complete user's speech with no lack of phrases is stored in the speech database. This makes it possible for the interactive device to generate a speech of the interactive device by making effective use of a user's speech stored in the speech database.
An interactive device in accordance with Aspect 2 of the present invention may be arranged such that, in Aspect 1, the speech completion section is configured to complete the speech of the user on the basis of a word that is contained in the at least one of the previous speech of the interactive device and the previous speech of the user. Note that, if information of both the previous speech of the interactive device and the previous speech of the user are stored in the speech database, the speech completion section may complete the speech of the user on the basis of the speech of the interactive device or of the user most recently stored in the speech database.
According to the above arrangement, it is possible to easily complete the speech of the user on the basis of a topic of a previous conversation between the interactive device and the user. For example, if at least one of the interactive device and the user previously talked about some topic related to a certain word, the certain word is highly likely to be contained in a subsequent speech of the user. As such, if the certain word is added to the speech of the user to complete the speech, the completed speech of the user is highly likely to be correct.
An interactive device in accordance with Aspect 3 of the present invention may be arranged such that, in Aspect 1 or 2, the correct/incorrect determination section is configured to (a) refer to information indicative of a correspondence relationship between words and categories thereof, and (b) if a category of a word that is contained in the speech of the user completed by the speech completion section is the same as a category of a word that is contained in the at least one of the previous speech of the interactive device and the previous speech of the user, determine that the speech of the user is correct.
According to the above arrangement, it is possible to easily determine that the completed user's speech is correct or incorrect. This makes it possible to selectively store, in the speech database, only information of user's speeches that are highly likely to be correct.
An interactive device in accordance with Aspect 4 of the present invention may be arranged such that, in any of Aspects 1 to 3, the speech storing section is configured to store, in the speech database, the speech of the user and at least one of (i) information indicative of one or more categories of one or more words that are contained in the speech of the user, (ii) information indicative of date and time or place at which the speech of the user was inputted, and (iii) identification information of the user.
According to the above arrangement, it is possible to improve the accuracy of determination of whether the speech of the user is correct or incorrect, by making use of the information stored in the speech database.
An interactive device in accordance with Aspect 5 of the present invention may be arranged such that, in any of Aspects 1 to 4, the correct/incorrect determination section is configured to (a) refer to information indicative of a correspondence relationship between words and categories thereof, and (b) if a combination of categories corresponding to a plurality of words that are contained in the speech of the user completed by the speech completion section is the same as a combination of categories corresponding to a plurality of words that are contained in at least one of a speech of the interactive device and a speech of the user which are stored in the speech database, determine that the speech of the user completed by the speech completion section is correct.
According to the above arrangement, it is possible to more accurately determine whether the speech of the user is correct or incorrect, on the basis of a combination of categories of a plurality of words that are contained in at least one of a previous speech of the interactive device and a previous speech of the user.
An interactive device in accordance with Aspect 6 of the present invention may be arranged such that, in any of Aspects 1 to 5, the correct/incorrect determination section is configured to (a) output a speech, of the interactive device, which asks the user whether the speech of the user completed by the speech completion section is correct or incorrect, and (b) if a speech, of the user, which indicates that the speech of the user completed by the speech completion section is correct is inputted to the interactive device, determine that the speech of the user completed by the speech completion section is correct.
According to the above arrangement, it is possible to more accurately determine whether the completed speech of the user is correct or incorrect.
A method of controlling an interactive device in accordance with Aspect 7 of the present invention is a method of controlling an interactive device (1) that is configured to converse with a user by voice or text, the method including: a speech completing step including, if a speech of the user inputted to the interactive device lacks some phrase, completing the speech of the user on the basis of at least one of a previous speech of the interactive device and a previous speech of the user; a correct/incorrect determining step including determining whether the speech of the user completed in the speech completing step is correct or incorrect on the basis of a specified determination condition; a speech storing step including, if it is determined in the correct/incorrect determining step that the speech of the user is correct, storing information of the speech of the user in a speech database (50) that is for use in generation of a speech of the interactive device; and a speech generating step including generating the speech of the interactive device with use of the speech of the user that has been stored in the speech database in the speech storing step. According to this arrangement, it is possible to provide similar effects to those of the interactive device in accordance with Aspect 1.
The interactive device according to the foregoing embodiments of the present invention may be realized by a computer. In this case, the present invention encompasses: a control program for the interactive device which program causes a computer to operate as the foregoing sections (software elements) of the interactive device so that the interactive device can be realized by the computer; and a computer-readable storage medium storing the control program therein.
The present invention is not limited to the embodiments, but can be altered by a skilled person in the art within the scope of the claims. The present invention also encompasses, in its technical scope, any embodiment derived by combining technical means disclosed in differing embodiments. Further, it is possible to form a new technical feature by combining the technical means disclosed in the respective embodiments.
Number | Date | Country | Kind |
---|---|---|---|
2016-198479 | Oct 2016 | JP | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/JP2017/030408 | 8/24/2017 | WO | 00 |