This application claims the benefit of Japanese Patent Application No. 2015-125632, filed on Jun. 23, 2015, which is hereby incorporated by reference herein in its entirety.
Field of the Invention
The present invention relates to a technique for determining a status of a group made up of a plurality of speakers engaged in a conversation.
Description of the Related Art
In recent years, research and development of techniques for performing various types of interventions such as making proposals and providing support from computers to humans are underway. For example, Japanese Patent Application Laid-open No. 2009-36998 and Japanese Patent Application Laid-open No. 2009-36999 disclose selecting a keyword being uttered by a user from conversation data to comprehend contents of the utterance and responding in accordance with the utterance contents. Other systems are known which provide information in accordance with a status or preferences of an individual.
The methods described in Japanese Patent Application Laid-open No. 2009-36998 and Japanese Patent Application Laid-open No. 2009-36999 assume a dialogue between one speaker and a computer and do not assume intervening in a conversation carried out by a group made up of a plurality of speakers.
A conversation carried out by a group may include a conversation for decision making such as deciding on a destination. Even when intervening in such a conversation with a focus on statuses or preferences of individuals, it is unclear as to whose opinion should be valued in the event that opinions of members differ from one another. When determining contents of an intervention based solely on utterance contents, opinions of members who have presented arguments with more explicit and specific contents tend to be prioritized. However, this means that members unable to voice explicit opinions will feel increasingly dissatisfied.
In consideration of problems such as those described above, an object of the present invention is to determine a status of a group made up of a plurality of speakers engaged in a conversation in order to enable an appropriate intervention to be performed on the group. An object of the present invention is to perform an appropriate intervention in accordance with a group status determined in this manner.
In order to achieve the object described above, a first aspect of the present invention is a group status determining device determining a status of a group made up of a plurality of speakers engaged in a conversation, the group status determining device including: an acquiring unit that acquires conversation situational data, which is data regarding a series of groups of utterances made by a plurality of speakers and estimated to be on a same conversation theme; a storage that stores determination criteria, based on the conversation situational data, with respect to a plurality of group types; and a determining unit that acquires a type of the group made up of the plurality of speakers, based on the conversation situational data and the determination criteria, as a group status of the group made up of the plurality of speakers.
A group type is a classification indicating a relationship among members that make up a group. Although group types may be arbitrarily defined, conceivable examples include “a group with a flat relationship and high intimacy, in which members are able to mutually voice their opinions frankly”, “a group with a hierarchical relationship but high intimacy, in which a specific member leads decision making of the group”, and “a group with a hierarchical relationship and low intimacy, in which a specific member leads decision making of the group”. The storage stores determination criteria for determining, based on conversation situational data, which group type a given group corresponds to.
In this case, as data regarding a series of groups of utterances, conversation situational data can include, for example, a speaker of each utterance, a correspondence relationship between utterances, semantics and an intention of each utterance, emotions of a speaker during each utterance, an utterance frequency of each speaker, an utterance feature value of each speaker, and a relationship between the speakers.
For example, when the conversation situational data includes an utterance feature value of each speaker in a series of groups of utterances, criteria for determining a group type based on utterance feature values can be adopted as the determination criteria. In this case, the determining unit can determine which group type a given group corresponds to, based on utterance feature values contained in conversation situational data and determination criteria stored in the storage.
In addition, when the conversation situational data further includes a relationship between utterances and utterance intentions in the series of groups of utterances, the determining unit may favorably estimate an opinion exchange situation in the group based on the information and determine a group type also in consideration of opinion exchange situation. In this case, the determining unit may determine at least any of liveliness of exchange of opinions in the group, a ratio of agreements against disagreements to a proposal, and presence or absence of an influencer in decision making as the opinion exchange situation.
In the present invention, favorably, the determining unit further determines a relationship among a plurality of speakers included in a group as a group status based on a relationship between utterances and utterance intentions. Examples of relationships among speakers include an influencer and a follower in decision making, a superior and a subordinate, a parent and a child, and friends. The relationship among speakers can be considered as being expressive of roles performed by the respective speakers in the group.
The relationship among speakers can be determined based on wording used in the utterances. For example, when there is a person using commanding language and a person responding thereto in honorifics in the group, the speakers can be determined as a superior and a subordinate. In addition, speakers respectively using informal language can be determined as speakers having a relationship of equals. Furthermore, when one person is using child language and another is using language that is typically used to address a child, the speakers can be determined as an adult and a child or a parent and a child.
In the present invention, the determining unit can acquire a status change of a group as a group status. An example of a status change of a group includes an occurrence of stagnation of utterances. An occurrence of stagnation of utterances can be determined based on utterance feature values. Moreover, stagnation of utterances includes both stagnation of utterances by a specific speaker and stagnation of utterances by a group as a whole.
With the group status determining device according to the present aspect, what kind of status a group made up of a plurality of speakers is in can be optimally determined.
A second aspect of the present invention is a support device which intervenes in and supports a conversation held by a group made up of a plurality of speakers. The support device according to the present aspect includes: the group status determining device described above; an intervention policy storing unit which stores a correspondence between group statuses and intervention policies; and an intervening unit which determines contents of an intervention in a conversation by the group based on an intervention policy corresponding to a group status obtained by the group status determining device and which performs an intervention in the conversation.
In the present aspect, favorably, the intervention policies define which member in a group is to be preferentially supported for each group type. In this case, a member in a group can be specified based on a relationship or roles of members in the group. For example, the intervention policies can define preferentially supporting an influencer in a group or preferentially supporting a follower in the group. In addition, a member to be preferentially supported can be specified as a member who has experienced a given status change. For example, the intervention policy can define preferentially supporting a member whose utterance frequency has declined.
With the support device according to the present aspect, optimal support can be provided in accordance with a group status.
Moreover, the present invention can be considered as a group status determining device or a support device including at least a part of the unit described above. In addition, the present invention can also be considered as a conversation situation analyzing method or a supporting method which executes at least a part of the processes performed by the unit described above. Furthermore, the present invention can also be considered as a computer program that causes these methods to be executed by a computer or a computer-readable storage unit that non-transitorily stores the computer program. The respective units and processes described above can be combined with one another in any way possible to constitute the present invention.
According to the present invention, what kind of status a group made up of a plurality of speakers is in can be optimally determined. In addition, according to the present invention, appropriate support can be provided based on a group status optimally determined in this manner.
<System Configuration>
The present embodiment is a conversation intervention support system which intervenes in a conversation held by a plurality of persons in a vehicle to provide information or support for decision making. The present embodiment is configured so that an appropriate intervention can also be performed in a conversation held by a plurality of persons and, in particular, a conversation held by three or more persons.
In the present embodiment, among the respective functions shown in
Moreover, the navigation device 111 and the server device 120 are both computers including a processing device such as a CPU, a storage device such as a RAM and a ROM, an input device, an output device, a communication interface, and the like, and realize the respective functions described above as the processing device executes a program stored in the storage device. However, a part of or all of the functions described above may be realized by dedicated hardware. In addition, the server device 120 need not necessarily be one device and may be constituted by a plurality of devices (computers) connected to one another via a communication line, in which case the functions are to be shared among the respective devices. <Overall Process>
In step S301, the navigation device 111 acquires conversational speech by a plurality of passengers in the vehicle 110 via the microphone 201. In the present embodiment, since subsequent processes on the acquired speech are to be performed by the server device 120, the navigation device 111 transmits the acquired conversational speech to the server device 120 via the communication device 114. Moreover, although the number and arrangement of microphones used are not particularly limited, a plurality of microphones or microphone arrays are favorably used.
In step S302, the server device 120 extracts respective utterances of each speaker from the conversational speech using the noise eliminating unit 202 and the sound source separating unit 203. Moreover, an “utterance” refers to the generation of language in the form of speech as well as speech generated as a result of such generation of language. The process performed at this point includes noise elimination by the noise eliminating unit 202 and sound source separation (speaker separation) by the sound source separating unit 203. The noise eliminating unit 202 specifies and eliminates noise based on, for example, a difference between speech obtained from a microphone arranged near a noise generation source and speech obtained from another microphone. In addition, the noise eliminating unit 202 eliminates noise using a correlation in speech input to a plurality of microphones. The sound source separating unit 203 detects a direction and a distance of each speaker with respect to a microphone based on a time difference between inputs of speech to the plurality of microphones in order to specify a speaker.
In step S303, the conversation situation analyzing unit 204 analyzes a situation of a conversation held by a plurality of persons. In order to analyze a situation of a conversation held by a plurality of persons and, in particular, three or more persons, for example, whether or not there is a correlation among utterances by the respective speakers and, in a case where a correlation exists, what kind of relationship exists among the utterances must be recognized. In consideration thereof, the conversation situation analyzing unit 204 extracts a group of utterances related to a same conversation theme as a series of groups of utterances, and further comprehends a relationship among utterances in the group of utterances to analyze a situation of the conversation and a relationship among the speakers in consideration with the relationship among the utterances. Specific contents of the process performed by the conversation situation analyzing unit 204 will be described later.
In step S304, based on conversation situational data provided by the conversation situation analyzing unit 204, the group status determining unit 207 determines a group type of a group of speakers participating in a same conversation or a status of the group of speakers. Conceivable examples of groups include “a group with a flat relationship and high intimacy, in which members are able to mutually voice their opinions frankly”, “a group with a hierarchical relationship but high intimacy, in which a specific member leads decision making of the group”, and “a group with a hierarchical relationship and low intimacy, in which a specific member leads decision making of the group”. In addition, conceivable examples of status changes of a group include a decline in an utterance frequency of a specific member, a decline in utterance frequency of an entire group, a change in emotion of a specific member, and a change in influencers of a group. Specific contents of the process performed by the group status determining unit 207 will be described later.
In step S305, the intervening/arbitrating unit 209 determines an intervention policy in accordance with a group status provided by the group status determining unit 207 and determines a specific timing and contents of the intervention based on the intervention policy and contents of a current conversation. For example, in a case of a group with a flat relationship and high intimacy, in which members are able to mutually voice their opinions frankly, an intervention policy may conceivably be adopted in which detailed reference information is more or less equally presented to everyone to facilitate a lively discussion. In addition, for example, when an utterance frequency of a specific speaker or the entire group has declined, an intervention policy of providing guidance so as to stimulate the conversation may conceivably be adopted. Once an intervention policy is determined, the intervening/arbitrating unit 209 acquires information to be presented in accordance with a current conversation topic from the recommendation system 121, the database 122 of information for store advertisement, or the related information website 130 and issues an intervention instruction. Specific contents of the process performed by the intervening/arbitrating unit 209 will be described later.
In step S306, the output control unit 212 generates synthesized speech or a text to be output in accordance with the intervention instruction output from the intervening/arbitrating unit 209 and reproduces the synthesized speech or the text using the speaker 213 or the display 214.
An intervention in a conversation held by a plurality of speakers in the vehicle 110 may be performed as described above. Moreover, the processes presented in the flow chart shown in
Next, details of the conversation situation analyzing process in step S303 will be described.
In step S401, the conversation situation analyzing unit 204 detects utterance sections from speech data obtained by sound source separation and adds a section ID and a time stamp to each utterance section. Moreover, an utterance section is a single continuous section in which speech is being uttered. An utterance section is assumed to end before, for example, an occurrence of a non-utterance of 1500 milliseconds or more. Due to this process, conversational speech can be separated into a plurality of pieces of speech data for each speaker and for each utterance section. Hereinafter, speech of an utterance in one utterance section may also be simply referred to as an utterance.
In step S402, the conversation situation analyzing unit 204 calculates utterance feature values (speech feature values) for each utterance. Examples of utterance feature values include a power level of voice, a pitch, a tone, a duration, an utterance speed (an average mora length). A power level of voice indicates a sound pressure level of an utterance. A tone indicates a height of a sound or a sound itself. The height of sound is specified by the number of vibrations (frequency) of sonic waves per second. A pitch indicates a height of perceived sound and is specified by a physical height (fundamental frequency) of a sound. An average mora length is calculated as a length (period of time) of an utterance per mora. A mora refers to the number of beats. In this case, with respect to a power level of voice, a pitch, a tone, and an utterance speed, favorably, an average value, a maximum value, a minimum value, a variation width, a standard deviation, or the like in an utterance section is obtained. While the utterance feature values described above are to be calculated in the present embodiment, all of the utterance feature values exemplified above need not be calculated and utterance feature values other than those exemplified above may be calculated.
In step S403, the conversation situation analyzing unit 204 obtains an emotion of a speaker for each utterance from a change in utterance feature values. Examples of emotions to be obtained include satisfaction, dissatisfaction, excitement, anger, sadness, anticipation, relief, and anxiety. An emotion can be obtained based on, for example, a change in a power level, a pitch, or a tone of an utterance from a normal status thereof. Utterance feature values during a normal status of each speaker may be derived from previously obtained utterance feature values or information stored in a database 123 of user information and usage history may be used. Moreover, an emotion of a speaker need not be determined based solely on utterances (speech data). An emotion of a speaker can also be obtained from contents (a text) of an utterance. Alternatively, for example, a facial feature value can be calculated from a facial image of a speaker taken by the camera 113, in which case an emotion of the speaker can be obtained based on a change in the facial feature value.
In step S404, on each utterance, the conversation situation analyzing unit 204 performs a speech recognition process using the speech recognition corpus/dictionary 205 to convert utterance contents into a text. Known techniques may be applied for the speech recognition process. The utterance contents (the text) shown in
In step S405, the conversation situation analyzing unit 204 estimates an intention and a conversation topic of each utterance from the contents (the text) of the utterance by referring to the vocabulary/intention understanding corpus/dictionary 206. Examples of an utterance intention include starting a conversation, making a proposal, agreeing or disagreeing with a proposal, and consolidating opinions. Examples of a conversation topic of an utterance include a category of the utterance, a location, and a matter. Examples of a category of an utterance include drinking and eating, travel, music, and weather. Examples of a location brought up as a conversation topic include a place name, a landmark, a store name, and a facility name. The vocabulary/intention understanding corpus/dictionary 206 includes dictionaries of vocabularies respectively used in cases of “starting a conversation, making a proposal, asking a question, voicing agreement, voicing disagreement, consolidating matters”, and the like, dictionaries of vocabularies related to “drinking and eating, travel, music, weather, and the like” for specifying a category of an utterance, and dictionaries of vocabularies related to “a place name, a landmark, a store name, a facility name, and the like” for specifying a location brought up as a conversation topic. Moreover, when estimating the utterance intention, an emotion of a speaker is favorably taken into consideration in addition to the text of the utterance. For example, when the utterance contents (the text) indicates consent to a proposal, the utterance intention can be estimated in greater detail by taking the emotion of the speaker into consideration such as a case of joyful consent and a case of grudging consent.
As a result of the process of step 5405, an intention of a speaker such as “what the speaker wants to do” and a category that is being discussed as a conversation topic can be estimated for each utterance. For example, with respect to a text reading “How about Italian food in Kita-Kamakura?” designated by utterance ID2 in
Utterance n(S)=(Cn, Pn, In)
In this case, n denotes an utterance ID (1 through k) which is assumed to be assigned in an order of occurrence of utterances. S denotes a speaker (A, B, C, . . . ), and Cn, Pn, and In respectively denote an estimated category of the utterance, an estimated location being brought up as a conversation topic, and an estimated utterance intention.
For example, when a collation of an utterance 1 by a speaker A with the vocabulary/intention understanding corpus/dictionary 206 results in matches of “C1: drinking and eating”, “P1: Kamakura”, and “I1: starting a conversation”, the utterance 1 is expressed as follows.
Moreover, with respect to each utterance, information such as a category being brought up as a conversation topic, a location as a conversation topic, and an utterance intention is favorably obtained by also taking information other than contents (a text) of the utterance into consideration. In particular, the utterance intention is favorably obtained by also taking the emotion of the speaker obtained from utterance feature values into consideration. Even when the utterance contents indicate an agreement to a proposal, utterance feature values enable a distinction to be made between a joyful consent and a grudging consent. Furthermore, depending on the utterance, such information cannot be extracted from the utterance contents (the text). In such a case, the conversation situation analyzing unit 204 may estimate the utterance intention by considering extraction results of intentions and utterance contents (texts) previously and subsequently occurring along a time series.
In step S406, the conversation situation analyzing unit 204 extracts utterances estimated as being made on a same theme in consideration of the category of each utterance and a time-sequential result of utterances obtained in step S405 and specifies a group of utterances obtained as a result of the extraction as a group of a series of utterances included in the conversation. According to this process, utterances included in one conversation from the start to end of the conversation can be specified.
In identity determination of a conversation theme, similarities of categories and locations as conversation topics of utterances are taken into consideration. For example, with respect to utterance ID5, while a category thereof is determined as “drinking and eating” from an extracted word “fish” and a location as the conversation topic is determined as “sea” from an extracted word “sea”, since both are concerned with the category “drinking and eating”, the utterance can be determined to have a same conversation theme. In addition, utterances may sometimes include a word (“let's decide”) that enables a determination of “starting a conversation” to be made as in the case of utterance ID1 or a word (“that settles it”) that enables a determination of “consolidating”) to be made as in the case of utterance ID9, and each of the utterances can be estimated to be an utterance made at the start or the end of a conversation on a same theme. Furthermore, in consideration of a temporal relationship among utterances, different conversation themes may be determined when a time interval between utterances is too long even when the category or the location as the conversation topic of the utterances is the same. Moreover, there may be utterances that do not include words from which an intention or a category can be extracted. In such a case, in consideration of a time-sequential flow of utterances, utterances by a same speaker occurring between the start and the end of a same conversation may be assumed as being included in a same conversation.
While the utterances shown in
In the present embodiment, for example, a series of “Conversation m” specified in this manner is expressed by the following equation.
Conversation m (SA, SB, Sc . . . )={utterance 1 (SA), utterance 2 (SB), utterance 3 (SC) . . . }=Tm {(CA, PA, IA), (CB, PB, IB), (CC, PC, IC) . . . }
In this case, m denotes a conversation ID (1 through k) which is assumed to be assigned in an order of occurrence of conversations. SA,SB,SC . . . denotes a speaker (A, B, C, . . . ) and Tm, Cn, Pn, and In respectively denote an estimated conversation theme, an estimated category of an utterance, an estimated location being brought up as a conversation topic, and an estimated utterance intention.
For example, when a group of utterances regarding a theme “drinking and eating” by the speakers A, B, and C is specified as belonging to conversation 1, conversation 1 is expressed as follows.
Conversation 1 (A, B, C)=T“drinking and eating” {“drinking and eating (lunch)”, “Kamakura”, “starting a conversation”), (“drinking and eating (cuisine)”, “Kamakura”, “proposal”), (“drinking and eating (cuisine)”, “na”, “negation/proposal”) . . . }
In step S407, the conversation situation analyzing unit 204 generates and outputs conversation situational data that integrates the analysis results described above. For example, conversation situational data includes information such as that shown in
The conversation situation analyzing unit 204 outputs conversation situational data such as that described above to the group status determining unit 207. Using conversation situational data enables a flow of a conversation to be linked with changes in feature values of each utterance and enables a status of a group engaged in a conversation to be optimally estimated. <Group Status Determining Process>
Next, details of the group status determining process in step S304 in
In step S1001, the group status determining unit 207 acquires conversation situational data output by the conversation situation analyzing unit 204. By performing the following processes based on the conversation situational data, the group status determining unit 207 analyzes a group status including a group type, a role of each member (relationship), and a status change of the group.
In step S1002, the group status determining unit 207 determines connections among speakers in a conversation. Conversation situational data includes a speaker of each utterance, a connection among utterances, and intentions (proposal, agreement, disagreement, and the like) of the utterances. Therefore, based on conversation situational data, a frequency of conversation between a pair of speakers (for example, “speaker A and speaker B are frequently engaged in direct conversation” or “there is no direct communication between speaker A and speaker B”) and how often utterances of proposals, agreements, and disagreements are made between a pair of speakers (for example, “speaker A has voiced X number of proposals, Y number of agreements, and Z number of disagreements with respect to speaker B”) can be comprehended. The group status determining unit 207 obtains the information described above for each pair of speakers in the group.
In step S1003, the group status determining unit 207 determines an opinion exchange situation among the members. An opinion exchange situation includes information such as liveliness of exchange of opinions in the group, a ratio of agreements against disagreements with respect to a proposal, and presence or absence of an influencer in decision making. The liveliness of exchange of opinions can be assessed based on, for example, the number of utterances or the number of agreements or disagreements between when a proposal is made and when a final decision is made. In addition, the presence or absence of an influencer in decision making can be assessed based on, for example, whether or not there is only a small number of disagreements with respect to a proposal made by a specific speaker and only consent or agreements occur or whether or not a proposal or an opinion of a specific speaker is adopted at a high rate as a final opinion. Since conversation situational data includes a speaker of each utterance, a connection among utterances, utterance intentions, contents of the utterances, and the like, the group status determining unit 207 can determine the opinion exchange situation described above based on the conversation situational data.
In step S1004, the group status determining unit 207 estimates a group type (a group model) based on utterance feature values and wording of the utterance contents included in the conversation situational data, the connection among speakers obtained in step S1002, and the opinion exchange situation among speakers obtained in step S1003. Group types are defined in advance and, as shown in
Determination criteria for each group type are stored in the group model definition storage unit 208. The group model definition storage unit 208 stores a plurality of determination criteria based on utterance feature values, wording of utterance contents, a connection among speakers, opinion exchange information, and the like.
Although the group status determining unit 207 may determine a group type only using the assessment value obtained above or, in other words, may determine a group type based solely on utterance feature values, the group status determining unit 207 determines a group type by also taking other elements into consideration in order to further improve determine accuracy.
For example, the group status determining unit 207 analyzes utterance contents (texts) in a conversation to acquire a frequency of appearance of commanding language, honorifics, polite language, deferential language, informal language (language used in intimate relationships), language used by children, language used for children, and the like in utterances of each speaker. Accordingly, the wording of each speaker in the conversation can be revealed. The group status determining unit 207 estimates the group type by also taking wording into consideration. For example, when “there is a person using commanding language and a person responding thereto in honorifics, polite language, or deferential language in the group”, a determination can be made that the group type is likely to be group type C. In addition, when “a group includes a person using commanding language but also a person responding in informal language thereto”, a determination can be made that the group type is likely to be group type A. Furthermore, when “most speakers in a group use a lot of informal language”, a determination can be made that the group type is likely to be group type A or B. Moreover, when “a group includes a person using wording that is typically used by a parent (adult) to address a child and a person using wording that is typically used by a child”, a determination can be made that the group type is likely to be group type B. The cases described above are merely examples, and as long as correlations between group types and wording are defined in advance, the group status determining unit 207 can determine which group type the current group is most likely to correspond to.
In addition, the group status determining unit 207 can also determine a group type based on an opinion exchange situation in a conversation. For example, when a lively exchange of opinions is taking place in a group or when a relatively large number of rejections or disagreements are being made with respect to a proposal, a determination can be made that the group type is likely to be group type A or B. In addition, when the exchange of opinions in a group is not lively or when an influencer is present in the group, a determination can be made that the group type is likely to be group type C. The cases described above are merely examples, and as long as correlations between group types and opinion exchange situations are defined in advance, the group status determining unit 207 can determine which group type the current group is most likely to correspond to.
The group status determining unit 207 integrates group types estimated based on utterance feature values, wording, opinion exchange situations, and a connection among speakers as described above and determines a group type which best matches the current group as a group type of the current group.
In step S1005, the group status determining unit 207 estimates a role of each member in the group using the results of analyses performed in steps S1002 and S1003 and other conversation situational data. Examples of roles in a group include an influencer in decision making and a follower with respect to the influencer. In addition, a superior, a subordinate, a parent, a child, and the like may also be estimated as roles. When estimating a role of a member, favorably, the group type determined in step S1004 is also taken into consideration.
In step S1006, the group status determining unit 207 estimates a status change of a group. A group status includes utterance frequencies, participants in a conversation, specification of an influencer of the conversation, and the like. Examples of the status change estimated in step S1006 include a decline in utterance frequency of a specific speaker, a decline in overall utterance frequency, separation of a conversation group, and a change of influencers.
In step S1007, the group status determining unit 207 consolidates the group type estimated in step S1004, the roles of the respective members estimated in step S1005, and the status change of the group estimated in step S1006 to create group status data, and outputs the group status data to the intervening/arbitrating unit 209. By referring to the group status data, the intervening/arbitrating unit 209 can comprehend what kind of status a group currently engaged in a conversation is in and can perform an appropriate intervention in accordance with the status.
<Intervening/Arbitrating Process>
Next, details of the intervention content determining process in step S305 in
In step S1201, the intervening/arbitrating unit 209 acquires the conversation situational data output by the conversation situation analyzing unit 204 and the group status data output by the group status determining unit 207. By performing the following processes based on these pieces of data, the intervening/arbitrating unit 209 determines contents of information to be presented when performing an intervention or arbitration.
In step S1202, the intervening/arbitrating unit 209 acquires an intervention policy in accordance with the group type or the group status change included in the group status data from the intervention policy definition storage unit 210. An intervention policy refers to information indicating which member in the group is to be preferentially supported and in what way in accordance with the group status. Examples of intervention policies defined in the intervention policy definition storage unit 210 are shown in
The intervention policies described above may be considered information defining a priority of an intervention and what kind of intervention is to be performed with respect to each member in a group in accordance with a group type and a status change of the group. Instead of being set with respect to individual members, a priority of intervention is set with respect to a member performing a role (such as an influencer) in a group or a member satisfying specific conditions (a decline in utterance frequency). However, all intervention policies need not necessarily include an intervention priority.
In step S1203, the intervening/arbitrating unit 209 determines an intervention object member and an intervention method based on the intervention policy acquired in step S1202. For example, the intervening/arbitrating unit 209 makes a determination to provide an influencer with information accommodating preferences of other members or to provide information related to a conversation topic that is preferred by a speaker whose utterances have stagnated. Moreover, a determination to not perform an intervention at this time may be made in step S1203. The determination in step S1203 need not necessarily be made solely based on an intervention policy and is also favorably made based on other information such as conversation situational data. For example, when it is determined based on the utterance intentions included in conversation situational data that an exchange of opinions for decision making is being performed in a group, an intervention object and an intervention method may be determined based on an intervention policy for supporting decision making.
In step S1204, the intervening/arbitrating unit 209 generates or acquires information to be presented in accordance with the intervention object member and the intervention method. For example, when providing an influencer with information accommodating preferences of other members, first, the preferences of other members are determined by acquiring the preferences based on previously-discussed conversation themes and emotions (levels of excitement or the like) of the members or acquiring the preferences from the database 123 of user information and usage history. In a case where a member prefers Italian cuisine when a conversation about a place for lunch is being carried out, information regarding Italian restaurants is acquired from the related information website 130 or the like. In doing so, favorably, the restaurants to be presented are narrowed down by also taking into consideration positional information acquired from the GPS device 112 of the vehicle 110.
In step S1205, the intervening/arbitrating unit 209 generates intervention instruction data including the information to be presented generated or acquired in step S1204 and outputs the intervention instruction data. In the present embodiment, the intervention instruction data is transmitted from the server device 120 to the navigation device 111 of the vehicle 110. Based on the intervention instruction data, the output control unit 212 of the navigation device 111 generates synthesized speech or a text to be displayed and presents the information through the speaker 213 or the display 214 (S306).
The series of conversation intervention supporting process (
In the present embodiment, the conversation situation analyzing unit 204 is capable of specifying a group of utterances including a same conversation theme in a conversation held by a plurality of speakers and further comprehending whether or not a relationship exists between respective utterances and, if so, what kind of relationship. Furthermore, a situation of the conversation can be estimated based on intervals and degrees of overlapping of utterances among the speakers with respect to a same conversation. With the conversation situation analysis method according to the present embodiment, even when a large number of speakers are split into different groups and are simultaneously engaged in conversations, a situation of each conversation can be comprehended.
In addition, in the present embodiment, the group status determining unit 207 is capable of comprehending a type or a status change of a group engaged in a conversation or a role of each speaker and a relationship among the respective speakers in the group based on conversation situational data and the like. The ability to comprehend such information enables a determination to be made as to which speaker is to be preferentially supported when the system intervenes in a conversation and enables an appropriate intervention to be performed in accordance with the status of the group. <Modifications>
While an example of a conversation intervention support system being configured as a telematics service in which a vehicle and a server device cooperate with each other has been described above, a specific mode of the system is not limited thereto. For example, the system can be configured so as to acquire a conversation taking place indoors such as in a conference room and to intervene in the conversation.
The present invention can be implemented by a combination of software and hardware. For example, the present invention can be implemented as an information processing device (a computer) including a processor such as a central processing unit (CPU) or a micro processing unit (MPU) and a non-transitory memory that stores a computer program, in which case the functions described above are provided as the processor executes the computer program. Alternatively, the present invention can be implemented with a logic circuit such as an application specific integrated circuit (ASIC) or a field programmable gate array (FPGA). Further alternatively, the present invention can be implemented using both a combination of software and hardware and a logic circuit. In the present disclosure, a processor configured so as to realize a specific function and a processor configured so as to function as a specific module refer to both a CPU or an MPU which executes a program for providing the specific function or a function of the specific module and an ASIC or an FPGA which provides the specific function or a function of the specific module.
Number | Date | Country | Kind |
---|---|---|---|
2015-125632 | Jun 2015 | JP | national |