The present invention relates to analyzing, searching and retrieving content, and more particularly, to a method and system for obtaining and presenting content that is relevant to an ongoing conversation.
Professionals in search of new and creative ideas have always sought inspiring environments in which to brainstorm, make new associations, and to think in different ways in order to develop new insights and ideas. People try to interact socially and philosophize with each other in a stimulating environment even during time spent in leisure activities. In all of these situations, it is helpful to have a creative inspirator who is involved in the conversation and who has a deep knowledge of the subject matter and the power to inject novel associations that lead to new avenues of discussion. In today's networked world, it would be equally valuable to have an intelligent network play the role of a creative inspirator.
To accomplish this, the intelligent system would need to monitor the conversation and understand what topic(s) were being discussed without requiring explicit input from the participants. Based on the conversation, the system would search for and retrieve content and information, including related words and topics, that could suggest new avenues of discussion. Such a system would be suitable for use in various environments, including living rooms, trains, libraries, meeting rooms, and waiting rooms.
A method and system are disclosed for determining the topic of a conversation and obtaining and presenting content that is related to the conversation. The disclosed system provides a “creative inspirator” in an ongoing conversation. The system extracts keywords from the conversation and utilizes the keywords to determine the topic(s) being discussed. The disclosed system then conducts searches within an intelligent, networked environment to obtain content based on the topic(s) of the conversation. The content can be presented to the participants in the conversation to supplement their discussion.
A method is also disclosed for determining the topic of a text document including transcripts of audio tracks, newspaper articles, and journal papers. The topic determination method uses hypernym trees of keywords and wordstems extracted from the text to identify parents in the hypernym trees that are common to two or more of the extracted words. Hyponym trees of selected common parents are then used to determine the common parents with the highest coverage of keywords. These common parents are then selected to represent the topic of the text document.
A more complete understanding of the present invention, as well as further features and advantages of the present invention, will be obtained by reference to the following detailed description and drawings.
According to a further aspect of the invention, the expert system 200 can identify supplemental information that may be presented to one or more of the participants 105, 110 to provide additional information, inspire the participants 105, 110 or encourage a new avenue of discussion. The expert system 200 can search for supplemental content, for example, that is stored on a networked environment (such as the Internet) 160 or in a local database 155 utilizing the identified conversation topic(s). The supplemental content is then presented to the participants 105, 110 to supplement their discussion. In the exemplary implementation, the expert system 200 presents the content in the form of audio information, including speech, sounds, and music, since the conversation exists only in a verbal form. The content can also be presented to a user, for example, in the form of text, video or images, using a display device, as would be apparent to a person of ordinary skill in the art.
Memory 202 will configure the processor 201 to implement the methods, steps, and functions disclosed herein. The memory 202 could be distributed or local and the processor 201 could be distributed or singular. The memory 202 could be implemented as an electrical, magnetic or optical memory, or any combination of these or other types of storage devices. The term “memory” should be construed broadly enough to encompass any information able to be read from or written to an address in the addressable space accessed by processor 201.
As shown in
The speech recognition system 210 captures the conversation of one or more participants 105, 110 and converts the audio information to text in the form of a complete or partial transcript, in a known manner. If the participants 105, 110 in the conversation are located in the same geographic area and if the speech of the participants 105, 110 overlaps in time, then recognizing their speech may be difficult. In one implementation, beam-forming technology using microphone arrays (not shown) may be utilized to improve speech recognition by picking up a separate speech signal from each individual 105, 110. Alternatively, each participant 105, 110 could wear a lapel microphone to pick up the speech of the individual speakers. If the participants 105, 110 to the conversation are in separate areas, then recognizing their speech can be accomplished without the use of the microphone arrays or lapel microphones. The expert system 200 may utilize one or more speech recognition system(s) 210.
Keyword extractor 220 extracts keywords from the transcript of the audio track of each participant 105, 110, in a known manner. As each keyword is extracted, it may optionally be time-stamped with the time it was spoken. (Alternatively, the keyword may be time-stamped with the time it was recognized or the time it was extracted.) The timestamps may optionally be used to relate the content discovered to the portion of the conversation that contained the keyword.
As discussed further below in conjunction with
The content presentation system 250 presents the content in a variety of formats. In a telephone conversation, for example, the content presentation system 250 will present an audio track. In other embodiments, the content presentation system 250 may present other types of content including text, graphics, images, and videos. In this example, the content presentation system 250 utilizes a tone to signal the participants 105, 110 in the conversation that new content is available. The participants 105, 110 then signal the expert system 200 to present (play) the content by using an input mechanism, such as voice commands or dual tone multi-frequency (DTMF) tone(s) from the telephone.
For example, if the participants 105, 110 are discussing the weather, the system 200 may inspire the participants 105, 110 by presenting information on the weather forecast, or will present historical weather information; if they are discussing plans for a vacation in Australia, the system 200 may present photographs and nature sounds of Australia; and if they are simply discussing what to have for dinner, the system 200 may present pictures of entrees along with their recipes.
If the wordstem test (step 422) determines that a wordstem was found for the selected keyword, then the wordstem is added to the list of wordstems (step 427) and a test is performed to determine if all the keywords were read (step 428). If it is determined during step 428 that all the keywords were not read, then step 410 is repeated; otherwise, the process continues with step 430.
During step 430, the hypernym trees for all senses (semantic meanings) of all words in the wordstem set are determined. A hypernym is the generic term used to designate a whole class of specific instances i.e., Y is a hypernym of X if X is a type of Y. For example, ‘car’ is a kind of ‘vehicle,’ so ‘vehicle’ is a hypernym of ‘car.’ A hypernym tree is a tree of all hypernyms of a word up to the highest level in the hierarchy, including the word itself.
A comparison is then made between all pairs of hypernym trees to find a common parent at a specific level (or lower) in the hierarchy during step 440. A common parent is the first hypernym in a hypernym tree that is the same for two or more words in the keyword set. It is noted that a level-5 parent, for instance, is an entry in the hierarchy at the fifth level, four steps down from the highest level in the hierarchy, that is either a hypernym of a common parent or a common parent by itself. The level selected to be the specified level should have an appropriate level of abstraction such that the topic is not so specific that no relevant content can be found and not so abstract that the content discovered is not relevant to the conversation. In the present embodiment, level-5 is selected as the specified level in the hierarchy.
A search is then conducted to find the corresponding level-S parent(s) for all common parent(s) (step 450). The hyponym trees are then determined for all the senses of the level-5 parents (step 460). A hyponym is the specific term used to designate a member of a class X. X is a hyponym of Y if X is a type of Y i.e., ‘car’ is a type of ‘vehicle’,’ so ‘car’ is the hyponym of ‘vehicle.’ A hyponym tree is a tree of all hyponyms of a word down to the lowest level in the hierarchy, including the word itself. For each of the hyponym trees, the number of words that are common to the hyponym tree and the set of keywords are counted (step 470).
A list of the level-5 parents whose hyponym tree covers (contains) more than two words in the wordstem set is then compiled during step 480. Finally, the one or two level-S parents that have the highest coverage (contain the most words from the wordstem set) are then selected (step 490) to represent the topic(s) of the conversation. In one alternative embodiment of the topic finder process 400, if common parents exist for senses of keywords utilized to select previous topics, then steps 440 and/or steps 450 can ignore common parents of the senses of the keyword that were not utilized in selecting the topic based on a particular sense of the keyword. This will eliminate unnecessary processing and will result in more stable topic selection.
In a second alternative embodiment, steps 450 through 480 are skipped and step 490 selects the topic based on the common parents of previous topics and the common parents discovered in step 440. Similarly, in a third alternative embodiment, steps 450 through 480 are skipped and step 490 selects the topic based on previous topics and the common parents discovered in step 440. In a fourth alternative embodiment, steps 460 through 480 are skipped and step 490 selects topics based on all the specific-level parents determined in step 450.
For example, consider the sentence 510 in
In the present example, the number of words in the hyponym tree of (device) that are also in the wordstem set is determined to be two: ‘computer’ and ‘train.’ Similarly, the number of words in the hyponym tree of (conveyance, transport) that are also in the set is determined to be three: ‘train,’ ‘vehicle,’ and ‘car.’ The coverage of (device) is therefore ½; the coverage of (conveyance, transport) is ¾. At step 480, both level-5 parents would be reported and the topic would be set to (conveyance, transport) (step 490) since it has the highest associated word count.
The content finder 240 would then search for content in a local database 155 or in an intelligent, networked environment 160 based on this topic (conveyance, transport) of the conversation in a known manner. For example, a google Internet search engine can be requested to perform a worldwide search utilizing the topic, or a combination of topic(s), discovered in the conversation. A list of the content found, and/or the content itself, is then sent to the content presentation system 250 for presentation to the participants 105, 110.
The content presentation system 250 presents the content to the participants 105, 110 in an active or passive manner. In the active mode, the content presentation system 250 interrupts the conversation to present the content. In the passive mode, the content presentation system 250 alerts the participants 105, 110 to the availability of content. The participants 105, 110 may then access the content in an on-demand manner. In the present example, the content presentation system 250 alerts the participants 105, 110 in the telephone conversation with an audio tone. The participants 105, 110 can then select which content is to be presented and specify the time at which it is to be presented utilizing DTMF signals generated by the telephone keypad. The content presentation system 250 would then play the selected audio track at the specified time.
It is to be understood that the embodiments and variations shown and described herein are merely illustrative of the principles of this invention and that various modifications may be implemented by those skilled in the art without departing from the scope and spirit of the invention
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/IB2005/050191 | 1/17/2005 | WO | 00 | 7/20/2006 |
Number | Date | Country | |
---|---|---|---|
60537808 | Jan 2004 | US |