The present invention relates to a method and a system for providing users access to information of interest.
The number of networked devices in local area networks such as home networks is on the rise, and so is the amount of data stored on them. Typically, home network users store and access several types of content (such as audio, video, image and other data files) in different formats on/via their home devices. In addition to accessing these, home users also commonly access audio/video broadcast data streams via broadcast television or cable networks.
Further, the amount of information available on sources such as external networks, the Internet (e.g., the World Wide Web), etc. is continually on the rise. For example, it is very likely that a user can find useful information on the Internet related to at least some of the data stored on the devices in the user's home network. It is highly likely that the user can find large quantities of such related information in different formats (structured, semi-structured and unstructured) via multiple sources.
However, there is no system available that would allow a user to access such related information easily and seamlessly. The only way a user can achieve this is by manually performing a search for the desired information using an Internet search engine or by directly accessing a website (through a Web browser) that the user believes may contain such related information. Thus, the user is forced to comprehend and analyze large quantities of information to identify/access the exact information the user is looking for.
There are existing approaches in which a user can obtain information in a network of resources. In one approach, the user requests the information. The user specifies information using keywords and then browses the information to find the piece of information that satisfies the user's needs. However, specifying keywords using devices without keyboards, such as consumer electronics (CEs) devices, can be a tedious task.
Another approach involves a configuration that uses a TV and a PC. The PC analyzes the subtitles of the TV program and categorizes the program as general, news, medical, etc. The hierarchy of categories is fixed and built from questions posed to broadcast TV viewers. Content of a particular program is mapped to a fixed number of categories. The user can view additional information only when the content matches one of the specified categories. Queries are linked to fixed sources, limiting the amount of information that can be retried for the user. Further, the PC is required and the system cannot function when the PC is turned off. There is, therefore, a need for a method and a system for analyzing and obtaining information of interest to the user, without limiting specific sources of information.
The present invention provides a method and system for providing access to information of potential interest to a user. In one embodiment, this involves analyzing closed-caption information and obtaining information of interest to a user, without limiting specific sources of information. Such an approach is useful in providing access to information of potential interest to a user of an electronic device, by monitoring the user interaction with the device to identify information accessed by the user, determining key information based on the identified information, wherein the identified information includes closed-caption information, and searching available sources for information of potential interest to the user based on said key information. Searching available sources includes forming a query based on the key information, and searching an external network such as the Internet using the query.
One example of such an electronic device is a CE device such as a TV that receives TV programming including closed-caption information. The closed-caption information of a TV program being accessed/viewed by a user is analyzed and key information extracted. This involves converting the closed-caption information to text, removing stop words, and ranking the remaining words based on their frequency of occurrence, proper noun information, and/or other criteria. The ranked words represent key information such as keywords/phrases that are used to form queries and conduct searches using search engines such as available Internet search engines. The search results are presented to the user as recommendations, representing information of potential interest to the user. The user can select among the recommendations for further searching to find additional and/or more refined information of interest to the user.
These and other features, aspects and advantages of the present invention will become understood with reference to the following description, appended claims and accompanying figures.
The present invention provides a method and a system for analyzing and obtaining information of interest to a user, without limiting specific sources of information. Potential information that the user may be interested in is determined by monitoring the user's interactions with a device in a local network of devices, connected to an external network. Such a device can be a CE device in a local area network (e.g., a home network) that is connected to the Internet.
In one implementation, this involves receiving close-captioned programming including closed-caption (CC) information, and analyzing the closed-caption information for key information indicating user interests. The key information is then used to find related information from sources of information such as the Internet, which the user may potentially be interested in.
On a typical CE device such as a TV, in the absence of a keyboard, it is difficult for a user to search for information on the Internet by entering keywords. If a user is watching a TV program, that is a good indication that the user is interested in the content of the TV program. Therefore, the content of the TV program is analyzed by gathering and analyzing text received as CC information for the TV program. Further, contextual information is gathered from the information about the channel being watched. The CC information and the contextual information can be combined and used to make recommendations to the user about information the user may potentially be interested in.
The gathered information is used to determine one or more keywords of potential interest to the user. The keywords are then used to search for related information on the Internet. For example, if the user is watching a news coverage involving Baltimore, the word “Baltimore” is extracted as a keyword. That keyword is used to form a query to search the Internet by using a search engine to find information, such as websites that include information about Baltimore city or Baltimore Ravens, etc.
The search results are presented to the user as recommendations, comprising potential search queries which may be selected by the user and executed to find further information on the Internet that may be of interest to the user. For example, while the user is watching a documentary on Antarctica on a TV, the keyword Antarctica is selected as a keyword and a search on the Internet returns “polar bears” as a recommendation of potential interest to the user. The user can then choose that recommendation to find more information about polar bears. If so, a query for “polar bears” is sent to a search engine and the results are displayed for the user.
Searching is not limited to a predetermined or fixed number of categories or queries or information sources. In one example, keywords are identified based on the CC information for searching. The keywords may be suggested to the user, wherein upon user selection, additional information is obtained using search engines that search available sources on the Internet (different websites available to the search engines), rather than a predetermined and/or a fixed number of sources such as one or more particular websites.
The devices 20 and 30, respectively, can implement the UPnP protocol for communication therebetween. Those skilled in the art will recognize that the present invention is useful with other network communication protocols (e.g., Jini, HAVi, IEEE 1394, etc.). Further, the network 10 can be a wired network, a wireless network, or a combination thereof.
A system that implements a process for analyzing TV CC information receives a TV signal as input. The channel being viewed by the user is monitored and corresponding CC information that is a part of the TV signal is analyzed. Then, a set of keywords are determined which capture the gist of what is being viewed by the user.
The monitor 201 monitors the TV/cable signal and determines channel information that is accessed/viewed by the user. That information includes CC information which is analyzed to extract words that capture the context, by utilizing the example process 300 in
Steps 316 and 318 allow the user to find more information about a program that the user recently viewed on the TV, and can be repeated as the user desires to provide the user with additional and/or further refined information of interest to the user.
Using Electronic Program Guide (EPG) information, which includes information about TV programs on cable TV, satellite TV, etc., the name of the channel being viewed, is used to frame the queries in steps 316, 318, along with the channel and program information. For example, when the user is viewing the “Panorama” program on BBC America, the words “Panorama” and “BBC America” are appended to the extracted keywords to provide related information in the context of the channel and program for searching.
Further, the extracted keywords can be converted into different languages and used for searching to find additional information on the Internet 50. Further, converting keywords, as opposed to sentences, from one language to the other is simple and can be done using a language-to-language dictionary. This is beneficial to users who may understand only a minor portion of the language in the TV program being watched.
In this embodiment, the Keyword Extractor 212 not only relies on information from the Proper Noun Detector 206 and the Indexer 208, but also uses information from the Phrase Extractor 214 to obtain keywords. The Phrase Extractor 214 includes a phrase identifier function that identifies important phrases using frequency and co-occurrence information recorded by the Indexer 208, along with a set of rules. This is important in identifying multi-word phrases such as “United Nations”, “Al Qaeda”, etc.
In operation, the gathered CC text is first passed through the phrase identifier to capture phrases, and then the captured phrases are indexed. The phrase identifier internally maintains three lists: a list of proper nouns, a dictionary, and a list of stop words. The phrase identifier uses an N-gram based approach to phrase extraction, in which conceptually, to capture a phrase of length ‘N’ words, a window of size ‘N’ words is slid across the text and all possible phrases (of length ‘N’ words) are collected. Then they are passed through the following set of three rules to filter out meaningless phrases:
The Phrase Extractor 214 includes a term extractor function which extracts the highest score terms and phrases from the index. The terms and phrases are presented to the user and can be used for further searching to provide additional information of interest to the user.
Alternatively, the Phrase Extractor 214 includes a natural language processing (NLP) tagger and a set of extraction rules to extract important phrases. In operation, the NLP tagger tags each word in the closed caption text with its part-of-speech (i.e. whether the word is a ‘noun’, ‘adjective’, ‘proper noun’ etc.) The extraction rules define the kinds of sequences of such tags that are important. For example, a rule can be to extract phrases which are “a sequence of more than one ‘proper nouns’” and another rule can be to extract “a sequence of one or more ‘adjectives’ followed by one or more ‘nouns’.” The Phrase Extractor applies these rules to the text tagged by the part-of-speech tagger and extracts phrases that follow these sequences. It can also be used to extract single word keywords by using appropriate rules.
In one example, in
Although, in the examples provided herein, a TV is used to receive closed-caption information, the present invention can be applied to other devices (e.g., music player, etc.) that receive information that can be used for analysis to determine and search for information of interest to the user, according to the present invention.
Further, although in
As is known to those skilled in the art, the aforementioned example architectures described above, according to the present invention, can be implemented in many ways, such as program instructions for execution by a processor, as logic circuits, as an application specific integrated circuit, as firmware, etc. The present invention has been described in considerable detail with reference to certain preferred versions thereof; however, other versions are possible. Therefore, the spirit and scope of the appended claims should not be limited to the description of the preferred versions contained herein.
Number | Name | Date | Kind |
---|---|---|---|
5703655 | Corey et al. | Dec 1997 | A |
5790935 | Payton | Aug 1998 | A |
5974406 | Bisdikian et al. | Oct 1999 | A |
5983214 | Lang et al. | Nov 1999 | A |
5983237 | Jain et al. | Nov 1999 | A |
5995959 | Friedman et al. | Nov 1999 | A |
6151603 | Wolfe | Nov 2000 | A |
6253238 | Lauder et al. | Jun 2001 | B1 |
6334127 | Bieganski et al. | Dec 2001 | B1 |
6412073 | Rangan | Jun 2002 | B1 |
6438579 | Hosken | Aug 2002 | B1 |
6480844 | Cortes et al. | Nov 2002 | B1 |
6637028 | Voyticky et al. | Oct 2003 | B1 |
6721748 | Knight et al. | Apr 2004 | B1 |
6766523 | Herley | Jul 2004 | B2 |
6774926 | Ellis et al. | Aug 2004 | B1 |
6807675 | Maillard | Oct 2004 | B1 |
6826512 | Dara-Abrams et al. | Nov 2004 | B2 |
6842877 | Robarts et al. | Jan 2005 | B2 |
6954755 | Reisman | Oct 2005 | B2 |
6981040 | Konig et al. | Dec 2005 | B1 |
7028024 | Kommers et al. | Apr 2006 | B1 |
7054875 | Keith, Jr. | May 2006 | B2 |
7062561 | Reisman | Jun 2006 | B1 |
7069575 | Goode et al. | Jun 2006 | B1 |
7110998 | Bhandari et al. | Sep 2006 | B1 |
7158961 | Charikar | Jan 2007 | B1 |
7158986 | Oliver et al. | Jan 2007 | B1 |
7162473 | Dumais et al. | Jan 2007 | B2 |
7165080 | Kotcheff et al. | Jan 2007 | B2 |
7181438 | Szabo | Feb 2007 | B1 |
7194460 | Komamura | Mar 2007 | B2 |
7203940 | Barmettler et al. | Apr 2007 | B2 |
7225187 | Dumais et al. | May 2007 | B2 |
7284202 | Zenith | Oct 2007 | B1 |
7343365 | Farnham et al. | Mar 2008 | B2 |
7363294 | Billsus et al. | Apr 2008 | B2 |
7386542 | Maybury et al. | Jun 2008 | B2 |
7389224 | Elworthy | Jun 2008 | B1 |
7389307 | Golding | Jun 2008 | B2 |
7433935 | Obert | Oct 2008 | B1 |
7552114 | Zhang et al. | Jun 2009 | B2 |
7565345 | Bailey et al. | Jul 2009 | B2 |
7593921 | Goronzy et al. | Sep 2009 | B2 |
7603349 | Kraft et al. | Oct 2009 | B1 |
7617176 | Zeng et al. | Nov 2009 | B2 |
7634461 | Oral et al. | Dec 2009 | B2 |
7657518 | Budzik et al. | Feb 2010 | B2 |
7685192 | Scofield et al. | Mar 2010 | B1 |
7716158 | McConnell | May 2010 | B2 |
7716199 | Guha | May 2010 | B2 |
7793326 | McCoskey et al. | Sep 2010 | B2 |
8060905 | Hendricks | Nov 2011 | B1 |
8065697 | Wright et al. | Nov 2011 | B2 |
8115869 | Rathod et al. | Feb 2012 | B2 |
20010003214 | Shastri et al. | Jun 2001 | A1 |
20010023433 | Natsubori et al. | Sep 2001 | A1 |
20020022491 | McCann et al. | Feb 2002 | A1 |
20020026436 | Joory | Feb 2002 | A1 |
20020087535 | Kotcheff et al. | Jul 2002 | A1 |
20020161767 | Shapiro et al. | Oct 2002 | A1 |
20020162121 | Mitchell | Oct 2002 | A1 |
20030028889 | McCoskey | Feb 2003 | A1 |
20030033273 | Wyse | Feb 2003 | A1 |
20030105682 | Dicker et al. | Jun 2003 | A1 |
20030131013 | Pope et al. | Jul 2003 | A1 |
20030158855 | Farnham et al. | Aug 2003 | A1 |
20030172075 | Reisman | Sep 2003 | A1 |
20030184582 | Cohen | Oct 2003 | A1 |
20030221198 | Sloo | Nov 2003 | A1 |
20030229900 | Reisman | Dec 2003 | A1 |
20030231868 | Herley | Dec 2003 | A1 |
20040031058 | Reisman | Feb 2004 | A1 |
20040073944 | Robert | Apr 2004 | A1 |
20040194141 | Sanders | Sep 2004 | A1 |
20040244038 | Utsuki et al. | Dec 2004 | A1 |
20040249790 | Komamura | Dec 2004 | A1 |
20050004910 | Trepess | Jan 2005 | A1 |
20050137996 | Billsus et al. | Jun 2005 | A1 |
20050144158 | Capper et al. | Jun 2005 | A1 |
20050154711 | McConnell | Jul 2005 | A1 |
20050160460 | Fujiwara et al. | Jul 2005 | A1 |
20050177555 | Alpert et al. | Aug 2005 | A1 |
20050240580 | Zamir et al. | Oct 2005 | A1 |
20050246726 | Labrou et al. | Nov 2005 | A1 |
20050289599 | Matsura et al. | Dec 2005 | A1 |
20060026152 | Zeng et al. | Feb 2006 | A1 |
20060028682 | Haines | Feb 2006 | A1 |
20060036593 | Dean et al. | Feb 2006 | A1 |
20060066573 | Matsumoto | Mar 2006 | A1 |
20060074883 | Teevan et al. | Apr 2006 | A1 |
20060084430 | Ng | Apr 2006 | A1 |
20060095415 | Sattler et al. | May 2006 | A1 |
20060133391 | Kang et al. | Jun 2006 | A1 |
20060136670 | Brown et al. | Jun 2006 | A1 |
20060156326 | Goronzy et al. | Jul 2006 | A1 |
20060161542 | Cucerzan et al. | Jul 2006 | A1 |
20060195362 | Jacobi et al. | Aug 2006 | A1 |
20060242283 | Shaik et al. | Oct 2006 | A1 |
20070043703 | Bhattacharya et al. | Feb 2007 | A1 |
20070050346 | Goel et al. | Mar 2007 | A1 |
20070061222 | Allocca et al. | Mar 2007 | A1 |
20070061352 | Dimitrova et al. | Mar 2007 | A1 |
20070073894 | Erickson et al. | Mar 2007 | A1 |
20070078822 | Cucerzan et al. | Apr 2007 | A1 |
20070107019 | Romano et al. | May 2007 | A1 |
20070130585 | Perret et al. | Jun 2007 | A1 |
20070143266 | Tang et al. | Jun 2007 | A1 |
20070156447 | Kim et al. | Jul 2007 | A1 |
20070179776 | Segond et al. | Aug 2007 | A1 |
20070198485 | Ramer et al. | Aug 2007 | A1 |
20070198500 | Lucovsky et al. | Aug 2007 | A1 |
20070214123 | Messer et al. | Sep 2007 | A1 |
20070214488 | Nguyen et al. | Sep 2007 | A1 |
20070220037 | Srivastava et al. | Sep 2007 | A1 |
20070233287 | Sheshagiri et al. | Oct 2007 | A1 |
20070300078 | Ochi et al. | Dec 2007 | A1 |
20080040316 | Lawrence | Feb 2008 | A1 |
20080082744 | Nakagawa | Apr 2008 | A1 |
20080114751 | Cramer et al. | May 2008 | A1 |
20080133501 | Andersen et al. | Jun 2008 | A1 |
20080133504 | Messer et al. | Jun 2008 | A1 |
20080162651 | Madnani | Jul 2008 | A1 |
20080162731 | Kauppinen et al. | Jul 2008 | A1 |
20080183596 | Nash et al. | Jul 2008 | A1 |
20080183681 | Messer et al. | Jul 2008 | A1 |
20080183698 | Messer et al. | Jul 2008 | A1 |
20080204595 | Rathod et al. | Aug 2008 | A1 |
20080208839 | Sheshagiri et al. | Aug 2008 | A1 |
20080235209 | Rathod et al. | Sep 2008 | A1 |
20080235393 | Kunjithapatham et al. | Sep 2008 | A1 |
20080242279 | Ramer et al. | Oct 2008 | A1 |
20080250010 | Rathod et al. | Oct 2008 | A1 |
20080266449 | Rathod et al. | Oct 2008 | A1 |
20080288641 | Messer et al. | Nov 2008 | A1 |
20090029687 | Ramer et al. | Jan 2009 | A1 |
20090055393 | Messer et al. | Feb 2009 | A1 |
20090077065 | Song et al. | Mar 2009 | A1 |
20090112848 | Kunjithapatham et al. | Apr 2009 | A1 |
20100070895 | Messer | Mar 2010 | A1 |
20100191619 | Dicker et al. | Jul 2010 | A1 |
Number | Date | Country |
---|---|---|
1393107 | Jan 2003 | CN |
1585947 | Feb 2005 | CN |
1723458 | Jan 2006 | CN |
1808430 | Jul 2006 | CN |
2003-099442 | Apr 2003 | JP |
10-2002-0005147 | Jan 2002 | KR |
10-2002-0006810 | Jan 2002 | KR |
10-2004-0052339 | Jun 2004 | KR |
10-2006-0027226 | Mar 2006 | KR |
WO 0137465 | May 2001 | WO |
WO 0243310 | May 2002 | WO |
WO 0243310 | May 2002 | WO |
WO 03042866 | May 2003 | WO |
WO 2005055196 | Jun 2005 | WO |
WO 2007004110 | Jan 2007 | WO |
Number | Date | Country | |
---|---|---|---|
20080266449 A1 | Oct 2008 | US |