The present invention relates to processing queries in a network, and more particularly to a framework for determining and pre-processing potential user queries related to content stored in a network.
The number of networked devices in local networks is on the rise, and so is the amount of data stored on them. Examples of networked devices include consumer electronic (CE) devices in local area networks such as home networks. Typically, consumers store and access several types of content (e.g., audio, video, image, other data files, etc.) in different formats on/via their devices. In addition to accessing such content, consumers commonly access audio/video broadcast data streams via external sources such as live broadcast television, cable networks, the Internet, etc.
Likewise, the amount of information available via the Internet is continually on the rise. A significant amount of information on the Internet relates to at least some of the content stored on a home network.
Media players (e.g., Windows Media Player, Real player, etc.), extract metadata information from the Internet for content that is currently/previously played by a user (i.e., content of interest to the user). Typically, such media players maintain a standard set of metadata types that can be extracted and displayed to a user, and rely on specific websites to obtain the required metadata. However, the amount of information made available to the user is limited since such media players only communicate with fixed websites on the Internet. As a result, the user cannot access random information related to the content of interest. In other words, if information related to the content of interest is not among the standard metadata information available on the specific websites that the media player is pre-configured to access, then the user is not presented with such related information.
Desktop search applications such as Google Desktop Search and Copernic are extensions of Internet searches where users can search for content on their PCs. However, drawbacks of such search extensions include: (1) requiring users to form queries and to refine the queries in order to obtain desired results, (2) requiring computing resources that far exceed what CE devices provide for analyzing large volumes of search results, and (3) requiring input devices such as a keyboard to enter a significant amount of query text for searching.
Therefore, there is a need for a method and system that simplifies processing of potential user queries related to content stored in a network.
The present invention provides a framework that identifies data that a user would likely be interested to access, then extracts and stores such data for the user to efficiently use when desired. A method of searching for information related to content stored in a network includes determining one or more potential user queries for information related to the content stored in the network, and resolving the queries by searching available sources before actual user request for information related to the content stored in the network.
Thereby, the framework allows the user to access several types of information efficiently, without the user having to explicitly request the information. As such, the framework does not restrict the user to choose the type of information the user wishes to access from a limited list of metadata information.
In addition, in some embodiments, the framework enables users to use a CE device, such as a TV, for accessing information using a small number of keys without the need for a typical keyboard. Further, the framework allows the users to obtain information from an external network (e.g., the Internet) with minimum involvement in query construction. The framework suggests information based on query context to augment user experience of using CE devices with additional data. Accordingly, the power of the Internet is delivered to consumers that use CE devices in an efficient manner in terms of performance and ease of use.
These and other features, aspects and advantages of the present invention will become understood with reference to the following description, appended claims and accompanying figures.
The present invention provides a framework for identifying and pre-processing potential user queries based on the knowledge of the various types of content in a local network, such as a home network, enterprise network, etc. The content can comprise content of interest to the user. Once identified, the potential user queries are resolved even before a user expresses interest in them. The query results can be accessed locally by the user when desired.
In one implementation, query pre-processing according to the present invention includes forming a query based on metadata and framing it into a format that can be searched by an external source, such as a search engine on the Internet. For example, metadata can be <Artist: Sting>, based on which a query for a search engine can be framed as “artist sting.” Further, resolving the query involves a process by which results for the query is obtained from the source, before actual user request for information related to the content stored in the network. In this example, this involves obtaining the search results for the query “artist sting.”
Accordingly, the framework identifies, pre-processes and resolves potential user queries even before the user asks for them, such that it becomes possible to respond to many user requests (actual queries) very efficiently. Because desired results are pre-fetched and stored locally, the results can be quickly accessed when needed, compared to accessing the Internet to obtain them when the user asks for them.
Preferably, the framework identifies many potential queries that a home user may be interested in, and can resolve such queries even before the user expresses interest to do so. Further, for example, in a device such as media player that implements such a framework, there is essentially no restriction on the sources accessed for information. The metadata can be obtained locally or from external sources. Further, information of potential interest can be obtained from various sources. For example, the metadata information <Artist:Sting> can be composed as an Internet search engine query or it can be composed specifically for a music website.
The devices 20 and 30, respectively, can be part of an IP-based network and therefore can communicate with each other. In the example described herein, the UPnP protocol is utilized by the network 10. However, those skilled in the art will recognize that the present invention is useful with other network communication protocols (e.g., Bluetooth, Jini, HAVi, IEEE 1394, etc.). The network 10 can be a wired network, a wireless network, or a combination thereof.
Said framework can be implemented as a logical module on any of the devices 20 and 30 in
In one example, a music album of the artist “Sting” is available in the home network 10 and the available metadata information includes artist, album, title and genre of the music. Given this information, an appropriate Internet website (e.g., allmusic.com) is accessed to find other information available about a music album in general and that particular music album by “Sting.” This allows discovery of additional information such as, “release date for the album,” “biography of Sting,” “lyrics for song X in the album” that are available. Based on this information, general queries (e.g., artist biography, release date of album, etc.) and specific queries (e.g., Sting's biography, etc.) are formed to extract additional information from additional (e.g., not music specific) sources such as search engines. The extracted additional information is stored locally in the network for efficient access and speed.
The queries are optionally customized based on contextual information, such as the user's history/preferences, etc. For example, if a particular user has previously requested the age of an artist (a request which the framework has never been able to guess as a potential user query), then the framework ensures that such information is searched and made available in case the user desires it in the future.
In the example herein, unstructured data refers to a data segment (e.g., free text data segment, or marked up data segment) whose semantics cannot be analyzed (e.g., Google search ‘pope’ or <other>pope</other>). Structured data refers to XML data with tags that define closely the semantics of small sections of free-form data (e.g., CDs song information <artist>Sting</artist>). Semi-structured data refers to data (such as XML) with tags that define part of the free-form data, but do not describe the semantics of significant sections of the data (e.g., EPG data sections <review> . . . </review>). Web pages are included in both unstructured and semi-structured sources. Most web pages are unstructured (e.g., most web pages with free descriptive text), but some web pages are semi-structured (e.g., those with content from a database).
The CF 106 includes a Query Execution Planner (QEP) 118, a Correlation Plan Executor (CPE) 120, a Correlation Constructor (CC) 122, and Rulelets 124. The CIG 104 and the data extraction plug-ins 108 obtain local content and Internet data for the CF 106. Potential user queries are formed and resolved by the CF 106, and the query results are locally stored and presented to the user when requested.
Specifically, the CIG 104 gathers information about the user, such as current user and device activity, content stored on devices 110, user history, preferences, etc.
The QEP 118 constructs a plan for forming and resolving potential queries, based on information gathered by the CIG 104. The plan essentially describes the steps to execute in order to identify the potential information of interest to the user, and to form one or more queries to search for that information.
The CPE 120 executes this plan using the Rulelets 124 to perform the individual steps in the plan. The Rulelets 124 comprise specialized processes that execute a specific task (e.g., extracting metadata information for a music album, etc.). The Ruelets 124 invoke the data extraction plug-ins 108 for obtaining information from external sources. The data extraction plug-ins 108 extract the requested data from appropriate sources, including the local devices/media repository and the Internet (via Internet search engines such as Google, Yahoo, etc., and seed sources such as CDDB.org, allmusic.com). For example, results from a search engine (e.g., Google) can be obtained from a search engine plug-in 108. Whereas, CDs by Sting can be obtained from an online music store through a plug-in 108 designed to work with that music store site.
The CC 122 stores the search results and identifies correlations amongst them and the locally available data for presentation to the user via the client UI 102 when the user requests such information. In this example, the client UI 102 further provides interfaces for the user to access local content and any related Internet data that the CF 106 identifies and provides to the CC 122.
The modules 104, 106 and 108 in
The identified correlations can comprise the correlations in step 210, or others, depending on the available local and extracted related data. As such, building over the example in step 210, if the artist of both albums X and Y are the same, the stored correlations/information is used later as follows: A user interface displays the release date of an album alongside other metadata information, when the user accesses the album. The user interface also displays (and allows users to browse) the additional correlations identified, such as the different kinds of virtual album groupings that the framework created.
A framework according to the present invention does not restrict the user to choose from a limited/standard list of related/metadata information. Rather, the framework identifies additional information that the user would likely be interested to access, then extracts and stores such data for the user to efficiently use when desired. Additional information can be made accessible by adding new plans and new plug-ins. Thereby, the framework allows the user to access several types of information efficiently, without the user having to request the information.
The framework further enables a user to utilize a CE device, such as a TV, for accessing information using a small number of keys without a keyboard. As discussed herein (e.g., steps 202-212 above), the Internet data is made available to the user based on resolution of the potential queries that are related to the home network content. Further, the framework allows the users to obtain information from an external network (e.g., the Internet) with minimum involvement in query construction. The framework suggests information based on query context to augment user experience of using CE devices with additional data. Accordingly, the power of the Internet is delivered to consumers that use CE devices in an efficient manner in terms of performance and ease of use.
While the example embodiments herein are related to media content, those skilled in the art will recognize a framework according to the present invention is also useful with, and can be applied to, several other kinds of data such as sports program, news clippings, etc.
As is known to those skilled in the art, the aforementioned example architectures described above, according to the present invention, can be implemented in many ways, such as program instructions for execution by a processor, as logic circuits, as an application specific integrated circuit, as firmware, etc. The present invention has been described in considerable detail with reference to certain preferred versions thereof; however, other versions are possible. Therefore, the spirit and scope of the appended claims should not be limited to the description of the preferred versions contained herein.