The amount of information and content available on the Internet continues to grow exponentially. Given the vast amount of information, search engines have been developed to facilitate web searching. In particular, users may search for information and documents by entering search queries comprising one or more terms that may be of interest to the user. After receiving a search query from a user, a search engine identifies documents and/or web pages that are relevant based on the terms. A search page is returned with a list of hyperlinks to “landing pages” that correspond with the identified documents and/or web pages. Because of its utility, web searching, that is, the process of finding relevant web pages and documents for user-issued search queries has arguably become one of the most popular services on the Internet today. However, in some instances, when a user selects a search result and accesses a landing page, although that landing page may be within a website (i.e., collection of web pages within a given domain) that contains the information the user is after, that particular landing page may not have the relevant information. As a result, the user may have to browse or search pages within the website to find the information the user is seeking.
This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter.
Embodiments of the present invention relate to identifying actions from social data and using the action data to provide deeplinks for search results. Social data from social networking services may initially be accessed. The social data may be analyzed to identify actions discussed by end users. Additionally, the social data may be analyzed to identify uniform resource locators (URLs) of web pages at which the actions may be performed. Information is stored regarding identified actions and corresponding URLs for providing deeplinks as part of search results returned in response to user search queries.
The present invention is described in detail below with reference to the attached drawing figures, wherein:
The subject matter of the present invention is described with specificity herein to meet statutory requirements. However, the description itself is not intended to limit the scope of this patent. Rather, the inventors have contemplated that the claimed subject matter might also be embodied in other ways, to include different steps or combinations of steps similar to the ones described in this document, in conjunction with other present or future technologies. Moreover, although the terms “step” and/or “block” may be used herein to connote different elements of methods employed, the terms should not be interpreted as implying any particular order among or between various steps herein disclosed unless and except when the order of individual steps is explicitly described.
As discussed above, one problem that users may face when employing search engines is that although a search result returned may correspond with a web page within a website containing the relevant information the user is seeking or a particular action the user wishes to perform, the user may be required to browse the website after selecting the search result to find the information or to perform the action. For instance, suppose a user wishes to check into a flight on an airline. The user may issue a search query for the airline and receive search results that include a hyperlink to the main web page of the website for the airline. After selecting the hyperlink to the main web page for the airline, the user would then need to find a location within the web site for checking into the flight. In some cases, this may be a time-consuming task.
One approach to addressing this problem has been the inclusion of deeplinks in search results on search result pages. As used herein, the term “deeplinks” refers to additional hyperlinks that are provided in association with a matching search result. In other words, a search result may include a hyperlink to a main destination web page, as well as deeplinks to other web pages to which the main destination web page links. For instance, in the example above, the main web page for the airline (i.e., the main destination web page) may include a hyperlink to a web page within the airline's website for logging into flights for the airline. Accordingly, a hyperlink to the location for checking into flights for the airline could be included as a deeplink in association with the hyperlink to the main web page for the airline provided as a search result to a user's search query. Although deeplinks allow users to more quickly access the information they are seeking or perform particular actions, the deeplinks included in search results vary from web page to web page, resulting in a fragmented experience for users. As a result, users may not quickly recognize a deeplink and may even simply select the hyperlink to the main destination web page for the search result, thereby missing out on the benefits of the deeplinks.
Some embodiments of the present invention are directed to providing action-based deeplinks with search results to provide a more consistent experience across web pages within a category of web pages. The action-based deeplinks link to locations that allow users to perform actions that are common to a given web page category. For instance, suppose that a web page category is an airline category, which includes the web pages of different airlines. The action-based deeplinks that may be provided would link to locations within the airline websites that allow users to perform airline-related actions such as checking into flights, check the status of flights, and booking flight reservations. To provide a consistent experience, the search results for airline web pages returned in response to search queries may have the same type of action-based deeplinks. While the action-based deeplinks are similar to traditional deeplinks, the action-based deeplinks differ from traditional deeplinks in that while traditional deeplinks are specific to a given web page, the action-based deeplinks are similar for web pages within a given category, thereby providing a more consistent user experience. This approach makes it easier for users to quickly navigate to a desired location and perform key tasks, thereby reducing the overall time required to perform the tasks.
In some embodiments of the present invention, action-based deeplinks may be identified for web pages by first categorizing web pages into a variety of categories. Each category is then analyzed to identify action-based deeplinks for web pages in each category. For a given category, hyperlinks within web pages of that category are identified and grouped into a number of clusters. Each cluster may correspond with a particular action users perform when visiting the web pages. For instance, in the example of airlines categories, the actions may include checking into a flight, check the status of a flight, and making flight reservations. Hyperlinks are identified within web pages that allow users to perform each action. Based on that information, action-based deeplinks may be provided when returning search results for those web pages. Again, because the action-based deeplinks may be similar among web pages within a given category, a more consistent user experience may be provided that allows users to more quickly perform desired tasks.
Social data may be used in some embodiments to identify actions performed by users for purposes of determining which deeplinks to provide with search results. In particular, end users often discuss actions that they perform in social networking messages, such as posts or status update messages. For instance, an end user may post a message that states: “I just watched Star Wars: Episode I at Regal Cinemas.” Here, the action identified would be a “watch” action or a “buy movie tickets” action. By analyzing social data, actions that are commonly being performed by end users may be identified. Accordingly, deeplinks that allow end users to perform those actions may be provided with search results.
Accordingly, in one aspect, an embodiment of the present invention is directed to one or more computer storage media storing computer-useable instructions that, when used by one or more computing devices, cause the one or more computing devices to perform a method. The method includes accessing social data from one or more social networking services. The method also includes analyzing the social data to identify a plurality of actions. The method further includes analyzing the social data to identify at least one URL for each action, each URL corresponding with a webpage at which a corresponding action may be performed. The method still further includes storing information regarding at least a portion of the plurality of actions and corresponding URLs for providing deeplinks for search results.
In another embodiment, an aspect is directed to one or more computer storage media storing computer-useable instructions that, when used by one or more computing devices, cause the one or more computing devices to perform a method. The method includes accessing social data from one or more social networking services. The method also includes analyzing the social data to identify a plurality of actions and analyzing the social data to identify a plurality of segments. The method further includes identifying a group of actions for a first segment based on the plurality of actions and plurality of segments from the social data. The method also includes selecting top actions from the group of actions for the first segment. The method further includes identifying a URL for each top action. The method still further includes storing information regarding the top actions and corresponding URLs for providing deeplinks for search results.
A further embodiment of the present invention is directed to a method. The method includes receiving a search query from an end user at a search engine service. The method also includes identifying a web page in response to the search query by querying a search engine index based on the search query. The method further includes identifying deeplinks for the web page, at least one deeplink having been identified based at least in part on an analysis of social data. The method also includes generating a search result for the web page to include the deeplinks. The method still further includes providing the search result for presentation to the end user.
Having briefly described an overview of embodiments of the present invention, an exemplary operating environment in which embodiments of the present invention may be implemented is described below in order to provide a general context for various aspects of the present invention. Referring initially to
The invention may be described in the general context of computer code or machine-useable instructions, including computer-executable instructions such as program modules, being executed by a computer or other machine, such as a personal data assistant or other handheld device. Generally, program modules including routines, programs, objects, components, data structures, etc., refer to code that perform particular tasks or implement particular abstract data types. The invention may be practiced in a variety of system configurations, including hand-held devices, consumer electronics, general-purpose computers, more specialty computing devices, etc. The invention may also be practiced in distributed computing environments where tasks are performed by remote-processing devices that are linked through a communications network.
With reference to
Computing device 100 typically includes a variety of computer-readable media. Computer-readable media can be any available media that can be accessed by computing device 100 and includes both volatile and nonvolatile media, removable and non-removable media. By way of example, and not limitation, computer-readable media may comprise computer storage media and communication media. Computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer-readable instructions, data structures, program modules or other data. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by computing device 100. Computer storage media does not include signals per se. Communication media typically embodies computer-readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media. The term “modulated data signal” means a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal. By way of example, and not limitation, communication media includes wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, RF, infrared and other wireless media. Combinations of any of the above should also be included within the scope of computer-readable media.
Memory 112 includes computer-storage media in the form of volatile and/or nonvolatile memory. The memory may be removable, non-removable, or a combination thereof. Exemplary hardware devices include solid-state memory, hard drives, optical-disc drives, etc. Computing device 100 includes one or more processors that read data from various entities such as memory 112 or I/O components 120. Presentation component(s) 116 present data indications to a user or other device. Exemplary presentation components include a display device, speaker, printing component, vibrating component, etc.
I/O ports 118 allow computing device 100 to be logically coupled to other devices including I/O components 120, some of which may be built in. Illustrative components include a microphone, joystick, game pad, satellite dish, scanner, printer, wireless device, etc.
As previously indicated, action-based deeplinks may be provided with search results to allow users to access and perform actions that are common to web pages within a given category. Referring to
Among other components not shown, the system 200 may include a user device 202, content server 204, and search engine server 206. Each of the components shown in
The search engine server 206 generally operates to index information regarding web pages served by content servers, such as the content server 204, in a search engine index 210. When the search engine server 206 receives search queries from user devices, such as the user device 202, the search engine queries the search engine index 210 to identify search results based on the users' search queries and returns those search results to the user devices. In accordance with embodiments of the present invention, the search engine server 206 is also configured to identify action-based deeplinks for some web pages and to provide those action-based deeplinks when providing search results corresponding with those web pages.
In the embodiment shown in
The web page categorization component 212 operates to identify a category for each of a number of different web pages served by content servers, such as the content server 204, and indexed in the search engine index 210. As a result, web pages are clustered together into various categories. By way of example only and not limitation, the web page categorization component 212 may identify web pages within a restaurants category, hotels category, airlines category, and social networks category, to name a few.
Web page categorization may be performed in any of a number of different manners within the scope of embodiments of the present invention. In some embodiments, the categorization may be based on an existing repository of web page categorizations, such as the Open Directory Project (ODP). In some embodiments, web pages may be automatically grouped together into categories by analyzing the content of the pages. For instance, clustering techniques may be employed to cluster the web pages based on their content. As another example, the web pages may be categorized by looking for particular keywords in the content of the web pages. Web page categorization could also be automatically performed by analyzing the hyperlinks within the content of the web pages. Web page categorization may also include a manual approach based on editorial review of web pages to manually place the web pages into the different categories. In still further embodiments, a sample of manually-categorized web pages may be used as seeds for an automatic approach in which other web pages are compared against the seed web pages to categorize the other web pages. Any and all such variations and combinations thereof are contemplated to be within the scope of embodiments of the present invention.
The link clustering component 214 operates to cluster hyperlinks found on web pages within each category. For a given category, the link clustering component 214 may analyze hyperlinks contained within the web pages within that given category to cluster the hyperlinks into a number of clusters. The clustering may be performed in some embodiments by analyzing the words in the anchor text of the hyperlinks. As is known in the art, the anchor text refers to the displayed text of a hyperlink. Hyperlinks containing similar words would be clustered together. In some embodiments, the content of each hyperlink's destination web page may be analyzed to cluster the hyperlinks. In still further embodiments, the clustering may include manual review of hyperlinks and/or destination web pages of the hyperlinks to facilitate clustering.
In some embodiments, the link clustering component 214 may analyze and cluster all hyperlinks within each web page within the category. In other embodiments, the link clustering component 214 may cluster only a portion of hyperlinks from the web pages. For instance, the link clustering component 214 may consider only the hyperlinks that meet some threshold based on user clicks on the hyperlinks. In some embodiments, the search engine server 206 may receive click-through data collected by web browsers, search toolbars, or other mechanisms on user devices, such as the user device 202. The click-through data may indicate the hyperlinks that users have clicked within web pages when viewing those web pages. Based on such click-through data, the most-clicked hyperlinks on a given web page may be identified and only those hyperlinks considered by the link cluster component 214. For instance, only the hyperlinks that have received a threshold number of clicks or that have a threshold click-through rate (i.e., the percentage of web page visits for the web page that have resulted in a click on the hyperlink) may be considered.
In some embodiments, the search engine server 206 may be configured to provide traditional deeplinks. In such embodiments, the link clustering component 214 may consider only hyperlinks corresponding with the deeplinks for the web page. All deeplinks may be considered in some embodiments, while only a portion of the deeplinks may be considered in other embodiments. For an example of the latter, the search engine server 206 may track user clicks of deeplinks from search results. Based on such click data, the search engine server 206 may identify the hyperlinks that correspond with the most-clicked deeplinks and only those hyperlinks considered by the link cluster component 214. For instance, only the hyperlinks that correspond with deeplinks that have received a threshold number of clicks or that have a threshold click-through rate (i.e., the percentage of search results for the web page that have resulted in a click on the deeplink) may be considered.
Based on the clustering of hyperlinks from the link clustering component 214, the action-based deeplink identification component 216 may identify one or more different types of actions for the web page category being analyzed. In particular, each action may correspond with a type of action users perform using hyperlinks within a cluster of hyperlinks. For instance, if the web page categorization being analyzed is an airlines category, a first cluster of hyperlinks may correspond with locations for checking into flights, a second cluster of hyperlinks may correspond with locations for checking the status of flights, and a third cluster of hyperlinks may correspond with locations book making flight reservations. As such, a check in action may be identified based on first cluster of hyperlinks, a check status action may be identified based on the second cluster of hyperlinks, and a reservations action may be identified based on the third cluster of hyperlinks.
In some embodiments, the action-based deeplink identification component 216 may consider each cluster identified by the link clustering component 214 and identify an action for each of those clusters. In other embodiments, only clusters that meet some threshold may be processed by the action-based deeplink identification component 214 to identify an action for each of those clusters. For instance, in some embodiments, only clusters that include a threshold number of hyperlinks may be further processed by the action-based deeplink identification component 214. In some embodiments, click-through rates for each hyperlink in a cluster may be analyzed to determine whether to process the cluster. In such embodiments, actions may be identified only for clusters with hyperlinks that satisfy a threshold level of click-throughs. The click-through data may represent user clicks on hyperlinks when visiting the web page. Such click-though data may be collected by web browsers, search engine tool bars, or other mechanisms on user devices, such as the user device 202, and provided to the search engine server 206. In instances in which the hyperlinks correspond with deeplinks, deeplink click-through data may be employed. The deeplink click-through data represents user clicks on deeplinks presented in search results.
After identifying an action for a web page category, the action-based deeplink identification component 216 may identify, for web pages within the category, hyperlinks that correspond with that action. For instance, if the web page category being analyzed is an airlines category and an identified action is checking into flights, hyperlinks on web pages within the category that link to locations that allow users to check into flights would be identified as corresponding with that action. The URL for those locations or other information may then be stored in association with each web page in the search engine index 210 to allow for providing action-based deeplinks when returning search results to search queries.
A hyperlink corresponding with an action may be identified for web pages within a given category in a number of different ways. In some instances, the hyperlinks may be identified from the cluster of hyperlinks for that action. In some cases, a web page may not have had a hyperlink placed in that cluster. For such a web page, hyperlinks from that web page may be analyzed to identify a hyperlink that corresponds with the action. This may include, for instance, automatically analyzing the anchor text of hyperlinks and/or the content of the destination web pages of the hyperlinks to identify a hyperlink that corresponds with the action. For instance, the anchor text or content of the destination web page for a hyperlink may be compared against the anchor text and/or destination web page content for hyperlinks in the cluster of hyperlinks used to identify the action. In some embodiments, editors may manually review web pages to identify hyperlinks that correspond with an action. Any and all such variations and combinations thereof are contemplated to be within the scope of embodiments of the present invention.
In some embodiments, a hyperlink may be identified for a particular action for only a portion of web pages in the category. In other embodiments, a hyperlink for an action may be identified for most or even all web pages in the category. In this way, an action-based deeplink may be provided with search results for the web pages in the category to provide users with a consistent experience. This would allow users to more quickly get to the information and perform desired actions.
The user interface component 218 provides an interface to user devices, such as the user device 202, that may allow users to submit search queries to the search engine server 206 and to receive search results from the search engine server 206. It should be understood that the user device 202 may be any type of computing device employed by a user to submit search queries and receive search results. By way of example only and not limitation, the user device 202 may be a desktop computer, a laptop computer, a tablet computer, a mobile device, or other type of computing device. The user device 202 may include an application that allows a user to enter a search query and submit the search query to the search engine server 206 to retrieve search results. For instance, the user device 202 may include a web browser that includes a search input box or allows a user to access a search page to submit a search query. Other mechanisms for submitting search queries to search engines are contemplated to be within the scope of embodiments of the present invention.
When the search engine server 206 receives a search query, the search engine index 210 is queried to identify search results. In some instances, a search result may have a corresponding action-based deeplink that has been identified by the action-based deeplink identification component 216. Accordingly, when the search engine server 206 returns the search result to the user device 202, the search result includes not only a hyperlink to the destination web page of the search result, but also an action-based deeplink that links to a location that allows the user to perform a corresponding action. In some embodiments, the search engine server 206 may return a search result that includes a hyperlink to the destination web page of the search result, one or more traditional deeplinks, and one or more action-based deeplinks based on information indexed for a web page corresponding with the search result.
Examples of search results that include action-based deeplinks are illustrated in the screen displays shown in
Referring first to
Turning next to
In some embodiments, the action-based deeplinks displayed for search results corresponding with each web page in a given category may be consistent across the search results. That is, action-based deeplinks may be provided for the same actions. Additionally, the action-based deeplinks may be presented in a similar manner using common anchor text and/or icons. Accordingly, while the deeplinks presented may vary from web page to web page, the action-based deeplinks may be similar or the same. For instance, when search results for other airlines' web pages are provided in response to search queries, the deeplinks may be different from the deeplinks 308 shown in
With reference now to
As shown at block 504, a web page category is selected for analysis. Hyperlinks within the web pages within the selected category are identified, as shown at block 506. In some embodiments, all hyperlinks contained within those web pages may be identified for further processing. However, in other embodiments, only a subset of hyperlinks may be employed. For instance, some embodiments may identify hyperlinks to consider based on user click data representing user clicks on hyperlinks when users visit the web pages or user clicks on deeplinks in search results, the deeplinks corresponding with particular hyperlinks in the web pages.
The hyperlinks are clustered into a number of clusters, as shown at block 508. In various embodiments, the hyperlink clustering may be based on the anchor text of the hyperlinks, the content of the destination web pages corresponding with the hyperlinks, and/or other data. Common actions for the category of web pages are identified based on the hyperlink clusters, as shown at block 510. In particular, a cluster may be identified as corresponding with a particular action. In some instances, an action is identified for each cluster. In other embodiments, actions are only identified for clusters that meet a threshold, which may be based on, for instance, a total number of hyperlinks within a given cluster, click-through data for hyperlinks within a given cluster, or other data.
Action-based deeplinks are identified for each action for at least a portion of the web pages in the category, as shown at block 512. The action-based deeplinks correspond with hyperlinks in the web pages that link to locations corresponding with each action. Data is stored identifying the action-based deeplinks for web pages, as shown at block 514. This allows for the action-based deeplinks to be provided with search results for the web pages in response to search queries.
As indicated previously, some embodiments may identify action-based deeplinks by analyzing deeplinks from web pages within a given web page category as opposed to analyzing all hyperlinks from those web pages. This specific approach is illustrated in the method 600 shown in
Deeplinks for web pages in the selected category are identified, as shown at block 606. These deeplinks correspond with hyperlinks that are provided in association with search results for the web pages when returning the web pages as search results in response to search queries. The deeplinks may have been previously identified for the web pages, for instance, by analyzing the hyperlinks in the web pages to identify important or popular hyperlinks (e.g., based on user clicks on the hyperlinks when users visit the web pages).
Popular deeplinks are identified at block 608. This may be performed by analyzing click-through data for the deeplinks. The click-through data may comprise information regarding user clicks on deeplinks when the deeplinks are provided with search results in response to search queries. By way of example only and not limitation, the click-through data for a deeplink may include information such as raw click data or click-through rates based on the number of clicks on the deeplink as compared to the number of times the deeplink is returned with search results.
The popular deeplinks are clustered into a number of clusters, as shown at block 610. The clustering may be based on, for instance, the anchor text of the deeplinks and/or the content of the destination web pages of the deeplinks. Actions are identified based on the clusters of deeplinks, as shown at block 612. In some instances, an action may be identified for each cluster. In other instances, only clusters that satisfy some threshold may be considered for identifying an action. The threshold may be based on, for instance, the number of deeplinks within a cluster or the popularity of the deeplinks in a cluster (e.g., based on click-through data).
For each identified action, hyperlinks within web pages within the selected category are identified as corresponding with the action, as shown at block 614. This may be done for a given action by identifying the deeplinks within the cluster corresponding with the action. In some embodiments, the hyperlinks may be automatically identified by analyzing the anchor text of hyperlinks or content of the destination web pages of the hyperlinks. This may include comparison of the anchor text or content of the destination web pages to the action or the cluster of deeplinks corresponding with the action (for instance, the anchor text or destination web pages for those deeplinks). In further embodiments, the hyperlinks may be manually identified by editors who review the web pages to identify hyperlinks corresponding with an action.
As shown at block 616, data is stored identifying the action-based deeplinks for web pages. This allows for the action-based deeplinks to be provided with search results for the web pages in response to search queries.
Turning now to
When the search result is presented, the search result may include a hyperlink to a destination web page corresponding with the search result. Additionally, the search result may include the deeplinks and the action-based deeplinks, which link to different web pages. In embodiments, the search result may include any number of deeplinks and action-based deeplinks. In some embodiments, the action-based deeplinks may be presented more prominently than the traditional deeplinks. The end user may select an action-based deeplink from the search result, as shown at block 714. In response to the user selection, the end user is navigated to the destination web page corresponding with the selected action-based deeplink, as shown at block 716.
Surfacing Actions from Social Data
End users often report actions they perform in their social networking messages. As such, social data from social networking services may be analyzed to identify actions that end users are performing and that information may be used in determining what deeplinks to provide for search results returned by a search engine service in response to user search queries.
For instance, an end user may post a status message such as: “I just watched Star Wars: Episode I at Regal Cinemas.” Here, the action performed would be a “watch” action or a “buy movie tickets” action. When this action is seen in multiple social networking messages from various end users, it's likely that this is a popular action that should be surfaced as a deeplink when returning search results.
Accordingly, some instances of the present invention are directed to accessing social data and analyzing the social data to identify actions. Using the social data may improve the precision and recall of the action dataset for deeplinks.
The social data may generally include a number of discrete social networking messages from social networking services. As used herein, a “social networking service” may be any type of online service that facilitates sharing messages and other content among end users within a social network. As used herein, a “social network” may be a group of end users who are linked together for content sharing purposes. In some instances, the linking between two end users may require a first end user to request to be linked to the second end user and the second end user to accept that linking (e.g., “friending” via “friend request” and “acceptance”). In other instances, the linking may simply include an end user subscribing to be linked to another user or group of users. Examples of social networking services include the FACEBOOK, TWITTER, LINKEDIN, MYSPACE, and GOOGLE PLUS social networking services. As used herein, a “social networking message” may be any type of message or other content shared among end users within a social network via a social networking service. For example, a social networking message may be a post, a status update, or a “tweet.” A social networking message may be shared among all end users within a particular social network or only a subset of the end users within the social network.
The social data may be analyzed to identify common actions discussed in social networking messages. Additionally, the social data may be analyzed to identify URLs of web pages at which those actions may be performed. That information may then be used to store metadata that may be used to select deeplinks for search results.
In some instances, the social networking messages may be analyzed to identify not only actions but entities and/or segments associated with the actions and that information may be used in determining what information to extract and store for deeplink purposes. An entity may be the object associated with the action in a social networking message. For instance, in the above example post: “I just watched Star Wars: Episode I at Regal Cinemas,” the entity that may be identified with the “watch” action may be “Stars Wars: Episode I” and/or “Regal Cinemas.” A segment is a broader category to which a related entity belongs and/or a website category (e.g., as discussed hereinabove) to which websites at which the action may be performed belong. For instance, in the above example post, the identified entity may be classified to a segment for “movies.” Alternatively, a URL for the “Regal Cinemas” website may be identified and that URL may be determined to belong to a website category corresponding with a segment for “movies.”
By identifying entities and/or segments associated with actions, the various actions for a given entity and/or segment may be identified and the most popular or “top” actions for an entity and/or segment may be determined based on various factors, such as the number of social networking messages discussing the action or how recent the messages were posted.
Turning now to
The social data is analyzed to identify actions discussed by end users within the social networking messages, as shown at block 804. For instance, in the example post: “I just watched Star Wars: Episode I at Regal Cinemas,” the action may be “watched.” An action may be identified in a social networking message in a number of different ways. For instance, an action may be identified by parsing the message and performing lexical analysis to identify an action word or phrase in the text of the message. In some instances, a social networking message may have an associated schema defined by the social networking service or otherwise and the action may be identified based on the schema associated with the message.
The social data is also analyzed to identify URLs of web pages at which the actions may be performed, as shown at block 806. For instance, continuing the example post above, the URL of a web page at which an end user may purchase tickets to watch a movie at Regal Cinemas may be identified.
In some instances, URLs may be embedded with the social data. For example, end users may include URLs when they compose messages. In other instances, URLs may be automatically included when social networking messages are generated. Accordingly, when a social networking message includes an embedded URL, that URL may be identified as corresponding with the action identified from that social networking message.
In other instances, URLs may be identified from additional analysis. For instance, when a social networking message is analyzed, both an action and an entity may be identified. An entity may be identified, for instance, by performing lexical analysis of a social networking message or relying on a schema for the social networking message to identify the object of the message. The object may be identified as a particular entity. A website for that entity may then be identified. The website may be identified for an entity, for instance, by relying on an existing knowledge base, such as the ODP, that may link entities to websites or particular web pages. The website may then be analyzed to identify a web page (and corresponding URL) at which the action (identified from the social networking message may be performed). The web page for an action may be identified, for instance, by analyzing the content of the web page or analyzing anchor text of a link to the web page. By way of example to illustrate, continuing the post above, the entity “Regal Cinemas” may be identified from the post, a website for “Regal Cinemas” may be identified, and the website may be analyzed to identify a web page at which movie tickets may be purchased.
In some instances, URLs identified from social data may need to be normalized for use in deeplinks. In particular, the URLs identified from social data for a particular action may not point to a web page that allows performance of the particular action generically but may include additional parameters. For instance, in the example in which the action corresponds with buying movie tickets at a particular theater, the associated social networking messages that include the action may include URLs that correspond with specific movies. To illustrate, a number of end users may have posted messages that include URLs that correspond with the various movies the end users watched at a particular theater. By way of example to illustrate, the URLs may include: www.exampletheatresite.com/buytickets?movie=firstmovie and www.exampletheatresite.com/buytickets?movie=secondmovie. As can be seen, these two URLs correspond with different movies (i.e., “firstmovie” and “secondmovie”). To identify a URL at which tickets may be purchased for any movie (i.e., a generic URL for the action), the URLs may be analyzed to identify a pattern. That pattern may then be analyzed to determine additional parameters that may be removed to identify the generic URL. Those additional parameters may then be removed to provide the generic URL for the action. For instance, from the above example URLs, the generic URL for a “buy movie ticket” action may be: www.exampletheatresite.com/buytickets.
Information regarding identified actions and corresponding URLs is stored, as shown at block 808. The information stored for URLs may be generic URLs determined as discussed above. As such, this information may be used during runtime, when a search query is issued to a search engine, to provide deeplinks as part of search results returned in response to the search query.
In some instances, information obtained from social data (e.g., actions and/or URLs) may be validated using the traditional deeplinks pipeline of the search engine service. The traditional deeplinks pipeline may identify deeplinks by identifying top URLs that users visit within websites. Accordingly, actions and/or URLs identified from social data may be validated and/or otherwise used in conjunction with the traditional deeplinks pipeline for determining which deeplinks to provide for search results.
Social data may be processed to identify actions and URLs corresponding with the actions in a variety of different manners in accordance with various embodiments of the present invention. In some instances, the process may include clustering, classifying, or other techniques to improve precision, for instance, by identifying the top actions and more reliably identifying associated URLs for actions. In one approach, actions may be identified from social networking messages (e.g., using lexical analysis or another technique as indicated above), and social networking messages may be grouped into clusters (e.g., using a naïve clustering algorithm or other technique) based on the actions identified for the messages. URLs may then be identified from the social networking messages within each cluster. In some instances, top actions may also be identified, and information for only those top actions may be stored (e.g., at block 808 of
In some instances, social data may be analyzed to identify actions associated with entities and that information may be employed to determine what information to store for deeplink purposes. In particular, social networking messages may include identification of both an action and an associated entity. Referring to
Correspondence between actions and entities are also determined, as shown at block 908. This may include, for instance, identifying action, entity pairs, in which each action, entity pair represents an action and a corresponding entity associated with the action.
Action, entity pairs may be identified from the social data in a number of different ways. By way of example, social networking messages may first be clustered around either actions or entities and each of those clusters may then be subclustered based on the other. For instance, social networking messages may be first clustered around actions and each of those action clusters may then be subclustered around entities. Each subcluster would represent an action, entity pair. Another approach for identifying action, entity pairs may be to perform a first clustering of the social networking messages around actions and another clustering of the social networking messages around entities. The overlap of social networking messages between action clusters and entity clusters may be used to identify the action, entity pairs.
In this manner, the various actions available for a given entity may be identified, and that information may be used for determining what information to store. For instance, top actions for a particular entity may be identified, as shown at block 910. Top actions may be selected based on various factors, such as the number of social networking messages, how recent the messages were posted, and the number of end users. URLs for those top actions may be identified at block 912, and information for those top actions and corresponding URLs may be stored, as shown at block 914.
In some instances, social data may be analyzed to identify actions associated with entities and that information may be employed to determine what information to store for deeplink purposes. In particular, social networking messages may include identification of both an action and an associated entity.
In further instances, social data may be analyzed to identify actions associated with segments and that information may be employed to determine what information to store for deeplink purposes. As previously indicated, a segment refers to a category of entities and/or websites. Referring to
URLs may be classified to particular segments using web page categorization techniques similar to the web page categorization techniques described hereinabove. As previously discussed, web page categorization may be performed in any of a number of different manners within the scope of embodiments of the present invention. In some embodiments, the categorization may be based on an existing repository of web page categorizations, such as the Open Directory Project (ODP). In some embodiments, web pages may be automatically grouped together into categories by analyzing the content of the pages. For instance, clustering techniques may be employed to cluster the web pages based on their content. As another example, the web pages may be categorized by looking for particular keywords in the content of the web pages. Web page categorization could also be automatically performed by analyzing the hyperlinks within the content of the web pages. Web page categorization may also include a manual approach based on editorial review of web pages to manually place the web pages into the different categories. In still further embodiments, a sample of manually-categorized web pages may be used as seeds for an automatic approach in which other web pages are compared against the seed web pages to categorize the other web pages. Any and all such variations and combinations thereof are contemplated to be within the scope of embodiments of the present invention.
Correspondence between actions and segments are also determined, as shown at block 1008. This may include, for instance, identifying action, segment pairs, in which each action, segment pair represents an action and a corresponding segment associated with the action.
Action, segment pairs may be identified in a number of different ways. By way of example, social networking messages may first be clustered around either actions or segments and each of those clusters may then be subclustered based on the other. For instance, social networking messages may be first clustered around actions and each of those action clusters may then be subclustered around segments. Each subcluster would represent an action, segment pair. Another approach for identifying action, segment pairs may be to perform a first clustering of the social networking messages around actions and another clustering of the social networking messages around segments. The overlap of social networking messages between action clusters and segment clusters may be used to identify the action, segment pairs.
In this manner, the various actions available for a given segment may be identified, and that information may be used for determining what information to store. For instance, top actions for a given segment may be identified, as shown at block 1010. URLs for those top actions may be identified at block 1012, and information for those top actions and corresponding URLs may be stored, as shown at block 1014.
Turning now to
Deeplinks are selected for one of the web page results, as shown at block 1106. The deeplinks may be selected, for instance, based on metadata stored in a search engine index for the web page. At least one of the deeplinks selected for the web page result is based on information gathered from social data. In particular, the deeplink may correspond with an action and URL that were identified by analyzing social data, for instance, using one of the above-discussed methods.
A search result is generated to include the deeplinks, as shown at block 1108. The search result is provided for presentation to the end user, as shown at block 1110. The deeplinks included with the search result may include at least one deeplink identified based on analysis of social data and at least one deeplink identified independent of social data (e.g., a default deeplink for the web page). In some instances, a deeplink identified from social data may be shown with other deeplinks. If provided with other deeplinks, the social data-based deeplink may be shown the same as the other deeplinks. Alternatively, the social data-based deeplink may be shown differently than the other deeplinks, for instance, by highlighting the social data-based deeplink, providing some text treatment (e.g., bolding, highlighting, etc.). In other instances, a deeplink identified from social data may be shown separate from other deeplinks (e.g., at a different location within the search result). In some instances, a social-data based deeplink may be displayed as an action-based deeplink, such as those described above. For instance, a social data-based deeplink may be displayed as an action-based deeplink separate from other deeplinks in a search result. Additionally, the social data-based deeplink may correspond with an action common to websites within a particular category and be displayed similarly to the way other action-based deeplinks of that type of action are displayed for search results for other websites within that website category to provide users with a more consistent search experience.
The end user may select a deeplink from the search result, as shown at block 1112. In response to the user selection, the end user is navigated to the destination web page corresponding with the selected deeplink, as shown at block 1114.
A number of search results may be returned in response to search queries. In some instances, only the top search result may be processed to provide deeplinks using embodiments of the present invention. In other instances, deeplinks may be provided with any search result. Any and all combinations and variations thereof are contemplated to be within the scope of embodiments of the present invention.
Referring next to
Among other components not shown, the system 1200 may include a user device 1202, content server 1204, and search engine server 1206. Each of the components shown in
The search engine server 1206 generally operates to index information regarding web pages served by content servers, such as the content server 1204, in a search engine index 1210. When the search engine server 1206 receives search queries from user devices, such as the user device 1202, the search engine queries the search engine index 1210 to identify search results based on the users' search queries and returns those search results to the user devices. In accordance with some embodiments of the present invention, the search engine server 1206 is also configured to identify actions based on social data and to provide deeplinks with search results based at least in part actions identified from the social data.
In the embodiment shown in
The social data accessing component 1212 operates to access social data from various social networking services. The action identification component 1214 operates to identify actions in the social data. The entity identification component 1216 operates to identify entities associated with actions in the social data. The segment identification component 1218 operates to identify segments associated with actions in the social data. The URL identification component 1220 operates to identify URLs associated with actions in the social data. These components may operate to identify actions, entities, segments, and/or URLs, for instance, using approaches such as those described above with reference to
The user interface component 1222 provides an interface to user devices, such as the user device 1202, that may allow users to submit search queries to the search engine server 1206 and to receive search results from the search engine server 1206. It should be understood that the user device 202 may be any type of computing device employed by a user to submit search queries and receive search results. By way of example only and not limitation, the user device 1202 may be a desktop computer, a laptop computer, a tablet computer, a mobile device, or other type of computing device. The user device 1202 may include an application that allows a user to enter a search query and submit the search query to the search engine server 206 to retrieve search results. For instance, the user device 1202 may include a web browser that includes a search input box or allows a user to access a search page to submit a search query. Other mechanisms for submitting search queries to search engines are contemplated to be within the scope of embodiments of the present invention.
When the search engine server 1206 receives a search query, the search engine index 1210 is queried to identify search results. In some instances, a search result may have a corresponding parameterized action that has been identified by the parameterized action identification component 1212 with corresponding parameters identified by the parameter identification component 1214. Accordingly, when the search engine server 1206 returns the search result to the user device 1202, the search result includes not only a hyperlink to the destination web page of the search result, but also deeplinks to pages available at the website corresponding with the web page. The deeplinks may include at least one deeplink that was identified based at least in part of analysis of social data.
As can be understood, embodiments of the present invention include identifying actions from social data and using that information for deeplink purposes. The present invention has been described in relation to particular embodiments, which are intended in all respects to be illustrative rather than restrictive. Alternative embodiments will become apparent to those of ordinary skill in the art to which the present invention pertains without departing from its scope.
From the foregoing, it will be seen that this invention is one well adapted to attain all the ends and objects set forth above, together with other advantages which are obvious and inherent to the system and method. It will be understood that certain features and subcombinations are of utility and may be employed without reference to other features and subcombinations. This is contemplated by and is within the scope of the claims.
This application is a continuation of U.S. patent application Ser. No. 13/406,203, filed Feb. 27, 2012, which is a continuation-in-part of U.S. patent application Ser. No. 13/190,744, filed Jul. 26, 2011, each of which is herein incorporated by reference in its entirety. This application is also related by subject matter to the inventions disclosed in the following U.S. patent applications: U.S. patent application Ser. No. 13/406,181, entitled “Context-Aware Parameterized Action Links for Search Results;” and U.S. patent application Ser. No. 13/406,192, entitled “Personalized Deeplinks for Search Results;” which are incorporated in this application by reference.
Number | Date | Country | |
---|---|---|---|
Parent | 13406203 | Feb 2012 | US |
Child | 15155864 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 13190744 | Jul 2011 | US |
Child | 13406203 | US |