This invention relates to generating an implied object graph based on user behavior.
Various analytical techniques are used to understand relationships between objects in online systems, such as web pages and other documents or items of content. These techniques include, for example, techniques for calculating a global ranking for objects in a corpus of objects, determining a centrality of objects in a corpus, and comparing a similarity of object graphs. But each of these techniques requires a citation graph, which is a graph of links between the objects in the corpus of objects. Citation graphs are often generated by examining explicit links between the objects in the corpus, such as web pages that link to other web pages. But in the absence of explicit links, these analytical techniques for understanding relationships between the objects cannot be used.
An online system monitors behaviors of users with respect to objects, such as documents distributed by or accessible from the online system. Based on the monitored behaviors, the online system determines connections between the objects and one or more users who interacted with the objects. If more than one object is connected to a given user, the online system generates implied links between the objects that are connected to the same user. The implied links between objects connected to the same user may be represented as a local object graph for that user. The online system then merges local object graphs constructed for each of a plurality of users to generate a global object graph. The global object graph represents the relationships within a corpus of objects in the online system, as indicated by users' mutual interests in the objects.
In one embodiment, the online system extracts an adjacency matrix from the global object graph, or from each local object graph. An adjacency matrix stores the links among the objects in the global object graph, which may be measured using weights that represent the strength or closeness of the links between two objects in the global object graph. Using the adjacency matrix and the weights, the online system may apply graph analysis techniques to analyze the relationships between the objects in the corpus. Accordingly, embodiments described herein enable the analysis of the relationships between objects in the online system without relying upon explicit links between the objects.
The features and advantages described in this summary and the following detailed description are not all-inclusive. Many additional features and advantages will be apparent to one of ordinary skill in the art in view of the drawings, specification, and claims.
The figures depict various embodiments of the present invention for purposes of illustration only. One skilled in the art will readily recognize from the following discussion that alternative embodiments of the structures and methods illustrated herein may be employed without departing from the principles of the invention described herein.
An online system identifies implied links between objects based on user interactions with the objects. Using these connections, the online system generates an implied object graph representing the relationships between the objects. The online system may analyze the object graph to determine the global rank of objects in the system. The objects may then be ranked based on their global ranks to provide recommendations and relevant search results to users. By inferring relationships between the objects based on the same user having interacted with the objects, the online system calculates the global rank of objects without relying on explicit links between the objects.
The content processing system 106 receives content items from the sources 102, processes the content items to build pages, and serves the pages to a client 104. The content processing system 106 may group pages into “sections,” where each section includes pages from a similar source, relating to a similar topic, or otherwise determined to be similar. The pages and sections, as well as domains or URLs associated with the content sources 102, are referred to herein as “objects” in the system environment 100. Other objects may also be present in the system environment 100. The set of all objects in the environment 100 make up a “corpus” of objects.
A client 104 can be any computing device equipped with a browser for accessing web pages and a display for viewing them, such as a personal computer, a tablet computer, or a mobile device. A client 104 receives pages from the content processing system 106 and displays them to a user. Although a single client 104 is shown in
Using the clients 104, users interact with the objects in the environment 100 by, for example, reading content, saving content, adding content to a feed, or sharing content with social network connections. The content processing system 106 monitors users' interactions with the objects to identify connections between users and objects. Based on the user-object connections, the content processing system 106 generates implied links between objects and constructs an implied object graph. Similarly, the content processing system 106 may infer probabilistic weights for the implied links. The content processing system 106 may use information about the implied links and their respective probabilistic weights to calculate object-object proximity. This enables proximity to be calculated without relying on explicit links between objects.
A process for generating an implied object graph is illustrated in the flowchart of
The content processing system 106 monitors 202 user behaviors with respect to the objects of the corpus. If the objects are documents or other content items, the behaviors may include, for example, providing explicit positive or negative feedback about the content item, such as by adding the content item to a favorites collection or by reporting the content item as spam or abusive, or providing implicit feedback about the item, such as by reading or viewing the content item. For reading or viewing the content item, the system may take into account the user's dwell time (i.e., the amount of time a user spends reading a content item). Other behaviors may include social sharing activities (e.g., sharing an object with one or more connections on a social network).
In one embodiment, the content processing system 106 monitors user behaviors over a sliding time window that depends, for example, on the type of object. The sliding time window provides behaviors that are contemporary enough to be relevant. For example, the content processing system 106 may behaviors of users with respect to news articles over a relatively short time period (e.g., 24 hours), as a given news article may only be relevant for a short period of time. Similarly, the content processing system 106 may monitor behaviors of users with respect to sections over a longer time period (e.g., two weeks), as the relevance of a section may attenuate less rapidly. Rather than filtering behaviors based on time, the impact of the monitored behaviors may also be decayed based on the time since the behavior occurred, thereby providing a smooth drop of the effect of that behavior on the implied graph. Different types of behaviors may be decayed at different rates, or not at all. For example, a user's reading a document may be decayed faster than a user's providing explicit feedback that the document is interesting to the user.
Based on user behaviors with respect to the objects, the content processing system 106 identifies 204 connections between users and objects. In particular, if a user's behavior with respect to an object satisfies a link criterion, the content processing system 106 creates a connection between the user and the object. Link criteria may include, for example, reading the content of an object, dwelling on an object for longer than a threshold dwell time, or sharing the object with a social network connection. An example set of connections between a user 300 and objects 302 is illustrated in
The content processing system 106 may also quantify the strength of the connections between objects and users based on the link criteria, with each link criterion associated with a weight. Different types of user interactions with objects may lead to differently weighted links between the user and the objects. For example, a social sharing criterion may be weighted more heavily than a reading criterion, since a user who shares an object with other users is likely to be more interested in the object than a user who merely reads the content. As another example, a longer dwell time may be weighted more heavily than a shorter dwell time. As mentioned above, these weights may be decayed over time, thereby lessening the impact of the user behaviors to the implied graph as those actions become stale.
If two or more objects are linked to a common user, the content processing system 106 generates 206 implied links between the objects and constructs a local object graph. As used herein, a “local object graph” represents the relationships among the objects with which a given user interacts. An example of a local object graph is illustrated in
The implied links between the objects 302 may be associated with weights, which are determined based on the weights of the links between the user 300 and the objects 302. In various embodiments, a weight for an implied link between two objects connected to a user may be an arithmetic mean of the weights between each object and the user, the geometric mean of the user-object connections, a summation of the logarithms of each of the two user-object link weights, or the greater of the two user-object link weights. Other methods of calculating weights for the implied links are also possible. In another embodiment, the content processing system 106 assigns weights to implied links between objects by summing the user-object link weights for the set of objects connected to a given user. If the cumulative weight of two user-objects connections is in the top n cumulative link weights associated with the user, the weight of the implied link between the two corresponding objects is assigned to a value of 1. Otherwise, the weight is assigned to a value of zero. For example, objects A, B, and C are connected to a user. The weight of the link between object A and the user is wA, the weight of the link between object B and the user is wB, and wC is the weight of the link between object C and the user. The content processing system 106 calculates the sums sAB=wA+wB, sAC=wA+wC, and sBC=wB+wC. If, for example, it is determined that sAB>sAC>sBC, the content processing system 106 may assign the implied link between objects A and B a weight of 1, and assigns weights of 0 to the implied links between objects A and C and between B and C. The number n of cumulative link weights assigned to a value of 1 may be selected so as to provide sparsity in an adjacency matrix result from the implied object-object links.
The content processing system 106 may generate local object graphs for each user (or a subset of the users) who interacts with content served by the content processing system 106. If an object occurs in more than one of the local object graphs, the content processing system 106 merges 208 the local objects graphs containing the object. The result of merging 208 the local object graphs is a global object graph representing relationships of the objects in the corpus.
The merging 208 of two local object graphs having at least one object in common is illustrated in
In one embodiment, the content processing system 106 extracts 210 an adjacency matrix from the global object graph. The content processing system 106 may alternatively extract 210 an adjacency matrix from each local object graph and generate the adjacency matrix for the global object graph based on the local adjacency matrices. The adjacency matrix is a data structure representing the implied links between the objects in the corpus, and it may be stored on a computer-readable storage medium, such as a memory of the content processing system 106. The content processing system 106 may use the adjacency matrix and graph analysis techniques to rank the objects, recommend objects to users, or otherwise analyze the relationships between the objects. For example, the content processing system 106 may use power iteration to calculate the eigenvector centrality of the objects, representing the influence of each object in the global object graph. Accordingly, the embodiments disclosed herein enable the content processing system 106 to apply techniques from graph theory that were not available in the absence of explicit object-object links.
One application of an implied object graph as described herein provides a method for ranking objects in a digital magazine. For example, a digital magazine application may provide a personalized, customizable digital magazine for a user. Based on selections made by the user and/or on behalf of the user, the digital magazine may contain a personalized collection of content from a number of sources, thereby providing a useful interface by which the user can consume content that interests and inspires the user.
The digital magazine may be organized into a number of sections, where each section contains content obtained from a particular source or otherwise has a common characteristic. For example, one section of the digital magazine may include articles from an online news source (such as a website for a news organization), another section may contain articles from a third-party-curated collection of content around a particular topic (e.g., a technology compilation), and yet another section may contain content obtained from the user's account on one or more social networking systems.
As one example, the digital magazine application may recommend objects to users of the digital magazine based on the implied object graph. For example, the digital magazine application may identify an implied link between articles A and B. If a user reads article A, the digital magazine application may recommend article B to the user based on the implied links between the articles.
As another example, the digital magazine application may rank objects in the digital magazine based on the implied object graph, and use the ranking to provide relevant search results to users. For example, users of the digital magazine may submit search queries for articles or sections relating to a particular topic. In response to receiving the search query, the digital magazine application may identify articles and/or sections corresponding to the query as search results. The digital magazine application may then rank the search results for presentation to the user based on the eigenvector centrality of the corresponding objects in the digital magazine. Alternatively, the digital magazine application may suggest high-ranking objects to human editors, who may then perform further processing to generate content packages. For example, editors may be alerted to popular user-generated content, which they may choose to promote within the digital magazine application to other users. Furthermore, the highly-ranked objects may be analyzed by algorithmic editing processes to determine entities (such as people, places, organizations, concepts, or events) named in the objects. If common entities are named in the highly-ranked objects, the digital magazine application may identify trends in currently dominant topics of discussion.
As yet another example, the digital magazine application may construct a section based on implied links between articles, URLs, or other objects. For example, if the digital magazine application identifies an implied link between an article C and a URL D, and the digital magazine application adds article C to a section, the digital magazine application may also add articles (or other content) retrieved from URL D to the section. Thus, a section may comprise a set of objects linked to one another in a global object graph of the digital magazine.
The foregoing description of the embodiments of the invention has been presented for the purpose of illustration; it is not intended to be exhaustive or to limit the invention to the precise forms disclosed. Persons skilled in the relevant art can appreciate that many modifications and variations are possible in light of the above disclosure.
Some portions of this description describe the embodiments of the invention in terms of algorithms and symbolic representations of operations on information. These algorithmic descriptions and representations are commonly used by those skilled in the data processing arts to convey the substance of their work effectively to others skilled in the art. These operations, while described functionally, computationally, or logically, are understood to be implemented by computer programs or equivalent electrical circuits, microcode, or the like. Furthermore, it has also proven convenient at times, to refer to these arrangements of operations as modules, without loss of generality. The described operations and their associated modules may be embodied in software, firmware, hardware, or any combinations thereof.
Any of the steps, operations, or processes described herein may be performed or implemented with one or more hardware or software modules, alone or in combination with other devices. In one embodiment, a software module is implemented with a computer program product comprising a computer-readable medium containing computer program code, which can be executed by a computer processor for performing any or all of the steps, operations, or processes described.
Embodiments of the invention may also relate to an apparatus for performing the operations herein. This apparatus may be specially constructed for the required purposes, and/or it may comprise a general-purpose computing device selectively activated or reconfigured by a computer program stored in the computer. Such a computer program may be stored in a tangible computer readable storage medium or any type of media suitable for storing electronic instructions, and coupled to a computer system bus. Furthermore, any computing systems referred to in the specification may include a single processor or may be architectures employing multiple processor designs for increased computing capability.
Embodiments of the invention may also relate to a computer data signal embodied in a carrier wave, where the computer data signal includes any embodiment of a computer program product or other data combination described herein. The computer data signal is a product that is presented in a tangible medium or carrier wave and modulated or otherwise encoded in the carrier wave, which is tangible, and transmitted according to any suitable transmission method.
Finally, the language used in the specification has been principally selected for readability and instructional purposes, and it may not have been selected to delineate or circumscribe the inventive subject matter. It is therefore intended that the scope of the invention be limited not by this detailed description, but rather by any claims that issue on an application based hereon. Accordingly, the disclosure of the embodiments of the invention is intended to be illustrative, but not limiting, of the scope of the invention, which is set forth in the following claims.
This application is a continuation of U.S. patent application Ser. No. 13/905,016, filed May 29, 2013, which claims the benefit of U.S. Provisional Application No. 61/700,308, filed Sep. 12, 2012, and U.S. Provisional Application No. 61/752,952, filed Jan. 15, 2013, which are incorporated by reference in their entirety.
Number | Name | Date | Kind |
---|---|---|---|
5870683 | Wells et al. | Feb 1999 | A |
6266059 | Matthews et al. | Jul 2001 | B1 |
7224892 | Yashiro | May 2007 | B2 |
7383285 | Pal et al. | Jun 2008 | B1 |
7424439 | Fayyad et al. | Sep 2008 | B1 |
7673340 | Cohen | Mar 2010 | B1 |
7716225 | Dean et al. | May 2010 | B1 |
8230075 | Weskamp et al. | Jul 2012 | B1 |
8260915 | Ashear | Sep 2012 | B1 |
8478735 | Archambault | Jul 2013 | B1 |
8667393 | Gerwin | Mar 2014 | B2 |
8700987 | Spalink | Apr 2014 | B2 |
8788487 | Stout et al. | Jul 2014 | B2 |
8825872 | Reisman | Sep 2014 | B2 |
8826169 | Yacoub et al. | Sep 2014 | B1 |
9037592 | Walkingshaw et al. | May 2015 | B2 |
9092529 | Gyongyi | Jul 2015 | B1 |
20020073178 | Jalili | Jun 2002 | A1 |
20020092019 | Marcus | Jul 2002 | A1 |
20020124050 | Middeljans | Sep 2002 | A1 |
20030004983 | Cohen | Jan 2003 | A1 |
20030110181 | Schuetze et al. | Jun 2003 | A1 |
20050033657 | Herrington et al. | Feb 2005 | A1 |
20050055632 | Schwartz et al. | Mar 2005 | A1 |
20050055635 | Bargeron et al. | Mar 2005 | A1 |
20050080684 | Blum | Apr 2005 | A1 |
20050105134 | Moneypenny et al. | May 2005 | A1 |
20050240865 | Atkins et al. | Oct 2005 | A1 |
20060026182 | Takeda et al. | Feb 2006 | A1 |
20070011050 | Klopf et al. | Jan 2007 | A1 |
20070150368 | Arora et al. | Jun 2007 | A1 |
20070168465 | Toppenberg et al. | Jul 2007 | A1 |
20080002964 | Edwards | Jan 2008 | A1 |
20080033587 | Kurita et al. | Feb 2008 | A1 |
20080079972 | Goodwin et al. | Apr 2008 | A1 |
20080082903 | McCurdy et al. | Apr 2008 | A1 |
20080120670 | Easton et al. | May 2008 | A1 |
20090049374 | Echenberg | Feb 2009 | A1 |
20090064003 | Harris et al. | Mar 2009 | A1 |
20090167768 | Bull et al. | Jul 2009 | A1 |
20100049770 | Ismalon | Feb 2010 | A1 |
20100123908 | Denoue et al. | May 2010 | A1 |
20100161369 | Farrell | Jun 2010 | A1 |
20100169340 | Kenedy et al. | Jul 2010 | A1 |
20100262490 | Ito et al. | Oct 2010 | A1 |
20100274815 | Vanasco | Oct 2010 | A1 |
20100299642 | Merrell et al. | Nov 2010 | A1 |
20100306249 | Hill et al. | Dec 2010 | A1 |
20100318571 | Pearlman | Dec 2010 | A1 |
20110047368 | Sundaramurthy | Feb 2011 | A1 |
20110052047 | Smith | Mar 2011 | A1 |
20110191321 | Gade et al. | Aug 2011 | A1 |
20110222769 | Galic et al. | Sep 2011 | A1 |
20110234613 | Hanson et al. | Sep 2011 | A1 |
20110238755 | Khan | Sep 2011 | A1 |
20110246440 | Kocks et al. | Oct 2011 | A1 |
20110249903 | Duga et al. | Oct 2011 | A1 |
20110265011 | Taylor | Oct 2011 | A1 |
20110302064 | Dunkeld et al. | Dec 2011 | A1 |
20120066591 | Hackwell | Mar 2012 | A1 |
20120079328 | Sawaguchi | Mar 2012 | A1 |
20120089455 | Belani | Apr 2012 | A1 |
20120110678 | Kumble | May 2012 | A1 |
20120124505 | St Jacques | May 2012 | A1 |
20120147163 | Kaminsky | Jun 2012 | A1 |
20120158476 | Neystadt | Jun 2012 | A1 |
20120169741 | Adachi | Jul 2012 | A1 |
20120179572 | Hesse | Jul 2012 | A1 |
20120192093 | Migos et al. | Jul 2012 | A1 |
20120203640 | Karmarkar et al. | Aug 2012 | A1 |
20120221555 | Byrne et al. | Aug 2012 | A1 |
20120254188 | Koperski et al. | Oct 2012 | A1 |
20120265744 | Berkowitz et al. | Oct 2012 | A1 |
20120297490 | Barraclough et al. | Nov 2012 | A1 |
20120304042 | Pereira et al. | Nov 2012 | A1 |
20130021377 | Doll | Jan 2013 | A1 |
20130024757 | Doll et al. | Jan 2013 | A1 |
20130111334 | Liang et al. | May 2013 | A1 |
20130111395 | Ying et al. | May 2013 | A1 |
20130145259 | Kiefer et al. | Jun 2013 | A1 |
20130226663 | Jahid | Aug 2013 | A1 |
20130290414 | Rait | Oct 2013 | A1 |
20130332593 | Patnaikuni | Dec 2013 | A1 |
20140006406 | Kafati et al. | Jan 2014 | A1 |
20140028685 | Weskamp et al. | Jan 2014 | A1 |
20140032635 | Pimmel et al. | Jan 2014 | A1 |
20140033134 | Pimmel et al. | Jan 2014 | A1 |
20140033202 | Weskamp et al. | Jan 2014 | A1 |
20140067825 | Oztaskent et al. | Mar 2014 | A1 |
20140068654 | Marlow | Mar 2014 | A1 |
20140074934 | Van Hoff et al. | Mar 2014 | A1 |
20140075289 | Brant | Mar 2014 | A1 |
20140075296 | Schaad et al. | Mar 2014 | A1 |
20140075339 | Weskamp et al. | Mar 2014 | A1 |
20140173397 | Pereira et al. | Jun 2014 | A1 |
20140226901 | Spracklen et al. | Aug 2014 | A1 |
20140258261 | Singh et al. | Sep 2014 | A1 |
20150019957 | Ying et al. | Jan 2015 | A1 |
20150019958 | Ying et al. | Jan 2015 | A1 |
20150127565 | Chevalier | May 2015 | A1 |
Number | Date | Country |
---|---|---|
101127783 | Feb 2008 | CN |
101297315 | Oct 2008 | CN |
WO 2010132491 | Nov 2010 | WO |
Entry |
---|
PCT International Search Report and Written Opinion for PCT/US2013/059302, dated Jan. 28, 2014, 11 Pages. |
PCT International Search Report and Written Opinion for PCT/US2013/059297, dated Jan. 28, 2014, 15 Pages. |
PCT International Search Report and Written Opinion for PCT/US2013/059298, dated Jan. 28, 2014, 15 Pages. |
Office Action for Chinese Patent Application CN 2013800536830, dated Aug. 30, 2016, 32 Pages. |
2nd Office Action for Chinese Patent Application No. CN 201380053683.0, dated Apr. 27, 2017, 29 Pages. |
Number | Date | Country | |
---|---|---|---|
20150227563 A1 | Aug 2015 | US |
Number | Date | Country | |
---|---|---|---|
61700308 | Sep 2012 | US | |
61752952 | Jan 2013 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 13905016 | May 2013 | US |
Child | 14691370 | US |