News distribution is one of the many features that arise from the use of social networking sites. For example, a person may post a social communication to his/her social network in a social networking site, where the social communication references and comments on some news article or content from a news organization. Upon receiving the social communication, one or more of the receiving members of the social network may comment on or repost the original social communication, and these comments/reposts are delivered or made available to members of their respective social networks. In this manner, news distribution through social networking sites is said to “go viral”, meaning a widespread, growing distribution of news content through social networking sites often occurring within a short amount of time. News content that is “going viral,” i.e., in the process of the widespread, growing distribution of the news content, is referred to as trending news content.
Due, at least in part, to the importance of delivering timely to people, there have been attempts to identify trending news content from social communications of various social networking sites. However, timely identifying news content among the vast amount social communications that are posted, as well generating informative summary topics, has proven to be a difficult challenge.
The following Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. The Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
According aspects of the disclosed subject matter, a method for identifying trending topics on social networking sites is presented. The method includes obtaining a plurality of social communications posted on one or more social networking sites. Links within the social communications are extracted and a set of potential topic descriptors are determined. A topic descriptor from the set of potential topic descriptors is selected and the selected topic descriptor is stored with the content as a trending topic of a social networking site.
According to additional aspects of the disclosed subject matter, computer-readable media bearing computer executable instructions for identifying trending topics on social networking services is presented. In execution on a processor, the computer executable instructions carry out a method comprising first obtaining a plurality of social communications posted on one or more social networking sites. A plurality of links to content from the plurality of social communications is extracted from the social communications, wherein each of the plurality of links is a link to different content. For each of the plurality of links to content, the method categorizes the linked content according to news or non-news. Moreover, a set of potential topic descriptors is determined for the linked content categorized as news. A topic descriptor is selected from the set of potential topic descriptors and the selected topic descriptor is stored with the content as a trending topic in a trending topic data store.
According to further aspects of the disclosed subject matter, a computer system for identifying trending topics on social networking sites is presented. The computer system comprises at processor and a memory wherein the processor executes instructions stored in the memory as part of or in conjunction with additional components to identify trending topics on social networking services. These additional components include, by way of illustration and not limitation: a social communication retrieval component to obtain social communications from one or more social networking sites; a link extraction component configured to extract a plurality of links to content from the social communications; a news evaluation component configure to filter the plurality of links to content according to whether or not the linked content can be categorized as news; and a topic descriptor generator that, for each link to content that can be categorized as news, generates a topic descriptor for each of the items of content and stores the generated topic descriptor with the corresponding link in a trending topic data store as a trending topic.
The foregoing aspects and many of the attendant advantages of the disclosed subject matter will become more readily appreciated as they are better understood by reference to the following description when taken in conjunction with the following drawings, wherein:
For purposed of clarity, the use of the term “exemplary” in this document should be interpreted as serving as an illustration or example of something, and it should not be interpreted as an ideal and/or a leading illustration of that thing. Additionally, a social networking site (also called a social networking service) refers to an online platform/service in which a computer user can build and interact with social networks or social relations with other people and groups for various purposes, including by way of illustration and not limitation, shared interests, activities, backgrounds, or real-life connections. Typically, social networking sites allow computer users to share ideas, pictures, posts, activities, events, and interests with people and groups in their social network.
The term “social communication” should be interpreted as a general term in which a computer user is sharing content on a social networking site. By way of illustration and not limitation, this content may include ideas, comments, views, pictures, videos, activities, events, and/or interests. Often, though not exclusively, social communications may include links (sometimes referred to as hypertext links) that reference content outside of the social communication. While each social networking site (or service) may employ its own set of terminology, including the terminology given to the act of sharing a social communication (e.g., “tweet,” “post,” and the like), active term “post” will be used in this application to indicate the act of sharing a social communication on a social networking site. In other words, a computer user will “post” a social communication on a social networking site.
Turning to
The trending topic service 110 is configured to identify trending topics of news content according to social communications from one or more social networking sites, such as the social networking sites 114-116. According to various embodiments, the trending topic service 110 may be implemented as part of an online search engine that responds to search queries from computer users. In another non-limiting embodiment, the trending topic service 110 may be implemented as part of a social networking site. Irrespective of the exact configuration of the trending topic service 110 (i.e., whether it is part of a search engine, a social networking site, an online news site, another online service, or implemented as an independent service), the trending topic service obtains social communications from one or more social networking sites and identifies trending topics of news content.
As mentioned above, a social networking site, such as social networking sites 114-116, corresponds to an online platform/service in which a computer user can build and interact with social networks or social relations with other people and groups for various purposes, including by way of illustration and not limitation, shared interests, activities, backgrounds, or real-life connections. Examples of social networking sites include, by way of illustration and not limitation, Facebook, Twitter, FourSquare, LinkedIn, Google+, and the like. Often, the social communications posted on these various social networking sites are made in regard to events and/or developments that are considered “newsworthy,” i.e., information and/or commentary describing events, topics, developments and the like. Additionally, the social communications may be related to and include links to established news content, such as news content hosted by news site 112.
According to aspects of the disclosed subject matter, a suitably configured trending topic service 110 obtains the social communications from one or more social networking sites, such as social networking sites 114 and 116 to identify trending topics news topics from social networking sites. Identifying trending topics from social networking sites 114-116 according to social communications is described in greater detail with regard to
With reference to both
As indicated in
As indicated at
Turning to
At block 408, key terms and phrases are extracted to a memory from the linked content of the current link. According to one embodiment, the key terms and phrases of the linked content are identified according to a natural language processor and/or a lexical analysis tool. Both natural language processing and lexical analysis to extra key terms and phrases are known in the art. After extracting the key terms and phrases from the linked content, at block 410, the natural language processor and/or the lexical analysis tool constructs one or more potential topic descriptors according to combinations of the various key terms and phrases.
At block 412, the potential topic descriptors derived from the search logs 216 and the potential topic descriptors derived from the key terms and phrases are combined. At block 414, the combined set of potential topic descriptors are scored according to various heuristics including, but not limited to, an estimation as to how informative a potential topic descriptor is; the number of times that the potential topic descriptor was submitted as a search query to a search engine; how lexically correct the potential topic descriptor is formed; and the like. The goal is to score the potential topic descriptors such that an optimal topic descriptor can be identified. Thus, after scoring each of the potential topic descriptors, at block 416 the optimal (i.e., highest scoring) topic descriptor is selected as the optimal, generated topic descriptor for the current link. The selected topic descriptor may be then associated with the content as a trending topic or, alternatively, may simply be stored as a trending topic (without the content.) Thereafter, routine 400 terminates.
Returning to routine 300 of
As will be appreciated, topics often only enjoy a short time period of popularity especially in regard to trending/popular topics on social networking sites. According, it may be advantageous to periodically update the set of trending topics.
As indicated above, the trending topics generated by the trending topic service 110 may be utilized by any number of network sites. For example, a search engine, such as search engine 220, may indicate trending topics of social networking sites on one or more of its web pages.
Regarding routines 300, 400 and 500, while these routines are expressed in regard to discrete steps, these steps should be viewed as being logical in nature and may or may not correspond to any actual and/or discrete steps of a particular implementation. Nor should the order in which these steps are presented in the various routines be construed as the only order in which the steps may be carried out. Moreover, while these routines include various novel features of the disclosed subject matter, other steps (not listed) may also be carried out in the execution of the routines. Further, those skilled in the art will appreciate that logical steps of these routines may be combined together or be comprised of multiple steps. Steps of routines 300, 400 and 500 may be carried out in parallel or in series. For example, routine 400 is illustrated as having parallel paths but this is just an example of one embodiment and should not be construed as the only arrangement of the routine. Often, but not exclusively, the functionality of the various routines is embodied in software (e.g., applications, system services, libraries, and the like) that is executed on computer hardware and/or systems as described below in regard to
While many novel aspects of the disclosed subject matter are expressed in routines embodied in applications (also referred to as computer programs), apps (small, generally single or narrow purposed, applications), and/or methods, these aspects may also be embodied as computer-executable instructions stored by computer-readable media, also referred to as computer-readable storage media. As those skilled in the art will recognize, computer-readable media can host computer-executable instructions for later retrieval and execution. When the computer-executable instructions store stored on the computer-readable storage devices are executed, they carry out various steps, methods and/or functionality, including those steps, methods, and routines described above in regard to routines 300, 400 and 500. Examples of computer-readable media include, but are not limited to: optical storage media such as Blu-ray discs, digital video discs (DVDs), compact discs (CDs), optical disc cartridges, and the like; magnetic storage media including hard disk drives, floppy disks, magnetic tape, and the like; memory storage devices such as random access memory (RAM), read-only memory (ROM), memory cards, thumb drives, and the like; cloud storage (i.e., an online storage service); and the like. For purposes of this disclosure, however, computer-readable media expressly excludes carrier waves and propagated signals.
Turning now to
The processor 702 executes instructions retrieved from the memory 704 in carrying out various functions, particularly in regard to identifying trending topics on social networking sites according to social communications. The processor 702 may be comprised of any of various commercially available processors such as single-processor, multi-processor, single-core units, and multi-core units. Moreover, those skilled in the art will appreciate that the novel aspects of the disclosed subject matter may be practiced with other computer system configurations, including but not limited to: mini-computers; mainframe computers, personal computers (e.g., desktop computers, laptop computers, tablet computers, etc.); handheld computing devices such as smartphones, personal digital assistants, and the like; microprocessor-based or programmable consumer electronics; game consoles, and the like.
The system bus 710 provides an interface for the various components to inter-communicate. The system bus 710 can be of any of several types of bus structures that can interconnect the various components (including both internal and external components). The trending topic service 700 further includes a network communication component 712 for interconnecting the network site with other computers (including, but not limited to, user computers such as user computers 102-106, other network sites including social networking sites 114-116, news site 112, one or more search engines (not shown) as well as other devices on a computer network 108. The network communication component 712 may be configured to communicate with an external network, such as network 108, via a wired connection, a wireless connection, or both.
The trending topic service 700 also includes a social communication retrieval component 720. The social communication retrieval component 720 obtains social communications from one or more social networking sites, such as social networking sites 114 and 116. As discussed, according to various embodiment the social communication retrieval component 720 obtains social communications from one or more social networking sites corresponding to a predetermined prior period of time, such as (by way of illustration and not limitation) the past 12 hours, the past hour, the past day, and the like. According to still further embodiments of the disclosed subject matter, the social communication retrieval component 720 may be configured to obtain social communications from only those social networking sites to which a computer user has subscribed. Moreover, in yet a further embodiment, the social communication retrieval component 720 may be configured to obtain social communications from only those social networking sites to which a computer user has subscribed and social communications from members of the computer user's social networks.
The trending topic service 700 also includes a link extraction component 714, a news evaluation component 716, a topic descriptor generator 716, and a trending topic data store 208. The link extraction component 714 scans the obtained social communications and extracts to a memory links to external content (typically in the form of universal resource locators (URLs) or universal resource identifiers (URIs)) within the social communications. In extracting links from the social communications, the link extraction component 714 is also configured to delete duplicates. The news evaluation component 716 filters the links extracted by the link extraction component according to whether or not the linked content can be categorized as news content, retaining those links that can be categorized as news content. The topic descriptor generator 716 generates a topic descriptor for the linked content (as discussed above in regard to routine 400 of
While not shown, the trending topic service 700 may also include a timing component such that the trending topic service 700 periodically determines trending topics from social networking sites, as well as a component for purging the trending topic data store 208 of old trending topic pairs.
While various novel aspects of the disclosed subject matter have been described, it should be appreciated that these aspects are exemplary and should not be construed as limiting. Variations and alterations to the various aspects may be made without departing from the scope of the disclosed subject matter.
Number | Name | Date | Kind |
---|---|---|---|
8402031 | Govani et al. | Mar 2013 | B2 |
20090019085 | Abhyanker | Jan 2009 | A1 |
20100211432 | Yiu et al. | Aug 2010 | A1 |
20110218946 | Stern et al. | Sep 2011 | A1 |
20110320715 | Ickman et al. | Dec 2011 | A1 |
20130018949 | Pradeep | Jan 2013 | A1 |
20140143405 | Pavlidis et al. | May 2014 | A1 |
Entry |
---|
Phuvipadawat, et al., “Breaking News Detection and Tracking in Twitter”, Retrieved at <<http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5616930>>, In International Conference on Web Intelligence and Intelligent Agent, Aug. 31, 2010, pp. 4. |
Jackoway, et al., “Identification of Live News Events using Twitter”, Retrieved at <<http://www.cs.umd.edu/˜hjs/pubs/lbsn2011.pdf>>, In 3rd ACM SIGSPATIAL International Workshop on Location-Based Social Networks, Nov. 1, 2011, pp. 8. |
Sankaranarayanan, et al., “TwitterStand: News in Tweets”, Retrieved at <<http://nsl.cse.unt.edu/˜enkh/classes/csce6350/Media/twitterStandNewsInTweets.pdf>>, In 17th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, Nov. 4, 2009, pp. 10. |
Agarwal, et al., “Catching the Long-Tail: Extracting Local News Events from Twitter”, Retrieved at <<http://www.aaai.org/ocs/index.php/ICWSM/ICWSM12/paper/view/4639/5011>>, In 6th International AAI Conference on Weblogs and Social Media, Jun. 4, 2012, pp. 4. |
Younus, et al., “Ins and Outs of News: Twitter as a Real-Time News Analysis Service”, Retrieved at <<http://ceur-ws.org/Vol-694/paper2.pdf>>, In Workshop on Visual Interfaces to the Social and Semantic Web, Feb. 13, 2011, pp. 9. |
Kwak, et al., “What is Twitter, a Social Network or a News Media?”, Retrieved at <<http://cs.wellesley.edu/˜cs315/Papers/What%20is%20twitter-a%20social%20net%20or%20news%20media.pdf>>, In Proceedings of 19th International World Wide Web Conference, Apr. 26, 2010, pp. 10. |
“Personalized News Feed based on Peer and Personal Activity”, U.S. Appl. No. 13/106,149, filed May 12, 2011, pp. 25. |
“Interaction Model for Serving Popular Queries in Search Box”, U.S. Appl. No. 13/671,589, filed Nov. 8, 2012, pp. 26. |
Number | Date | Country | |
---|---|---|---|
20140358885 A1 | Dec 2014 | US |