More and more people are posting content on Internet via social networking sites. Unfortunately, once a person posts content on the social networking site, it is often difficult to locate, especially after a period of time and after that person posts other content. This is especially difficult to locate content if a person posts the content in the domain of another person, e.g., posting a comment on a friend's “wall”, tagging an image posted by another person identifying one or more persons in the image, etc. Indeed, search services, including search services of social networking sites and general search engines, are unable to respond to a query that for content posted on a social networking site. For example, a search engine cannot respond with good results to a search query such as “image of my friend Steve at Lake of the Angels,” even when the person submitting the search query knows that there are one or more images of his/her friend, “Steve,” while at the “Lake of Angels,” posted on at least one social networking site.
The following Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. The Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
A method directed to aggregating social content from social networking sites, and making the social content available for searching is described. Social content, corresponding to multiple computer users, is obtained from a plurality of social networking sites. The social content is stored in a content store, making it available for searching. In response to receiving a search query (directed to social content), a set of search results is identified, the search results including at least one item of social content obtained from a social networking site. The social content in the search results is filtered according to privacy constraints. A presentation of the filtered search results is generated and provided to the requesting computer user.
A computer system for responding with search results to a search query is also described. The computer system includes a processor and a memory, wherein the processor executes instructions stored in the memory as part of or in conjunction with additional components to respond to a search query. Additionally, the computer system includes a key extraction component that extracts key terms from each instance of social content and stores the social content in a content store and indexed according to the key terms. A query parsing component identifies a plurality of key terms for identifying social content in the content store. A content retrieval component that identifies a set of search results from the content store, wherein the set of search results includes at least one instance of social content. An access filter component filters the set of search results according to access privileges of the requesting computer user with regard to the at least one instance of social content. A presentation generation component generates a presentation of the filtered set of search results and returns the presentation to the requesting computer user.
The foregoing aspects and many of the attendant advantages of the disclosed subject matter will become more readily appreciated as they are better understood by reference to the following description when taken in conjunction with the following drawings, wherein:
For purposed of clarity, the use of the term “exemplary” in this document should be interpreted as serving as an illustration or example of something, and it should not be interpreted as an ideal and/or a leading illustration of that thing. Additionally, the term “social content” refers to content that is uploaded to a network site such that the content may be viewed by one or more other persons/computer users. While much of the description of the disclosed subject matter is made with regard to images, images are just an example of social content. Social content may include, by way of illustration and not limitation: images, videos, hyperlinks, textual posts, comments, audio files, check-in data, and likes. In obtaining social content from a social networking site, both the social content (i.e., the content that was posted to the social networking site) and content data (i.e., information that relates to the posted content) are obtained, though the two parts may not necessarily be obtained at the same time. A “social networking site” refers to a network location where persons/computer users are able to upload social content such that it may be viewed by one or more other persons/computer users.
Turning now to the figures,
Those skilled in the art will appreciate that, generally speaking, a search engine 110 corresponds to an online service hosted on one or more computers, or computing systems, located and/or distributed throughout the network 108. The search engine 110 receives and responds to search queries submitted over the network 108 from various computer users, such as computer user 101 using user computer 102 over the network 108. In particular, responsive to receiving a search query from a computer user, the search engine 110 obtains search results information related and/or relevant to the received search query (as defined by one or more query terms of search query.) The search results information includes search results, i.e., references (typically in the form of hyperlinks) to relevant content available at various network locations located throughout the network 108, such network sites 112-116. The content sites may include (by way of illustration and not limitation): social networking sites, such as social networking sites 112 and 116; online shopping sites, such as online shopping site 114; news outlets (not shown); educational and research sites (not shown); and the like.
It should be appreciated that while much of the following discussion will be made with regard to a search engine 110 that is separate from a social networking site, such as social networking sites 112 and 116, this should not be construed as limiting upon the disclosed subject matter. Indeed, aspects of the disclosed subject matter may be suitably implemented in the search services of the various social networking sites, such as social networking sites 112 and 116. Moreover, social content may be obtained from any number of sources, not just social networking sites. For example, a computer user may post a comment/review with regard to a product available on a shopping site, such as shopping site 114. This product review may be obtained by a search engine 110 and made available for searching in accordance to aspects of the disclosed subject matter.
To better appreciate how the search engine 110 aggregates social content from social networking sites, reference is made to
As shown in
While obtaining the image 204 or a link to the image at the social networking site is important for the search engine 110 (so that it can serve up the image in response to a search query), obtaining related image data 206 is also important. According to aspects of the disclosed subject matter, the search engine uses the related image data 206 to identify keys and other aspects or factors upon which the image can be indexed in a content store. Generally speaking, the search engine 110 uses numerous aspects of the related image data 206 to identify keys/criteria/factors that help identify the particular image. These aspects include (by way of illustration and not limitation): the album title in which the obtained image is found; album comments—from both the image poster as well as other persons allowed to comment on the album at the social networking site; tagged entities in the image, including people and/or places; image titles; images comments; location information regarding the geographic location of subject of the image; the date and/or time that the image was taken; the number of “likes” that the image has received; popularity of the item, e.g., the number of times that the image has been viewed; the number of times that the image has been shared (reposted) by others; and the date the image was uploaded to the social networking site. Also included in the related image data 206 is an Uploader ID (identifier), the Uploader ID being an identifier corresponding to the person/computer user that uploaded the image to the social networking site. As will be discussed below, the Uploader ID plays an important function in determining whether a requesting computer user has sufficient privileges to access a given image.
Upon receiving the image content 202, the search engine 110 examines the related image data 206 and identifies and/or determines criteria, referred to as “key terms,” upon which the image may be indexed in a content store for efficient retrieval. According to various embodiments of the disclosed subject matter, the search engine 110 extracts the key terms from the image content 202 via a key extraction component 208. Once the key terms are identified, the image content 202, comprising the image 204 and related image data 206, is stored in the content store 210, indexed according to the identified key terms.
Turning now to
At block 304, another iteration loop is begun in which the process iterates through each social networking site associated with the current computer user. In this manner, the process is able to obtain image data associated with the current computer user from each of the social networking sites with which the computer user is associated, and where image data regarding the current computer user is found. As with the prior iteration loop, the term “current social networking site” refers to the social networking site currently being processed in this iteration loop.
At block 306, the process obtains image content 202 corresponding to the current user from the current social networking site. As mentioned above in regard to
At block 308, yet another iteration loop is begun in which the process iterates through each of the images obtained for the current computer user from the current social networking site. As with the other iteration loops, the image that is currently being processed in the iteration loop is referred to as the “current image.” At block 310, key terms are extracted from the image data 206 associated with the current image. At block 312, the image content 202 (including the image 204 or a link to the image, and the image data 206) is added to the content store 210 and indexed according to the extracted key terms. At block 314, if there are any additional images to be processed, the routine 300 returns to block 308 where the next image is selected as the current image for processing, and the loop repeats for the current image. However, if there are no more images for the current user from the current social networking site to be processed, the routine proceeds to block 316.
At block 316, if there are any additional social networking sites associated with the current computer user, the routine 300 returns to block 304 where the next social networking site is selected as the current social networking site for processing and the iteration loop repeats. However, if there are no more social networking sites corresponding to the current user from which image content 202 may be obtained, the routine proceeds to block 318.
At block 318, if there are any additional computer users to be processed, the routine 300 returns to block 302 where the next computer user is selected as the current computer user and the loop repeats as described above. However, if there are no more computer users for processing, the routine 300 terminates.
Those skilled in the art should appreciate that the process described in routine 300 is only one example of a routine suitable for processing image content (or, of a more general nature, social content). Indeed, those skilled in the art will appreciate that obtaining image content and then processing it may be performed in any number of manners. Significantly, image content 202 (or, more generally, social content) corresponding to a computer user is obtained from a plurality of social networking sites, the image content includes a source (Uploader ID), key terms are extracted from the image data, and the image content is stored in the content store 210 and indexed according to one or more of the extracted key terms.
While routine 300 describes a process for aggregating image content (or, more generally, social content) into a content store 210,
At block 404, the routine 400 parses the search query to identify the query topic (or topics) and conditions of the search query. With reference to the example of above, in parsing the search query the routine 400 identifies “my friend Steve” as the query topic, as well as identifying sub-topics and/or conditions such as “image” and “Lake of the Angels.” Of course, in this example the search engine 110 may consult with one or more networking sites associated with the requesting computer user to accurately identify “my friend Steve.” Thus, at block 406, a set of images are identified from the content store 210 that are relevant to the search query, both the query topic as well as the conditions. These results are available for searching since the images have been stored and indexed in the content store 210.
As those skilled in the art will appreciate, the images (or, more generally, social content) that are posted on social networking sites are often posted by people that have various access restrictions in place as to who is able to view the posts and who is not. Thus, at block 408, the initial set of results/images may or may not be accessible to the requesting computer user based on the access restrictions of the person that posted the content in the first place. Thus, the initial set of search results/images is filtered according to privacy constraints.
Beginning at block 502, an iteration loop is begun to iterate through each of the images in the initial set of resulting images, with each image being processed in the iteration loop being referred to as the “current image.” Since each social networking site may have its own privacy policies and each user may establish different privacy settings for access to content that he/she posts on the social networking site, at block 504 the social networking site associated with the current image is identified from the image data 206 that was stored as part of the image content 202 in the content store 210. Additionally, the Uploader ID (identifier for the user that uploaded the image to the social networking site) is also identified from the image content 202 in the content store 210.
At decision block 508, the identified social networking site is queried as to whether the requesting computer user (typically identified by a user identifier) has permission to access/view the current image (based on the Uploader ID). If the response is that the requesting computer user has permission to access the image, the routine 500 proceeds to block 512 where the process determines whether there are any additional images to examine in the process. However, if at decision block 508 it is determined that the requesting computer user does not have sufficient privileges to access the current image, the routine 500 proceeds to block 510. At block 510, the current image is removed from the set of images selected as potential search results for the search query.
At block 512, if there are additional images in the initial set to validate, the routine 500 returns to block 502 where the next image in the set of images is selected as the current image and the described above repeats. Alternatively, if there are no additional images to validate (i.e., check whether the requesting computer user has sufficient privilege to access/view), the routine 500 has completed filtering the initial set of images and, correspondingly, terminates.
Returning again to
Regarding routines 300, 400 and 500, while these routines are expressed in regard to discrete steps, these steps should be viewed as being logical in nature and may or may not correspond to any actual and/or discrete steps of a particular implementation. Nor should the order in which these steps are presented in the various routines be construed as the only order in which the steps may be carried out. Moreover, while these routines include various novel features of the disclosed subject matter, other steps (not listed) may also be carried out in the execution of the routines. Further, those skilled in the art will appreciate that logical steps of these routines may be combined together or be comprised of multiple steps. Steps of routines 300, 400 and 500 may be carried out in parallel or in series. Often, but not exclusively, the functionality of the various routines is embodied in software (e.g., applications, system services, libraries, and the like) that is executed on computer hardware and/or systems described below in regard to
It should be appreciated that while many novel aspects of the disclosed subject matter are expressed in routines, applications (also referred to as computer programs), apps (small, generally single or narrow purposed, applications), and/or methods, these aspects may also be embodied as computer-executable instructions stored in computer-readable media, also referred to as computer-readable storage media. As those skilled in the art will recognize, computer-readable media can host computer-executable instructions for later retrieval and execution. When executed on a computing device, the computer-executable instructions stored on one or more computer-readable storage devices carry out various steps, methods and/or functionality, including those steps, methods, and routines described above in regard to routines 300, 400 and 500. Examples of computer-readable media include, but are not limited to: optical storage media such as Blu-ray discs, digital video discs (DVDs), compact discs (CDs), optical disc cartridges, and the like; magnetic storage media including hard disk drives, floppy disks, magnetic tape, and the like; memory storage devices such as random access memory (RAM), read-only memory (ROM), memory cards, thumb drives, and the like; cloud storage (i.e., an online storage service); and the like. For purposes of this disclosure, however, computer-readable media expressly excludes carrier waves and propagated signals.
Turning now to
In contrast to the pinboard view arrangement of
The processor 802 executes instructions retrieved from the memory 804 in carrying out various functions, particularly in aggregating social content from various social networking sites and making the aggregated content available for searching. The processor 802 may be comprised of any of various commercially available processors such as single-processor, multi-processor, single-core units, and multi-core units. Moreover, those skilled in the art will appreciate that the novel aspects of the disclosed subject matter may be practiced with other computer system configurations, including but not limited to: mini-computers; mainframe computers, personal computers (e.g., desktop computers, laptop computers, tablet computers, etc.); handheld computing devices such as smartphones, personal digital assistants, and the like; microprocessor-based or programmable consumer electronics; and the like.
The system bus 810 provides an interface for search engine's components to inter-communicate. The system bus 810 can be of any of several types of bus structures that can interconnect the various components (including both internal and external components). The illustrative search engine 110 further includes a network communication component 812 for interconnecting the search engine with other computers (such as user computers 102-106 and networking sites 112-116) as well as other devices on a computer network 108. The network communication component 812 may be configured to communicate with an external network, such as network 108, via a wired connection, a wireless connection, or both.
The exemplary search engine 110 further includes a query parsing component 814. As previously suggested, the query parsing component 814 parses search queries received from computer users in order to identify a query topic (or topics), sub-topics, conditions such that the search engine 110 is able it identify a set of search results of social content responsive to the query. These topics and conditions are used as keys by a content retrieval component 820 to identify and/or retrieve social content in the content store 210 that satisfies the search query. An access filter component 816 filters potential search results according to current access permissions before the search engine provides the identified search results to the computer user. A presentation generation component 818 generates a presentation of the identified search results for the requesting computer user, such as the pinboard or slideshow views described in
Those skilled in the art will appreciate that the various components described above, including the query parsing component 814, access filter component 816, key extraction component 208, presentation generation component 818 and others may be implemented as executable software modules within the search engines 110, as hardware modules, or a combination of the two. Moreover, each of the various components may be implemented as an independent, cooperative process or device, operating in conjunction with the search engine 110. It should be further appreciated, of course, that the various components described above in regard to the search engine 110 should be viewed as logical components for carrying out the various described functions. As those skilled in the art appreciate, logical components (or subsystems) may or may not correspond directly in a one-to-one manner to actual components. In an actual embodiment, the various components identified as being part of the search engine 110 in
While various novel aspects of the disclosed subject matter have been described, it should be appreciated that these aspects are exemplary and should not be construed as limiting. Variations and alterations to the various aspects may be made without departing from the scope of the disclosed subject matter.