This disclosure relates generally to a system and method for providing a “reputation” for a Social Media identity and for a Reputation Service (RS) available to users of Social Media sites. More particularly, but not by way of limitation, this disclosure relates to systems and methods to determine a reputation of an identity based on a plurality of conditions and, in some embodiments, across a plurality of Social Media and other types of web environments which may not fall strictly under the category of “Social Media.” Users can then use the determined reputation to possibly filter information from an “untrustworthy” identity or highlight information from a “trustworthy” identity.
Today the popularity of Social Media is very high and appears to continue to become more popular. Many different types of “social” interaction can take place via Internet sites. Some types of sites (e.g., Facebook®, LinkedIN® and Twitter®) are primarily concerned with sharing content of a purely social nature. (FACEBOOK is a registered trademark of Facebook, Inc., LINKEDIN is a registered trademark of linkedIN Corp., TWITTER is a registered trademark of Twitter, Inc.) Other types of sites have been used and continue to be used to share a combination of social and business relevant information. For example, professional web logs (blogs) and forum sites allow individuals to collaborate on discussion topics and share information and content directed toward a particular topic for an interested Internet community. A third type of “social” interaction on the web takes place when a buyer and seller make a transaction on sites such as eBay®, Craigslist®, Amazon.com®, etc. And still other types of “social” interaction take place on dating sites (e.g., Match.com®, eHarmony.com®etc.), ancestry cites (Ancestry.com®, MyHeritage.com, etc.), and reunion sites to name a few. (eBay is a registered trademark of eBay Inc., Craigslist is a registered trademark of craigslist, Inc., Amazon.com is a registered trademark of Amazon.com, Inc., match.com is a registered trademark of Match.com, LLC., eharmony.com is a registered trademark of eharmony.com Corp., and Ancestry.com is a registered trademark of Ancestry.com Operations Inc.).
In each of these types of social environments on the web, it may be possible for a user to become an “untrustworthy” participant and perhaps propagate inappropriate, malicious, factually inaccurate, or electronically hazardous materials (e.g., malware) to other interested users. “Inappropriate content” includes, but is not limited to: inaccurate content; malicious content; illegal content; or annoyance content, etc. Even a “trustworthy” participant can sometimes provide content that may be considered “inappropriate content,” however, the percentage of time that this happens should be low. Additionally, there are numerous examples where electronic messages (e.g., tweets, short messages, emails, etc.) purporting to be from celebrities or politicians have been faked resulting in an inappropriate post.
If a social environment becomes overly populated with “untrustworthy” content the popularity of that environment will diminish or die. Prior art solutions to limit bad content are typically directed to areas other than social media such as “email filters” that look for malicious content (e.g., viruses, malware, Trojans, spyware, etc.) or for spam-like content (e.g., advertisements, chain e-mails, etc.) and do not address a solution for social media interaction. Generally, when a user is deemed “untrustworthy” or “trustworthy” that user has formed a “reputation.”
To address these and other problems users encounter with social media content, systems and methods are disclosed to provide a Reputation Service “RS” which can determine a score for individual posts and to determine an aggregate score for identities providing the individual posts. Given this score, other users and user devices can receive an indication of an “untrustworthy” post or an “untrustworthy” user. Actions devices can take based on these types of indications, and other improvements for providing Reputation Services for a Social Media identity are described in the Detailed Description section below.
Various embodiments, described in more detail below, provide a technique for determining a reputation for a Social Media identity and for providing a Reputation Service (RS) to provide reputation information to subscribers of the service. The implementation could utilize a “cloud” of resources for centralized analysis. Individual users and systems interacting with the cloud need not be concerned with the internal structure of resources in the cloud and can participate in a coordinated manner to ascertain potential “untrustworthy” and “trustworthy” users on the Internet in Social Media sites and other web environments. For simplicity and clearness of disclosure, embodiments are disclosed primarily for a tweet message. However, a user's interaction with other Social Media environments (such as Facebook, LinkedlN, etc.) and web commerce communities (such as eBay, Amazon, etc.) could similarly be measured and provide input to a reputation determination. In each of these illustrative cases, users can be protected from or informed about users who may be untrustworthy. Alternatively, “trustworthy” users can benefit from a good reputation earned over time based on their reliable interaction in their Internet activities.
Also, this detailed description will present information to enable one of ordinary skill in the art of web and computer technology to understand the disclosed methods and systems for determining a reputation and implementing an RS for identities on Social Media and other web communities. As explained above, computer users post many types of items to the Internet. Posts can include links to songs, movies, videos, software, among other things. Other users in turn can initiate a download of posted content in a variety of ways. For example, a user could “click” on a link provided in a message (e.g., tweet or blog entry). Also, content of a post could be deemed inappropriate, as explained above, because the content may be considered spam-like or reference (via link) malicious or illegal downloads. To address these and other cases, systems and methods are described here that could inform the user of a “quality” score for the post based on the post itself and a determined score for the identity making the post.
Coupled to networks 102 are data server computers 104 which are capable of communicating over networks 102. Also coupled to networks 102 and data server computers 104 is a plurality of end user computers 106. Such data server computers 104 and/or client computers 106 may each include a desktop computer, lap-top computer, hand-held computer, mobile phone, peripheral (e.g. printer, etc.), any component of a computer, and/or any other type of logic. In order to facilitate communication among networks 102, at least one gateway or router 108 is optionally coupled there between.
Referring now to
System unit 210 may be programmed to perform methods in accordance with this disclosure (examples of which are in
Processing device 200 may have resident thereon any desired operating system. Embodiments may be implemented using any desired programming languages, and may be implemented as one or more executable programs, which may link to external libraries of executable routines that may be provided by the provider of the illegal content blocking software, the provider of the operating system, or any other desired provider of suitable library routines. As used herein, the term “a computer system” can refer to a single computer or a plurality of computers working together to perform the function described as being performed on or by a computer system.
In preparation for performing disclosed embodiments on processing device 200, program instructions to configure processing device 200 to perform disclosed embodiments may be provided stored on any type of non-transitory computer-readable media, or may be downloaded from a server 104 onto program storage device 280.
Referring now to
To facilitate reputation services for social media identities, GTI cloud 310 can include information and algorithms to map a posting entity back to a real world entity. For example, a user's profile could be accessed to determine a user's actual name rather than their login name. The actual name and other identifying information (e.g., residence address, email account, birth date, resume information, etc.) available from a profile could be compared with information gathered from another profile on another site and used to normalize the multiple (potentially different) login identifiers back to a common real world entity. Also, GTI cloud 310 can include information about accounts to assist in determining a reputation score. For example, a twitter account existing for less than 7 days may have an average reputation, the same account posting a GTI flagged bad link may immediately be flagged as dangerous. In contrast, an account existing for some months, with a history of innocent link posting, would not be penalized for an occasional malware link. To define a “score” for the identity items and account information such as age, history, frequency, connections to other social media accounts, connections to a physical person, etc. could be used. This “score” could be used by filtering software such as personal firewalls, web filters etc., to strip content posted by identified low reputation accounts or to provide an indication to other users via a visual indicator (an indication of which could be received or added) when the post is made available to a receiving user. Alternatively or in addition, a pop up style message could appear when a user accesses the questionable post.
User reputation could be calculated using a supervised learning algorithm along with defined business rules. Business rules may determine a reputation level for filtering an organization's accessible content (e.g., content to prevent from passing a corporate firewall) or provide a business-specific algorithm to use in conjunction with other disclosed embodiments. The supervised learning algorithm could be trained to classify user accounts in one of the score dimensions (e.g., malicious link tweeter, spammy tweeter, unreliable information tweeter, etc.). The training set could be labeled using automated systems with some possible human interaction as needed. For example, users who send tweets with links to malware can be automatically labeled by analyzing a tweet's link and content with a suite of security software—e.g., anti-virus, cloud-based URL reputation services (such as GTI cloud 310) etc. The twitter user attributes used in training can include, but may not necessarily be limited to:
Once the machine learning model has been trained, new users (i.e., posts of first impression) can be submitted to the model and classified as trustworthy, potentially spammy, potentially malicious (or gradients between these extremes.) These classifications (i.e., identity's score) can be used in security applications to perform functions including, but not limited to:
Other “features” which could be extracted from transactions (e.g., posts, dates, sales) and used as metrics for establishing reputation include graph properties of relationships (friends of friends etc.), direct addressing of the user in Twitter (implies a real-world relationship), text-learning techniques to analyze for spam, profanity etc., network properties of postings (same server/IP, domain age), unfollowings/unfriending type activity, consistency of information between social environments, seller rankings on e-commerce sites, and other rating type information on other available sites to which the identity can be mapped.
Referring now to
Referring now to
Process 500 is illustrated in
Process 550 is illustrated in
As should be apparent from the above explanation, embodiments disclosed herein allow the user, the reputation services server, web sites and end users to work together to create and determine (in an on-going manner) a reputation of an identity on the Internet. Also, in the embodiments specifically disclosed herein, the reputation has been formed from the context of a post; however other types of Internet interaction by an identity are contemplated and could benefit from concepts of this disclosure. It may also be worth noting that both the score and reputation of an identity may be applied to more than just web based environments and could be used in real world transactions to bolster or deflate a person's reputation. For example, credit rating or loan approval amounts could be lowered or raised in the real world.
In the foregoing description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the disclosed embodiments. It will be apparent, however, to one skilled in the art that the disclosed embodiments may be practiced without these specific details. In other instances, structure and devices are shown in block diagram form in order to avoid obscuring the disclosed embodiments. References to numbers without subscripts or suffixes are understood to reference all instance of subscripts and suffixes corresponding to the referenced number. Moreover, the language used in this disclosure has been principally selected for readability and instructional purposes, and may not have been selected to delineate or circumscribe the inventive subject matter, resort to the claims being necessary to determine such inventive subject matter. Reference in the specification to “one embodiment” or to “an embodiment” means that a particular feature, structure, or characteristic described in connection with the embodiments is included in at least one disclosed embodiment, and multiple references to “one embodiment” or “an embodiment” should not be understood as necessarily all referring to the same embodiment.
It is also to be understood that the above description is intended to be illustrative, and not restrictive. For example, above-described embodiments may be used in combination with each other and illustrative process steps may be performed in an order different than shown. Many other embodiments will be apparent to those of skill in the art upon reviewing the above description. The scope of the invention therefore should be determined with reference to the appended claims, along with the full scope of equivalents to which such claims are entitled. In the appended claims, terms “including” and “in which” are used as plain-English equivalents of the respective terms “comprising” and “wherein.”