Ad-serving companies, e.g., Microsoft®, need to serve advertisements to users that visit particular web sites. Typically, the ad-serving company bills an advertiser for legitimate responses, e.g., clicks or actions, from interested users. Unfortunately, advertisers, publishers, and users may abuse this system for their own financial gain.
Advertisers may generate vast numbers of advertisements that are irrelevant to the web sites being visited by the users. Because it is inexpensive to “mass market” rather than carefully target customers, this behavior benefits the advertisers that engage in offering irrelevant advertisements. Although this is not necessarily malicious, this behavior degrades the overall relevance of the advertisements served by the ad-serving company and adversely affects the likelihood that publishers will be interested in these advertisements. Accordingly, it is beneficial to identify and discourage irrelevant advertising.
Publishers may create a web site and indicate display categories that are irrelevant when compared to the web site. In addition, publishers may select keywords as being associated with their web sites so as to attract high value advertisements, e.g., utilizing terms like “mesothelioma” with a $100 cost-per-click (CPC), even though the topic of the web site is not related to the selected keyword. Further, the publisher may engage in “click fraud,” where the publisher itself clicks on advertisements being displayed at the publisher's web site, thus, causing false charges to the advertisers.
Users, often when affiliated with an advertiser or publisher, may also engage in click fraud, i.e., responding to advertisements without any interest therein. As such, the advertiser is billed for clicks or actions that do not relate to interest in the material within the advertisement being served by the ad-serving companies.
This malicious, and even illegal, behavior of advertisers, publishers, and users may be automated through the employment of robotic users, e.g., robots. Due to the complex and variable design of robotic users, ad networks have difficulty distinguishing between the requests and responses from robotic users and those from human users, and consequently, accurately detecting the inappropriate behavior. Because many ad-serving companies utilize a pricing scheme that charges the advertiser per action or click-through, (e.g., charge-per-click (CPC) or charge-per-action (CPA) pricing models), and because actions and click-through may be automated by the robotic users, the advertiser's budget may be prematurely expended without the intended sales while the publisher's revenue is artificially increased. Robotic users may also drain the advertiser's computing bandwidth and/or deplete revenue received by the publisher. Accordingly these robotic users accelerate online detrimental behavior and inaccurate advertising charges.
This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter.
Embodiments of the present invention relate to computerized methods and systems for identifying automated click fraud programs. Upon receiving a request for presentation of a web page, the probability that the user is robotic vs. human is determined, at least in part, based upon the nature of the request. The determined probability, along with historic behavior related to the requesting user, if available, is used to determine a score that may be utilized to select advertisements for presentation to the user. If the score indicates a high likelihood that the user is robotic, an advertisement designed to solicit user behavior known to be associated with robots may be selected to confirm the suspicion. Alternatively, if the likelihood that the user is robotic is high enough, advertisement presentation may be largely suppressed. If, on the other hand, the score indicates a high likelihood that the user is human, a standard advertisement and/or an advertisement designed to solicit user feedback related to advertisements and/or publishers may be selected. The user behavior related to a trap or feedback advertisement, probability and/or score are stored in association with a user identifier and may be utilized to train the system for future scoring, if desired.
The present invention is described in detail below with reference to the attached drawing figures, wherein:
The subject matter of the present invention is described with specificity herein to meet statutory requirements. However, the description itself is not intended to limit the scope of this patent. Rather, the inventors have contemplated that the claimed subject matter might also be embodied in other ways, to include different steps or combinations of steps similar to the ones described in this document, in conjunction with other present or future technologies. Moreover, although the terms “step” and/or “block” may be used herein to connote different elements of methods employed, the terms should not be interpreted as implying any particular order among or between various steps herein disclosed unless and except when the order of individual steps is explicitly described.
Embodiments of the present invention provide computerized methods and systems, and computer-readable media having computer-executable instructions embodied thereon, for presenting advertisements designed to aid in differentiating human from robotic users. As utilized herein, the term “advertisement” is not meant to be limiting. Further, the term “advertisement” could be, or include, a promotional communication between a seller offering goods or services and a prospective purchaser (e.g., a human user) of such goods or services; or a noncommercial communication presented by a publisher on its own web page, e.g., a trap advertisement, a virus warning, or the like. In addition, an advertisement may contain any type or amount of data that is capable of being communicated for the purpose of generating interest in and/or sale of goods or services, e.g., text animation, executable information, video, audio, and other various forms known to those of ordinary skill in the art.
“Presentation,” as contemplated by one aspect of the present invention, includes display in association with a user interface. As utilized herein, the term “user interface” may include an aggregate of means by which users interact with a particular machine, device, computer program or other complex tool (e.g., computing system). The user interface provides means of both input, allowing the users to manipulate a computing system (e.g., inputting a request or communicating a click-through), and output, allowing the computing system to produce the effects of the users' manipulation (e.g., presenting advertisements).
Embodiments of the present invention relate to computerized methods and systems for selecting one or more advertisements for presentation based upon at least one request for a web page submitted by a user. In embodiments, the web page request may be received in association with the presentation of a trap advertisement (e.g., an unapparent advertisement or an image advertisement) or in association with the presentation of a feedback advertisement designed to solicit advertisement and/or publisher feedback from human users. The nature of the request, as more fully described below, is utilized to determine a probability that the requesting user is robotic as opposed to human. This determined probability, along with historic behavior related to the requesting user, is used to provide a score that is subsequently utilized in selecting one or more advertisements for presentation to the user. In one embodiment, if the score overcomes a threshold pre-defined based on robotic traffic patterns, a virus cleaner advertisement is presented to warn a potential human user of suspected infection and/or provide a mechanism for cleaning their system of viruses. In another embodiment, the score is utilized to adjust the rate at which commercial advertisements, as opposed to trap advertisements, are presented, thereby optimizing web page publisher revenue and reducing inappropriate billing for invalid requests.
Accordingly, in one aspect, the present invention provides one or more computer-readable media having computer-executable instructions embodied thereon that, when executed, perform a method for identifying automated click fraud programs. The method includes presenting an advertisement to a user, the user being associated with an identifier; measuring at least one user behavior related to the presented advertisement; utilizing the measured at least one user behavior to determine a probability that the user is robotic; and storing the probability and the associated at least one user behavior in association with the user identifier.
In another aspect of the present invention, a computer system is provided for identifying automated click fraud programs. The computer system includes a probability determining module configured to determine a probability that a user submitting a request for a web page is a robotic user based upon at least one measured user behavior; a scoring module configured to analyze at least one of the probability that the user submitting the request for the web page is a robotic user and historic user behavior and to assign a score to the user; an advertisement selection module configured to utilized the assigned user score to select one or more advertisements for presentation; and an historic user behavior database configured to store one or more of the determined probability, the at least one measured user behavior, the assigned score and the one or more selected advertisements in association therewith.
In another aspect, the present invention provides a computerized method for selecting one or more advertisements for presentation that are designed to warn a user of a potential virus. The method includes, incident to receiving at least one user request for a web page, determining a probability that the at least one request originated from a robotic user and utilizing the determined probability to assist in selecting the one or more advertisements to present. If the determined probability is high, the one or more selected advertisements include at least one virus warning.
Having briefly described an overview of embodiments of the present invention, an exemplary operating environment suitable for use in implementing embodiments of the present invention is described below.
Referring to the drawings in general, and initially to
The invention may be described in the general context of computer code or machine-useable instructions, including computer-executable instructions such as program components, being executed by a computer or other machine, such as a personal data assistant or other handheld device. Generally, program components including routines, programs, objects, components, data structures, and the like, refer to code that performs particular tasks, or implements particular abstract data types. Embodiments of the present invention may be practiced in a variety of system configurations, including hand-held devices, consumer electronics, general-purpose computers, specialty computing devices, and the like. Embodiments of the present invention may also be practiced in distributed computing environments where tasks are performed by remote-processing devices that are linked through a communications network.
With continued reference to
Computing device 100 typically includes a variety of computer-readable media. By way of example, and not limitation, computer-readable media may comprise Random Access Memory (RAM); Read Only Memory (ROM); Electronically Erasable Programmable Read Only Memory (EEPROM); flash memory or other memory technologies; CDROM, digital versatile disks (DVD) or other optical or holographic media; magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium that can be used to encode desired information and be accessed by computing device 100.
Memory 112 includes computer-storage media in the form of volatile and/or nonvolatile memory. The memory may be removable, non-removable, or a combination thereof. Exemplary hardware devices include solid-state memory, hard drives, optical-disc drives, and the like. Computing device 100 includes one or more processors that read data from various entities such as memory 112 or I/O components 120. Presentation component(s) 116 present data indications to a user or other device. Exemplary presentation components include a display device, speaker, printing component, vibrating component, etc. I/O ports 118 allow computing device 100 to be logically coupled to other devices including I/O components 120, some of which may be built in. Illustrative components include a microphone, joystick, game advertisement, satellite dish, scanner, printer, wireless device, and the like.
Turning now to
Computing system 200 includes an advertisement delivery engine 210, a user device 212, an advertisement database 214, and a historic user behavior database 216 all in communication with one another via a network 218. The network 218 may include, without limitation, one or more local area networks (LANs) and/or wide area networks (WANs). Such networking environments are commonplace in offices, enterprise-wide computer networks, intranets, and the Internet. Accordingly, the network 218 is not further described herein.
The advertisement database 214 may be configured to store information associated with various types of advertisements, as more fully discussed below. In various embodiments, such information may include, without limitation, one or more unapparent advertisements, one or more image advertisements, one or more virus cleaning/warning advertisements, one or more user feedback advertisements, advertiser and/or publisher identities and the like. In addition, the advertisement database 214 may include zero advertisements stored in association therewith but rather contain an organizational blueprint with an empty set. In some embodiments, the advertisement database 214 is configured to be searchable for one or more advertisements to be selected for presentation, as more fully described below.
It will be understood and appreciated by those of ordinary skill in the art that the information stored in the advertisement database 214 may be configurable and may include any information relevant to an advertisement. Further, though illustrated as a single, independent component, database 214 may, in fact, be a plurality of databases, for instance, a database cluster, portions of which may reside on a computing device associated with the advertisement delivery engine 210, the user device 212, another external computing device (not shown), and/or any combination thereof.
The historic user behavior database 216 may be configured to store information associated with a plurality of system users and their associated user behaviors, as more fully discussed below. In various embodiments, such information may include, without limitation, one or more user identities, one or more probabilities related to a user, one or more scores assigned to a user, and the like. In addition, the historic user behavior database 216 may include no actual user behavior information stored in association therewith but rather contain an organizational blueprint with an empty set. In some embodiments, the historic user behavior database 216 is configured to be searchable for one or more user identities based upon, for instance, an IP address or the like, and associated information, as more fully described below.
It will be understood and appreciated by those of ordinary skill in the art that the information stored in the historic user behavior database 216 may be configurable and may include any information relevant to a user and their associated user behavior. Further, though illustrated as a single, independent component, database 216 may, in fact, be a plurality of databases, for instance, a database cluster, portions of which may reside on a computing device associated with the advertisement delivery engine 210, the user device 212, another external computing device (not shown), and/or any combination thereof.
Each of the advertisement delivery engine 210 and the user device 212 shown in
As shown in
The trap advertisement presenting module 220 is configured to provide, incident on receiving at least one request associated therewith, user indicia pertaining to a robotic user. By way of example, the request may be received at a user interface as the result of user input. It will be understood and appreciated by those of ordinary skill in the art that multiple methods exist by which a user may input a request. For instance, requests may be input, by way of example only, utilizing a keyboard, joystick, trackball, touch-advertisement, or the like. Alternative user interfaces known in the software industry are contemplated by the invention. The at least one request is typically a user-initiated action or response that is received at a user interface, as discussed above. Examples of a request are a click, click-through, or selection by a user, e.g., human user or robotic user; however, it is understood and appreciated by one of ordinary skill in the art that a request may take any number of forms of indication at a web page. Further, it is contemplated by the present invention that a robotic user may be any non-human operator (i.e., an internet bot, web bot program, virus, robot, web crawler, web spidering program, or any software applications that run automated tasks over the Internet), which is an artificial agent that, by its actions, conveys a sense that it has intent or agency of its own. Even further, a human user is contemplated as being a human, but also, an entity (virtual or physical) acting under the present intent of a human operator.
The trap advertisement presentation module 220 includes an unapparent advertisement (or honey pot advertisement) component 232 and an image advertisement component 234. The unapparent trap advertisement component 232 is configured to present one or more advertisements that may trigger at least one request from a robotic user, as more fully discussed below with reference to
By way of example only, the unapparent advertisement may be an “<A HREF>,” a 1×1 pixel, or an alphanumeric character of the same color as the background of a web page, yet having the same linking structure as other advertisements on the web page, more fully discussed below with reference to
The image advertisement component 234 is configured to solicit at least one request, wherein the coordinates of the at least one request on a user interface are determined, as more fully described below with reference to
Upon determination of the coordinates of a request, the image advertisement component 234 may compare those coordinates with expected coordinates, e.g., coordinates of the “call-to-action” of the image advertisement, more fully discussed below with reference to
Although two different configurations of trap advertisements have been shown, it should be understood and appreciated by those of ordinary skill in the art that other trap advertisements or robotic user identification components could be used, and that the invention is not limited to those embodiments shown and described.
The feedback advertisement presentation module 222 is configured to present a feedback advertisement, wherein the feedback advertisement comprises noncommercial content that is accessible by satisfying a user-validation query, as more fully discussed below with reference to
In the illustrated embodiment, the feedback advertisement includes a user-validation query component 236 and a survey component 238. The user-validation query component 236 is configured to provide a user-validation query upon selection of a feedback advertisement prompt (
If the user-validation query is satisfied, a survey may be presented, e.g., utilizing the survey component 238. Alternatively, if the survey is not satisfied, then the survey is not presented. However, in either of these instances, the IP address of the user and status of whether the user-validation query is satisfied is sent as user indicia of a human user or robotic user to the probability determining module 224. Accordingly, the user indicia generated from the user-validation query component 238 is useful to help provide examples of requests that are likely from a human user or robotic user.
The survey component 238 is configured to present noncommercial content, e.g., a survey. In other embodiments, the noncommercial content may be comprise a solicitation of relevance of the at least one advertisement, quality of a publisher, and relevance of at least one advertisement with regard to and advertiser, as more fully described with reference to
Useful feedback from the large, engaged, and interested audience of human users may provide a variety of input to a web page publisher. In one instance, the survey may assist in judging the relevance of an advertisement. Here, the human users have an opportunity to comment on advertisements that may be irrelevant, untargeted, selling illegal schemes (e.g., porn, hate, money-making), or any other advertisement where the content is questionable. In another instance, the survey may help gather feedback on the relevance and quality of the web page publisher. Here, human users have an opportunity to report publishers that purvey illegal schemes, as discussed above, or that simply provide a poor user experience upon entering that particular web page. In yet another instance, the survey asks for ratings on the quality and relevance of the advertisement with regards to publisher, e.g., effectiveness of the ad-matching algorithm. Although several instances of survey material are discussed above, other fields of useful feedback are apparent to those of ordinary skill in the art to which the present invention pertains. Examples of questions that achieve the ends discussed above are provided at
Incident to receiving a request for a web page originating from a presented advertisement, the probability determining module 224 is configured to determine a probability that a user submitting the web page request is a robotic user based upon at least one measured user behavior. More specifically, information related to the advertisement associated with the request (and possibly the requesting user's IP address) is utilized in determining whether it was a human user or robotic user that provided the request. In one exemplary embodiment, if the request is associated with an unapparent advertisement, then a determination of high probability that the request originated from a robotic user is likely. In another exemplary embodiment, if the request is associated with an image advertisement and the coordinates of a request and the coordinates of an expected request are dissimilar upon comparison, then a determination of high probability that the request originated from a robotic user is likely. However, in yet another embodiment, if the request is associated with a feedback advertisement and the user-validation query is satisfied, then a determination of low probability that the request originated from a robotic user is likely. Incident to a determination of a probability that the requesting user is a robotic user, the determination is forwarded to the scoring module 226.
The scoring module 226 is configured to analyze at least one of the probability that the user submitting the request for the web page is a robotic user and historic user behavior and to assign a score to the user, as more fully discussed below with reference to
In embodiments, the scoring module 226 is further configured to be trained. Training is comprised of receiving information, examining that information in view of click-stream traffic patterns already stored in association with the scoring module 226 (and/or accessible from historic user behavior database 216), and updating the stored information such that the scoring module 226 is better able to distinguish a human user from a robotic user upon receiving future requests. Receiving information includes receiving the determination of probability of the request originating from a robotic user and the requesting user's IP address from probability determining module 224. If an IP address is received, the scoring module 226 may additionally request any historic behavior related to that IP address, for instance, from historic user behavior database 216. Examining the information includes to comparing the historic behavior against known or previously collected, click-stream traffic patterns of a robotic user, a human user, or both. By way of example only, comparison comprises analyzing click-through rate or conversion statistics that are robotic in nature in view of historical behavior associated with user indicia of a robotic user. Updating, with reference to the previous example, includes incorporating any differences between the historical behavior of an identified robot and known robotic click-stream traffic patterns into the scoring module 226 and storing the comparison as an update therein.
The advertisement selection module 228 is configured to utilize the assigned user score to select one or more advertisements for presentation, as more fully discussed below with reference to
As can be understood and appreciated by those of ordinary skill in the art, the advantage of selection is that it can serve a variety of purposes. For instance, if the score overcomes the threshold value, then it is likely that the request originated from a human user, and correspondingly, a commerical advertisement is selected for presentation. Further, the rate of presentation (i.e., the frequency at which non-commercial, or trap, advertisements are presented in context to the commercial advertisements) may be adjusted for that particular requesting user. As such, revenue is optimized for the web page publisher by reducing the rate of presenting non-commercial advertisements. If the score does not overcome the threshold value, then it is likely that the request originated from a robotic user, and correspondingly, the commercial advertisements are withheld by adjusting the rate of presentation. Accordingly, inappropriate advertiser billing is reduced. It will be understood and appreciated by those of ordinary skill in the art that methods for selecting the rate of presentation and the type of advertisements associated therewith are not limited to the embodiments described herein and that the nature the threshold value may vary accordingly.
The advertisement delivery module 230 is configured to delivery one or more advertisements to the user device 212 for presentation, for instance, at a user interface associated therewith, as more fully discussed below with reference to
As discussed above, the type of advertisement may be commercial (e.g., provided by an advertiser), non-commercial (e.g., feedback advertisement provided by a web page publisher), or a warning of robotic user. The warning of robotic user is typically presented to a suspected human user's device that has indicated a robotic user originated a request therefrom. That is, based on the adjusted rate of presentation, the advertisement delivery module 230 may present a warning upon noticing that the more recent requests are of a robotic nature as opposed to historic behavior indicating a human user, e.g., IP address. Embodiments of the warning include virus cleaning advertisements and are discussed in more detail below with reference to
Turning now to
With reference to
As shown in
Turning now to
With reference to
As shown in
Turning now to
Referring to
Turning to
Referring to
An illustrative screen display 1400, similar to the an exemplary user interface 1300 of
Turning now to
The illustrated screen displays 1300 (
As can be seen, embodiments of the present invention relate to computerized methods and systems for selecting one or more advertisements for presentation based upon at least one request for a web page submitted by a user. In embodiments, the web page request may be received in association with the presentation of a trap advertisement (e.g., an unapparent advertisement or an image advertisement) or in association with the presentation of a feedback advertisement designed to solicit advertisement and/or publisher feedback from human users. The nature of the request is utilized to determine a probability that the requesting user is robotic as opposed to human. This determined probability, along with historic behavior related to the requesting user, is used to provide a score that is subsequently utilized in selecting one or more advertisements for presentation to the user. In one embodiment, if the score overcomes a threshold pre-defined based on robotic traffic patterns, a virus cleaner advertisement is presented to warn a potential human user of suspected infection and/or provide a mechanism for cleaning their system of viruses. In another embodiment, the score is utilized to adjust the rate at which commercial advertisements, as opposed to trap advertisements, are presented, thereby optimizing web page publisher revenue and reducing inappropriate billing for invalid requests.
The present invention has been described in relation to particular embodiments, which are intended in all respects to be illustrative rather than restrictive. Alternative embodiments will become apparent to those of ordinary skill in the art to which the present invention pertains without departing from its scope.
From the foregoing, it will be seen that this invention is one well adapted to attain all the ends and objects set forth above, together with other advantages which are obvious and inherent to the system and method. It will be understood that certain features and sub-combinations are of utility and may be employed without reference to other features and sub-combinations. This is contemplated by and is within the scope of the claims.