The subject matter is now described with reference to the drawings, wherein like reference numerals are used to refer to like elements throughout. In the following description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the subject matter. It may be evident, however, that subject matter embodiments may be practiced without these specific details. In other instances, well-known structures and devices are shown in block diagram form in order to facilitate describing the embodiments.
As used in this application, the term “component” is intended to refer to a computer-related entity, either hardware, a combination of hardware and software, software, or software in execution. For example, a component may be, but is not limited to being, a process running on a processor, a processor, an object, an executable, a thread of execution, a program, and/or a computer. By way of illustration, both an application running on a server and the server can be a computer component. One or more components may reside within a process and/or thread of execution and a component may be localized on one computer and/or distributed between two or more computers.
Web-search engines typically make money by showing advertisements above and/or to the side of the non-monetized (i.e., organic) results. Advertisers compete with each other in an auction to decide which advertisements are shown. In the most common pricing model, an advertiser pays the search engine each time his advertisement is clicked by the person doing the search. Search engines can log, for each query, all of the monetization information for that query, such as: how many bidders were competing, what were the bids, how much was charged, etc. Instances disclosed herein utilize the monetization information to facilitate in reducing the effects of spam. In general, an indication of how valuable a particular search query is can be obtained by looking at the monetization information—search queries for which advertisers are willing to pay a lot of money are also likely to be targets of spammers. Other instances also incorporate specific user advertising interactions to tune the effect based on individual users.
For example, search relevance can be compromised by web spam. Web spammers try to trick search engines into showing particular pages higher up in the result set than those pages deserve. One such web-spam trick is to embed hidden words in the html of a web page that might suggest to a ranking algorithm that the page is more relevant than it really is. Similarly, communications can be compromised by email spam. Email spammers overwhelm email systems with large quantities of unsolicited emails every day. Some spammers even attempt to camouflage the true content of the spam emails and/or the true sender of the email to entice recipients to open their spam emails. Thus, instances provided herein can utilize data collected from a search advertising system to help rank organic (non advertisement) search results, to help filter email spam, and/or to facilitate in providing “user-specific” results and/or filtering based additionally on specific user impressions/clicks/purchases and the like.
In
For example, a user who enters the term “flower” into a search engine may also be interested in purchasing flowers as well as finding out further information about a flower—thus, it is beneficial for a company that sells flowers to advertise to that user at the point in time that the user is searching for a relevant term. Thus, frequently, users who are searching for information will see related advertisements and click on such advertisements to purchase flowers, thereby creating business for the flower retailer. The search engine itself is also provided with additional revenue by selling advertisement space for a particular period of time to a retailer when a relevant term, such as, for example, the term “flower,” is utilized as a search term.
The advertising space relating to search terms is typically bought or sold in an auction. More specifically, a search engine can receive a query (from a user) that includes one or more search terms that are of interest to a plurality of buyers. The buyers can place bids with respect to at least one of the search terms, and a buyer that corresponds to, for example, the highest bid will have their advertisement displayed upon a resulting page view. The search engine stores this advertising information in logs so that it can track which advertisers bid, how much they bid, and for what search terms, etc. The advertiser monetization information 108 can also be obtained in substantially real-time and/or from a local and/or remote data store.
The search query monetization value 106, in effect, represents the commercial desirability or worth of the search query 104. Because spammers tend to target highly desirable search queries, the search query monetization value 106 is a good indicator of likely spammer targets. Thus, the search query monetization value 106 can be leveraged to facilitate in, for example, improving organic search relevance and/or reducing spam emails and the like as described in detail infra. This affords a substantial improvement over current spam reduction techniques.
Turning to
The advertiser-monetization value component 210 determines the search query monetization value 206 based on, at least in part, the advertiser monetization information 212. To facilitate the determination, the advertiser-monetization value component 210 can employ, for example, monetization value mapping processes that employ, for example, independent phrase value algorithm 214 and/or independent bids algorithm 216 and the like. These two mapping algorithms are not the only algorithms that can be utilized with instances described herein. It can be appreciated that there are many acceptable ways to map from monetization information to the monetization value of a search query.
Two simple algorithms mentioned above that can be employed are described in detail. In a first example, independent phrase values are utilized. For each search query that occurs during some time span (e.g., a day), calculate the total amount charged to advertisers. Distribute this amount uniformly among phrases in the search query 204 (a phrase can be a single word and/or multiple words). When a new search query comes in, the advertiser-monetization value component 210 can sum up the phrase values to determine the search query monetization value 206. For example:
Phrase Values:
In a second example, independent bids are utilized. For each search query that has occurred, record the third-highest (or second-highest, or . . .) bid for a click on that search query. As in the independent phrase value example above, distribute this bid uniformly across the phrases in the search query 204, and store with each phrase the average “phrase bid” attributed to the phrase when it appears in a search query that has at least one advertisement. When a new search query comes in, the advertiser-monetization value component 210 takes the average (or the sum) of the individual phrase bids to determine the search query monetization value 206.
Referring to
Thus, the search relevance improvement system 300 is comprised of a search relevance improvement component 302 that receives the search query 304 and provides search result ranking 306. The search relevance improvement component 302 is comprised of the advertiser-monetization component 308 and the web-search ranking component 310. The advertiser-monetization component 308 receives the search query 304 and the advertiser monetization information 312. A search query monetization value is then determined for the search query 304 based on, at least in part, the advertiser monetization information 312. To accomplish this, the advertiser-monetization component 308 can employ monetization value mapping techniques that utilize, for example, an independent phrase value algorithm 320 and/or an independent bids algorithm 322 and the like. It can be appreciated that the advertiser-monetization component 308 can employ many different monetization mapping techniques.
The web-search ranking component 310 obtains the search query monetization value from the advertiser-monetization component 308. In other instances, the web-search ranking component 310 can also obtain the search query 304 directly (e.g., directly from a search engine, advertising system, etc.). The web-search ranking component 310 employs web-search algorithm(s) 314 to facilitate in determining the search result ranking 306. The web-search algorithm(s) 314 are based on, at least in part, the search query monetization value determined by the advertiser-monetization component 308. For example, the web-search algorithm(s) 314 can include a low monetization value queries algorithm 316 and/or a high monetization value queries algorithm 318 and the like. It can be appreciated that there are many algorithms that can be employed with instances disclosed herein. Different algorithms can be employed by the web-search ranking component 310 to adequately weight the search query monetization values to improve search relevance.
The web-search ranking component 310 can also utilize specific user advertisement interactions 324 to facilitate in determining the search result ranking 306 for a specific user. The specific user advertisement interactions 324 includes, but is not limited to, clicking on advertisements, viewing advertisements, purchasing items through an advertisement, and other interaction activities (e.g., hovering a mouse pointer over an advertisement, prolonged eye contact (via eye movement detection devices), and/or attention (via environmentally aware devices) and the like). In general, the specific user advertisement interactions 324 are obtained from search engines and/or advertising systems which typically log information associated with user advertisement interactions. The information can also include, but is not limited to, levels of interaction such as looking but not clicking, clicking, and/or actually making purchases and the like related to an advertisement. Many search engines can also track individual users so these types of information can be user specific. In scenarios where a “user” is considered to be a specific computing entity (i.e., it could be shared by many different people), “user specific” refers to the computing entity when individual people utilizing the machine cannot be identified.
The specific user advertisement interactions 324 can be invaluable to user satisfaction during the determination of the search result ranking 306 by the web-search ranking component 310. Users who desire search results related to a likely spam topic will be dissatisfied if those topics are removed and/or downgraded in the search result ranking 306. Thus, by incorporating specific user advertisement interactions 324 into the search result ranking 306, the web-search ranking component 310 can provide tuned results for individual users, substantially increasing the user's satisfaction with the search result ranking 306.
The monetization value from search advertisements can also be utilized to help detect email spam. The intuition is similar—if certain phrases such as “digital camera” are worth a lot to search advertisers, it might be expected that email spam will be sent from those same advertisers trying to sell users digital cameras. Thus, this can be leveraged to increase the effectiveness of a spam filtering. Looking at
The advertiser-monetization component 408 receives the search query 404 and advertiser monetization information 412. A search query monetization value is then determined for the search query 404 based on, at least in part, the advertiser monetization information 412. To accomplish this, the advertiser-monetization component 408 can employ monetization value mapping techniques that utilize, for example, independent phrase value algorithm 420 and/or independent bids algorithm 422 and the like. It can be appreciated that the advertiser-monetization component 408 can employ many different monetization mapping techniques.
The email spam filtering component 410 obtains the search query monetization value from the advertiser-monetization component 408. The email spam filtering component 410 employs spam filter algorithm(s) 414 to facilitate in determining the filtered email 406. The spam filter algorithm(s) 414 are based on, at least in part, the search query monetization value determined by the advertiser-monetization component 408. For example, the spam filter algorithm(s) 414 can include a low monetization value queries algorithm 416 and/or a high monetization value queries algorithm 418 and the like. It can be appreciated that there are many algorithms that can be employed with instances disclosed herein. Different algorithms can be employed by the email spam filtering component 410 to adequately weight the search query monetization values to improve email spam filtering.
The email spam filtering component 410 can also utilize specific user advertisement interactions 424 to facilitate in determining the filtered email 406 for a specific user. The specific user advertisement interactions 424 includes, but is not limited to, clicking on advertisements, viewing advertisements, purchasing through advertisements, and other interaction activities (e.g., hovering a mouse pointer over an advertisement, prolonged eye contact (via eye movement detection devices), and/or attention (via environmentally aware devices) and the like). In general, the specific user advertisement interactions 424 are obtained from search engines and/or advertisement systems which typically log information associated with user advertisement interactions. The information can include, but is not limited to, levels of interaction such as looking but not clicking, clicking, and/or actually making purchases and the like related to an advertisement. Many search engines can also track individual users so these types of information can be user specific. In scenarios where a “user” is considered to be a specific computing entity (i.e., it could be shared by many different people), “user specific” refers to the computing entity when individual people utilizing the machine cannot be identified.
The specific user advertisement interactions 424 can be invaluable to user satisfaction during the filtering of the filtered email 406 by the email spam filtering component 410. Users who desire to receive email related to a likely spam topic will be dissatisfied if those topics are removed and/or flagged in the filtered email 406. Thus, by incorporating specific user advertisement interactions 424 into the filtered email 406, the email spam filtering component 410 can provide tuned spam filtering for individual users, substantially increasing the user's satisfaction with the filtered email 406.
In view of the exemplary systems shown and described above, methodologies that may be implemented in accordance with the embodiments will be better appreciated with reference to the flow charts of
The embodiments may be described in the general context of computer-executable instructions, such as program modules, executed by one or more components. Generally, program modules include routines, programs, objects, data structures, etc., that perform particular tasks or implement particular abstract data types. Typically, the functionality of the program modules may be combined or distributed as desired in various instances of the embodiments.
In
Turning to
In general, specific user advertisement interactions are obtained from search engines and/or advertising systems which typically log information associated with user advertisement interactions. The information can include, but is not limited to, levels of interaction such as looking but not clicking, clicking, and/or actually making purchases and the like related to an advertisement. Many search engines can also track individual users so these types of information can be user specific. In scenarios where a “user” is considered to be a specific computing entity (i.e., it could be shared by many different people), “user specific” refers to the computing entity when individual people utilizing the machine cannot be identified.
At least one filtering process is then employed that filters search results for a search query based on, at least in part, the search query monetization value and/or specific user advertising interactions 606, ending the flow 608. For example, a filtering process can include a low monetization value queries algorithm and/or a high monetization value queries algorithm and the like. The filtering process can also, for example, be utilized to provide ranking of the search results. It can be appreciated that there are many algorithms that can be employed with instances disclosed herein. Different algorithms can be employed to adequately weight the search query monetization values to improve search query filtering such as enhanced ranking/relevance. Use of specific user advertisement interactions can also be invaluable to user satisfaction during the search result filtering. For example, users who desire search results related to a likely spam topic will be dissatisfied if those topics are removed and/or downgraded in the search result filtering. Thus, by incorporating specific user advertisement interactions into the search result ranking, results can be tuned for individual users, substantially increasing the user's satisfaction with such filtering processes as search result ranking.
Referring to
In general, specific user advertisement interactions are obtained from search engines and/or advertising systems which typically log information associated with user advertisement interactions. The information can include, but is not limited to, levels of interaction such as looking but not clicking, clicking, and/or actually making purchases and the like related to an advertisement. Many search engines can also track individual users so these types of information can be user specific. In scenarios where a “user” is considered to be a specific computing entity (i.e., it could be shared by many different people), “user specific” refers to the computing entity when individual people utilizing the machine cannot be identified.
Email spam is then filtered by employing, at least in part, search query monetization values and/or specific user advertiser interactions 706, ending the flow 708. For example, a filtering process can include a low monetization value queries algorithm and/or a high monetization value queries algorithm and the like. It can be appreciated that there are many different algorithms that can be employed with instances disclosed herein. Different algorithms can be employed to adequately weight the search query monetization values to improve email spam filtering. Use of specific user advertisement interactions can also be invaluable to user satisfaction during the filtering of email. Users who desire to receive email related to a likely spam topic will be dissatisfied if those topics are removed and/or flagged. Thus, by incorporating specific user advertisement interactions into the filtering process, the spam filtering can be tuned for individual users, substantially increasing the user's satisfaction with the filtering process.
The various components and processes described above can reside in similar and/or disparate locations that require various communication means to retrieve/obtain information/data.
It is to be appreciated that the systems and/or methods of the embodiments can be utilized in search query monetization value facilitating computer components and non-computer related components alike. Further, those skilled in the art will recognize that the systems and/or methods of the embodiments are employable in a vast array of electronic related technologies, including, but not limited to, computers, servers and/or handheld electronic devices, and the like.
What has been described above includes examples of the embodiments. It is, of course, not possible to describe every conceivable combination of components or methodologies for purposes of describing the embodiments, but one of ordinary skill in the art may recognize that many further combinations and permutations of the embodiments are possible. Accordingly, the subject matter is intended to embrace all such alterations, modifications and variations that fall within the spirit and scope of the appended claims. Furthermore, to the extent that the term “includes” is used in either the detailed description or the claims, such term is intended to be inclusive in a manner similar to the term “comprising” as “comprising” is interpreted when employed as a transitional word in a claim.