Online advertising has become a significant aspect of the Web browsing experience. Today, many search engine providers receive revenue through advertisements positioned adjacent to a user's query results. In particular, when a user submits a search query to a search engine, the search engine will select advertisements and present the advertisements in conjunction with general search results for the user's query. Typically, search engine providers receive payment from advertisers based upon pay-per-performance models (e.g., cost-per-click or cost-per-action models). In such models, the advertisements returned with search results for a given search query include links to landing pages that contain the advertisers' content. A search engine provider receives payment from an advertiser when a user clicks on the advertiser's advertisement to access the landing page and/or otherwise performs some action after accessing the landing page (e.g., purchases the advertiser's product).
In the pay-per-performance model, search engine providers select advertisements for search queries based on monetization. In other words, search engine providers select advertisements to return for a given search query to maximize advertising revenue. This is typically performed through an auction process. Search engine providers permit advertisers to bid for particular words and/or phrases as a way for selecting advertisements and determining the order in which advertisements will be displayed for a given search query. Bids are typically made as cost-per-click (CPC) commitments. That is, the advertiser bids a dollar amount it is willing to pay each time a user selects or clicks on a displayed advertisement presented as a result of a given search query.
One monetization method that search engines may use to determine selection and placement of different advertisements is to simply rank by the CPC bid and give the best or most prominent placement to the advertiser bidding the highest amount. For instance, Hotel A may “bid” or agree to pay the search engine $1.00 for each user that accesses its information as a result of its advertisement appearing with the search results of a given query while Hotel B may “bid” or agree to pay the search engine $1.50 for each user that accesses its information upon its advertisement appearing with the query results. In this instance, Hotel B would “win” the bid and, accordingly, its advertisement would be placed in a more prominent position on the web page on which the results of a search initiated by a query that exactly or partially matches the bid terms are displayed.
Another monetization method that search engines may use to determine the selection and placement of advertisements as the result of a particular search query is to take the product of the advertiser's CPC bid and the probability that a user will access the information associated with the advertisement. This probability is typically determined based on historical information regarding advertisements' click-through rates (CTRs), which is the rate at which users have clicked on a particular advertisement when presented. The most prominent placement is provided to the advertiser having the highest product (CPC bid×CTR). In this way, the search engine provider can attempt to maximize its expected profit.
The selection of advertisements based on CPC bids, CTRs, and/or other monetization factors, however, often result in irrelevant advertisements being returned for search results. For example, if an advertiser's landing page is about children books and the advertiser bids on the bid term “children,” it is possible that the advertisement would be returned for all search queries that include the term “children.” This may often result in the advertisement being presented for search queries for which the advertisement is irrelevant, such as “orphaned children” and “children medical conditions,” for example. Showing irrelevant advertisements for search queries hurts a search engine provider's revenue as the irrelevant advertisements are not likely to be selected. Additionally, providing irrelevant advertisements hurts the brand-name for the search engine, as advertisers are dissatisfied when their advertisements are irrelevant to the search queries for which they are returned. In particular, users are not only less likely to click on an irrelevant advertisement but are also less likely to purchase a product or otherwise complete an action when an irrelevant advertisement is selected by a user. As such, advertisers are likely to enter lower bids to search engines providing irrelevant advertisements.
Some approaches have been taken to check the relevance of bid terms for submitted advertisements (and their associated landing pages) at the time of their submission to the search engine provider as an attempt to provide relevant advertisements. In particular, the landing page is analyzed to determine whether the bid terms the advertiser selected are relevant to the landing page. If it is determined that the advertiser has bid on irrelevant terms, the bid terms may be removed from the advertisement and/or the search engine may refuse to use the advertisement. However, verifying the relevance of bid terms for a given advertisement does not ensure that relevant advertisements will be selected for a given search query. For instance, in the above example, the bid term “children” would be determined to be relevant to the advertisement relating to children books. Accordingly, the advertisement could still be returned for search queries, such as “orphaned children” and “children medical conditions,” despite the irrelevance of the advertisement to the search queries.
This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter.
Embodiments of the present invention relate to verifying the relevance of advertisements for search queries received at a search engine. In particular, the relevance of an advertisement for a given search query is determined by comparing the content of a landing page associated with the advertisement against search results for the search query. Advertisement relevance for a given search query is used to select and/or rank advertisements to return in conjunction with search results for the search query.
In some embodiments, advertisement relevance for a given search query is used to identify irrelevant advertisements and remove the irrelevant advertisements from consideration. Irrelevant and relevant advertisements are determined in some embodiments, by comparing a relevance score for each advertisement against a relevance threshold. Accordingly, after removing irrelevant advertisements from consideration, an auction process that considers only relevant advertisements proceeds using monetization factors (such as CPC bid and CTRs) to select and rank advertisements. In other embodiments, advertisements' relevance for a given search query are used during the auction process in conjunction with monetization factors to select and order (e.g., rank) advertisements to return for the search query. In further embodiments, advertisement relevance for a given search query is used to both filter irrelevant advertisements from consideration before the auction process as well as to select and rank advertisements in conjunction with monetization factors during the auction process. In still further embodiments, an auction process is conducted to provide a set of candidate advertisements, which are then filtered based on relevance to produce a set of advertisements to return for the search query.
The present invention is described in detail below with reference to the attached drawing figures, wherein:
The subject matter of the present invention is described with specificity herein to meet statutory requirements. However, the description itself is not intended to limit the scope of this patent. Rather, the inventors have contemplated that the claimed subject matter might also be embodied in other ways, to include different steps or combinations of steps similar to the ones described in this document, in conjunction with other present or future technologies. Moreover, although the terms “step” and/or “block” may be used herein to connote different elements of methods employed, the terms should not be interpreted as implying any particular order among or between various steps herein disclosed unless and except when the order of individual steps is explicitly described.
As indicated previously, embodiments of the present invention provide online relevance verification to present relevant advertisements in response to search queries. Accordingly, in one aspect, an embodiment of the invention is directed to a computerized method for providing advertisements in response to a search query. The method includes receiving the search query. The method also includes determining a relevance score for an advertisement and the search query by comparing content of a landing page associated with the advertisement against search results for the search query. The method further includes selecting one or more advertisements based at least in part on the relevance score and at least one monetization factor. The method still further includes communicating the advertisements for presentation.
In another embodiment, an aspect of the invention is directed to one or more computer-readable media embodying computer-useable instructions for performing a method of providing a set of advertisements for a given search query. The method includes accessing information regarding a set of search results for the given search query and accessing information regarding content of a landing page associated with an advertisement. The method also includes calculating a relevance score indicative of the relevancy of the advertisement for the given search query by comparing the information regarding the set of search results against the information regarding the content of the landing page. The method further includes in response to receiving a search request from a user including the given search query or an equivalent thereof, selecting a set of advertisements based at least in part on the relevance score for the advertisement and at least one monetization factor. The method further includes communicating the set of advertisements for presentation.
In yet a further aspect of the invention, an embodiment is directed to one or more computer-readable media embodying computer-useable instructions for performing a method of providing a set of advertisements for a search query. The method includes receiving the search query. The method also includes performing an auction based on at least one monetization factor to identify a set of candidate advertisements for the search query. The method further includes determining a relevance score for at least one candidate advertisement by comparing content from a landing page associated with the at least candidate advertisement against search results for the search query. The method also still further includes selecting a set of advertisements by removing one or more of the candidate advertisements from the set of candidate advertisements based on a relevance score and communicating at least a portion of the set of advertisements for presentation.
Having briefly described an overview of the present invention, an exemplary operating environment in which various aspects of the present invention may be implemented is described below in order to provide a general context for various aspects of the present invention. Referring initially to
The invention may be described in the general context of computer code or machine-useable instructions, including computer-executable instructions such as program modules, being executed by a computer or other machine, such as a personal data assistant or other handheld device. Generally, program modules including routines, programs, objects, components, data structures, etc., refer to code that perform particular tasks or implement particular abstract data types. The invention may be practiced in a variety of system configurations, including hand-held devices, consumer electronics, general-purpose computers, more specialty computing devices, etc. The invention may also be practiced in distributed computing environments where tasks are performed by remote-processing devices that are linked through a communications network.
With reference to
Computing device 100 typically includes a variety of computer-readable media. Computer-readable media can be any available media that can be accessed by computing device 100 and includes both volatile and nonvolatile media, removable and non-removable media. By way of example, and not limitation, computer-readable media may comprise computer storage media and communication media. Computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer-readable instructions, data structures, program modules or other data. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by computing device 100. Communication media typically embodies computer-readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media. The term “modulated data signal” means a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal. By way of example, and not limitation, communication media includes wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, RF, infrared and other wireless media. Combinations of any of the above should also be included within the scope of computer-readable media.
Memory 112 includes computer-storage media in the form of volatile and/or nonvolatile memory. The memory may be removable, nonremovable, or a combination thereof. Exemplary hardware devices include solid-state memory, hard drives, optical-disc drives, etc. Computing device 100 includes one or more processors that read data from various entities such as memory 112 or I/O components 120. Presentation component(s) 116 present data indications to a user or other device. Exemplary presentation components include a display device, speaker, printing component, vibrating component, etc.
I/O ports 118 allow computing device 100 to be logically coupled to other devices including I/O components 120, some of which may be built in. Illustrative components include a microphone, joystick, game pad, satellite dish, scanner, printer, wireless device, etc.
Referring now to
Among other components not shown, the system 200 may include a search engine server 202, an advertisement server 204, a source device 206, an advertiser server 208, and a user device 210. Each of the components shown in
Source devices, such as the source device 206, may maintain a variety of content such as web pages. For example, the source device 206 may be a web server that maintains multiple web pages. The search engine server 202 may access web page information by communicating with these source devices. For example, the search engine server 202 may periodically crawl the source device 206 to access web page information and/or index the information.
By accessing and/or indexing web page information from various source devices, the search engine server 202 may provide search capabilities to user devices, such as the user device 210. In particular, a user may employ a web browser 214 or other mechanism on the user device 210 to communicate with the search engine server 202. For instance, a user may issue a search query to the search engine server 202 and receive search results. The search query may comprise one or more search terms, and the search engine server 202 attempts to provide search results that are relevant to those search terms.
In embodiments of the present invention, advertisements are also selected based on the search query and returned to the user device 210 with the search results. Each advertisement may be provided by an advertiser and associated with a landing page. For instance, an advertiser may maintain an advertiser server 208, which includes a landing page 216 associated with one or more advertisements for the advertiser.
Advertisements to return for search queries may be selected by an advertisement server 208 and presented to the user via the user device 210 in hyperlink form, allowing user interaction with the advertisements. As such, a user may select an advertisement and be directed to a landing page associated with the advertisement, such as the landing page 216 located at the advertiser server 208. In embodiments of the invention, relevance of advertisements for given search queries is checked to reduce and/or prevent the presentation of irrelevant advertisements. In particular, advertisements are not selected based on monetization factors alone (such as CPC bids and CTRs) but are also selected based on relevance of the advertisements for search queries. The relevance of an advertisement for a given search query is determined based on a comparison of the content of a landing page for an advertisement (e.g,. landing page 216) against search results for the given search query. In some embodiments, if an advertisement is determined to be irrelevant, the advertisement will be rejected for the search query. Accordingly, an auction may be performed for the search query without consideration of the irrelevant advertisement. In other embodiments, an auction for a search query may be performed using relevance scores for advertisements in conjunction with monetization factors, such as CPC bids and CTRs, with or without initially filtering advertisements based on relevance. In further embodiments, auction results may be filtered based on relevance to provide a set of advertisements to return to the user device 210.
Turning now to
As shown at block 306, a set of advertisements are selected based at least in part on the relevance of the advertisements for the search query. Relevance may be used in a number of different manners to select advertisements for a search query in various embodiments of the invention. In one embodiment, relevance scores may be used to remove irrelevant advertisements from consideration before a final auction using monetization factors is conducted to select the advertisements to return to the requesting device. For instance, as shown in
Returning again to
Rank=(ε+Bid)1−Σiki*(ε1+PClick)k1*(ε2+RVScore)k2*
Wherein Bid represents CPC bid value, PClick represents CTR, RVScore represents relevance score, and for every i, εi is any number, ki ε[0,1] and
In a further embodiment, advertisements may be selected at block 306 by using relevance scores to remove irrelevant advertisements before an auction process is conducted and during the auction process to rank advertisements. Accordingly, in the present embodiment, irrelevant advertisements are identified based on a comparison of relevance scores against a relevance threshold. The irrelevant advertisements are filtered and an auction is conducted for the relevant advertisements in which the relevant advertisements are ranked based on both monetization and relevance. Any and all such variations are contemplated to be within the scope of embodiments of the present invention.
The selected advertisements are returned to the requesting device (e.g., the user device 202 of
Turning to
As shown at block 508, the set of candidate advertisements is filtered based on the relevance scores to identify a set of advertisements. In particular, relevance scores for the candidate advertisements are compared against a relevance threshold. Those candidate advertisements having a relevance score below the relevance threshold are deemed to be irrelevant and are removed from the set of candidate advertisements.
The set of advertisements (or at least a portion thereof) is then returned to the requesting device (e.g., the user device 202 of
Referring now to
As shown at block 606, a word score is determined for each word from the collection of words for the landing page. In an embodiment, the word score comprises a term frequency—inverse document frequency (TFIDF) score. Although other forms of TFIDF scores may be employed in various embodiments of the invention, the following equation is used in an embodiment to calculate the word score for a given word:
WordScore=(TF*log(N/DF))k
Wherein TF represents the frequency that the word appears in the document; N is the number of documents in a document corpus; DF is the number of documents in the document corpus that contain the word; and k is a coefficient determined experimentally for enhancing the result quality. Experimentation has shown that good results may be achieved using a k between 0.2 and 0.7, and preferably using k=0.45.
As shown at block 608, a collection of words is also accessed for the search query. In particular, a search is performed using the search query, and the collection of words is gathered from the search results. In an embodiment, the words are gathered from the top N (e.g., top 100) search results. In some embodiments, the words are gathered from the content of documents associated with the search results (e.g., by crawling the documents). In other embodiments, the words are collected from search result snippets. Experimentation has shown that using search result snippets reduces processing time while providing accurate results. Information from search results for the search may have been previously gathered such that the collection of words are available in a data store from which they may be accessed. At block 610, a word score is determined for each word in the collection of words for the search query. The word score may be calculated based on a TFIDF score such as that described above for the landing page.
Using the word scores determined for the collection of words for both the landing page and the search query, a relevance score for the landing page and search query pair is determined, as shown at block 612. In various embodiments of the invention, the relevance score may be calculated using all words or only a portion of words for a search query and landing page. For instance, in one embodiment, the top N words (e.g., top 100 words) based on their TFIDF score are selected for the landing page and designated as dominant words for the landing page. Similarly, the top N words (e.g., top 100 words) are selected for the search query and designated as dominant words for the search query. In such an embodiment, the relevance score is determined based on the dominant words for the landing page and the search query. In some embodiments of the invention, dominant word stores are maintained for search queries and landing pages. The dominant word stores store the dominant words and word scores for various search queries and landing pages. Accordingly, because relevance scores may need to calculated for various search query and landing page pairs, instead of accessing a collection of words and determining a word score of each word for the landing pages and search queries (as in blocks 504, 506, 508, and 510), the dominant word stores may be accessed to obtain word scores for a given landing page and search query pair for use in calculating relevance.
By way of example only and not limitation, the relevance score for a given landing page and search query pair may be determined by the following process. First, a vector distance is calculated based on the vector of words for the landing page and search query. In an embodiment, the vector distance is calculated using a cosine similarity for the collection of words according to the following equation:
Wherein A represents a word-bag containing word scores for the collection of words for the search query and B represents a word-bag containing word scores for the collection of words for the landing page.
The vector distance is then converted to a linear value using, for instance, the following equation:
A relevance score for the search query and landing page pair is next determined by converting to a signal function using, for instance, the following equation:
Wherein R represents the relevance score, N is a system value (typically 1000), and x is the linear value of the vector distance such as that shown hereinabove.
Having described the overall process of selecting relevant advertisements for search queries, a specific system architecture 700 and process for implementing one embodiment of the invention will now be described with reference to
The search query and advertisements determined based on monetization are provided to an online store 708, which facilitates online relevance verification for the advertisements. Typically, an identifier for each advertisement may be provided. In some cases, because the advertisements are associated with a landing page, an identifier for a landing page (e.g., a URL) may be provided instead of or in addition to its corresponding advertisement. Also, in some embodiments, only the top N (e.g., top 100) advertisements are passed to the online store.
As shown in
The online relevance handler 710 queries the query data store 712 to determine if it contains the current search query. If the current search query is not stored in the query data store, the online relevance handler 710 returns default relevance scores to the delivery engine 706. Alternatively, if the current search query is stored in the query data store 712, the online relevance handler receives a word-bag comprising the collection of words and their corresponding word scores for the current search query.
Similar to the query data store 712, the advertisement/landing page data store 712 includes information associated with a number of advertisements and/or landing pages. For each stored advertisement/landing page, the advertisement/landing page data store 712 includes a collection of words and corresponding word scores. The collection of words are collected from the content of the landing page. In some cases, the advertisement/landing page data store includes the top N (e.g., top 100) words for a given advertisement/landing page which have been designated as dominants words for the advertisement/landing page.
The online relevance handler 710 queries the advertisement/landing page data store for each of the advertisements/landing pages identified by the delivery engine 706. Some advertisements/landing pages may not be stored in the advertisement/landing page data score 714. For such advertisements/landing pages, a default relevance score may be assigned and returned to the delivery engine. For each advertisement/landing page stored in the advertisement/landing page data store 714, a word-bag having a collection of words and their corresponding word scores is retrieved.
After retrieving information from the query data store 712 and the advertisement/landing page data store 714, the online relevance handler 710 calculates a relevance score for each of the advertisements/landing pages for which a word-bag was available from the advertisement/landing page data store 714. The relevance scores may be determined based on a matching algorithm, for instance, such as that described with reference to
The relevance scores for the identified advertisements/landing pages are returned from the online relevance handler 710 to the delivery engine 706. The relevance scores may include both calculated relevance scores for those advertisements/landing pages for which data was available and default relevance scores for those advertisements/landing pages for which data was unavailable. The delivery engine 706 then selects and orders advertisements for return to the user device 704. As described previously, in some embodiments, irrelevant advertisements are filtered (e.g., based on a relevance threshold and the advertisements' relevance scores) and an auction is conducted using the relevant advertisements. In some cases, an auction does not need to be performed as the results of the auction previously performed by the delivery engine 706 are merely filtered based on the relevance scores. In other embodiments, the relevance scores may be used in conjunction with monetization factors to rank the advertisements without first filtering irrelevant advertisements. In further embodiments, relevance scores may be used to both filter advertisements and calculate ranking in conjunction with monetization factors. Any and all such variations are contemplated to be within the scope of embodiments of the present invention.
As indicated previously, in some cases, the query data store 712 may not contain a given search query or the advertisement/landing page data store 714 may not contain a given advertisement/landing page. In such cases, a message is sent to a dynamic priority queue 716 for the missing query or advertisement/landing page. The dynamic priority queue 716 is responsible for managing the priority of landing pages and search queries that need to be crawled using a crawler 718. Generally, the dynamic priority queue 716 will manage a list of landing pages and search queries that are sorted by number of hits (i.e., the number of times the dynamic priority queue has been requested to crawl a landing page or search query). When a query or advertisement/landing page are not currently stored, a message is sent to the dynamic priority queue 716. If the query or advertisement/landing page is not currently in the queue, the query or advertisement/landing page is added to the queue. Alternatively if the query or advertisement/landing page is currently in the queue, its priority may be adjusted by the request (i.e., the number of hits is incremented based on the request).
The dynamic priority queue 716 sends the top record to be crawled when the crawler 718 becomes available. Results for a given search query or advertisement/landing page will include a collection of words for which a word score is calculated for each word. The information is then stored in the appropriate location (i.e., query data store 712 or the advertisement/landing page data store). Accordingly, the information is available for subsequent search queries for use in determining relevance scores.
As can be understood, embodiments of the present invention provide relevant advertisements in response to a search query. Advertisement relevance is determined by comparing content from a landing page associated with a given advertisement against search results for a given search query. Relevance is used before, during, and/or after an auction to filter and/or rank advertisements to return for a search query.
The present invention has been described in relation to particular embodiments, which are intended in all respects to be illustrative rather than restrictive. Alternative embodiments will become apparent to those of ordinary skill in the art to which the present invention pertains without departing from its scope.
From the foregoing, it will be seen that this invention is one well adapted to attain all the ends and objects set forth above, together with other advantages which are obvious and inherent to the system and method. It will be understood that certain features and subcombinations are of utility and may be employed without reference to other features and subcombinations. This is contemplated by and is within the scope of the claims.