In response to a search query, an online search engine may provide sponsored search results in the form of online advertisements along with general web search results. The online advertisements may be displayed in order according to their estimated click-through rates and the advertising fees paid by the advertisers. When a user clicks on an advertisement, the advertiser may pay the search engine provider a fee for the click. This revenue model is referred to as the pay-per-click model. Generally speaking, the pay-per-click model is based on the assumption that advertisement clicks are very important to both search engine providers and advertisers. For example, the clicks on advertisements provides revenue for the search engine provider, and for advertisers, the clicks on advertisements mean potential customers and purchases.
Described herein are techniques for determining the attractiveness of an online advertisement to users, and predicting a user click probability by taking into account both the relevance of the online advertisement to a user search query and the attractiveness of the online advertisement.
The relevance between a search query and an online advertisement may be one of the important factors in explaining user advertisement click behaviors. However, relevance is not the only factor in determining whether a user will click on an online advertisement. In some instances, an online advertisement that is well matched to a query may have a lower click through rate and click numbers than another online advertisement that does not match the query as well. An additional factor that affects whether a user will click on an online advertisement may be the attractiveness of the online advertisement to the user. The attractiveness of an online advertisement may be contingent upon the ability the words in the online advertisement to attract the attention of users. The techniques describes herein may provide a way to quantify the attractiveness of an online advertisement, and predict a probability that a user may click on the online advertisement based on the attractiveness of the advertisement in conjunction with the relevance of the online advertisement to a search query.
In at least one embodiment, an advertisement attractiveness model for estimating an attractiveness of an online advertisement to users may be developed. A click behavior model is then created by combining the advertisement attractiveness model with a relevance model. The relevance model may be used for estimating relevance between the online advertisement and a search query. The click behavior model may be applied to features extracted from the online advertisement to calculate a click probability for the online advertisement.
This Summary is provided to introduce a selection of concepts in a simplified form that is further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
The detailed description is described with reference to the accompanying figures. In the figures, the left-most digit(s) of a reference number identifies the figure in which the reference number first appears. The use of the same reference number in different figures indicates similar or identical items.
The embodiments described herein pertain to techniques for determining the attractiveness of an online advertisement to users, and predicting a user click probability by taking into account both the relevance of the online advertisement to a user search query and the attractiveness of the online advertisement.
The relevance between a search query and an advertisement may be one of the important factors in explaining user advertisement click behaviors. However, relevance is not the only factor in determining whether a user will click on an advertisement. An additional factor that affects whether a user will click on an online advertisement may be the attractiveness of the online advertisement to the user. The attractiveness of an online advertisement may be contingent upon the ability the words in the online advertisement to attract the attention of a user.
In various embodiments, the attractiveness of an online advertisement may be quantified using an advertisement attractiveness model. The advertisement attractiveness model may be developed from a word-level attractiveness model that measures the attractiveness of individual words in the online advertisement. Further, the probability that the online advertisement may be clicked on by a user may be quantified using a click behavior model that is developed based on the advertisement attractiveness model and a relevance model. The relevant model may quantify the relevance between the online advertisement and a search query submitted by the user.
Accordingly, the implementation of the models to an online advertisement may produce word-level attractiveness scores that measure the attractiveness of words in the online advertisement to users. The implementation may further produce an advertisement attractiveness score that measure the overall attractiveness of the online advertisement to users. The implementation may additionally produce a click probability that measures the likelihood that the user will click on the online advertisement given the attractiveness of the online advertisement and the relevance of the online advertisement to a search query of the user.
The scores that are produced by the techniques described herein may be used by the online advertisers to gauge the effectiveness of their online advertisements in attracting user attention. Accordingly, rather than simply improving the relevance of their online advertisement to user search queries, the online advertisers may alternatively or concurrently improve the content attractiveness of their online advertisements to increase the number of user clicks on their online advertisements. Various examples of techniques for implementing attractiveness-based online advertisement click prediction in accordance with the embodiments are described below with reference to
The analysis of the online advertisement 106 may enable the user click inference engine 102 to generate a user click probability 112 for the online advertisement 106. The user click probability 112 may be generated based on the attractiveness of the words in the online advertisement 106 and the relevance of the online advertisement 106 to the search query 110. The user click probability 112 may represent the likelihood that a user may click on the online advertisement 106 when the online advertisement 106 is displayed as a sponsored search result with the list of search results 108.
In addition to the user click probability 112, the user click inference engine 102 may also provide word attractiveness scores 114 and an advertisement attractiveness score 116 for the online advertisement 106. Each of the word attractiveness score 114 may quantify the appeal of a corresponding word in the online advertisement 106 to users. The advertisement attractiveness score 116 may quantify the overall appeal of the online advertisement 106 to users.
In operation, the user click inference engine 102 may extract a set of attractiveness features 118 from each word in the online advertisement 106. The extracted attractiveness feature for a word may include two types of features. The first type of features may be textual features, such as the position of the word in an online advertisement, the length of the word, the part of speech (POS) of the word, and so forth. The second type of features for each word may be features that are extracted from the online advertisement 106 based on a historic record of user impressions and clicks, which may represent prior user preferences on words in online advertisements.
The user click inference engine 102 may also extract a set of relevance features 120 that quantify the relevance of the online advertisement 106 to the search query 110. The extracted relevance features 120 may include features that are visible to users, such as word frequency, inverse document frequency, topical page rank, and/or so forth, which are extracted by using the query words of a search query and content of the online advertisement 106. In some embodiments, the extracted relevance features 120 may exclude features that are invisible to users, such as bid keywords and/or content of an advertisement landing page that displays the online advertisement 106.
The user click inference engine 102 may generate the user click probability 112 for the online advertisement 106 using a click behavior model 122. In various embodiments, the click behavior model 122 may be developed from a relevance model 124 and an advertisement attractiveness model 126. In turn, the advertisement attractiveness model 126 may be derived from a word-level attractiveness model 128. The user click inference engine 102 may further use the word-level attractiveness model 128 to generate a word attractiveness score 114 for each word in the online advertisement 106 based on corresponding attractiveness features. For example, words such as “free”, “save”, “deal”, and “affordable” may be correlated with high word attractiveness scores. Likewise, the user click inference engine 102 may use the advertisement attractiveness model 126 to generate the advertisement attractiveness score 116 for the online advertisement 106 based on the attractiveness features 118.
The computing device 104 may includes one or more processors 202, memory 204, and/or user controls that enable a user to interact with the electronic device. The memory 204 may be implemented using computer readable media, such as computer storage media. Computer-readable media includes, at least, two types of computer-readable media, namely computer storage media and communication media. Computer storage media includes volatile and non-volatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules, or other data. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information for access by a computing device. In contrast, communication media may embody computer readable instructions, data structures, program modules, or other data in a modulated data signal, such as a carrier wave, or other transmission mechanism. As defined herein, computer storage media does not include communication media. The computing device 104 may have network capabilities. For example, the computing device 104 may exchange data with other electronic devices (e.g., laptops computers, servers, etc.) via one or more networks, such as the Internet.
The one or more processors 202 and the memory 204 of the computing device 104 may implement components of the user click inference engine 102. The user click inference engine 102 may include a relevance module 206, an attractiveness module 208, a click behavior module 210, a training module 212, a relevance feature extraction module 214, an attractiveness feature extraction module 216, and a user interface module 218. The memory 204 may also implement a data store 220.
In various embodiments, the user click inference engine 102 may use a factor graph to model user click behavior based on relevance and attractiveness factors. The high-level dependency between user clicks and the relevance and attractiveness factors may be expressed by the factor graph 222. As shown in the factor graph 222, fc is N(wc,1r+wc,2a,βc), and Φ may be a logistic function. Further, node c may represent whether an advertisement is clicked (c=1) or not (c=0).
Accordingly, the click probability, p(c=1), based on the relevance and attractiveness factors may be defined using a logistic function:
in which s is the click score, and a larger click score may mean that the advertisement is more likely to be clicked by users. Correspondingly, the non-click probability p(c=0), may be defined as p(c=0|s)=1−1/(1+e−s).
As further shown in the factor graph 222, score s may depend on the relevance score r of an advertisement to the query and the attractiveness score a of the online advertisement. Accordingly, the probability p(s|r,a,wc) may be defined using a Gaussian distribution:
s|r,a,wc:N(wc,1r+wc,2a,βc), (2)
in which the mean of the Gaussian distribution may be the linear combination of the relevance score and the attractiveness score using a two-dimensional weight vector wc. The vector wc may represent the tradeoffs between the relevance and attractiveness factors in their contributions to the overall click score, and βc may represent a hyperparameter controlling precision of clicks, that is, the variance of the Gaussian distribution. Additionally, the weight vector wc may be assumed to have a Gaussian prior:
wc:N(μc,σc) (3)
As such, given r and a, the click probability for an online advertisement may be estimated as follows:
p(c|r,a)=∫∫p(c|s)p(s|r,a,wc)p(wc)dwcds (4)
The relevance module 206 may use the relevance model 124 to estimate the relevance between an online advertisement and a search query inputted by a user. For example, the online advertisement may be the online advertisement 106, and the search query may be the search query 110. The relevance may be quantified by the relevance module 206 as a relevance score.
In various embodiments, the relevance model 124 may be a probabilistic model that is described by an factor graph 224, in which N is N(wr;μr,σr), and fr is N(wr,xr,βr). The probabilistic model may assume that there is a relevance score r for each advertisement-query pair. Similar to the click score s introduced earlier, r may also be a Gaussian random variable:
r:N(wr,xr,βr) (5)
in which xr may be the relevance features, wr may be a weight variable, and βr may be a hyperparameter controlling the precision of relevance. Further, wr may be assumed to be a Gaussian random variable: wr:N(μr,σr).
In various embodiments, the relevance features xr may include features that the users may see in a sponsored search, such as word frequency, inverse document frequency, topical page rank, and/or so forth, which are extracted by using the query words of a search query and the online advertisement. In other words, relevance features xr may exclude features that are invisible to users, such as bid keywords and/or content of an advertisement landing page.
Thus, given the relevance features xr, the relevance model 124 may be used to obtain a joint probability of r,wr as follows:
p(r,wr|xr)=p(r|wr,xr)p(wr), (6)
in which p(r|wr,xr) is N(wr,xr,βr).
Further, if the prior of wr is known, the relevance model 124 may be used to estimate a probability of a relevance score for a query-advertisement pair as follows:
p(r|xr)=∫p(r,wr|xr)dwr (7)
The attractiveness module 208 may use an advertisement attractiveness model 126 to quantify the attractiveness of an online advertisement, such as the online advertisement 106. However, since the attractiveness of an online advertisement depends on the attractiveness of words that are in the online advertisement, the advertisement attractiveness model 126 may be defined based on the word-level attractiveness model 128. The word-level attractiveness model 128 may be used to generate an attractiveness score for each word in an online advertisement.
As shown in
ai:N(wa,xa
Further, as in the relevance model 124, wa may be a weight vector which has a Gaussian prior: wa:N(μa,σa).
In various embodiments, the attractiveness features 118 that are quantified by the attractiveness module 208 may include two types of features. The first type of features may be textual features, such as the position of each word in an online advertisement, the length of each word, the part of speech (POS) of each word, and so forth. Each word may be tagged using POS tags, such as a Noun tag, a Verb tag, an Adjective tag, an Adverb tag, an Unknown tag, and/or so forth.
The second type of features for each word may be features that are extracted from an online advertisement based on a historic record of user impressions and clicks, which may represent prior user preferences on words in online advertisements provided by an advertisement platform. The advertisement platform may be an advertisement space provided by a specific search engine. The second type of features may include one or more of the following:
Accordingly, by using the attractiveness features, the word-level attractiveness model 128 may provide the joint probability of ai,wa given attractiveness features xa
p(ai,wa|xa
in which p(ai|wa,xa
Further, given that the prior of weight vector wa is known, the probability of an attractiveness score for a word may be estimate as follows:
p(ai|xa
The advertisement attractiveness model 126 may be defined based on the word-level attractiveness model 128. In defining the advertisement attractiveness model 126, the attractiveness score of an online advertisement may be assumed to be a Gaussian random variable. Further, the Gaussian random variable may take a sum of the attractiveness of the words in the online advertisement as its mean:
in which a is the attractiveness score of an online advertisement, ai is the attractiveness score of the i-th word in the online advertisement, and βa is a hyperparameter controlling a precision of attractiveness.
As shown in
p(a,{ai}i=1n,wa|xa)=p(a|{ai}i=1n)(Πi p(ai|wa,xa
in which xa={xa
p(a|xa)=∫∫p(a,{ai}i=1n,wa|xa)dwad{ai}i=1n (16).
The click behavior module 210 may use a click behavior model 122 to perform user click behavior analysis. The click behavior model 122 may be generated based on the relevance model 124 and the advertisement attractiveness model 126. As shown in
p(c|xr,xa)=∫∫p(c|r,a)p(r|xr)p(a|xa)drda (17)
in which p(c|r,a) may be defined by equation (4), p(r|xr) by equation (7), and p(a|xa) by equation (14).
In various embodiments, the click behavior model 122 may use two categories of parameters in order to perform user click behavior analysis. These two categories may include:
The parameters in category A may be manually set, and the parameters in category B may be learned from a set of training data. The parameters in category B may have a vector/matrix form whose dimension depends on the dimension of input features. A training module 212 may be used to learn the parameters in category B and facilitate the training of the click behavior model 122.
Thus, given a set of training examples (impression events represented by triples of {xr,xa,c}), the training module 212 may learn the parameters in category B by maximizing their likelihood. In each of the triples, xr may be a set of relevance features, xa may be a set of attractiveness features, and c may be a ground truth in binary format. For example, c=1 may represent that a corresponding online advertisement was clicked, and c=0 may represent that the corresponding online advertisement was not clicked. The training examples may be collected from sponsored search logs of a search engine for a predetermined time period.
In some embodiments, in order to perform the likelihood estimation in an efficient manner, the training module 212 may exploit an approximate message passing algorithm to train the click behavior model 122. The messages and marginals may be approximated by moment matching to a Gaussian distribution with the same mean and variance using expectation propagation. Such estimation may be achieved by minimizing a Kullback-Leibler divergence between the true and the approximated probabilities. In at least one embodiment, the training of the click behavior model 122 may be accomplished via a framework for running Bayesian inference in graphical models.
The learning of the parameters in the category B may further enable the attractiveness module 208 to use the word-level attractiveness model 128 to obtain an attractiveness score of a word in an online advertisement. In at least one embodiment, the attractiveness score of a word, a*i, may be inferred as follows:
in which p(ai|xa
Likewise, the learning of the parameters in the category B may further enable the attractiveness module 208 to use the advertisement attractiveness model 126 to obtain an attractiveness score of an online advertisement. In at least one embodiment, the attractiveness score of the online advertisement, a*, may be inferred as follows:
in which p(a|xa) is defined in equation (16).
The relevance feature extraction module 214 may extract a set of relevance features from each online advertisement that is to be analyzed, such as the online advertisement 106. As described above, the extracted relevance feature may include features that the users may see in a sponsored search, such as term frequency, inverse document frequency, topical page rank, and/or so forth. The features may be extracted by using the query words of a search query and the online advertisement. In some embodiments, the extracted relevance features may exclude features that are invisible to users, such as bid keywords and/or content of an advertisement landing page.
The attractiveness feature extraction module 216 may extract a set of attractiveness features for each word in an online advertisement that is to be analyzed, such as the online advertisement 106. As described above, the extracted attractiveness features for a word may include two types of features. The first type of features may be textual features, such as the position of each word in an online advertisement, the length of each word, the part of speech (POS) of each word, and so forth. The second type of features for each word may be features that are extracted from an online advertisement based on a historic record of user impressions and clicks, which may represent prior user preferences on words in online advertisements.
The user interface module 218 may enable the user to interact with the modules of the user click inference engine 102 using a user interface (not shown). The user interface may include a data output device (e.g., visual display, audio speakers), and one or more data input devices. The data input devices may include, but are not limited to, combinations of one or more of keypads, keyboards, mouse devices, touch screens, microphones, speech recognition packages, and any other suitable devices or other electronic/software selection methods.
In some embodiments, the user may select online advertisements to be analyzed by the user click inference engine 102 via the user interface module 218. In other embodiments, a user may use the user interface module 218 to manually input category A parameters into the training module 212, and/or upload training examples for learning category B parameters into the training module 212. In still other embodiments, the user interface module 218 may be used to select the types of relevance features and attractiveness features to be analyzed by the user click inference engine 102.
The data store 220 may store the various models that are used by the user click interference engine 102. The stored models may include the relevance model 124, the advertisement attractiveness model 126, the word-level attractiveness model 128, and the click behavior model 122. The data store 220 may further stored the factor graphs 222-230, as well as other data and/or intermediate products that are used by the user click inference engine 102, such as the category A and category B parameters, training examples, search queries, online advertisements to be analyzed. The data store 220 may also store scores generated by the user click inference engine 102. The scores may include word attractiveness scores, advertisement attractiveness scores, relevance scores, and/or probability of clicks for online advertisements.
The relevance model 124 may be constructed to quantify a set of relevance features that are visible to users, such as term frequency, inverse document frequency, topical page rank, and/or so forth, which are extracted by using the query words of a search query and the online advertisement. In some embodiments, the relevance features may exclude features that are invisible to users, such as bid keywords and/or content of an advertisement landing page.
At block 304, the advertisement attractiveness model 126 for estimating an attractiveness of the online advertisement to users may be developed for use by the attractiveness module 208. In various embodiments, the advertisement attractiveness model 126 may be a probabilistic model that is described by the factor graph 228.
At block 306, the click behavior model 122 may be created by combining the relevance model 124 and the advertisement attractiveness model 126. In various embodiments, the click behavior model 122 may be represented by the factor graph 230. The click behavior model 122 may use two categories of parameters in order to perform user click behavior analysis, in which the parameters in a first category may be manually set, while the parameters in a second category may be learned from a set of training data.
At block 308, the click behavior model 122 may be trained. The click behavior model 122 may be trained with the manual setting of the parameters in the first category. Additionally, the training module 212 may further train the click behavior model 122 by obtaining the parameters in the second category from a set of training examples by maximizing the likelihood of the training examples. In some embodiments, in order to perform the likelihood estimation in an efficient manner, the training module 212 may exploit an approximate message passing algorithm to train the click behavior model 122.
At block 310, the click behavior module 210 may apply the click behavior model 122 to features of an online advertisement, such as the online advertisement 106, to calculate a click probability of the online advertisement. The features of the online advertisement 106 may include the attractiveness features 118 and the relevance features 120. The click probability may be further reported to the online advertiser that provided the online advertisement 106 so that the online advertiser may improve the content of the online advertisement 106. For example, the online advertiser may modify the online advertisement to include additional words that are more appealing to users.
At block 402, a set of attractiveness features for quantifying attractiveness of words in an online advertisement may be identified. In various embodiments, the attractiveness features may include two types of features. The first type of features may be textual features, such as the position of each word in an online advertisement, the length of each word, the part of speech (POS) of each word, and so forth. The second type of features may be features that are identified based on a historic record of user impressions and clicks, which may represent prior user preferences for online advertisements and words in online advertisements.
At block 404, the word-level attractiveness model 128 that quantifies the set of attractiveness features may be generated. In various embodiments, the click behavior model 122 may be represented by the factor graph 226. The word-level attractiveness model 128 may use a Gaussian distribution to model the attractiveness scores of words in an online advertisement. In some embodiments, the word-level attractiveness model 128 may be used to generate an attractiveness score for each word in the online advertisement.
At block 406, the advertisement attractiveness model 126 may be defined based on the word-level attractiveness model 128. In defining the advertisement attractiveness model 126, the attractiveness score of an online advertisement may be assumed to be a Gaussian random variable. The advertisement attractiveness model 126 may be used to generate the advertisement attractiveness score 116 for an online advertisement.
At block 502, the relevance feature extraction module 214 may extract relevance features 120 that reflect the relevance of the online advertisement 106 to a search query, such as the search query 110. The extracted relevance features 120 may include features that are visible to users, such as word frequency, inverse document frequency, topical page rank, and/or so forth, which are extracted by using the query words of a search query 110 and the online advertisement 106.
At block 504, the attractiveness feature extraction module 216 may extract attractiveness features 118 of word in the online advertisement 106. In various embodiments, the extracted attractiveness features may include two types of features. The first type of features may be textual features, such as the position of each word in an online advertisement, the length of each word, the part of speech (POS) of each word, and so forth. The second type of features may be features that are identified based on a historic record of user impressions and clicks, which may represent prior user preferences for online advertisements and words in online advertisements.
At block 506, the click behavior module 210 may infer a click probability for the online advertisement 106 by applying a click behavior model, such as the click behavior model 122, to the relevance features 120 and the attractiveness features 118 of the online advertisement 106.
In additional embodiments, the attractiveness module 208 may further use the word-level attractiveness model 128 to generate a word attractiveness score 114 for each word in the online advertisement 106 based on the attractiveness features 118. Likewise, the attractiveness module 208 may also use the advertisement attractiveness model 126 to generate the advertisement attractiveness score 116 for the online advertisement 106 based on the attractiveness features 118.
The attractiveness of an online advertisement is dependent on the ability of the words in the online advertisement to attract the attention of a user. The techniques describes herein may provide a way to quantify the attractiveness of an online advertisement, and predict a probability that a user may click on the online advertisement based on the attractiveness of the advertisement in conjunction with the relevance of the online advertisement to a search query. Accordingly, rather than simply improving the relevance of their online advertisement to user search queries, the online advertisers may alternatively or concurrently use the click probabilities of online advertisements to improve the content attractiveness of their online advertisements to increase the number of user clicks. For example, words such as “free”, “save”, “deal”, and “affordable” may be used to increase the appeal of online advertisements to consumers.
In closing, although the various embodiments have been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended representations is not necessarily limited to the specific features or acts described. Rather, the specific features and acts are disclosed as exemplary forms of implementing the claimed subject matter.