The present invention is directed to a method, apparatus and computer readable medium for selecting messages such as advertisements to serve to a web page, using past performance scores, relevance scores, and message revenue information.
When a user makes a request for base content to a server via a network, additional content is also typically sent to the user along with the base content. The user may be a human user interacting with a user interface of a computer that transmits the request for base content. The user could also be another computer process or system that generates and transmits the request for base content programmatically.
Base content may include a variety of content provided to a user and presented, for example, on a published Web page. For example, base content may include published information, such as articles about politics, business, sports, movies, weather, finance, health, consumer goods, etc. Base content may also include applications (such as iphone app.'s) or pdf files, or any other documents (such as ads being dynamically placed in document files). Additional content may include content that is relevant to the base content or a user. For example, relevant additional content that is relevant to the user may include advertisements for products or services in which the user is likely to have an interest. This type of additional content will be referred to hereinafter as “messages.”
Base content providers receive revenue from advertisers who wish to have their messages, such as advertisements, displayed to users, and who generally pay a particular amount each time a user clicks on one of their messages. This is known as price per click (PPC) model. If this pricing method is used, the revenue received is a function of PPC x Click Through Rate (CTR). Another possible pricing method is price per impression, wherein a charge is levied each time a message is displayed, regardless of click rate. Base content providers employ a variety of methods to determine which additional content to display to a user in order to maximize revenue received. For example, user interest in particular subject categories (i.e., relevance) may be used to determine which additional content to display to the user. Refinements in the selection of message display content are important for the improvement of revenue received.
A method and apparatus for selecting additional message content to display to a user when the user request base content is provided. In order to optimize the initial selection pool of message candidates (from which a final selection is chosen), for each candidate message, a historical aggregate of CTR data is combined with an offline estimate, also obtained from historical data, of RElative Probability of Action (REPA, which combines computed relevancy scores and ranking information) and with bid amount to obtain an estimate for revenue generation. This estimate is combined with dynamic matching between the message and the page/user pair to obtain a final score for each message, which is used to create the initial selection pool of messages for the page displayed. Final message selection uses a feedback loop, using specific CRT in conjunction with the specific page displayed, for the messages in the initially selected pool.
a illustrates one embodiment for the basic quality index data flow.
b is a flow diagram corresponding to
When a user makes a request for base content to a base content provider via a network, additional content is also typically sent to the user along with the base content. As used herein, base content connotes content requested by a user. Base content may be presented, for example, as a Web page and may include a variety of content (e.g., news articles, emails, chat-rooms, etc.). Base content may take a variety of forms including text, images, video, audio, animation, program code, data structures, hyperlinks, etc. The base content may be formatted according to the Hypertext Markup Language (HTML), the Extensible Markup Language (XML), Standard Generalized Markup Language (SGML), or any other language.
As used herein, additional content is content sent to the user along with the requested base content. Additional content may include content that is relevant to the base content or a user. Additional content may include, for example, an advertisement or hyperlink (e.g., sponsor link, integrated link, inside link, or the like) in which the user has an interest. Additional content may include a similar variety of content and form as the base content described above.
In some embodiments, a base content provider is a network service provider (e.g., Yahoo! News, Yahoo! Music, Yahoo! Finance, Yahoo! Movies, Yahoo! Sports, etc.) that operates one or more servers that contain base content and receives requests for and transmits base content. A base content provider also sends additional content to users and employs methods for determining additional content to send along with the requested base content, the methods typically being implemented by the one or more servers it operates.
It is clear that displaying the most relevant messages (i.e., additional content) on a Web page results in good user experiences, and also leads to a better message click-through-rate (CTR). However, publishers are most interested in maximizing revenue, which is at least to some extent correlated to positive user experiences. In the standard cost-per-click (CPC) advertising system, advertisers are charged for each click on the message, resulting in revenue of CTR×CPC. Embodiments are disclosed herein to boost CTR as well as CPC, and as a result, boost revenue.
In one embodiment, a score for each message chosen in an initial selection pool is computed. In one embodiment, the score is comprised of two portions: a first portion, referred to herein as the “Static Score”, and a second portion, referred to herein as the “Dynamic Score.” The Static Score is a static property of the message itself. In some embodiments, the Static Score is formulated, for example, from bid amount, message quality measurement, and historical CTR information for the message. The second portion of the score, the Dynamic Score, consists of the dynamic matching between the message and the page/user pair. The total, or “Final Score”, is the weighted average of the Dynamic Score and the Static Score, and may be expressed as:
Final Score=α×Static Score+(1−α)×Dynamic Score,
wherein, α is a user-determined weighting parameter, with a value between 0 and 1.
In some embodiments, the Static Score is determined and normalized for linear combination with the Dynamic Score to generate the Final Score. The Static Score, as defined herein, is a function of the product of the bid amount with a quality index, referred to herein as the Relative Clickability Score (“RCB”). The RCB provides a quantitative comparison between the quality of a specified message with other peer messages. A key aspect of the model is combining two factors into the RCB:
In some embodiments, the RCB model uses a linear combination of the true historical CTR and the estimated CTR (eCTR), with user-determined parameters that define relative weights ascribed to the CTR and eCTR. When there is a large amount of historical data for a specified message, (i.e., there has been ample opportunity for the message to show its performance), the historical true CTR is given more weight. However, when there is limited historical data, (i.e., not enough “impressions”), the estimated CTR is given more weight. The CTR is normalized as a percentile so as to transform it to the same scale as the eCTR: [0,1].
The mathematical description of the construction of the RCB model in accordance with some embodiments is as follows:
RCB=βCTR+(1−β)eCTR
Where: β=f(x)=1/(1+e−c
x=g(impression)=log(impression)−log(c2)
Mathematically, if log(impression)>log(c2), β>½, so the CTR, or historically derived score, is weighted more heavily, but if log(impression)<log(c2), β<½, then the eCTR, or estimated score, is weighted more heavily. The tuning parameters are determined by doing a grid search, and are chosen based on the highest accuracy in predicting good messages for the following week.
In one embodiment, CTR is defined as (sum of clicks)/(sum of impressions), aggregated over the preceding 30 day time period. The clicks and impressions in this definition are both validated events (i.e., the clicks and impressions were not derived from fraudulent spamming clicks or automated web crawlers and robots). The basic quality index flow, including the gathering of daily and aggregated statistics, is described below.
eCTR, or estimated CTR, may be calculated based on the relevancy score RS and the rank information (i.e., the positioning or ranking of the message on the display page) of the message in question, compared to all other messages displayed, aggregated over the preceding 30 day time period. The eCTR is used as an initial estimate before there is adequate true historical click through data.
An exemplary mathematical description of the eCTR as incorporated into the RCB formulation described above is:
An important factor in the above analysis is the collection of historical data. Based on a sensitivity study of the behavior of high quality messages, messages with a high CTR on a certain month tend to maintain it in the following month, then begin to disappear or decay. Thus, in order to estimate the RCB for selection of future messages, the data from the previous month is especially important. Data analysis shows that using the monthly data to predict the following week's top messages is an effective method. The decay speed of good messages tends to be relatively slow. Therefore, a 30-day sliding window database may be used to get the history of a message's performance at the creative level. For this embodiment, the database stores, for each creative/message, information about: click numbers, impressions, rank and relevance. This information is used to estimate and forecast the RCB on a daily basis for each message.
a illustrates one embodiment for an exemplary basic quality index data flow. For this embodiment, the data is divided into two major parts. The first part, which stores the data from each day, identifies the daily statistics for the past 30 days. The second part, which contains the aggregated statistics as a whole, stores the rolling 30-day aggregated statistics.
The Static Score may be calculated from the RCB, as follows:
Static Score=F(bid, RCB)=1,000,000, if RCBs×bid≧τ, or else 1,000,000×(RCBs×bid)/τ.
The (RCBs×bid) amount is the Estimated Revenue Per Million impression (ERPM) for the message, and “s” is the “squashing factor” that distributes weight between the RCB score and the bid amount (i.e., increasing “s” increases the weight of the RCB score, while a smaller s gives a larger weight to the contribution of the bid amount to the Static Score). τ is a threshold value: above it, the Static Score is given the highest score. Therefore, τ acts as a normalizing factor. The maximum value is chosen so that the Static Score follows a similar value distribution as the Dynamic Score.
Initially, Final Score is set equal to:
Final Score=α×Static Score+(1−α)×Dynamic Score, (500, FIG. 5). Then, Static Score is computed (505, FIG. 5) by:
Static Score=F(bid, RCB)=1,000,000, if RCBs×bid≧τ. or else
1,000,000×(RCBs×bid)/τSet RCB=βCTR+1−β,)eCTR (510, FIG. 5)
wherein: β=f(x)=1/(1+e−c
x=g(impression)=log(impression)−log(c2)
summed over j, where RS is the externally computed relevancy score.
Due to the huge amount of data processing required to implement the RCB and ERBM generation model disclosed herein, in some embodiments, the model is implemented using grid computing. In general, grid computing is described in Hadoop: The Definitive Guide, 1st Edition, O'Reilly Media, 2009. Described hereinafter is an exemplary computer implemented method. Note that the method is exemplary and other methods may be used without deviating from the spirit or scope of the invention.
Using grid computing, the majority of both the RCB and ERPM calculation is conducted offline. In one embodiment, the 30-day aggregated database and daily statistics are obtained by map-reduce jobs, described in “MapReduce: Simplified Data Processing on Large Clusters”, by Jeffrey Dean and Sanjay Ghemawat, usenix.org. Specifically, Hadoop streaming and Pig jobs (See “Pig Latin: A Not-So-Foreign Language for Data Processing”, Christopher Olston, Benjamin Reed, Utkarsh Srivastava, Ravi Kumar and Andrew Tomkins, Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, Vancouver, Canada, 2008) may be implemented using 60 nodes for about 1 hour daily to generate the data. The offline jobs are triggered by the daily serve and click events, and run automatically under a scheduler such as the Yawn scheduler (e.g., Yahoo's internal grid computing scheduler). Once the RCB score computation is completed, it is joined with a database that stores the message inventory, to obtain the bid information for each creative. The bid value and RCB score are combined to get the ERPM score for each creative. If the bid values are not available for some creatives, the ERPM values for those creatives are set, as a default, to the median value of the other ERPM's, in order to give those creatives some chance to be shown in the future.
The model parameters in this exemplary embodiment are updated once a month. The model is trained, for example, the first day of each month, and the configuration file is modified to plug in the new optimal parameters obtained. The training objective is to find the best c1 and c2 with the highest recall for good messages, using the grid search with exponential steps. The optimal parameters may be used in the model for the following month. In one example, c1 was searched from 1e-4 to 1e5 in exponential steps, and c2 was searched from 1 to 1e5. For this example, c1=0.5 and c2=400 yield the highest accuracy in forecasting good messages (i.e., with an accuracy of 93.46%).
Once the ERPM score for each creative message is computed, it is used as one attribute of the message in the index, and the index is pushed to all the serving nodes. In the online evaluation time, the ERPM score is linearly combined with dynamic scores computed with different relevance models, to generate the final evaluation score of each message. Only the messages with the highest final scores are selected as candidate messages (i.e., messages selected for delivery). In some embodiments, final message selection uses a feedback loop, using specific CRT in conjunction with the specific page displayed, for the messages in the initially selected pool of candidate messages.
Table 1 shows exemplary evaluation data comparing different weightings of the Static Score.
The data is divided into different testing buckets, shown in column 1. Column 2 displays the server configuration, where Static Score Weight (ssw) α is varied (i.e., varying how much relative weight is given to the dynamic score verse the static score calculated using the one-month aggregated history model). Baseline bucket B0 uses the bid amount of the message as the Static Score, and uses an ssw of 0.13. Subsequent buckets use the RCB×bid to obtain the static score, and the ssw varies between 0 and 0.25. Column 3 displays the Normalized Discounted Cumulative gain (Ndcg-3) results, a measure of effectiveness of a Web search engine algorithm or related applications, often used in information retrieval. In this case, it measures the relevance of the selected message group. Column 4 lists test coverage, and column 5 lists the difference in the message selection results between the bucket in question and baseline bucket B0.
As shown in Table 1, the ERPM score method disclosed herein yields similar relevance measurements to the baseline method, but the message candidate selection is considerably different, up to 56%. This indicates that use of the ERPM methods provides potential for improvement of the CTR, PPC and revenue.
One set of experimental bucket test results shows that the ERPM based static scoring technique generates 4.61% CTR gain, 9.63% PPC gain and 14.72% RPM gain (revenue per thousand impression). Thus, experimental results indicate substantial improvement in click through rate and revenue generation when the message selection model outlined herein is used.
Any node of the network 700 may comprise a general-purpose processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), a field programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination thereof capable to perform the functions described herein. A general-purpose processor may be a microprocessor, but in the alternative, the processor may be any conventional processor, controller, microcontroller, or state machine. A processor may also be implemented as a combination of computing devices (e.g. a combination of a DSP and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a DSP core, or any other such configuration, etc).
In alternative embodiments, a node may comprise a machine in the form of a virtual machine (VM), a virtual server, a virtual client, a virtual desktop, a virtual volume, a network router, a network switch, a network bridge, a personal digital assistant (PDA), a cellular telephone, a web appliance, or any machine capable of executing a sequence of instructions that specify actions to be taken by that machine. Any node of the network may communicate cooperatively with another node on the network. In some embodiments, any node of the network may communicate cooperatively with every other node of the network. Further, any node or group of nodes on the network may comprise one or more computer systems (e.g. a client computer system, a server computer system) and/or may comprise one or more embedded computer systems, a massively parallel computer system, and/or a cloud computer system.
The computer system 750 includes a processor 708 (e.g. a processor core, a microprocessor, a computing device, etc), a main memory 710 and a static memory 712, which communicate with each other via a bus 714. The machine 750 may further include a display unit 716 that may comprise a touch-screen, or a liquid crystal display (LCD), or a light emitting diode (LED) display, or a cathode ray tube (CRT). As shown, the computer system 750 also includes a human input/output (I/O) device 718 (e.g. a keyboard, an alphanumeric keypad, etc), a pointing device 720 (e.g. a mouse, a touch screen, etc), a drive unit 722 (e.g. a disk drive unit, a CD/DVD drive, a tangible computer readable removable media drive, an SSD storage device, etc), a signal generation device 728 (e.g. a speaker, an audio output, etc), and a network interface device 730 (e.g. an Ethernet interface, a wired network interface, a wireless network interface, a propagated signal interface, etc).
The drive unit 722 includes a machine-readable medium 724 on which is stored a set of instructions (i.e. software, firmware, middleware, etc) 726 embodying any one, or all, of the methodologies described above. The set of instructions 726 is also shown to reside, completely or at least partially, within the main memory 710 and/or within the processor 708. The set of instructions 726 may further be transmitted or received via the network interface device 730 over the network bus 714.
It is to be understood that embodiments of this invention may be used as, or to support, a set of instructions executed upon some form of processing core (such as the CPU of a computer) or otherwise implemented or realized upon or within a machine- or computer-readable medium. A machine-readable medium includes any mechanism for storing or transmitting information in a form readable by a machine (e.g. a computer). For example, a machine-readable medium includes read-only memory (ROM); random access memory (RAM); magnetic disk storage media; optical storage media; flash memory devices; electrical, optical, acoustical or other form of propagated signals (e.g. carrier waves, infrared signals, digital signals, etc); or any other type of media suitable for storing or transmitting information.
While the invention has been described with reference to numerous specific details, it is not intended that the invention be limited to the exact embodiments disclosed herein. It should be clear to one skilled in the art that changes and modifications may be made without departing from the inventive concept. The scope of the invention should be construed in view of the claims.